Originally posted at techsays.com on February 25nd, 2007.
As you might have read from an earlier post, I’ve been given the task of building a Proxy server and an Anti-Spam/Anti-Virus server for a client.
I haven’t picked a software application for the mail portion of this server, but I’m going to be using Squid for proxying web traffic. I’m also going to use Sarg to parse the Squid logs and make pretty graphs for me.
So let’s get right to it.
Squid Configuration
Here’s my Squid configuration template that I use:
cat /etc/squid/squid.conf
# /etc/squid/squid.conf
# To find what these entries mean, see /etc/squid/squid.conf.original
http_port 3128
# visible_hostname sub.domain.local
log_ip_on_direct off
log_fqdn on
error_directory /etc/squid/errors
cache_access_log /var/log/squid/access.log
cache_log /var/log/squid/cache.log
# This entry means that by invoking squid -k rotate, the logfiles
# will get rotated. Remove the logrotate.d/squid file and call
# squid -k rotate after the sarg.monthly job
logfile_rotate 1
# These are default values from the original squid.conf file
hierarchy_stoplist cgi-bin ?
acl QUERY urlpath_regex cgi-bin \?
no_cache deny QUERY
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern . 0 20% 4320
#———-[ ACCESS LISTS
acl all src 0.0.0.0/0.0.0.0
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl to_localhost dst 127.0.0.0/8
acl purge method PURGE
acl CONNECT method CONNECT
# acl GROUP1_SRC src 1.1.1.102 # Chris
# acl GROUP2_SRC src 1.1.1.102 # Chris
# acl GROUP2_DST dstdomain .google.com # Give only group2 access to Google
#———-[ SPYWARE
acl SPYWARE_LIST_1 dstdomain “/etc/squid/spyware_list_1.txt”
acl SPYWARE_LIST_2 dstdomain “/etc/squid/spyware_list_2.txt”
#———-[ ALLOWED DOMAINS
acl ALLOWED_LIST_1 dstdomain “/etc/squid/allowed_domains.acl”
#———-[ ALLOWED PORTS
acl ssl_ports port 443
acl safe_ports port 80
acl purge method PURGE
acl CONNECT method CONNECT
no_cache deny all
#———-[ ALLOW/DENY LIST
http_access allow manager localhost
http_access deny manager
http_access allow purge localhost
http_access deny purge
http_access deny !safe_ports
http_access deny CONNECT !ssl_ports
http_access allow localhost
#———-[ SPECIAL ALLOWS
# Block all access from this source
#http_access deny GROUP1_SRC all
# Normal allow lists
#http_access allow GROUP2_SRC GROUP2_DST
#http_access deny GROUP2_SRC all
# Allow everyone else access to whitelisted sites
#http_access allow ALLOWED_LIST_1
#———-[ DENY
# Deny everyone access to these sites
# This list is updated by an script that runs every night
http_access deny SPYWARE_LIST_1
# This list is updated manually by me
http_access deny SPYWARE_LIST_2
# Show these error messages
deny_info ERR_SPYWARE_ACCESS_DENIED SPYWARE_LIST_1
deny_info ERR_SPYWARE_ACCESS_DENIED SPYWARE_LIST_2
#———-[ DEFAULT ALLOW
# If you weren’t blocked, then you’re allowed out
http_access allow all
icp_access allow all
#———-[ MISC SETTINGS
coredump_dir /var/spool/squid
cache_mgr root
Blocking malicious sites
The purpose of the SPYWARE_LIST_1 and 2 are for automatically blocking bad sites that the MVPS group finds. They provide a list of malicious sites that they find and they create a HOSTS file for Windows that you can import to protect yourself. I’ve taken that list and wrote a parser that turns it into something Squid can use.
The difference between list 1 and list 2 is that list 1 is overwritten daily by the MVPS file and list 2 is manually updated by me. That way I can add custom hosts to block that I know will always be blocked.
The MVPS list is very similar to AdBlock for Firefox. If you visit a site that has a lot of advertisements, with AdBlock you can pick these advertisement sites out of the list to block. With my MVPS list, I don’t have to worry about keeping up with distributing these lists to all of my users running Firefox and, with the list being on the proxy, it also applies to the users running Internet Explorer, obviously.
I wrote a custom error message for the sites that get blocked by my spyware lists which simply consists of a bunch of blank lines. Without my custom error message, the blocked elements of a site will contain the Squid error message and with poorly written sites that don’t specify table sizes, the error message can take up a lot more space than the advertisement did. All you’ll see with my error message is a blank spot on the page. There are probably more elegant ways to pull this off, but I’ve been running my stuff this way for at least 2 years with no problems.
Thanks to Luke from http://terminally-incoherent.com/blog/ my snippet below actually looks like HTML code.
cat /etc/squid/errors/ERR_SPYWARE_ACCESS_DENIED
<html>
<head>
<title> </title>
</head>
<body topmargin=”0″ leftmargin=”0″ marginheight=”0″ marginwidth=”0″>
<br><br><br><br><br><br><br>
<br><br><br><br><br><br><br>
<br><br><br><br><br><br><br>
<br><br><br><br><br><br><br>
<br><br><br><br><br><br><br>
<br><br><br><br><br><br><br>
<br><br><br><br><br><br><br>
<br><br><br><br><br><br><br>
<br><br><br><br><br><br><br>
<br><br><br><br><br><br><br>
<br><br><br><br><br><br><br>
<br><br><br><br><br><br><br>
<br><br><br><br><br><br><br>
</body>
</html>
The script that I use to download and parse the MVPS hosts file:
cat bin/update_spyware_rules.sh
#!/bin/bash
#———[ Changelog
# Created 2007.02.18 by Chris Davis
#———-[ Notes
# Here is the cronjob for this script
# Make sure the cronjob script is executable
# cat /etc/cron.d/update_spyware_rules
# 1 2 * * * root /home/user/bin/update_spyware_rules.sh
#———-[ Variables
URL1=http://everythingisnt.com/hosts
URL2=http://www.mvps.org/winhelp2002/hosts.txt
SPYWARE_DIR=/home/user/spyware/
TEMP_SPYWARE_LIST_2=$SPYWARE_DIR/temp_spyware_list_2.txt
SPYWARE_LIST_2=$SPYWARE_DIR/spyware_list_2.txt
SQUID_SPYWARE_LIST_2=/etc/squid/spyware_list_2.txt
EMAIL_THIS=$SPYWARE_DIR/email_this.txt
ADMIN_EMAIL=user@domain.local
#———-[ Script
# If the spyware directory doesn’t exist, create it.
if [ ! -e $SPYWARE_DIR ]
then mkdir $SPYWARE_DIR
fi
# Download the MVPS HOSTS file
wget $URL2 -O $TEMP_SPYWARE_LIST_2
# Parse the newly downloaded file to work with Squid
cat $TEMP_SPYWARE_LIST_2 | grep 127.0.0.1 | sed ’s/127.0.0.1 //g’ > $SPYWARE_LIST_2
cat $SPYWARE_LIST_2 | grep -v localhost | cut -d “#” -f 1 > $SQUID_SPYWARE_LIST_2
# Get a few stats about the new file and email them to the admin
wc -l $SPYWARE_LIST_2 > $EMAIL_THIS
echo “—- ” >> $EMAIL_THIS
ls -lt /etc/squid >> $EMAIL_THIS
cat $EMAIL_THIS | mail -s SQUID $ADMIN_EMAIL
# Reload Squid to load the new files
/etc/init.d/squid reload
If you don’t have the mail command, you can find it in the Debian mailx package.
Now that Squid is installed and configured, we need a way to create pretty HTML reports from the logs. I use a script called Sarg for this task.
I installed Sarg using Debian’s package management system so all of my files ended up in /etc/squid/. You’ll also need to manually download and edit the sarg-reports file here: http://www.initzero.it/products/opensource/sarg-reports/download/sarg-reports
Read the file for instructions on how to set everything up, including the cron job.
There are four jobs that are going to run: today, daily, weekly, and monthly.
The Today job runs every hour from 8am to 6pm. This keeps your Squid reports updated every hour.
The Daily job runs at midnight of every day.
The Weekly job runs on the first hour of the first day of every week.
The Monthly job runs on the 30th minute of the second hour of the first day of every month.
Since you’re going to be processing Monthly reports, it is very important to update your logrotation schedule to NOT rotate the logs on a daily basis.
After you edit the sarg-reports script, which I keep in /etc/squid with the rest of the squid files, you’ll need to edit the /etc/logrotate.d/squid file. Basically, I comment out everything in the logrotate.d/squid file. This way, an apt-get update won’t create the file without telling me, thus messing up my sarg logfile rotation. Debian’s apt-get is good about telling me when a conf file is about to be updated so by commenting out the contents of the file, I’m pretty sure that I’ll be notified if the file is ever updated.
I add the squid -k rotate job at the end of the monthly report creation to make sure the logs are rotated immediately afterwards.
grep sarg /etc/crontab
00 08-18/1 * * * root /etc/squid/sarg-reports today
00 00 * * * root /etc/squid/sarg-reports daily
00 01 * * 1 root /etc/squid/sarg-reports weekly
30 02 1 * * root /etc/squid/sarg-reports monthly && squid -k rotate
And you’re done. The only step left to do is to reconfigure your browser (I highly recommend SwitchProxy for Firefox) to use your new proxy.
[EDIT]
Oh yeah, I forgot to include the Webmin part of this. If you read the SoW that Jeff originally sent me, then you’ll know that they want to have control over what gets blocked and what doesn’t.
So I installed Webmin and the webmin packages for postfix and sarg. I don’t have access to a Linux box right now but if you search for them you’ll find them.
apt-cache search webmin postfix; apt-cache search webmin sarg
I just followed the defaults to install them.
The cool thing about Webmin is that you don’t need Apache to use it (for those that don’t want to run unnecessary services on your servers).
I had to edit the /etc/webmin/miniserv.conf file to give my IP access to the GUI. After editting the file make sure you restart Webmin using /etc/init.d/webmin restart.
And that’s literally it. I didn’t need to edit anything to make Webmin recognize my spyware config files. If you click on one of the filenames, Webmin opens its own text editor and lets you edit the files directly. Perfect for what the client needs.
Webmin is not pretty (by default, there are a lot of themes for it) but it definitely gets the job done.
[/EDIT]