Re: [squid-users] Force ASCII encoding for access.log fields?

2014-06-27 Thread Mark DeCheser
>> [serverIP],[clientIP],
>> 4012,692,498,GET,200,º^_x°*,username,20/Jun/2014:00:06:36
>
> The log format you used does not match this log line. The format produces:
>
> [squid-listening-IP],[clientIP],
> 4012,692,498,GET,200,º^_x°*,username,20/Jun/2014:00:06:36

Thanks for the correction.  To expand on that point, on some of our
proxies, we have more than one IP being serviced by a single daemon. 
Recording which IP received the traffic is essential to proper accounting
(e.g. FreeRADIUS).

> URL-encoding is the %xx character encoding, it can be (and is) applied
> to anything which can legitimately contain non-ASCII characters or ASCII
> special characters. Content-Type header is not one of those places.
>
> You can use the '#' format modifier to URL-encode that %mt field
> explicitly. Like so:  %#mt

Amos, thank you so much for sharing this.  I plan to try it as soon as ...

> If you will share the exact Squid version you are using I would also
> like to check the code to see if the mt code is being correctly setup,
> that log entry looks a bit like random memory being displayed as if it
> were text.

... as soon as I finish upgrading from squid-3.1.10-16.el6 to
3.1.10-20.el6, both of which are packaged and delivered via the CentOS
repo :).  Totally ashamed I didn't even notice there was an update
available before posting.  I plan to schedule an outage to patch and I'll
report back with my findings.  If you suspect random memory chunks are
being written to the file as a consequence of this outdated version of
Squid, and even the more recent version I plan to move to does not address
this condition, feel free to share.

This particular proxy is pretty active.  We're averaging between 800,000 -
1.2M lines in the access log per day.  The proxy is non-caching, running
with 512MB RAM and 1GB swap (don't ask).

More soon,
MD



[squid-users] Force ASCII encoding for access.log fields?

2014-06-26 Thread Mark DeCheser
Hi everyone --

I recently ran into a strange condition within my Squid access logs which
is making importing the events into a database a bit more difficult. 
Note, I am not logging directly to a database, but rather parsing event
into a centralized database via batch/cron.

Events in the access log, mainly which I see are in the ContentType field,
are being recorded as non-ASCII characters.  When I attempt to import the
log into PostgreSQL, psql barfs.

Our logfile format in our Squid config looks like this:

logformat my-custom %la,%>a,%10tr,%>st,%Hs,%mt,%[un,%tg
access_log /var/log/squid/access.log my-custom

Some examples of the events look like this:

[serverIP],[clientIP],
4012,692,498,GET,200,º^_x°*,username,20/Jun/2014:00:06:36
[serverIP],[clientIP],
4012,564,795,GET,200,text/css,username,20/Jun/2014:00:06:36
[serverIP],[clientIP],,
4191,681,528,GET,200,application/javascript,username,20/Jun/2014:00:06:36

[serverIP],[clientIP],
4322,457,25813,GET,200,application/javascript,eadqnfkx,20/Jun/2014:00:07:21
[serverIP],[clientIP],
627,907,499,GET,200,°Z<90><8f>^X+,username,20/Jun/2014:00:07:21
[serverIP],[clientIP],
627,912,499,GET,200,@Ì^Px°*,username,20/Jun/2014:00:07:21
[serverIP],[clientIP],
627,898,499,GET,200,<90>KPñx+,username,20/Jun/2014:00:07:21
[serverIP],[clientIP],
627,907,497,GET,200,p<91><96>^U,username,20/Jun/2014:00:07:21

I'm running Squid instances on VPSes in a number of different countries. 
This particular Squid instance is in Norway, and coincidentally enough
happens to be the only VPS delivered to my organization that wasn't
already set to en_US.UTF-8.

# cat /etc/sysconfig/i18n
LANG="en_US.UTF-8"
SYSFONT="latarcyrheb-sun16"
# echo $LANG
en_US.UTF-8

It could be a coincidence, but based on the fact that I have instances all
over the world, and only this instance is giving me trouble ... I found it
to be an odd coincidence.

Ideally, if it's possible for Squid to force some kind of hex encoding for
this Content-Type (or really, for any field that receives non ASCII
characters), that would be optimal.   There are downstream alternatives
which include finding / replacing non-ASCII chars in a preparation script.
 There's also the option to change the charset of the database itself so
that it doesn't complain about the charset, but these alternatives seem a
little reactionary.

I've reviewed:  http://www.squid-cache.org/Doc/config/logformat/
I also tried using iconv unsuccessfully: 
http://stackoverflow.com/questions/12999651/how-to-remove-non-utf-8-characters-from-text-file

It essentially leaves me with offset fields/columns in the logfile.

I also reviewed Amos' comment here: 
http://www.squid-cache.org/mail-archive/squid-users/201109/0343.html

The difference in my case is that I'm dealing with Content-Type, not URL. 
The potential for this condition to be found elsewhere is within the realm
of possibility (username, for example), but presently not an immediate
concern.

The community's advice would be greatly appreciated.

Thanks,
Mark DeCheser




[squid-users] Squid response time (tr)

2013-11-05 Thread Mark DeCheser
Hello everyone --

I'm cooking up a custom log in Squid, and one of my planned data points is
Squid response time (tr), as defined here:

http://www.squid-cache.org/Doc/config/logformat/

I am not running a caching proxy.

Question 1:

Does the response time value reflect the time spent transferring a file
from the remote web server to the proxy server, or the time spent by the
squid daemon handling the request in total?

Question 2:

Can someone demonstrate the usefulness of capturing this value for the
purposes of tuning or troubleshooting the Squid proxy?

Thanks everyone,

Mark DeCheser