Endre,

You also might try this filter script I've written to cleanup
the logs before passing to calamaris, it removes some characters
that cause calamaris problems.

#!/usr/local/bin/gawk -f
{
        if ( $0 ~ /[^\x20-\x7E]/ ) {
                gsub( /\x20\x0C/, "" )
                gsub( /[\x00-\x1F]/, "" )
                gsub( /[\x7F-\xFF]/, "" )
        }

        if ( $5 < 0 ) {
                $5 *= -1
        }

        if ( $2 !~ /[^0-9]/ && $5 !~ /[^0-9]/ && ( $11 == "ALLOW" || $11 == "DENY" ) ) 
{
                print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10
        }
        else {
                next
        }
}



--
Kirk Schneider                          972-952-4645 (work)
Raytheon Corporate IT Security          214-912-8679 (cell)
[EMAIL PROTECTED]                 888-431-7621 (pager)

"If you think the problem is bad now just wait until we've solved it."



-------- Original Message --------
Subject: [squid-users] Calamaris
Date: Mon, 1 Mar 2004 17:43:52 +0100
From: Endre Szekely-Bencedi <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]

Hello List,

I have a problem with Calamaris (v2.58).

I am using squid 2.5stable3, compiled from sources, with SmartFilter
plugin.
As far as I know, I have to use the squid-extended input type for this. But
this will give some errors:

[EMAIL PROTECTED] logs]# date;cat test.log | /usr/local/squid/bin/calamaris
-f squid-extended -F html > /var/www/html/calamaris2.html;date
Mon Mar  1 17:44:08 CET 2004
Malformed UTF-8 character (unexpected non-continuation byte 0x31,
immediately after start byte 0xf3) in split at (eval 1) line 20, <> line
369578.
Malformed UTF-8 character (unexpected non-continuation byte 0x31,
immediately after start byte 0xf3) in split at (eval 1) line 20, <> line
369578.
Split loop at (eval 1) line 20, <> line 369578.
Mon Mar  1 17:48:05 CET 2004
[EMAIL PROTECTED] logs]#

Generated log shows:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html;
charset=iso-8859-1"></HEAD>
<BODY></BODY></HTML>

Which is an empty page.

A sample from the logfile:

1077780471.441     93 3.227.65.74 TCP_MISS/302 476 GET
http://sher.index.hu/ad? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Portal
 Sites
1077780471.466     64 3.227.65.74 TCP_MISS/200 1722 GET
http://sher.index.hu/get? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Port
al Sites
1077780471.479     72 3.227.65.74 TCP_MISS/302 477 GET
http://sher.index.hu/ad? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Portal
 Sites
1077780471.508     59 3.227.65.74 TCP_MISS/302 477 GET
http://sher.index.hu/ad? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Portal
 Sites
1077780471.699     73 3.227.65.74 TCP_MISS/200 1585 GET
http://sher.index.hu/get? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Port
al Sites
1077780471.713     83 3.227.65.74 TCP_MISS/200 1607 GET
http://sher.index.hu/get? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Port
al Sites
1077780471.726     86 3.227.65.74 TCP_MISS/200 1589 GET
http://sher.index.hu/get? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Port
al Sites
1077780471.885    256 3.227.65.74 TCP_MISS/200 726 GET
http://as.fotexnet.hu/adserver.ads/153/0///937480 -
DEFAULT_PARENT/10.20.20.254 text/ht
ml text/html ALLOW
1077780473.212    229 3.227.65.74 TCP_MISS/200 23713 GET
http://index.hu/ad/lipton/banner1_120x240.swf? -
DEFAULT_PARENT/10.20.20.254 applicat
ion/x-shockwave-flash application/x-shockwave-flash ALLOW Portal Sites
1077780473.298     72 3.227.65.74 TCP_MISS/302 477 GET
http://sher.index.hu/ad? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Portal
 Sites
1077780473.388    279 3.227.65.74 TCP_MISS/200 17697 GET
http://index.hu/ad/microsoft_wss.swf? - DEFAULT_PARENT/10.20.20.254
application/x-sho
ckwave-flash application/x-shockwave-flash ALLOW Portal Sites
1077780473.439    106 3.227.65.74 TCP_MISS/302 476 GET
http://sher.index.hu/ad? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Portal
 Sites
1077780473.458     47 3.227.65.74 TCP_MISS/302 476 GET
http://sher.index.hu/ad? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Portal
 Sites
1077780473.480    368 3.227.65.74 TCP_MISS/200 4292 GET
http://as.fotexnet.hu/adserver.ads/196/0///27236 -
DEFAULT_PARENT/10.20.20.254 text/ht
ml text/html ALLOW
1077780473.643    162 3.227.65.74 TCP_MISS/302 477 GET
http://sher.index.hu/ad? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Portal
 Sites
1077780473.646    144 3.227.65.74 TCP_MISS/302 477 GET
http://sher.index.hu/ad? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Portal
 Sites
1077780473.673    487 3.227.65.74 TCP_MISS/200 10319 GET
http://as.fotexnet.hu/adserver.ads/200/0///378158 -
DEFAULT_PARENT/10.20.20.254 text/
html text/html ALLOW
1077780473.799    280 3.227.65.74 TCP_MISS/200 26216 GET
http://index.hu/ad/teluzoallo_120x240.swf? - DEFAULT_PARENT/10.20.20.254
application/
x-shockwave-flash application/x-shockwave-flash ALLOW Portal Sites
1077780473.819    122 3.227.65.74 TCP_MISS/200 216 GET
http://sher.index.hu/get? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Porta
l Sites
1077780473.824    124 3.227.65.74 TCP_MISS/200 355 GET
http://sher.index.hu/get? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Porta
l Sites
1077780473.842    136 3.227.65.74 TCP_MISS/200 1603 GET
http://sher.index.hu/get? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Port
al Sites
1077780473.846     47 3.227.65.74 TCP_MISS/200 353 GET
http://sher.index.hu/get? - DEFAULT_PARENT/10.20.20.254 text/html text/html
ALLOW Porta
l Sites

Am I doing something wrong?

Thanks,
Endre.




Reply via email to