I'm using Analog to analyze the Web server logs on a site that
currently receives 1.5 million hits per month and growing. In the
Request Report I'm seeing some annoying results that I'd like to find
a way to eliminate. In each case requests that returned the same file
to the user were formatted in a multitude of different ways. Each and
every one results in a separate line in the Request Report. As an
example, the below is an example excerpt from a recent report:
181807: /money/milpay/
21: /money/milpay/index.htm/
1: /money/milpay/index.htm/99pay/
1: /money/milpay/index.htm/99pay/99pay/
1: /money/milpay/index.htm/99pay/99pay/99pay/
1: /money/milpay/index.htm/99pay/99pay/99pay/98pay/
1: /money/milpay/index.htm/99pay/99pay/99pay/99pay/
1: /money/milpay/index.htm/99pay/99pay/99pay/99pay/99pay/
1: /money/milpay/index.htm/99pay/99pay/99pay/99pay/99pay/99pay/
Since these requests all returned the same file
(/money/milpay/index.htm) I'd like to find a way to roll all these
items into one single entry on the report. If it were just the file
in my example causing the problem I'd try using FILEALIAS or something
similar to fix it. Unfortunately its happening all over the report in
ways that are difficult to predict.
What I'm looking for is the ability in Analog to process the request
line as follows:
If the request portion of the line reads something like
"/money/milpay/index.htm/98pay/99pay/99pay/" return a line that reads
"/money/milpay/index.htm" instead. This would necessitate an ALIAS type
command that would read:
AliasCommandOfSomeKind "*.htm*" "*.htm"
For those of you who are curious as to how the multiple lines are
occurring I did some detective work to try and find out. Many of the
cases come from using who came to my site via a lookup on a search
engine. For whatever reason the search site returns a hit with a URL
formatted like "/money/milpay/index.htm/". You'll notice the extra "/"
on the end. Once the page from my sites loads in the requesting user's
browser, each link on the page is then misformated. The Web browser
appends the URL in the link to the end of "/money/milpay/index.htm/".
The user clicks on a link which should lead them to a subdirectory
"99pay/". The actual URL in the source code of the user's browser
reads "/money/milpay/index.htm/99pay/". When the user clicks on the
link, instead of going to the page they thought they were going to get,
they get "/money/milpay/index.htm", the same page they're currently
viewing. In many cases the user clicks on the link upwards of five or
more times until they finally figure out that they're not leaving the
page. This same problem is being repeated in varying locations across
the site.
Is anyone else seeing anything like this? If Analog does not have the
ability to fix this. Does anyone have something like a PERL script
that I might use to preprocess the log files. (I'm not a Perl coder so
I would not be able to build one myself.)
Any assistance will be greatly appreciated. Thanks in advance,
Ed Kabat
Web Administrator
Defense Finance and Accounting Service
--------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to [EMAIL PROTECTED]
with "unsubscribe analog-help" in the main BODY OF THE MESSAGE.
--------------------------------------------------------------------