From: Barbara Kantzos@EDF on 05/06/99 12:22 PM

Subject:  HELP! using analog for search term analysis via get query string
      isolation

Hi There,

I am using analog 3.11 running on Windows NT for MS IIS 4.0 log files.  I
have been attempting to use analog to analyze which search terms people
submit when using my sites MS IIS 4.0 provided search engine.  I have tried
to do this by limiting the request with REQINCLUDE to
http://www.edf.org/cgi-bin/search/edf.idq*  which has the query string
attached after this url.
I realize that IIS log files cut the query string off of the request field
and store it at the end of the log file line (whereas this doesn't seem to
be the case with the referrer field of the log file).  I have appropriately
defined the logfile format in my configuration file and included %q to
include the query string.

I have encountered several problems --

When I run a monthly report with no restrictions other than pageinclude for
html and idq pages my request report indicates 6888 visits to the search
page before submission of a search term.  The referrer report indicates
that the search page was a referrer 16912.  I don't get it.

I though of accounting for the difference if images were included, but they
aren't and there are quite a few on this page, so it would be larger by
more that ~2x if I were overlooking some parameter.

Also, I have a quick index on the search page which I am sure that many
people use to find what they are looking for -- meaning they leave the page
without using the search engine at all.  This makes me think that there
would be more more requests for the search index page than actual searches.
But, there are about 10,000 requests for cgi-bin/search...  Again, more
searches than requests for the search index page.  Any thoughts?

Next problem.  I manually analyzed one days log file to see which numbers
are correct.  I get half as many requests in the analog report with the
search index page isolated with REQINCLUDE as I do counting manually.
What's up with that?  It may be related to the above numbers being
approximately 2x difference, but I am not sure.

This is all confusing for me, but the real purpose is to isolate the query
string to determine most frequently searched terms.

So, I tried this with REQINCLUDE, but only got about 1000 query strings for
the month when I know that there are 10x as many searches indicated by a
listing in the report of the cgi-bin/search... without any strings
attached.  How come I only get the query string 10% of the time?  I reduced
the reqfloor to 1, so I know it's not a 'not listed problem'.

Also,  when I do a refinclude, does that apply to my other analyses such as
my request report?  Or if I do a reqinclude, will that apply to my referrer
report?  My pageinclude applies to all reports?  Maybe I am making some
erroneous assumptions based on includes and excludes.

I think this search term analysis would be a really useful way for me find
out what content visitors are really interested so that the content
developers have an idea of what's missing from our site etc. and I don't
seem to have an IIS provided search log.  Please don't send 'idiot mail' if
there is some help on this in the documentation or answers are obvious.  I
am new to this.

Any help on this would be appreciated.  Thanks so much.
Barbara Kantzos
Environmental Defense Fund NYC


--------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to [EMAIL PROTECTED]
with "unsubscribe analog-help" in the main BODY OF THE MESSAGE.
--------------------------------------------------------------------

Reply via email to