Re: [analog-help] complete list of accessed files?

2009-08-12 Thread Aengus
Christoph Kukulies  wrote:
> Aengus schrieb:
>> On 8/12/2009 10:47 AM, Christoph Kukulies wrote:
>> 
>> Can you post 3 or 4 lines from your log file, including some PDF
>> requests? 
> 87.79.34.253 - - [11/Aug/2009:17:58:38 +0200] "GET
> /export/download/de/AB-lang/AB-3-5-7.pdf HTTP/1.1" 200 158955
> "http://www.mysite.de/de/produkte/AB-lang//index.htm"; "Mozilla/5.0
> (Windows; U; Windows NT 5.1; de; rv:1.9.1.2) Gecko/20090729
> Firefox/3.5.2 (.NET CLR 3.5.30729)"

Just as a test, I ran Analog against these 4 lines with no analog.cfg, just 
using the hardcoded defaults in Analog, and REQFLOOR 1r and it displayed these 
entries in the Request Report.

Listing files, sorted by the number of requests.
reqs %bytes last time   file
261.16% 12/Aug/09 15:36 /export/download/de/ab-lang/ab-plan.pdf
115.78% 12/Aug/09 14:21 /export/download/de/ab-lang/xyzpp.pdf
123.06% 11/Aug/09 17:58 /export/download/de/ab-lang/ab-3-5-7.pdf


analog pdf.log -G +O"report.html" +C"reqfloor 1r"

If these .pdf files aren't showing up in your reports, then you have something 
in your analog.cfg that is excluding them.

Aengus

+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Analog Documentation: http://analog.cx/docs/Readme.html
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+


Re: [analog-help] complete list of accessed files?

2009-08-12 Thread Christoph Kukulies

Aengus schrieb:

On 8/12/2009 10:47 AM, Christoph Kukulies wrote:

I introduced this setting (REQFLOOR 1r) now in my config file, I also 
included

FILEINCLUDE *.pdf

Still I don't see a single .pdf in the list of requested files (last 
listing in the report).
I have about 3000 .pdf file requests in the original apache 
access_log file.


I assume the file extension (FILEINCLUDE syntax) isn't case sensitive.


Actually, if Analog is running on a case sensitive OS, then the 
FILEINCLUDE probably is case sensitive.


http://analog.cx/docs/alias.html#CASE

Can you post 3 or 4 lines from your log file, including some PDF requests?
87.79.34.253 - - [11/Aug/2009:17:58:38 +0200] "GET 
/export/download/de/AB-lang/AB-3-5-7.pdf HTTP/1.1" 200 158955 
"http://www.mysite.de/de/produkte/AB-lang//index.htm"; "Mozilla/5.0 
(Windows; U; Windows NT 5.1; de; rv:1.9.1.2) Gecko/20090729 
Firefox/3.5.2 (.NET CLR 3.5.30729)"


217.91.80.223 - - [12/Aug/2009:09:31:33 +0200] "GET 
/export/download/de/AB-lang/AB-Plan.pdf HTTP/1.1" 200 212734 
"http://www.mysite.de/de/produkte/AB-lang/AB-Plan.htm"; "Mozilla/5.0 
(Windows; U; Windows NT 5.1; de; rv:1.9.0.13) Gecko/2009073022 
Firefox/3.0.13"


159.51.236.51 - - [12/Aug/2009:14:21:36 +0200] "GET 
/export/download/de/AB-lang/XYZPP.pdf HTTP/1.1" 200 108759 
"http://www.mysite.de/de/produkte/AB-lang/ABACC-XYZYPP.htm"; "Mozilla/4.0 
(compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 
1.1.4322; .NET CLR 2.0.50727)"


85.3.36.7 - - [12/Aug/2009:15:36:27 +0200] "GET 
/export/download/de/AB-lang/AB-Plan.pdf HTTP/1.1" 206 208857 "-" 
"Mozilla/5.0 (Windows; U; Windows NT 6.0; de; rv:1.9.0.13) 
Gecko/2009073022 Firefox/3.0.13 (.NET CLR 3.5.30729)"


The first three are referred from our own pages (mysite.de).
I found that most of the requests with return code 200 are bots, 
crawlers, Yandex, msnbot, google search results.
I will try the suggested page-include and see what happens when I 
include SEARCHENGINES again.


I hope the syntax is correct since I must admit that I had to 
reconstruct (paste together) the main log file from different files
(referrer.log, agents.log) because there was a break in the logformat 
during time.


--
Christoph Kukulies



Aengus
+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Analog Documentation: http://analog.cx/docs/Readme.html
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+


+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Analog Documentation: http://analog.cx/docs/Readme.html
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+


[analog-help] Re: complete list of accessed files?

2009-08-12 Thread Paul Wade
Christoph Kukulies  writes:

> I introduced this setting (REQFLOOR 1r) now in my config file, I also 
> included
> FILEINCLUDE *.pdf
> 
> Still I don't see a single .pdf in the list of requested files (last 
> listing in the report).
> I have about 3000 .pdf file requests in the original apache access_log file.

@Christoph - try 

PAGEINCLUDE *.pdf

Paul

+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Analog Documentation: http://analog.cx/docs/Readme.html
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+


Re: [analog-help] complete list of accessed files?

2009-08-12 Thread Aengus

On 8/12/2009 10:47 AM, Christoph Kukulies wrote:

I introduced this setting (REQFLOOR 1r) now in my config file, I also 
included

FILEINCLUDE *.pdf

Still I don't see a single .pdf in the list of requested files (last 
listing in the report).
I have about 3000 .pdf file requests in the original apache access_log 
file.


I assume the file extension (FILEINCLUDE syntax) isn't case sensitive.


Actually, if Analog is running on a case sensitive OS, then the 
FILEINCLUDE probably is case sensitive.


http://analog.cx/docs/alias.html#CASE

Can you post 3 or 4 lines from your log file, including some PDF requests?

Aengus
+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Analog Documentation: http://analog.cx/docs/Readme.html
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+


Re: [analog-help] complete list of accessed files?

2009-08-12 Thread Christoph Kukulies

Jeremy Wadsack schrieb:

...


However, I'm not sure that answers your original question. Analog's 
reports will show the top number of results based on FLOOR settings. 
You can change the floor for a given report to have it show more data. 
For example, this will show all requests (because everything has at 
least one request):


REQFLOOR 1r


I introduced this setting (REQFLOOR 1r) now in my config file, I also 
included

FILEINCLUDE *.pdf

Still I don't see a single .pdf in the list of requested files (last 
listing in the report).

I have about 3000 .pdf file requests in the original apache access_log file.

I assume the file extension (FILEINCLUDE syntax) isn't case sensitive.


See http://analog.cx/docs/othreps.html#FLOOR for details.

Also, if you just want a report that shows PDF, you can use the 
FILEINCLUDE to get that (assuming no previous FILEEXCLUDEs exist in 
your config):


FILEINCLUDE *.pdf

As you'll find if you peruse the archives of this list, PDF requests 
often show multiple times for a given file because some versions of 
Adobe Acrobat Reader will load a page at a time from the server. So 
the number of requests to the PDF may be higher than the actual number 
of reads.


As HTTP is stateless and cached, there's no accurate way to ensure 
that you know how many actual read you had.
OK. Good to know. I just would like to know about a possible increase 
over all, assuming the behaviour of the clients remains the same.


--
Christoph Kukulies


--
Jeremy Wadsack

On Fri, Jul 24, 2009 at 9:52 AM, Christoph Kukulies > wrote:


Our marketing people are interested in how often certain .PDF
documents accessed (downloaded)
but depending on the hit statistics these do not appear in the
listed files (only under others).

I already thought of putting index.htm (the most accessed files in
folders and subfolders)
on the FILEEXCLUDE list. This doesn't seem to work.

FILEEXCLUDE index.htm

doesn't make it disappear in the statistics of accessed files.

--
Christoph

+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Analog Documentation: http://analog.cx/docs/Readme.html
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
|  Usenet version:
news://news.gmane.org/gmane.comp.web.analog.general

+




--
Jeremy Wadsack


+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Analog Documentation: http://analog.cx/docs/Readme.html
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+
  


+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Analog Documentation: http://analog.cx/docs/Readme.html
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+