[analog-help] FROM TO daily time window (possible?)
I would like to confine the statitics to only consider page visits within a certain daily time window, e.g. 7 am to 7 pm. Is this possible with a FROM/TO syntax? TIA Christoph + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] FROM TO daily time window (possible?)
Klaus Johannes Rusch schrieb: Christoph Kukulies wrote: I would like to confine the statitics to only consider page visits within a certain daily time window, e.g. 7 am to 7 pm. Is this possible with a FROM/TO syntax? Hi Christoph, FROM/TO can only specify a contiguous timeframe, so you would be limited to a single 7am to 7pm window on a given day. To extract certain times on multiple days you will need to filter the logs prior to piping them to analog, or using logfile preprocessing http://analog.cx/docs/logfile.html#UNCOMPRESS. Thanks, Klaus. I was already suspecting that it wouldn't work with FROM/TO. I tried FROM *:0700 TO *:1900 or FROM :0700 TO :1900 but this triggered syntax errors, although it wouldn't problably very difficult to implement this as a feature, but I'm not sure how "frozen" analog6.0 is, anyway :) I couldn't get from the link you gave me, though, in the first place, how this would lead me to a means to preprocess the logs? I could run a self crafted program that greps/seds/awks through the logs in the analog.bat job and have analog.bat (.exe) work on the proprocessed files. But I didn't find a config-keyword that specifies a preprocessor on the log file. -- Christoph + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] FROM TO daily time window (possible?)
Klaus Johannes Rusch schrieb: Christoph Kukulies wrote: I couldn't get from the link you gave me, though, in the first place, how this would lead me to a means to preprocess the logs? Correct. If you cannot get to the documentation at http://analog.cx/docs/logfile.html#UNCOMPRESS, check the page "Choosing a logfile" in your local analog documentation. Basically you specify the preprocessor as the program to "uncompress" a file (of course if the files are really compressed you will need to the actual uncompression yourself then), e.g. UNCOMPRESS *.gz /var/sample/filter.sh I see the trick. Thanks. -- Christoph + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
[analog-help] complete list of accessed files?
Our marketing people are interested in how often certain .PDF documents accessed (downloaded) but depending on the hit statistics these do not appear in the listed files (only under others). I already thought of putting index.htm (the most accessed files in folders and subfolders) on the FILEEXCLUDE list. This doesn't seem to work. FILEEXCLUDE index.htm doesn't make it disappear in the statistics of accessed files. -- Christoph + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] complete list of accessed files?
Jeremy Wadsack schrieb: ... However, I'm not sure that answers your original question. Analog's reports will show the top number of results based on FLOOR settings. You can change the floor for a given report to have it show more data. For example, this will show all requests (because everything has at least one request): REQFLOOR 1r I introduced this setting (REQFLOOR 1r) now in my config file, I also included FILEINCLUDE *.pdf Still I don't see a single .pdf in the list of requested files (last listing in the report). I have about 3000 .pdf file requests in the original apache access_log file. I assume the file extension (FILEINCLUDE syntax) isn't case sensitive. See http://analog.cx/docs/othreps.html#FLOOR for details. Also, if you just want a report that shows PDF, you can use the FILEINCLUDE to get that (assuming no previous FILEEXCLUDEs exist in your config): FILEINCLUDE *.pdf As you'll find if you peruse the archives of this list, PDF requests often show multiple times for a given file because some versions of Adobe Acrobat Reader will load a page at a time from the server. So the number of requests to the PDF may be higher than the actual number of reads. As HTTP is stateless and cached, there's no accurate way to ensure that you know how many actual read you had. OK. Good to know. I just would like to know about a possible increase over all, assuming the behaviour of the clients remains the same. -- Christoph Kukulies -- Jeremy Wadsack On Fri, Jul 24, 2009 at 9:52 AM, Christoph Kukulies <mailto:k...@kukulies.org>> wrote: Our marketing people are interested in how often certain .PDF documents accessed (downloaded) but depending on the hit statistics these do not appear in the listed files (only under others). I already thought of putting index.htm (the most accessed files in folders and subfolders) on the FILEEXCLUDE list. This doesn't seem to work. FILEEXCLUDE index.htm doesn't make it disappear in the statistics of accessed files. -- Christoph + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general <http://news.gmane.org/gmane.comp.web.analog.general> + -- Jeremy Wadsack + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general + + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] complete list of accessed files?
Aengus schrieb: On 8/12/2009 10:47 AM, Christoph Kukulies wrote: I introduced this setting (REQFLOOR 1r) now in my config file, I also included FILEINCLUDE *.pdf Still I don't see a single .pdf in the list of requested files (last listing in the report). I have about 3000 .pdf file requests in the original apache access_log file. I assume the file extension (FILEINCLUDE syntax) isn't case sensitive. Actually, if Analog is running on a case sensitive OS, then the FILEINCLUDE probably is case sensitive. http://analog.cx/docs/alias.html#CASE Can you post 3 or 4 lines from your log file, including some PDF requests? 87.79.34.253 - - [11/Aug/2009:17:58:38 +0200] "GET /export/download/de/AB-lang/AB-3-5-7.pdf HTTP/1.1" 200 158955 "http://www.mysite.de/de/produkte/AB-lang//index.htm"; "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)" 217.91.80.223 - - [12/Aug/2009:09:31:33 +0200] "GET /export/download/de/AB-lang/AB-Plan.pdf HTTP/1.1" 200 212734 "http://www.mysite.de/de/produkte/AB-lang/AB-Plan.htm"; "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.0.13) Gecko/2009073022 Firefox/3.0.13" 159.51.236.51 - - [12/Aug/2009:14:21:36 +0200] "GET /export/download/de/AB-lang/XYZPP.pdf HTTP/1.1" 200 108759 "http://www.mysite.de/de/produkte/AB-lang/ABACC-XYZYPP.htm"; "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; .NET CLR 2.0.50727)" 85.3.36.7 - - [12/Aug/2009:15:36:27 +0200] "GET /export/download/de/AB-lang/AB-Plan.pdf HTTP/1.1" 206 208857 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0; de; rv:1.9.0.13) Gecko/2009073022 Firefox/3.0.13 (.NET CLR 3.5.30729)" The first three are referred from our own pages (mysite.de). I found that most of the requests with return code 200 are bots, crawlers, Yandex, msnbot, google search results. I will try the suggested page-include and see what happens when I include SEARCHENGINES again. I hope the syntax is correct since I must admit that I had to reconstruct (paste together) the main log file from different files (referrer.log, agents.log) because there was a break in the logformat during time. -- Christoph Kukulies Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general + + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
[analog-help] a little bit OT - doing backups
Sorry if this is not 100% analog centric. But I guess anyone in this list is familar with the issue: I recently resurrected a site which did not do any backups or at least the backup strategy was not very sophisticated. Now the system admin decided to do a nightly backup to a 40 GB tape or something. A full backup. It takes 4 hours and he stopped tomcat, apache and mysql for that. For 4 hours the site is now off net (!). Incredible. I'm trying to advise him to do either the backup without caring about locked files/open files. We have scheduled this for the coming nihjtly backup and want to watch the outcome. Is analog picky in any way about the log files? The dnsquery processes also take quite a while. But I believe, apache doesn't have to be stopped for this. (I'm talking about Windows XP, btw). Thanks for your attention. -- Christoph + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +