Re: [analog-help] Frequently updated site, sharing d ata with third-party websites - what effect on analog?
Natalia Lis [EMAIL PROTECTED] wrote: Hello, I didn't set Analog and I have no background in IT. I'm just trying to interpret the results for my company's site. It is a financial website with a lot of graphs updated every minute and other sections which are also frequently updated. We also share our real-time graphs which are very often displayed on third-party websites. 1.Given that most of our pages contain frequently updated elements, what may be the effect of this on the cache issue? Is it reasonable to expect that visitors will be less likely to use the cached version and our results are less likely to be skewed by this problem? It's not a problem as such, it's just something to be aware of when interpreting the data in your log files - some people will be able to see your website without actually making any connection to it, if they use a caching proxy server and an earlier user has already cached the requested pages, and other people may appear to be multiple people, if they access your site through an array of proxy servers. In the case of dynamic data that is generated on the fly every time a user accesses it, there should be a no-cache header that will tell a caching proxy to always request a fresh copy. But if frequently updated means that you generate a new copy every 10 minutes, then it's quite possible that each copy may get cached somewhere. Having said all that, caching doesn't usually have a major impact on the numbers, and, more importantly, that impact doesn't tend to change much. So even if there's an X% skew in the numbers, the skew is likely to be X% next month and the month after that, as long as there haven't been any major changes in the environment. A different site, though, might have a Y% skew, because their customer base is different, and their usage pattern is different. 2.What can be the effect of sharing our graphs on Individual hosts served category? If people see our graphs on a third-party website, those views will count as requests in Analog (right?) but will those people be included in the individual hosts served count or will Analog only see the page on which the graphs are displayed? There are 2 different issues here - Individual Hosts Served means the number of end users who make a request against your server. Sharing your graphs could mean a number of different things - it could mean that some 3rd party server copies your graphs and puts it on their site, in which case you'll never see the requests for those graphs in your log files. Or it could mean that those sites simpley point to the data on your website, so that the end user sends the request to your server, and that end user is an Individual Host Served. They key difference between such a 3rd party visitor, and a direct visitor, is that the 3rd party visitor will indicate that 3rd party in their referrer field. For example, if you go to the home page for Analog (http://analog.cx/), you'll see a button for Sourceforge on the right, below the blue box. The source for that image is the Sourceforge server. The log files for analog.cx don't contain any information about whether or not users ever see that Sourceforge button, but the logfiles for the sourceforge.net server will show that that image was requested by you, and that you were told to load that image by the analog.cx page. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Frequently updated site, sharing data with third-party websites - what effect on analog?
Natalia Lis [EMAIL PROTECTED] wrote: Thanks a lot, this is very useful, Our shared graphs are hosted on our sites. So just to make sure: if (putting aside the cache and other issues) 1000 different people/hosts view our graph on page XYZ (which is not our website), analog will show that we've got at least 1000 requests from 1000 individual hosts referred to us by the site XYZ? Sort of. If you have 5 different 3rd party servers displaying your graphs (for example), you can get a report saying that 40% of the requests were referred from site 1, 25% from site 2, 20% from site 3, etc. But you can't get a report saying that 1000 individual hosts were referred from site 1, 700 from Site 2, etc. This is in part because any given user could have visited site 1, 2 and 3, and so they could be counted multiple times, whereas any given request can only occur once - so requests and individual hosts are different types of data, so they can't be reported on in exactly the same way. (Note also that the Referrer field is optional - not all web servers log it. And not all web browsers report it, and it's something that can only be measured with the cooperation of the user). Regarding cache, I have another question. Is it possible to cache a page and only refresh some of its items (like graphs) without making a new request for page? The reason I ask is because we have several sites and for one of them, analog shows that the number of requests for pages per week is three times smaller than the number of individual hosts served per week. In other words, an individual host requests less than one page. And this is precisely the site that does not host any graphs – even the graphs which are displayed there come from our other websites. Thus, as far as I know no files from this website are displayed elsewhere and it would seem to me that in order to make any requests from the site, one would have to actually visit it and request at least one page. I know that one possible problem may be the definition of a page, but analog shows that all major files on the site are .html Hmmm. You've already covered the obvious explanation. But I wouldn't look at caching for the explanation here. I presume you have the Host report on (I think the Individual Hosts Served figure is only calculated if you're generating the Host Report). I'd change the Columns displayed for that report to show both Requests and Page requests: HOSTCOLS RPb HOSTSORTBY Requests This will give you an indication of who is making Requests, but not making page requests. Next, I'd have a look at the Status Report, and see if you're getting an unusual number of Redirects - these probably count as requests, rather than Page Requests. Lastly, I'd have a look at the File Type report. You said that analog shows that all major files on the site are .html but you didn't say how you reached that conclusion. If you looked at the Request Report, and saw that it only lists .html requests, your configuration might have a REQINCLUDE PAGES command that excludes any non-Page requests from the report. The File Type Report (FILETYPE ON) will give you more detailed information about the type of requests being made against the server. (You can also change the File Type report to show Requests and Page Requests with TYPECOLS RPb) The bottom line is that a Host gets into your log file, and therefor gets counted, when it makes a Request. The discrepancy that you're seeing is because not all Requests are Page Requests, so if you're seeing lots of Requests that aren't Page Requests, then your assumption about .html files must be incorrect. Once you find out what non-Page requests are causing this discrepancy, you can look at the referrers for those particular requests to see if the problem might be down to cached .html files, or, more likely, due to some dynamic content not being counted as a Page Request. Sorry I can't give a cut and dried answer here - this is exactly the sort of problem that Analog is really good at solving, once you know the right questions to ask of your log files. Hope that helps, Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Availability of Windows 64 bit binary
Nick Altmann [EMAIL PROTECTED] wrote: I've reviewed the archives (and found plenty of 64 bit discussion), but have not been able to find a binary compiled for 64 bit Windows. Has anyone made a analog_60w64.zip available for download? Compiling from source (on Windows) is currently outside my skill set, but the 2GB memory limit of 32 bit is becoming increasingly bothersome. The Windows binary is compiled using MinGW (http://lists.meer.net/pipermail/analog-help/2007-January/020117.html) and at the moment, MinGW is only available for Win32. There is a WinGW-64 project underway, but as far as I can tell, it's not simply a matter of recompiling existing applications under MinGW-64: https://sourceforge.net/projects/mingw-w64/ At this point, it looks like the only tools avilable for compiling Win64 binaries are the Microsoft tools. If there are any full-time students reading this list, you may be able to get free access to Microsoft Visual Studio Pro (https://downloads.channel8.msdn.com/) which can compile 64 bit apps - the Express version that's free for the rest of us (http://msdn2.microsoft.com/en-us/express/default.aspx) won't do 64-bit, as far as I know. (And I don't know whether there are any issues compiling Analog in Visual Studio - I haven't tried it). Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] how to uninstall analog V6?
Randall Wilson [EMAIL PROTECTED] wrote: On MS Windows XP Pro, how does one uninstall analog V6? I have reviewed FAQ and several archive files, with no results, and cannot find any uninstall file. Thanks There is no install file for Analog, and therefore there is no uninstall file. If you don't want it anymore, just delete the Analog directory/folder. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Escaping Quotes in Logs
Roberto Hoyle [EMAIL PROTECTED] wrote: I have a log entry of the form: Entry however, the Entry above may have escaped quotes (\) in it. Is there a way to have Analog differentiate between escaped quotes and regular quotes in it's parsing? ?? Can you expand on your description the problem? I don't understand what you're trying to do, and what you expect to get, versus what you're actually getting. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Logformat Problems
At Friday, February 29, 2008 6:20 PM, The Wolf [EMAIL PROTECTED] wrote: Hello everyone! I've been messing around with LOGFORMAT, trying to get it working with our non-standard log setup. Unfortunately, there seem to be some rather weird errors going on. LOGFORMAT (%v\t%j\t%t\t%s\t%u\t[%d/%M/%Y:%h:%n:%j %j]\t%r\t%c\t%b\t%f\t%B) matches our real setup. All fields are tab delimited, except for the date which is ISO compliant. servername, gzip Ratio, processing time, client IP, username, [DD/MMM/:HH:NN:SS -TZTZ], Request, status code, size, Referrer, Useragent However, analog thinks that all lines are corrupt, and the error seems to point towards a different place in lines. A sample (sanitized for our users protection): C: server - - 1.2.3.4- [27/Feb/2008:18:54:24 -0500]GET /images/image.png HTTP/1.1 200 4373 www.somesite.comMozilla/5.0 (Windows; U; Windows NT 6.0; en-US;) Gecko/ Firefox/ C: * If we put a space where the first tab is (some of the tabs appear a space long, but cat -e confirms they are actually tabs), it changes to this: C: server - - 1.2.3.4- [27/Feb/2008:18:54:24 -0500]GET /images/image.png HTTP/1.1 200 4373 www.somesite.comMozilla/5.0 (Windows; U; Windows NT 6.0; en-US;) Gecko/ Firefox/ C: * I'm not sure what to do next, so I turn to you. If I change your %t to %j, Analog parses the line - it appears that - isn't a valid %t. (Tab delimited log files are almost impossible to debug - you have to change the tabs to spaces in a sample line, and the \t to %w in the LOGFORMAT to have any chance of figuring out what Analog doesn't like). Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Logformat Problems
At Friday, February 29, 2008 9:52 PM, The Wolf [EMAIL PROTECTED] wrote: That's unfortunate. Still, I guess I can make a awk script to ignore that part. No need to do that - just specify %j in the LOGFORMAt - that's what I did! Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] multiple domains one log file
Hildreth, Steve [EMAIL PROTECTED] wrote: I have one access log that contains information for multiple virtual domains. Can someone help me out with the parameter that will allow me to pull out only the information for one (or two domains)? If each line of the log file includes a field that indicates which VHOST (in Analog terminology) that the entry belongs to, and your LOGFORMAT as %v in that position, then VHOSTINCLUDE will tell Analog to only generate a report for the entries that match the specified VHOSTs. For example: 1.2.3.4 - - server1 [2007/Mar/2008:00:55:43 -0500] GET /index.html HTTP/1.1 200 LOGFORMAT (%S %u %j %v [%d/%M/%Y:%h:%n:%t %j] %j %r %j %c) VHOSTINCLUDE server1 Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] multiple domains one log file
Hildreth, Steve [EMAIL PROTECTED] wrote: I attempted to implement the parameter below and was not successful. Let me include a couple line from a log file to make sure I correctly explained what I am asking. Also, I am using analog 5.32. What I am trying to report on are all log entries that are for the domain www.lastudentscount.org while excluding all log entries for the domain www.lausd.net. 10.82.26.193 - - [04/Mar/2008:10:05:27 -0800] GET /Clinton_MS/work_files/compass.gif HTTP/1.1 304 - http://www.lausd.net/Clinton_MS/; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727) 204.108.65.10 - - [04/Mar/2008:10:10:18 -0800] GET /images/LAUSDmap.jpg HTTP/1.1 200 441023 http://www.lastudentscount.org/aboutlausd.html; Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; InfoPath.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30) Okay, there's nothing in those log entries to indicate that they are from different virtual domains. The 2 entries in the middle that start with http:// are referrer fields - I'm pretty sure that if you look at your log files closely, you'll find http://www.google.com showing up in that field on a regular basis, and I'm pretty sure you don't have a virtual domain called www.google.com :-) You'll have to modify your web servers log settings to do what you want to do. Right now, it's just logging all requests for both servers into a single logfile, without indicating which entries belong to which server. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Configuring Analog, Host IP not displayed correctly
Leonard Kramer [EMAIL PROTECTED] wrote: Hi everybody, I have searched for a solution to this problem, but I have been unable to find anything, I'm afraid I'm still a bit wet behind the ears when it comes to setting up webserver stuff. Whenever a report is generated, it seems that analog only reads the 2 first numbers of the host IP, this is the Organisation report for instance: reqs%bytesorganisation 6788.09%78 1511.67%205.142 20.16%[domain not given] 1 66.249 10.08%216.145 Does anyone have any idea where it goes wrong, There's nothing going wrong - That's the Organization Report, which shows what Organizations users are coming from. If you don't have DNS resolution set up, then Organizations are calculated by grouping IP addresses. and how to make things work as they should? Any help will be greatly appreciated. It sounds like you actually want the Host Report, which lists the IP address of individual users. You need to turn it on by adding HOST ON to your analog.cfg file. You might also want to look into doing DNS lookups. While your logfiles are relative small (a few thousand entries), you can have Analog do the DNS lookups by adding these entries to your analog.cfg file: DNSFILE dnscache DNS WRITE http://analog.cx/docs/dns.html http://analog.cx/docs/reports.html Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Odd stats - need help
steve [EMAIL PROTECTED] wrote: I showing some odd stats in my analog, as compared to a simple impression tracker I have scripted on a page. Quite simply, I pull a random image from a database and display it on the screen, then I update that database record by adding 1 to the impression count. Very simple. 1. get random image from database 2. display image 3. update database impression count (+1) But I get very skewed results, here are my impression counts for one particular banner through my simple script: Impressions: 100121 Yet Analog shows this: 4297 0.16% 19/Mar/08 15:50 /banners/468x60.jpg So I've recorded over 100,000 impressions, but analog only recorded 4297 requests. Actually, Analog said that your web server only recorded 4297 requests. It's not immediately obvious how a request for 468x60.jpg is supposed to trigger the update process you describe above - .jpg is usually a static file type. But a useful sanity check would be to see what Analog shows as the number of requests for the page that the .jpg file is embedded in. Is it more in line with the 4297 number or the 100121 number? Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Advice on using includes or excludes
At Monday, March 31, 2008 4:34 PM, Esposito, Richard [EMAIL PROTECTED] wrote: Hello, I use Analog to count traffic on a ColdFusion internal news and events calendar site. There are dozens of different types of URLs captured within the log files, but the two kinds of pages I need to count -- stories and calendar items --have URLs like this: STORIES: http://bnn.ids.web.boeing.com/index.cfm?content=story.cfmid=5226bu=11 CALENDAR ITEMS: http://bnn.ids.web.boeing.com/index.cfm?content=include/event_calendar_s ite.cfmregion=4site=49event_id=11970 Based on these URLs, what would be the best way to configure Analog to isolate just these 2 types of pages? I've tried a few approaches but haven't gotten consistent results. Thanks I'd be inclined to use FILEALIAS to seperate those entries out in the Request Report. FILEALIAS /index.cfm?content=story.cfm* story.cfm$1 FILEALIAS /index.cfm?content=include/event_calendar_site.cfm* calendar.cfm$1 (I'm assuming that the http://bnn part isn't in your log file. If it is, you can just put * before the /index.cfm part) If you don't care about the additional parameters after the , you can leave the $1 out. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] How to condense referring pages?
Michael Crawford [EMAIL PROTECTED] wrote: Hi, I'd like to condense all the the pages one my own site that refer to my homepage into a single line with the total number of referrals. I've done a redesign of my site navigation that now has a more obvious link to my homepage, and want to see how effective it now is. I'm hoping that by getting visitors to visit my homepage, I can encourage them to visit the rest of my site rather than clicking away from it. I know that I could do it manually by tallying them up from the referrer report, but I'm sure there is an automated way. I'd like this total to only include referrals from my own domain. I'm not quite sure that I understand what you are trying to achieve, but the only way to get specific information about referrers to just your home page is to do a report on just your home page - if you are reporting on all the pages on your site, then there's no way to pick out the referrers that only point to your home page You could do FILEINCLDE /index.html to generate a report on just the home page, and then use the Referrer Report and the Referring Site report to get greater detail on where the links to your home page are. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Advice on using includes or excludes
Esposito, Richard [EMAIL PROTECTED] wrote: I'd be inclined to use FILEALIAS to seperate those entries out in the Request Report. Would doing this be enough to keep the page view count down to just those 2 kinds of pages? I'm not sure I understand. With those two FILEALIAS commands, your story.cfm and event_calendar_site.cfm requests will not be included with everything else in the index.cfm count - they'll be seperate entries in the Request Report, with their own Page count columns. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Advice on using includes or excludes
At Tuesday, April 01, 2008 6:25 PM, Esposito, Richard [EMAIL PROTECTED] wrote: Your confusion is probably due to my lack of understanding of how this website works. What I hope to end up with is a Successful Requests for Pages number that reflects only these 2 types of files. I wasn't sure if I needed to write FILEEXCLUDES for every other kind of file shown in the request report, or if there were some other easier way to do it. Thanks. Okay, the simplest solution is to just use these commands: FILEALIAS /index.cfm?content=story.cfm* /story.cfm FILEALIAS /index.cfm?content=include/event_calendar_site.cfm* /calendar.cfm FILEINCLUDE /story.cfm FILEINCLUDE /calendar.cfm The FILEALIAS commands will convert all references to content=story.cfm to plain old /story.cfm, throwing away a lthe additional parameters, and the same for the calendar requests. Then the FILEINCLUDE commands will tell analog to ignore all the log entries except for the story and calendar entries, and that should give you what you want. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Trouble interpreting log / Getting User Report
Michael Summerfield [EMAIL PROTECTED] wrote: I'm guessing Analog isn't finding the %u variable in my logformat statement This one includes the user, mhurley: 2007-12-10 20:40:37 W3SVC1 APOLLO 192.168.32.134 GET /include/styles/author.css - 80 - 66.250.5.66 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.1.432 2;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.04506.30) mhurley http://www.riedthunberg.com/headlines/reload_headlines.aspx www.riedthunberg.com 200 0 0 445 651 62 I've interpreted it as: LOGFORMAT (%Y-%m-%d %h:%n:%j %j %S %j %j - %j - %s %j %A %u %r %j %j %j %j %j %j) That LOGFORMAT doesn't match the log entry. Your LOGFORMAT has 4 fields between the time and the first -, but the log entry has 5. Changing the LOGFORMAT to LOGFORMAT (%Y-%m-%d %h:%n:%j %j %j %S %j %j - %j - %s %j %A %u %r %j %j %j %j %j %j) generates a User Report for your sample line. But I don't think that LOGFORMAT is correct - the first %S looks like your servers address, rather than the remote address. %s (lowercase) is only used if %S (uppercase) is blank. And I'm pretty sure that that particular entry is a request for /include/styles/author.css , and that http://www.riedthunberg.com/headlines/reload_headlines.aspx is actually the referrer (%f), not the request (%r) I'd try LOGFORMAT (%Y-%m-%d %h:%n:%j %j %j %j %j %r - %j - %S %j %A %u %f %j %c %j) instead (no need for trailing %js). This also gets the status code of the request, so that you can tell failures from successes. There is a second log format I discovered while trying to figure out how to get the User Report which I've ignored so far: 2007-12-10 13:51:30 W3SVC1 192.168.32.134 GET /include/images/logo.gif - 80 - 204.179.96.51 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+.NET+CLR+1.1.4322) 200 0 0 I can interpret this one as: LOGFORMAT (%Y-%m-%d %h:%n:%j %S %j %j - %j - %s %j %j %j) LOGFORMAT (%Y-%m-%d %h:%n:%j %j %j %j %r - %j - %S %A %c %j) Should I list both formats? If you're using both logfiles, you'll need to list both logformats. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Request Report SubFloor Configuration
Terry Chambers [EMAIL PROTECTED] wrote: My current Request Report list looks like this: /www.blah.com/section/page.jsp /www.blah.com/section/page.jsp?id=59595tab=1 /www.blah.com/section/page.jsp?id=59595tab=2 /www.blah.com/section/page.jsp?id=59595tab=3 I'd like the report to ignore the tab value and treat it as if all the URLs were: /www.blah.com/section/page.jsp?id=59595 Is there any way to do that? something like FILEALIAS /www.blah.com/section/page.jsp?id=** /www.blah.com/section/page.jsp?id=$1 might do the job. If it matters, this not the only type of URL in the report. Is it the only one that you want to clean up? Second, is there a way to suppress reporting of the subfloor in all cases if we wanted to? I see commands that allow you to control them but I am not sure how to just turn it off. I'm not sure that I understand - subfloors typically aren't reported. If you're talking about the description at the top of each report (for example Listing the top 30 files by the number of failed requests, sorted by the number of failed requests.) then, no, I don't remember that there's a way to suppress that. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Re: GeoIP Patch for Analog
Paul Wade [EMAIL PROTECTED] said: This looks good but has anybody tried it? If somebody want to compile everything for Win32 and upload it I'll definitely give it a go. Otherwise, there are two things putting me off doing this myself. Firstly finding and reinstalling MinGW which I have only ever used for Analog patches and secondly this lurking at the bottom of the instructions on how to make GeoIP work... You must have OS-report on and Browser-report and Browser-summary off. I guess it's more important for me to know what browser people are using than where they actually are. Isn't there an add-on available already that will find people's locations anyhow? I haven't been able to look at Ravikumar's modifcations to tree.c, so I'm not sure exactly what changes he's made, but I assume he decided to sacrifice the Browser reports because they already provide a mechanism for sorting the GeoIP information in a 2 stage hierarchy. Another approach that might work without sacrificing the Browser Report, or recompiling Analog itself would be to use the GeoIP code to create a DNS file (or to modify the IP address using the UNCOMPRESS technique ravikumar used). Using the DNS information, the Organization or Domain Report could be used to display the Country and City information, with one slight modification - you would also have to crate a custom DOMAINFILE (eg geoip.tab) that set the depth of each country to 2. The GeoIP code would set the IP address to bombay.in, madras.in, chennai.in madrid.es, barcelona.es, chicago.us, etc, and the Domain Report would look like 123 .in (India 57bombay.in 43madras.in 12chennai.in 102 .es (Spain) 47 madrid.es 23 barcelona.es etc. It's not quite as pretty as ravikumars solution, because Analog automatically lowercases domain names, and placenames with spaces would have to use dashes or underscores, but it's somewhat easier to implement. The GeoIP API code is also available in a number of different languages, including perl, so there are a number of options available. http://www.maxmind.com/download/geoip/api/ Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Re: Analog-GeoIP and DNS lookup combat.
[EMAIL PROTECTED] said: Hey guys, You all are very-much aware that Analog is a web log analyzer.So I think it is of very interest that people from which region of world has accessed your site. GeoIP resolution has it's own significance if you are dealing with some professional stuff. I want to make clear that ,those who are interested in Browser-report or OS-report should not follow this patch. After thinking over every possible way I choose to replace OS-report and Browser-report with GeoIP-report. Aengus suggested to use DNS look-up for GeoIP resolution as alternate and easy way but, the problem with it is that it affects the following reports and you probably would have to sacrifice all of them. 1 Host Report ,Host Redirection Report, Host Failure Report : It will show garbage Country and City information lines from DNS look-up file. 2 Organisation Report : It will show garbage Cou! ntry and City information lines from DNS look-up file. 3 Domain Report : It will show garbage Country and City information lines from DNS look-up file. 4 Virtual Host Report,Virtual Host Redirection Report,Virtual Host Failure Report : It will show garbage Country and City information lines from DNS look-up file. You have to sacrifice something to get the GeoIP information into your Analog reports - you chose to sacrifice the Browser and OS information, and the advantage of using a single version of Analog. Or you can add GeoIP information to the Host address, and use the reports that are associated with IP names and addressesto display the information. The Domain and Organization Reports only show garbage if you consider the GeoIP information to be garbage - and you would hardly bother doing this if you considered the GeoIP data to be garbage. If you have a lot of international visitors, the Domain report is already a long list of countries - that's the very reason why the Domain report is a natural place for displaying this information - add SUBDOMAIN *.* and you can see city and country information instead of just Country information. The other advantage of this approach is that Analog is already designed to cache this type of information in user defined locations (DNSFILE) so switching back and forth between between the GeoIP and the clean reports is easy. A GeoIP script that would go through a log file and create a DNSFILE would provide a very simple way for users to look at the GeoIP data, and decide whether it is of any real value (given the already discussed weakness of the actual data). As I said, there are cosmetic advantages to your approach - Domain and Host reports are case insensitive, and places names with spaces probably wouldn't work. But you had to modify the Analog source and recompile to get this to work - similar modifications would fix the cosmetic issue for the Domain approach. This isn't a criticism of the work you've done, or the decisions that you made. I was prompted to respond by Paul Wades concern that your approach doesn't work well for most Windows users who don't have the tools or skills needed to recompile Analog. In those circumstances, the reports that are based on IP information are a more obvious choice for displaying GeoIP information. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] How to group related URLs in referrer report
Jay [EMAIL PROTECTED] said: You know how in the Referrer Report when one has a URL with a ? in it the URL's are displayed indented underneath the main one like: http://www.example.com/index.php http://www.example.com/index.php?Category=Something http://www.example.com/index.php?Category=SomethingElse Is there anyway to get URLs in the following format to display in the same way? http://www.example.com/index.php/ http://www.example.com/index.php/Something http://www.example.com/index.php/SomethingElse I don't have any log files to hand to test this, but you could try REFSORTBY ALPHABETICAL http://analog.cx/docs/othreps.html#SORTBY Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] AnalogX QDNS
Hajj abujamal [EMAIL PROTECTED] wrote: 1. How do I prevent QDNS from attempting to add these bogus canonical names ~ the too-long names and the names containing asterisks or unprintable characters ~ to the dns.txt file, or ignoring them when it encounters the unprintable characters in the dns.txt file? and Where are these problematical names coming from? Is QDNS getting these names from the DNS server, and writing them into the dns.txt file, and then choking on them later when it tries to read the dns.txt file back, or is it choking on them when it gets the answer back from the DNS server? (Can you give me one or 2 examples of IP addresses that cause QDNS to choke in this way?) 2. How do I make QDNS simply merge the resolved entries in the temporary dnst.txt file into the main dns.txt file without trying to resolve them (they're already resolved!!!)? Maybe you don't need to get QDNS to merge the files. A DNS cache file is a fairly simple text file. You should be able to just join multiple small DNS cache files into a single file with the DOS copy command (COPy DNS1.txt+DNS2.txt DNS3.txt) Or perhaps a third question: Where do I look for answers to these questions? QDNS is, in my opinion, a marvelous companion to Analog, of comparable workmanship, but has not been updated since 2000. Is there a solution for this? QDNS was written by someone who (as far as I can recall) never contributed to this list (or never identified themselves if they did contribute), and the source was never made available. This is probably as good a place to ask your question as any, but don't expect any definitive answers. It doesn't seem likely that any of the identified bugs in QDNS will ever be fixed at this stage, because the author is the only one who has the source, and he or she doesn't seem to be interested in fixing it. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] LOGFORMAT question
Joshua S. Freeman [EMAIL PROTECTED] wrote: Hi analog folks, Here is my current logformat string and two example log lines: (%S - - [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b %f %B %v) 24-247-100-8.dhcp.aldl.mi.charter.com - - [30/Dec/2007:00:00:07 -0500] (GET /kermit/postal-ca.html HTTP/1.1) 200 9449 (ref http://search.yahoo.com/search;_ylt=A0geu_AQJXdHVYsAap1XNyoA?p=montreal%2C+q uebec+postal+codesfr=yfp-t-501ei=UTF-8) (client Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; SU 3.005; .NET CLR 1.1.4322; HbTools 4.8.4; InfoPath.2)) c01.ba.accelovation.com - - [30/Dec/2007:00:00:10 -0500] (GET /edit_entry.php?area=63room=67hour=16minute=30year=2007month=12day=06 HTTP/1.0) 302 2976 (ref http://meeting.cc.columbia.edu/day.php?year=2007month=12day=06area=63) (client Mozilla/5.0 (compatible; heritrix/1.12.0+http://www.accelobot.com)) vhost meeting.cc.columbia.edu Your referrer, browser and vhost fields are all preceded by an identifier and delimited by quotes. Your logformat delimits the fields by spaces, and there are lots of spaces in your Browser field, so you get chunks of the browser string in your reports. You need two LOGFORMAT strings to deal with the fact that only some of your entries have a VHost entry: LOGFORMAT (%S - - [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b %j %f %j %B) LOGFORMAT (%S - - [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b %j %f %j %B %j %v) Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] AnalogX QDNS
Hajj abujamal [EMAIL PROTECTED] wrote: Salaam! Aengus wrote: (Can you give me one or 2 examples of IP addresses that cause QDNS to choke in this way?) 20160835 38.118.71.121 *.upiasiaonline.com 18459349 193.225.86.51 *.cicero.itak.sztaki.hu 18459204 202.140.141.30 *.enetmall.com Those appear to create problems. I also get Just using a 4 line log file with these addresses (and the one below), QDNS doesn't crash for me, either creating the DNS cache file from scratch, or re-reading it The instruction at 0x00402384 referenced memory at 0x6e2e7873. The memory could not be read. errors (as I was writing this reply). These appear to be the end-of-line corruptions. Just now I loaded a dns.txt file containing 853,309 resolved IP addresses ~ QDNS loaded only 718,170, so I checked the file: 20156554 207.101.74.26 churchilldevelopment-churchilldevelopment-psr2081256.z74-101-207.customer.algx.n4¿ÛC¡3 On my machine, QDNS resolved this to 20163681 207.101.74.26 churchilldevelopment-churchilldevelopment-psr2081256.z74- 101-207.customer.algx.nö So it exhibited the same error, but the outcome is slightly different. Perhaps that has something to do with the default Regional and Language settings in the control Panel. (Mine are all set to English (United States)). So the crash could be caused by the size of the file, or by the additional characters being generated by QDNS on your machine. What happens if you create a 1 line log file with just 207.101.74.26 as the address, and run QDNS against it? Does QDNS still crash? Maybe you don't need to get QDNS to merge the files. A DNS cache file is a fairly simple text file. You should be able to just join multiple small DNS cache files into a single file with the DOS copy command (COPy DNS1.txt+DNS2.txt DNS3.txt) Aaah, that works ~ except that the COPY command adds an EOF character at the end of the new file. QDNS removes it when I strip all unresolved with a command line like: qdns /D dns.txt /Y 63.135.48.130 /S and that works. So far. Does Analog care about the presence of the EOF? Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] LOGFORMAT question
Joshua S. Freeman [EMAIL PROTECTED] wrote: Hi Steve, Thanks so much.. I'll try them. I already tried Angus's LOGFORMAT lines and here's the result: http://www.columbia.edu/~jf2412/report2/ http://www.columbia.edu/~jf2412/report3/ I forgot about the closing parentheses, so the referrer and Browser strings using my LOGFORMATs have a trailing ). Not really a problem for the Browser Summary, but if you click on any of the Referrers in your Referrer report, you should get an error response. Stephens version tells Analog that the closing parenthesis around these fields aren't actually part of the Browser and Referrer strings. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] undefined log field
On 5/27/2008 4:48 PM, Terry L Maluk wrote: We are analyzing our internal logs from using the Google Search Appliance. These logs include a variable for the number of search results returned to the client. Rather than defining this as %j (junk) in the LOGFORMAT, is there some way to capture this field with Analog? You can use any field that isn't otherwise being used. The User field (%u) is probably the most commonly re-used field in these circumstances. Just put %u in that location in your LOGFORMAT, and turn the User Report on (USER ON), and you should see how many times user 1 turned up, and how many times user 2, etc. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] [amount of data - monthly?]
On 5/30/2008 7:13 AM, Sergiusz Pawlowicz wrote: Hi, how can i setup analog to show me amount of data served monthly? Turn on the Monthly Report (MONTHLY ON) and make sure that the Bytes column is displayed (MONTHCOLS BR). Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Report Request feature stuck in
Bai Macfarlane [EMAIL PROTECTED] wrote: I use the stats feature from analog with our godaddy website. Of late, the left menu Report Navigation bottom option of Request Report website is virtually useless. I used to be able to track the number of request for particular pages and as of April 22, most of the pages have been showing no change in number of requests, and the last reporting date is April 22 (date and time of last access). I know pages have been requested many, many times since then but it is not counting in this report anymore. Why is this happening or how can I fix it? Your description sounds like you're using a service offered by GoDaddy, rather than you downloading the logfiles and running Analog against them yourself. If that's the case, then you need to contact GoDaddy, and ask them to fix it for you. Analog is running properly, but GoDaddy hasn't fed the latest logfiles into the process. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] -21474836.-48% in %bytes field
Grzasko, James (RBI-UK) [EMAIL PROTECTED] wrote: Hi Stephen I managed to identify an offending logfile and can consistently reproduce the error on our reports server. However I then tried running the same report on my local PC and it worked fine (no -21474836.-48% figures coming out). The only change to our report server was the installation of .NET 3.5 a few months back. The reports server is Win 2003 versus Win XP on my desktop (which also has .NET 3.5). I will see about removing .NET 3.5 from the reports server and see if this helps. It might be easier to install a clean copy of Analog in a different directory on the server, and see if that works - it won't even require a reboot of the server :-). Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
[analog-help] Web hits used to pinpoint earthquakes
This is not entirely relevant to this list, but there was a short piece in this weeks New Scientist magazine about an interesting and novel appliance of web log analysis that I thought some of you might find amusing. It seems that traffic to the European-Mediterranean Seismological Centre's (EMSC) website spikes in the immediate aftermath of a tremor, and by using GEO-IP techniques to map the IP addresses, researchers can locate the area where the earthquake occurred. http://technology.newscientist.com/channel/tech/mg19826626.300-web-hits-used-to-pinpoint-earthquakes.html Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Identifying Known Spiders?
On 7/3/2008 3:48 AM, Michael Crawford wrote: I'd like to know the success of my efforts to submit a new site to all the search engines; some spiders won't visit a site until it's been online for a while, and some will only visit the home page. I can see some of the spiders in the BROWSERREP and BROWSERSUM, but it's missing some because it's definitely missing Googlebot and Yahoo Slurp. Also the BROWSERREP shows all the browsers used by my human visitors; it will get hard to spot spiders when my traffic picks up. Is there a report specifically for known spiders? No, the only special treatment for spiders in Analog is the ROBOTINCLUDE command which tells Analog to count the requests with the specified User-Agents as Search Engines in the OS Report. There used to be a list of Spider User-Agents at http://www.wadsack.com/robot-list.html but it seems to be empty at the moment. There's a list from may 2007 at http://www2.owen.vanderbilt.edu/mike.shor/diversions/analog/RobotInclude.txt You might want to do a report with FILEINCLUDE /robots.txt, which should give you a good indication of which search engines are hitting your site. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Identifying Known Spiders?
On 7/4/2008 12:30 AM, Michael Crawford wrote: On Thu, Jul 3, 2008 at 8:37 AM, Jeremy Wadsack [EMAIL PROTECTED] wrote: The robots list from which that page was built no longer exists. The group that was maintaining it decided that it didn't make sense to maintain a database of known robots any more as anyone can make a robot. In my personal case, it's not so much that I want to watch all the bots, as to monitor my progress at getting a new site indexed by the search engines. While Google, Yahoo and MSN together provide the vast majority of search engine referrals, there are still a few small, independent players such as JGDO. There are lots of reasons for running a bot, some good, some bad. I'd be happy if I could get a report of visits by the bots belong to, say, the top half-dozen search engines. Note that it often happens, with new sites, that a search engine spider may not visit at all for months, and even then will only fetch the home page. By creating config files for each of my pages, I hope to monitor spider visits throughout my site. If this isn't yet possible with analog, I don't think it would be hard to implement, and would be very popular, and so would get Analog a lot more users, and maybe some consulting fees for Analog experts. It all comes down to the same simple question - how do you decide that any given request is from a spider/bot rather than a real person? If you rely on the User-Agent string you then have to decide how to identify the relevant strings - assume that everything that isn't a well known browser is a spider, or assume that everything that asks for asks for /robots.txt is a spider. Unfortunately, there's nothing to stop a bot using a well known browser User-Agent (see recent controversy about the AVG LinkScanner, for example), and there's nothing to stop an ordinary user from requesting /robots.txt. That means that there's no simple way to automate the identification of spiders - it requires some judgement, and Analog doesn't do judgement :-). Once you come up with a set of rules that work for you (or for the set of log files that you're working with at the moment), then it's not difficult to use Analog to delve deeper into the robot traffic. You can use FILEINCLUDE /robots.txt to get a list of IP addresses or Browser strings that have requested /robots.txt. You can then use this information with HOSTINCLUDE or with BROWINCLUDE to get a view of the rest of the traffic from either one specific spider, or all of the spiders as a whole, bearing in mind that the job of spidering your site might be spread between a number of different machines, so you might need to HOSTINCLUDE a range of machines if you use that technique. So you can certainly use Analog to watch this type of traffic - indeed Analog's configurability makes it an ideal tool for the job. But because there are no black and white rules for deciding what is or is not a robot/spider, this functionality can't be built-in to Analog. The decisions that you might make today to do this analysis on your site might be different for someone else, and might be different in a few months time, as the list of search engines change. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Help on the LogFormat
On 7/7/2008 7:19 AM, Arnab Ganguly wrote: Hi All, I am using Analog for the first time.In my Apache I have tailor made the log in the following format *LogFormat %h %l %u %t \%r\ \Transaction time in Sec= %T\ \Bytes received = %I\ %s %b common *My requirement is Analog should be able to understand the above format. I tried out the following in the analog.cfg file, APACHELOGFORMAT (%h %l %u %t \%r\ \Transaction time in Sec= %T\ \Bytes received = %I\ %s %b) LOGFILE /data/servers/testgm1/logs/access_log OUTFILE /data/servers/testgm1/pages/output.html HOSTNAME 10.146.163.301 HOSTNAME http://10.146.163.301; but the output.html file generated is not proper.Any help would be very much appreciated. Can you post 2 or 3 lines from your logfile? Analogs debug output usually makes it possible to figure out any discrepancies in the log format. Also one more question, is it possible for Analog to monitor logs other than Apache? Analog can be configured to read log files from most web servers (Analog predates Apache). It's sometimes possible to use it to parse log files that aren't web server logs, but that depends on the format of the logfiles, and the type of information you're actually interested in. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Help on the LogFormat
On 7/7/2008 7:19 AM, Arnab Ganguly wrote: Hi All, I am using Analog for the first time.In my Apache I have tailor made the log in the following format *LogFormat %h %l %u %t \%r\ \Transaction time in Sec= %T\ \Bytes received = %I\ %s %b common *My requirement is Analog should be able to understand the above format. I tried out the following in the analog.cfg file, APACHELOGFORMAT (%h %l %u %t \%r\ \Transaction time in Sec= %T\ \Bytes received = %I\ %s %b) LOGFILE /data/servers/testgm1/logs/access_log OUTFILE /data/servers/testgm1/pages/output.html HOSTNAME 10.146.163.301 HOSTNAME http://10.146.163.301; but the output.html file generated is not proper.Any help would be very much appreciated. I'm sorry, I jumped to the wrong conclusion when I saw a question that referenced a Log format. It looks like your APACHELOGFORMAT is fine - it parses the sample logfile lines that you posted. What do you mean when you say that the output.html file generated is not proper? Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Re: Help on the LogFormat
Arnab Ganguly [EMAIL PROTECTED] wrote: I rechecked again.Actually the pie charts are not getting generated?OS information is also blank. The OS Information comes from User Agent string, which isn't in your logs. How do I generate the pie charts? I get Pie Charts for File Size and Processing Time from your sample log file - if you don't have those two reports turned on, you won't get any charts (at least from the sample lines you provided - the charts for the Domain, Organisation. Host, Status Code, File Type, Directory and Request Reports aren't generated, because there is only one entry in each of those reports (for the 10 sample lines that you provided). When I ran Analog with ALL ON to turn on all available reports, Analog generated the following warnings: analog: Warning R: Turning off empty Redirection Report analog: Warning R: Turning off empty Failure Report analog: Warning R: Turning off empty Host Redirection Report analog: Warning R: Turning off empty Host Failure Report analog: Warning R: Turning off empty Referrer Report analog: Warning R: Turning off empty Referring Site Report analog: Warning R: Turning off empty Redirected Referrer Report analog: Warning R: Turning off empty Failed Referrer Report analog: Warning R: Turning off empty Browser Report analog: Warning R: Turning off empty Virtual Host Report analog: Warning R: Turning off empty Virtual Host Redirection Report analog: Warning R: Turning off empty Virtual Host Failure Report analog: Warning R: Turning off empty User Report analog: Warning R: Turning off empty User Redirection Report analog: Warning R: Turning off empty User Failure Report analog: Warning R: Turning off empty Search Query Report analog: Warning R: Turning off empty Search Word Report analog: Warning R: Turning off empty Internal Search Query Report analog: Warning R: Turning off empty Internal Search Word Report analog: Warning R: Turning off empty Browser Summary analog: Warning R: Turning off empty Operating System Report analog: Warning R: In Domain Report, turning off pie chart of only one wedge analog: Warning R: In Organisation Report, turning off pie chart of only one wedge analog: Warning R: In Host Report, turning off pie chart of only one wedge analog: Warning R: In Status Code Report, turning off pie chart of only one wedge F: Opening Charts/proctime.png as pie chart file F: Closing Charts/proctime.png F: Opening Charts/size.png as pie chart file F: Closing Charts/size.png analog: Warning R: In File Type Report, turning off pie chart of only one wedge analog: Warning R: In Directory Report, turning off pie chart of only one wedge analog: Warning R: In Request Report, turning off pie chart of only one wedge The Empty Report warnings occur because there isn't enough information in the logfile to generate those reports (no Referrer or no User Agent, in most cases. Some of the other reports may show up in your real logs, such as the Failure Report). The turning off pie chart warnings occur because all of your sample lines come from the same IP address, and request the same file, so there is only one entry, therefore Analog doesn't generate a Pie Chart. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Re: Help on the LogFormat
On 7/8/2008 3:44 AM, Arnab Ganguly wrote: Hi All, Thanks a lot for all your help.Explanation was really comprehensive and things are clear.Lastly I face one more issue.I am able to generate the output.html file and related png files.But issue is that when I try to access those files from the browser the output in the web page doesn't come properly below I have given the snippet If the problem is that the HTML page itself is not displayed properly, then it sounds like you have some sort of issue with MIME Types on your web server - I really have no idea what might be going on there. (Thought I have a vague recollection that there was a problem with using anlgform.pl and IE6 at one point). If the web page is displaying properly on your server, but the graphs aren't being displayed, then you need to look into the CHARTDIR and LOCALCHARTDIR commands. http://analog.cx/docs/othreps.html#CHARTDIR Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Quick question...
Joshua S. Freeman [EMAIL PROTECTED] wrote: Hi all, I'm using Analog to analyze and report on our helix server (realmedia) log files. It's going great. I have one particular customer interested in these reports. It's the Law School at Columbia University. Is there anyway I can filter the analysis so that it ONLY parses logfile lines with either */law/* And/or */lawlive.rm How would I write that filter to the analog.cfg file? If you use FILEINCLUDE */law/* and/or FILEINCLUDE */lawlive.rm the Analog should ignore any log entries that don't match those patterns. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Analog and Ezproxy
Shannon E. Fox [EMAIL PROTECTED] wrote: I want to use Analog to analyze my Ezproxy stats. I have entered the following command into the ezproxy.cfg file to set up daily logs that can then be analyzed monthly: LogFile -strftime log/ezp%Y%m%d.log The problem is that this command is generating file names that include the entire year, rather then the appended '08, and Analog doesn't like that. Does anyone know how I should amend the command line above so that my log file name will look like ezp080701 instead of ezp20080701 Analog doesn't care what your logfiles are called. If you're using codes in your Analog LOGFILE command, there's a code for a 4 digit year as well as a 2 digit year. Assuming you're using LOGFILE log/ezp%y%M%D.log in Analog now, just change it to LOGFILE log/ezp%Y%M%D.log (uppercase Y for 4-digit year). http://analog.cx/docs/logfile.html#datecodes Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] analog error during analyzing a set of large logfiles
Christian Grosse [EMAIL PROTECTED] wrote: Hello, we have a problem analyzing our logfiles. The error Fatal error: Ran out of memory: cannot continue: exiting occured when the common size of the logfiles crosses a limit about 80 GB. At this time the memory load of the analog process reached 4081 MB, but more than 3 GB of memory are still free and available. We read the FAQ's and manuals, and performed the decsribed solutions with config item HOSTLOWMEM - but this didn't eliminate the error. We used analog 6.0 on a Sun-Fire-V240 with 2 cpu and 8 GB RAM. OS is Solaris 8. We hope that somebody has a solution for our problem. According to an old post in the archives (June 2001), there was hard limit of 4G in the version of Solaris being used at that time. http://lists.meer.net/pipermail/analog-help/2001-June/009019.html 4081MB is pretty close to that limit. I don't deal with Solaris on a regular basis, and 4G would be much more restrictive today that it was then, but is there any chance that this might a Solaris issue? Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Command Line processing question
Terry Chambers [EMAIL PROTECTED] wrote: My site's log files are multiple .gz files each day (about 30 of them). I want to create a command line to perform the analog process. Herei s my example: /analog-6.0/analog -G +g/analog/configs/analog-daily-full.cfg logfile /www_logs/2008/07/09/*.gz O /www_reports/daily/20080709.html When I execute this, Analog complains that it cannot understand the log format of the files. When I embed the location of the LOGFILE and OUTFILE into the CFG file, it works fine. Is there a problem with my command line? Is the problem in my config file itself? LOGFILE is sort of assumed on the command line. For example analog 20080714.log will analyze 20080714.log. To specify the outfile, the command line argument is +O, and, just like the +g for yoru additional .cfg file, there shouldn't be a space between the +O and the outfile. /analog-6.0/analog -G +g/analog/configs/analog-daily-full.cfg /www_logs/2008/07/09/*.gz +O/www_reports/daily/20080709.html http://analog.cx/docs/syntax.html http://analog.cx/docs/indx.html#clargs Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Referrer report - what it shows exactly and why itlists only foreign sites
Natalia Lis [EMAIL PROTECTED] wrote: The site has dozens of millions requests per day so I think the floor of 20 is good. Also, the graphs that we share are embedded images; they are displayed on other sites, but are hosted on our websites. Based on what you wrote, I'm not surprised that the Referrer report only lists referrers for about 5% of all requests. If you've got millions of requests per day, I'd find it amazing if only 5% of them have referrers. I would consider that to be a very unusual outcome, and I would take a much closer look at the data to see if I have missed something important. It would be much more likely that 95% of the successful requests had referrers than that only 5% of them did. The site is very busy with graphics and I suppose that most users have it bookmarked. People don't usually bookmark graphs, they bookmark the pages that the graphs are embedded in. The request for the page itself won't have a referrer, but the request for the image will have the page itself listed as the referrer. You need to look again at your referrer figures - they are so far out of the ordinary that they indicate that something is wrong somewhere. The difference between a referrer report and a referring site report is that the former lists URLs while the latter lists websites? So if a website shows our graphs on two different pages, the referrer report will show 2 urls, while the referring site report will show one website address? Yes. It will essentially aggregate the referrer number by site. If you want not to count referrers to images, you can use the command REQINCLUDE pages This is interesting. So it will only count instances when another website links to us and a person clicks on that link and actually visits our site? Yes. (With certain caveats - you could define your graphs as pages, especially if they're dynamically generated). If the main purpose of your site is to host graphs for other peoples sites, it wouldn't make a lot of sense to exclude the requests for those graphs from your reports. On the other hand, if you want to be able to tell how much of your traffic is due to the graphs, and how much is people actually visiting your site directly, excluding the graphs may be useful. Let's say we have a graph: www.oursite.com/graph and it is embedded on another website in the following way: A HREF=http://www.oursite.com;IMG SRC=http://www.oursite.com/graph;/a If I follow Stephen's recommendation, the report will only count clicks on the graph, not just requests for an updated image? Yes. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Large Log FIles
Kush [EMAIL PROTECTED] wrote: My site recieves a whopping 4GB's per day!!! What can I do?? My host are staying they can't help me. Can you process transfer.log files? Transfer.log files often refer to FTP server type logs, that often have multiple entries for each transaction. If that's the type of log file that you're taking about, then you may be able to use Ananlog to coax some information from the data in the log files, but it's not a natural fit, and won't work out of the box. If that's not what you mean by trasnfer.logs, can you expand on your question? Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Referrer report - what it shows exactly and whyitlists only foreign sites
Natalia Lis [EMAIL PROTECTED] wrote: People don't usually bookmark graphs, they bookmark the pages that the graphs are embedded in. The request for the page itself won't have a referrer, but the request for the image will have the page itself listed as the referrer. But this referrer would be our own site, right? Not if the page that is bookmarked is the page at some other site that has your graphs embedded in it. And there is a way to exclude one own's sites from being listed in the referrer report? If this is the case, and I assume that it is because I don't see any of our sites in the report, then wouldn't the result make sense? You can use REFREPEXCLUDE to exclude matching referrers from the Referrer and Referring Site Reports. (Note that REFREPEXCLUDE only excludes entries from the Referrer Reports. REFEXCLUDE will exclude the logfile entries that have matching referrers from the whole analysis process. So REFREPEXCLUDE *.oursite.com/* will just keep the internal referrers out of the Referrer Reports. REFEXCLUDE *.oursite.com/* will keep any log entries that have internal referrers out of your report altogether). On the other hand, if you want to be able to tell how much of your traffic is due to the graphs, and how much is people actually visiting your site directly, excluding the graphs may be useful. That's what I would like to do. Is there a way to set the report so that it only counts referrals to certain pages (so as to be sure that no graphs are included)? If you use REFINCLUDE *.oursite.com/Jul/16/report008.html (for example) then Analog will ignore all log entries that don't have that referrer. (I'm sorry if it's a basic question, I don't set the reports, I just interpret the results, if I want changes I will have to speak to IT). I would strongly recommend getting a copy of even a single logfile on your own computer, and installing Analog and running your own copy of Analog against that sample log file. It's a lot easier to understand the nuances of what each report does when you can very quickly make a small modification to the configuration and then run a new report to see what the differences are. Analog can easily crunch through a million lines of log files in under a minute, so you can very quickly see the difference as you change configuration options. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Large Log FIles
Kush [EMAIL PROTECTED] wrote: Yes, that is the type of log file its is. I was asking to see if Analog will process that type of file. What do you mean it won't work out of the box, and that its not a natural fit? And Analog is designed to work with web server log files where every single line represents a complete transaction, and consists of a a date, time, IP address, request, status code and some other optional fields such as the bytes transferred, user name, referrer and Browser Agent string. If your logfiles look like that, then Analog will work very well with those logs. I have no idea what exactly you mean by a transfer.log. But most FTP server logs tend not to have all the information that Analog expects, or else have multiple log entries for a single transaction (for example an entry when a file transfer starts, and another one when it finishes). Analog isn't designed to parse that type of information, though if you know what you're doing, you can use Analog to extract some information from that type of log. Can they be converted? Beats me, as you haven't told us what you mean by transfer.log. Analog is designed to analyse and report on Web server access logs. If you're not dealing with standard web server log files, and don't have time to read the documentation, then Analog isn't for you. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Large Log FIles
On 7/16/2008 7:32 PM, Kush wrote: transfer.log is what I read when I log into to my server by way of my FTP and look into the log folder. Thats all. Can Analog process past log info? That tells me nothing about what the data in the logfile looks like. Post the first 8 lines from one of your logfiles and someone on the list will tell you whether it's something that Analog will handle. I've read that Analog can process large files like no other program. Will my 4GB per day log files be an issue?? Analog can handle 4G of logfiles. You might encounter problems if you try to analyis a couple of weeks worth of such logs in a single report. If you need to do something like that, Analog can summarise teh information in log files into a cache file that will allow it to generate reports on very large amounts of data. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Large Log FIles
On 7/16/2008 10:45 PM, Kush wrote: Can I remove the files from the server and process them from my computer using Analog? Do you provide set-up services? This is a self-help mailing list for users of Analog. Analog is a freeware application that you download and install yourself. There are no set up services - just read the documentation at http://www.analog.cx How many request equals an actual visit. I was told by my host that I recieve 2,000 request per second but that doesn't mean visits. Are there other report display options? Read the documentation at http://www.analog.cx Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Help with Analog 6.0 on Windows
Hanif [EMAIL PROTECTED] wrote: Hello all, I am using Analog 6.0 for Windows XP SP2. I've encountered two necessary functions that need to be completed for successful logging: 1. Must implement log file paths to logs stored in two different directories/folders of a network drive. Ex. \\192.168.0.100\apache\logsa; 192.168.0.100\apache\logsb 2. Will incorporate both locations mentioned above to generate one report. I've used online analog configurators which proved helpful and was able to output the report in a specified directory. However, #1 and #2 are most important and would appreciate any help provided. Analog can have multiple LOGFILE statements. The report that it creates will be a report on the information in all of the logs it is told to analyse. If you're going to analyze logfiles over the network, it's often a good idea to compress the files first - network file access is usually oders of magnitude slower than local access, and the overhead of unzipping the files is a lot less than the delay in network access. This is particularly true if you are likely to be running many different reports, with changing parameters. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Processing Past Log Files
As I was about to send this reply, I noticed that you had addressed your e-mail directly to my address, and that of some other contributors. I consider that to be an abuse of the voluntary support process that the analog-help mailing list facilitates. Nobody is entitled to any response or support from any subscriber to this list, and if someone has not responded because they are too busy, or are away, or because they have decided to unsubscribe from the list, that's their business, and sending direct e-mail to individual responders on this list as well as to the list is essentially spam. On 7/28/2008 11:36 PM, Kush wrote: I have log files that have not been processed being that they're too large...4GB per day. Is there a standard operation procedure to processing past log files stored on the server? Analogs Cache mode essentially summarises the information in a log file, so that you can generate reports from these summaries. This allows you to generate reports from sets of log files that would require vastly more memory than Analog would be able to handle. It's important to understand that the caching mode works by discarding a great deal of linking information, so that you while you can tell that X Hosts visted your sites, and that there were Y requests for a particular page on your site, you can't can't use a cache file to tell how many hosts requested just that particular page - you'd have to go back to the original log files to find that out. The other important thing to bear in mind about cache files is that they can only generate reliable reports for the data that they were designed to capture. That means that you must generate your report design (the .cfg files that control which reports are turned on, what the FLOORs and SUBFLOORs are, what is INCLUDEd and EXCLUDEd, etc) before generating your cache files. If you subsequently decide that you need to change the .cfg files, you may need to recreate the cache files for all the logs. Analog Cache files can help you generate reports from very large log files, but they can be complex, and you can make mistakes if you're not careful. http://analog.cx/docs/cache.html Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Special Organisation Report
Brian Clanton [EMAIL PROTECTED] wrote: I have two types of log files for a Proxy Server that I am anylyzing with Analog: local 75.149.219.73 - - [18/Jul/2008:13:58:19 -0600] GET http://infotrac.galegroup.com/itweb/caro83221?id=caro83221db=LitRC HTTP/1.1 301 555 proxy 67.167.91.241 - fGKzhzjLu6flE6G [18/Jul/2008:14:33:07 -0600] GET http://mldv.permissiontv.com/channels/carol_il HTTP/1.1 200 0 The logs can either local or proxy. This designates whether or not a request is coming from inside of an IP range or outside of an IP range. My question is How can I get a report within the Organisations Report to tell me how many requests are internal, local and how many are remote, proxy? I'm not sure what you mean by a report within the Organisations Report but you can use one of the reports that you aren't currently using, and hijack it's field to report on how often a specific term turns up in that field. For instance, if you're not using the User Report, you can create a logformat with %u in the first field, turn on the User Report, and it will show you how many request were made by the user local and how many were made by the user proxy. If you are using the User Report already, you can also repurpose the Virtual Host report in the same way. (You might be able to do something with the Browser Report too). Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Distinct Host Served
On 8/4/2008 6:05 PM, Kush wrote: Can distinct host served be considered as visits, or come close to it? If I visit your website at 9AM and at 9PM, would you consider that to be 2 visits or one 12-hour long visit? There'll only be one distinct host served. If a company with 1,000 employees has a proxy server, and 30 people from that company visit your website, there'll only be one IP address showing in your logs. Do you want to count those 100 visitors as 1 visit, or as 30 visits? If some of your users are with ISPs that use load balancing proxy servers (a farm of proxy servers, with requests being shared between them), then a single user might show up in your logfiles with 10 different IP addresses for a single visit, but 10 distinct hosts served. http://analog.cx/docs/webworks.html The closest that you can get to a reliable count of visits is to assign each visitor a session_ID, a transient cookie with a limited lifetime of 15 or 20 minutes. That will typically require specific support on the server side, and will undoubtedly annoy some part of your audience. And it requires assumptions about the way your users use your site. And if you're going to start making assumptions, you can start by admitting that if you change your assumptions, you're going to get different answers, even if you start with the same data. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] RE: Help on reporting failures correctly from Xferlog format
On 8/7/2008 1:40 AM, Read, Kimberly wrote: Hi, Well, I have figured out one way around this and that is to convert the last character in the log to an http return code. I didn't want to have to do that, but it seems to work and then obviously I added a %c to the LOGFORMAT command. If anyone knows of another way to do this without changing the input file(s), please let me know. Your original message indicates that you just want to ignore the records that end in i, so I presume that you only want to count the records that end in c. LOGFORMAT (%j %M %d %h:%n:%j %Y %t %S %b %r %j %j o %j %u ftp %j %j) LOGFORMAT (%j %j %j %j %j %j %j %j %j %j %j i%j) Just change the first record so that it will only match records that end in c, like this: LOGFORMAT (%j %M %d %h:%n:%j %Y %t %S %b %r %j %j o %j %u ftp %j c) (Altenatively, just swapping the order of your LOGFORMAt commands might work, as it should cause the i records to be matched as junk before they are parsed as real records. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Running out of memory
Laurent FAILLIE [EMAIL PROTECTED] wrote: Hi all, I have quite large log files and I'm trying to generate statistics on a machine few memory. After many tests, I find out I got error when enabling SEARCHQUERY. Is any way to request analog to consider only last 7 days requests ? If you can't use Wild cards to just specify a subest of logfiles to analyse, you can tell Analog to ignore all requests that fall outsite a specific time range using the FROM and TO commands. http://analog.cx/docs/include.html#FROMTO Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Weekly and Monthly Reports
Terry Chambers [EMAIL PROTECTED] wrote: Our site produces about 1 GB of logs per day so I have them organized into directories like this: 2008/08/01 2008/08/02 etc In each directory are about 100 logs in .gz format I can easily produce a report for a single day. If I want to produce a report for a week, how would I go about specifying all of the logs to process? For example, if I wanted to process all the logs in 2008/08/01 - 2008/08/08, how would I do that? LOGFILE 2008/08/01/* LOGFILE 2008/08/02/* LOGFILE 2008/08/03/* LOGFILE 2008/08/04/* LOGFILE 2008/08/05/* etc. You can have multiple LOGFILE commands, or you can specify multiple logfiles on the command line, if that's how you run Analog Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Special Organisation Report
Brian Clanton [EMAIL PROTECTED] wrote: When I use the User Report for identifying local verses proxy users, I only get a readout of the proxy users, not local. I've played with some of the FLOOR commands as they apply to the User report, but I cannot get Analog to recogize the local requests in the log file. The floor command that would seem to apply is: USERFLOOR 1r Which to me would mean List all users with at least 1 request My logformat command is APACHELOGFORMAT (%u %h %l %w %t %r %s %b) Any input would be appreciated If all the local entries have a 301 status code (Permanent Redirection), then you'll need to look at the User Redirection Report to see the local users. REDIRUSER ON Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Changing report titles
Brian Clanton [EMAIL PROTECTED] wrote: Has anyone ever tried changing the titles in the reports. I am running Analog to get statistics for a proxy server that a number of libraries use to authenticate library patrons. I want to change the titles of some of the reports. For example: User Report - Local Requests User Redirection Report - Remote Requests Request Report - Database Usage You can modify the .lng file in the lang folder http://analog.cx/docs/output.html#LANGUAGE Note that the default language is uk.lng. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Remote and local statistics
Brian Clanton [EMAIL PROTECTED] wrote: The problem is that the proxy server (Ezproxy) redirects the local access requests (301) to save on bandwidth and only proxies the remote requests. Therefore Analog only shows stats for the successful proxied requests in the other subsequent reports, namely the REQUEST REPORT. Is there a way to configure Analog to consider all requests, both local and proxy, as legitimate requests so the REQUEST REPORT will show totals for both local and remote patrons as well still have separate reports showing total remote requests (User Report) and total local requests (REDIRECTED USER REPORT)for a given time period? The issue of treating 301's as Successful requests is delat with in the Analog FAQ: http://analog.cx/docs/faq.html#faq181 I want to be able to count requests with status code 301 and 302 as successes, so that they appear in the Request Report. No, you really don't, because that would lead to double counting when a request for /dir (code 301) is redirected to /dir/ (code 200). For CGI scripts etc. look in the Redirection Report instead of the Request Report. If you want all the entries to turn up in the Request Report, then you'll have to modify the Response Code to be 200. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Remote and local statistics
Brian Clanton [EMAIL PROTECTED] wrote: Anyway, to me it seems I need to include the redirected logs since these are the only accounts of local access requests from patrons that are in the log file. Am I correct in saying this? It doesn't make much difference whether you're correct or not. Analog won't treat log entries with a 301 response code as Successful requests. If you want those log entries to show up in the Request Report rather than the Redirection Report, then you'll have to either modify the logs, modify Analog's source code, or modify EZProxy. You might have some success by ignoring the Status Code altogether (marking it as %j in the LOGFORMAT), but that depends on whether you are paying any attention to requests other than 200s and 301s. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Analog OUTPUT
Brian Clanton [EMAIL PROTECTED] wrote: Is it possible to produce an html webpage for output AND create a deliminted file that can be exported intp a spreadsheet? One output per run. The Analog README suggests it is either one or the other. http://analog.cx/docs/faq.html#faq153 Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] User Report
DINAKARAN SUNDARESAN [EMAIL PROTECTED] wrote: Hello, I'm using Analog tool for the first time. Before i get started on my question, I want to tell you that this a cool tool and we love it. Here goes... I'm trying to extract a report that would give me list of pages accessed by users. I tried User Report. I was successfully able to get the details but it doesn't include the page names accessed by each user. How do I achieve this? You can't: http://analog.cx/docs/faq.html#faq128 You can get a list of all the pages read by a single user (IP address or userID), or you can get a list of all the Users (IP addresses or UserIDs) that accessed a single page, but you can't get a single report that shows every request made by every user. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Help on analog logging
On 8/22/2008 7:15 AM, Arnab Ganguly wrote: Hi All, I am able to generate the reports properly using the Analog analog-6.0.3-1.el3.i386.rpm on Red-Hat 3.0. But some extension is required for my requirement 1)Under the File Size report I see the upper limit as 1KB-10KB, so how can I increase the difference like 10KB -20KB...?Is there any configuration param for that? If you're seeing an upper limit of 10KB, it's because you don't have any requests for files greater than 10KB. If there's a request for a file of 20KB, it;'ll be counted in the 10-100KB bucket. The only way to change the size of the buckets is to modify the Analog source code and recompile. 2)Under the Host Failure report it shows top 20.How can I list all ? 3) Under the Host report it show top 50,How can I list all or increase the limit? http://analog.cx/docs/faq.html#faq114 There are floors for each report. You can change the number of lines displayed in the Host Failure Report by changing the FAILHOSTFLOOR, and you can change the number of entries in the Host Report by changing the HOSTFLOOR. http://analog.cx/docs/othreps.html#FLOOR 4)Processing time also can we see all the intervals? As far as I know, there is no floor for the Processing Time Report - if there's a processing time in the log entry, it will be counted in the appropriate bucket. The Processing Time report uses a set of buckets (1-2, 2-5, 5-10, 10-20, 20-60, 60-120, 120-300, 300-600, etc). There are no configuration options for these buckets - you would have to change the source code to change the bucket sizes. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Suppress Charts
On 8/22/2008 9:12 AM, Brian Clanton wrote: I would like to suppress the lines: u *f -50Rr r *f 1R r Which I assume are the pie charts for the USER and REQUEST reports. However, when I include the command: ALLCHARTS OFF These line are still present in this OUTPUT. They look more like the Floors to me (top 50 Users, by Requests, and all Request with more than 1 request. I don't know if there's any way to turn that off. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Help on analog logging
On 8/22/2008 9:21 AM, Arnab Ganguly wrote: Hi, I have checked my access log entry there are requests more than 10kb but it is not reported.Is it the 10KB max by default?Let me know No, it's not the max. Pick out a couple of lines from your logfile that you think should be causing a higher entry in the filesize report, and run Analog against just those lines. If the results don't make sense, post the lines here and see if other people get the same results. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] How to generate report for Bytes received, including request and headers
On 8/25/2008 5:20 AM, Arnab Ganguly wrote: Hi All, I want to generate report for %I - Bytes received, including request and headerspresent in the Apache access log.Which is the corresponding field report in analog need to be kept on for this. Analog doesn't have a bytes in report. The File Size report can be used to report on either Bytes In, or Bytes Out, but it defaults to Bytes Out. If you want it to report on Bytes In instead, then you just need to craft a LOGFORMAT that specifies %b in the Bytes In field. That explains why your earlier reports weren't giving you what you expected - the File Size report is based on the %b field, and the %b fields in your example lines were 117, 6918, 117, 60 and 124. I'm not sure that this is explicitly mentioned anywhere in the documentation, but it's because for the vast majority of websites, the inbound bandwidth is irrelevant - it's not even logged in most cases. For those sites that care more about inbound than outbound bandwidth, it's easy to modify the LOGFORMAT to report on it. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] How to generate report for Bytes received, including request and headers
Arnab Ganguly [EMAIL PROTECTED] wrote: Thanks for the update.I did some what like that and saw the result coming correct.I just swapped the %I with %b and gave me the correct result.Not sure as the approach is correct or not. Where %I -Bytes received, including request and headers, cannot be zero. You need to enable mod_logio to use this. and %b-Size of response in bytes as per the http://httpd.apache.org/docs/2.2/mod/mod_log_config.html#formats Let me know your views. As I said earlier, for most websites, the Bytes Received is ignored, because it's usually trivial compared to the Bytes Sent. Unless you are allowing people to upload files to your site (photos, or code submissions, for example), then you'll usually only get a couple of dozen bytes of request for every couple of Kbytes of response. As the Apache documentation confirms, you have to use the mod_logio extension, rather than the base mod_log_config, if you want to log the inbound bytes as well as the (default) outbound bytes. Swapping the %I and the %b is the best way to get a report on the inbound bytes, if that information is important. You can't repotr on inbound and outbound bytes at the same time in Analog. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Issues with OS report
On 8/28/2008 10:08 AM, Arnab Ganguly wrote: Hi Stephen, Thanks for your update.I fully agree with you about the programmed part. Is it possible we can get the count for Win32 as well by some configuration change or something else? testApp is my test application which is sending request.For MacPPC we get the correct result it would be nice if we get the same for Win32 as well. You can modify your application to use the same convention that other Windows browsers use, or you can alias the string in your config file to a format that Analog will recognize as a Windows browser: BROWALIAS testApp/1.0; Win32 testApp/1.0; (Windows NT 5.1) Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Unique Visits/Distinct Host Served/Request
On 8/29/2008 11:25 AM, Kush wrote: I've read that request for pages can be used as page views. But, I've recieve an average of 30 million request per day (for files) but it states that I receive almost 8,000 request for pages, and 5 million distinct hosts served per day. How is that possible? Define page. By default, Analog considers .html and .htm and directories (*/) as Pages. If you use something else to define a page, then you need to tell Analog about it. http://analog.cx/docs/include.html#PAGEINCLUDE Is there an average is can come up with, or an mathmatical formula I can come up with to tell my potential advertisers? As I said before, every website is different. A rule of thumb that might work for your website would be utterly useless for someone else's website. You'd have to have a pretty intimate understanding of any particular website to make a fair judgement about what metrics might make sense for that particular website. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Create links on html output
On 8/29/2008 11:27 AM, Brian Clanton wrote: I am running analog on a monthly basis for 6 separate libraries to get monthly statistics on a Proxy Server for their online database resources. For each library, I run analog twice: once for an html output, and one for a text file they can download for archival purposes. I would like to incorporate a link on the html output that points to the text file ouput page so library staff can get to this text file from the html ouput page. To do this, would it be more appropriate to make changes in the the css stylesheets? LOGOURL is the only simple way that I can think to what you want to do without modifying the CSS. http://analog.cx/docs/output.html#LOGO (Maybe you could do something with a dummy Virtual Host Report, and a dummy logfile with a single link to your text file). Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Unique Visits/Distinct Host Served/Request
Kush [EMAIL PROTECTED] wrote: I understand my website enough to make a guess has to how many files are being requested and that could be divided into the number 'successful request'. Does Analog read .PHP as a page? By default, Analog considers .html and .htm and directories (*/) as Pages. If you use something else to define a page, then you need to tell Analog about it. http://analog.cx/docs/include.html#PAGEINCLUDE Aengus - Original Message From: Aengus [EMAIL PROTECTED] To: Support for analog web log analyzer analog-help@lists.meer.net Sent: Friday, August 29, 2008 11:42:17 AM Subject: Re: [analog-help] Unique Visits/Distinct Host Served/Request On 8/29/2008 11:25 AM, Kush wrote: I've read that request for pages can be used as page views. But, I've recieve an average of 30 million request per day (for files) but it states that I receive almost 8,000 request for pages, and 5 million distinct hosts served per day. How is that possible? Define page. By default, Analog considers .html and .htm and directories (*/) as Pages. If you use something else to define a page, then you need to tell Analog about it. http://analog.cx/docs/include.html#PAGEINCLUDE + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Monthly Report
On 9/3/2008 6:22 AM, Terry Chambers wrote: I've got my logs being dropped into directories such that each day of logs have their own location. e.g. 2008/09/01 2008/09/02 etc. I currently run only daily and weekly reports. I've used the command line to specify 7 different log file paths and that works. If I want to run a monthly report, will the command line accept 31 log file paths? Or will I need to write a script to either copy/move the log files into fewer directories? The length of the command line depends on the Operating System. In the unlikely event that you ran into a problem with the length of the command line, you could create an additional config file, for example august.cfg, and call it from the command line, like thist: analog +gaugust.cfg The config file would just contain LOGFILE entries: LOGFILE 2008/08/01/* LOGFILE 2008/08/02/* LOGFILE 2008/08/03/* etc. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Help on Processing Time
On 9/5/2008 9:22 AM, Arnab Ganguly wrote: Hi All, I print the %T value in the access log.As per the apache docs %T implies the The time taken to serve the request, in seconds. My question is my Client and Server timeout is kept for 25 seconds.I am not using *Timeout* value also and my *KeepAlive *is set to Off.But under the Processing Time report I do get entries where the request was served with value more than 10 mins ie 600 sec. So if the Client would have timeout before my guess is but how come the request was successful.The response code also we get is 200 OK. Can you please help me on this. Thanks and regards If a user on a slow link (a dialup modem, for example) is downloading a large file, I believe that the %T value reflects the length of time it takes that particular download to complete. I'm not certain about that, but you should be able to confirm it by looking at the details of some of the log entries that show high %T values. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Help on Processing Time
Arnab Ganguly [EMAIL PROTECTED] wrote: Thanks for help and giving me a wonderful example.Can you give me some more inputs littile bit off topic, suppose some user is downloading the file, once the download is complete then only the final 200 ok is propagated to the client from the server or there would be in between message interaction with the client as well? For a straightforward HTTP download, there would usually only be a single entry in the log file, with a 200 status code. But there are circumstances, most notably PDF files, where, if the server supports it, you may encounter partial downloads, with repeated 206 status codes. (http://analog.cx/docs/faq.html#faq143) Also where does the Client ,Server message transaction timer comes into the picture as in this case it takes long time for downloading so the timer is going to expires. That's really more of a server specific question - different servers could deal with it in different ways. But strictly speaking a time-out doesn't occur, because there is continuous communication between the client and server during that download - the server only sends packets as the client requests them, otherwise the server, with it's high speed connection, would just spit out the whole transaction faster than the client could accept it. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Bounty for 64-bit Windows binary
On 9/25/2008 11:01 AM, Nick Altmann wrote: What type of bounty would entice someone to compile a 64-bit Windows binary (i.e. one that is not limited to 2GB of memory)? The open-source tools used for compiling the Windows version of Analog are 32-bit only. There are some people working on 64-bit versions of these tools, but the status of MinGW is still marked as 3-Alpha - https://sourceforge.net/projects/mingw-w64/ In all honesty, you might get a better response to your question from the people who are developing the compiler tools - if it's simpley a question of recompiling the existing code as a 64-bit binary, then it'd be a simple task for them. If it involves any significant code changes to Analog, then there is very little history of such work being handled on this list. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] LogFormat in analog.cfg broken
Ulf Hofemeier [EMAIL PROTECTED] wrote: Quoting Aengus [EMAIL PROTECTED]: Ulf Hofemeier [EMAIL PROTECTED] wrote: Hi folks: I could use some help figuring out the correct syntax for a new LogFormat line in my analog.cfg file after changing the LogFormat settings for Apache a few weeks ago. In my httpd.conf I have the following line: LogFormat %h %l \%u\ %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\ combined Have you tried just using your Apache command directly? APACHELOGFORMAT (%h %l \%u\ %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\) Aengus Hi Aengus: I tried APACHELOGFORMAT in my analog.cfg without success (still corrupt logfile lines). My Apache access_log looks like this: 1.2.3.4 - - [02/Oct/2008:09:31:33 -0600] GET /images/black.gif HTTP/1.1 200 43 http://www.unm.edu/; Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3 1.2.3.4 - - [02/Oct/2008:09:31:33 -0600] GET /images/white.gif HTTP/1.1 200 43 http://www.unm.edu/; Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3 My mistake - I forgot to strip the double-quotes from the start and end of your line from the Apache entry. APACHELOGFORMAT (%h %l \%u\ %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\) This works for the sample lines that you posted. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] LogFormat in analog.cfg broken
Ulf Hofemeier [EMAIL PROTECTED] wrote: I have to admit that the analog configuration is getting quite confusing to me. Unfortunately the APACHELOGFORMAT line doesn't solve my problem, so please allow me to provide you with a little more information regarding the purpose of the updated analog.cfg, as well as what I'm doing before the problem occurs. 1. Copy the previous month Apache log to a temporary location 2. Run a script to extract page visitor data from the general Apache log file and store it in a separate file 3. Run a bash 'for i' loop on the new log files and store the data in page visitor sub directores Unfortunately I decided that Apache has to write more information to its access_log log file, which is finally the reason why there is issues with analog now. According to the analog documentation there is a way to set up a hierarchy so that it will understand a log file syntax even if it changes from old to new over time, but I haven't been able to figure out how to make it work. If you have multiple LOGFORMAT statements, Analog will try them each in turn until it finds one that matches the entries in each of your logiles. That means that if you have multiple logfiles, and they aren't all the same format, Analog can still create a single report from these different logfiles. Obviously the report may understate this items that weren't recorded in some of the logfiles - for example, you might have a million requests, but only only 200,000 Browser strings if you only added that field in leater log files. LOGFORMAT commands apply to LOGFILEs that are specified after the LOGFORMAT in the .cfg file. DEFAULTLOGFORMAT commands apply to logfiles that are specified on the command line. It's not clear from your description whether your script calls Analog and passes it the name of the logfile as a paramter, or whether Analog picks up the logfile from the LOGFILE log--??.gz statement in your .cfg file. If you're speciying the LOGFILES in the .cfg file, then these lines should do the job: APACHELOGFORMAT (%h %l \%u\ %t \%r\ %s %b) APACHELOGFORMAT (%h %l \%u\ %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\) If you're calling Analog with the logfiles specified on the command line, then these lines should work: DEFAULTAPACHELOGFORMAT (%h %l \%u\ %t \%r\ %s %b) DEFAULTAPACHELOGFORMAT (%h %l \%u\ %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\) Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] LogFormat in analog.cfg broken
Ulf Hofemeier [EMAIL PROTECTED] wrote: DEFAULTLOGFORMAT (%S %j %u [%d/%M/%Y:%h:%n:%j] %j %r %j %c %b) DEFAULTLOGFORMAT (%S %j %u [%d/%M/%Y:%h:%n:%j] %j %r %j %c %b) DEFAULTLOGFORMAT (%S %j %u [%d/%M/%Y:%h:%n:%j] %j %r %c %b) DEFAULTLOGFORMAT (%S %j %u [%d/%M/%Y:%h:%n:%j] %r %c %b) Are these lines in your .cfg file for a reason? Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] LogFormat in analog.cfg broken
Ulf Hofemeier [EMAIL PROTECTED] wrote: I inherited the ancient analog.cfg configuration file and the scripts that trigger analog every month so my answer is that the DEFAULTLOGFORMAT lines are supposed to cover every Apache access_log version there was going back to 2003. As I learned from you that DEFAULTLOGFORMAT is what analog uses for log files that are handed over as a parameter on the command line they should work fine. Analog did run without reporting corrupt log file lines until I changed the Apache logformat output to something else at least. I just checked the documentation, and the correct directive is APACHEDEFAULTLOGFORMAT, rather than DEFAULTAPACHELOGFORMAT. APACHELOGFORMAT is meant to be a convenience for those who have Apache configured with custom logformats - instead of translating the Apache configuration into Analog syntax, Analog will do it for you (most of the time - some complex Apache statements won't translate). It's probably worth taking a few minutes to look at the LOGFORMAT documentation to see how the LOGFORMAT is created - it's a fairly straightforward substitution of letter codes for fields (%S for IP address, %b for bytes, %B for Browser, %c for status code, etc), so 1.2.3.4 - - [17/Sep/2008:13:12:52 -0600] GET /images/sandiamountains.jpg HTTP/1.1 200 61671 http://ladb.unm.edu/; Mozilla/5.0 ... 1.2.3.4 is %S. GET /images/sandiamountains.jpg HTTP/1.1 is %j %r %j (I don't care about the GET or the HTTP/1.1, so they are coded as %j for junk). The timestamp has a day (%d), Month, (%M), 4-digit Year (%Y), hour (%h), minutes (%n) and more junk (seconds and GMT offset), so [17/Sep/2008:13:12:52 -0600] is coded as [%d/%M/%Y:%h:%n:%j] Note that case and spacing is important. Put it all together, and you end up with a LOGFORMAT like this: %S %j %u [%d/%M/%Y:%h:%n:%j] %j %r %j %c %b %f %B Your existing .cfg file uses analog syntax, not Apache syntax, so when you added two extra fields to your logfile (referrer and User Agent), so you could have just copied the existing entries and add %f %b to the end (though there's a lot of redundancy in your existing setup - you really only need the first 1 of the 4 DEFAULTLOGFORMAT lines). Or you could copy the modified logformat command from your http.conf file and add it to the analog.cfg file with APACHEDEFAULTLOGFORMAT (%h %l \%u\ %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\) Based on what you've posted, you only need # If you need a LOGFORMAT command (most people don't -- try it without first!), # it must go here, above the LOGFILE commands. DEFAULTLOGFORMAT (%S %j %u [%d/%M/%Y:%h:%n:%j] %j %r %j %c %b) DEFAULTLOGFORMAT (%S %j %u [%d/%M/%Y:%h:%n:%j] %j %r %j %c %b %f %B) DEBUG ON You should be able to delete the other 12 lines from your .cfg file, as they don't appear to be doing anything useful (I'd be particularly concerned about that LOGFILE line - are you counting access_log.-??.gz and log--??.gz? Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Any possibility of recognising Windows Vista?
On 10/9/2008 4:25 AM, Philip Goddard wrote: I've greatly valued Analog as my website statistics program for many years now, and nowadays I'm using version 6. There is just one little thing that I wonder if it could be addressed, and that is, getting Windows Vista appearing in the Operating System report. I've looked around to see if there was some little hack that I could do in the translation files, but I see nothing there that I could usefully change or add to. I'm assuming that the Vista entries in my logfiles are the ones listed by Analog as 'Unknown Windows'. So, any chance of my changing this? I wouldn't mind, actually, if I just had the Windows versions showing as their NT version numbers as they arrive in the logfiles - at least I'd know which is which - though the version names certainly add clarity to the report. The issue first came up almost 2 years ago, and John Harman posted a few lines of source code that added Windows Vista to the OS Report. http://lists.meer.net/pipermail/analog-help/2006-November/020007.html For Windows users who don't typically compile their own source code, there was a discussion on the list in early 2007, and Jeroen Feelders and Paul Wade both compiled the windows version and posted it on their own websites: http://lists.meer.net/pipermail/analog-help/2007-January/020118.html http://lists.meer.net/pipermail/analog-help/2007-January/020122.html Chris Tilley posted a version with some addition changes this time last year: http://lists.meer.net/pipermail/analog-help/2007-October/020754.html Paul Wade posted a link to a version that includes some of these changes: http://lists.meer.net/pipermail/analog-help/2007-October/020804.html I posted instructions for compiling Analog on Windows yourself, though it included links to specific versions of the various tools that may or may not be still available, but I imagine that the current versions of the various components would do the just just fine. http://lists.meer.net/pipermail/analog-help/2007-January/020117.html Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Separate configuration files on Windows
On 10/16/2008 8:05 AM, Richard Heyes wrote: Should this work on Windows: analog +Gc:canalog\rgraph-analog.cfg ? Basically I have two logfiles that I want to analyse seperately on Windos. No joy though - keeps coming back with this: analog: Warning C: Command line argument +Gfoo.cfg too long: ignoring end of it Even if I change it to the name foo.cfg so it's shorter. Analog command line arguments are case sensitive. Upper case G refers to the default .cfg file, and it's only used as -G to tell analog to ignore the default .cfg file. When you tack anything onto uppercase G the argument is too long, and Analog ignores the end of it. You want +gc:\analog\rgraph-analog.cfg, with a lower-case g. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Separate configuration files on Windows
Richard Heyes [EMAIL PROTECTED] wrote: Analog command line arguments are case sensitive. Upper case G refers to the default .cfg file, and it's only used as -G to tell analog to ignore the default .cfg file. When you tack anything onto uppercase G the argument is too long, and Analog ignores the end of it. You want +gc:\analog\rgraph-analog.cfg, with a lower-case g. I don't, I want Analog to use a totally separate config file, ignoring the default cfg file. Then you want -G +gc:\analog\rgraph-analog.cfg Config files aren't either/or - you can have multiple .cfg files, so the -G turns off the default.cfg file, and you can use +g to turn on your other.cfg file (you can have multiple +g commands). http://analog.cx/docs/syntax.html#specialcfgs Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] no stats or reports in ananlog cgi web inetrface
On 10/20/2008 7:39 PM, Tubal Deen wrote: I'm getting nothing out of this config and have tried everything to pull stats. I know there are stats there as apache mod_status is producing output... Does Analog run properly at the command line? What do you get in your browser when you call anlgform.pl? What's the HTTP Status code in your log file for that request? Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] APACHELOGFORMAT and hosts report
On 10/24/2008 10:20 AM, Don Jones wrote: Hello Analog gurus, I've been using Analog on-and-off for a while, and I'm a big fan. I'm trying to get Analog to give me a hosts report. The problem I seem to have is that the logs are writing an X-Forwarded-For header which is the only way I have of knowing what the actual browser IP address was. (lots of network topology in the way) So based on the following log format in Apache httpd.conf: (I'm pretty sure this is current, but I will double-check) LogFormat %{X-Forwarded-For}i %l %u %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\\%{Cookie}i\ %D webtrends So in analog.cfg, I have: APACHELOGFORMAT (%{X-Forwarded-For}i %l %u %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\\%{Cookie}i\ %D) And here's a sample line from the Apache access log: 10.235.166.27 - - [22/Oct/2008:09:22:49 -0500] GET /wps/portal/xxx HTTP/1.1 400 65536 - Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)WT_FPC=id=10.234.239.40-2330051872.29954568:lv=1224706655084:ss=1224706491290; JSESSIONID=HDRNq7GzVKH0HRzrmcAv123:139i273in; erU47MFBA6M2SE7HASZ6CLAGK3341=PWD=CLX=EnhancedRTEHMS=ppdapz0131LGN=MJSW43TFNJZDC; __utma=101953745.1997367580080200200.1221591400.1221591400.1221591400.1; __utmz=101953745.1221591400.1.1.utmcsr=hostname.com|utmccn=(referral)|utmcmd=referral|utmcct=/wps/portal/!ut/p/c1/04_sb8k8xllm9msszpy8xbz9cp0os3gdfwnvj29dm2mxazmj91avl08jawjq9_piz03vl8h2vaqavxwhdw!!/dl2/d1/l2djqsevuut3qs9zqnb3lzzfme8ws0jlmtyzrda2mkdvskwxmjawmdawmda!/ 576318 Finally I get to my question: how can I get a hosts report from this? I tried making the APACHELOGFORMAT use %S as the first token, but that didn't work. APACHELOGFORMAT is simply a mechanism for translating the line from the Apache configuration file into native Analog format. Whenever your Apache logformat string gets a bit complex, you're going to have to give up on the convenience of this automatic translation mechanism, and tell Analog exactly how it should interpret the logfile, by writing an Analog LOGFORMAT string, rather than relying on Analog to do the translation for you. Try this LOGFORMAT (%S %j %u [%d/%M/%Y:%h:%n:%j] %j %r %j %c %b %f %B%j %D) Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] x-forwarded-for with multiple hosts in LOGFORMAT
Don Jones [EMAIL PROTECTED] wrote: I am wrestling with the fact that my logfiles, occasionally, have more than one entry for the x-forwarded-for header. for the following Apache 2.0 LogFormat directive: LogFormat %{X-Forwarded-For}i %l %u %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\\%{Cookie}i\ %D webtrends and given the following Analog LOGFORMAT directive: LOGFORMAT (%S %j %u [%d/%M/%Y:%h:%n:%j] %j %r %j %c %b %f %B%j %D) (which this board gave to me, thank you again very much) Most of the lines in my logfiles look like this: 10.234.232.167 - - [25/Oct/2008:23:01:10 -0500] GET ... But over the course of a week, about 1/5 of them (enough to skew the statistics) look like this, or some variation 10.236.188.189, 10.254.246.140 - - [25/Oct/2008:23:00:34 -0500] GET .. Analog can cope with multiple LOGFORMATs in a single log file, so just add an additional entry for decoding the lines with the extra IP addresses. LOGFORMAT (%S %j %u [%d/%M/%Y:%h:%n:%j] %j %r %j %c %b %f %B%j %D) LOGFORMAT (%S, %j %j %u [%d/%M/%Y:%h:%n:%j] %j %r %j %c %b %f %B%j %D) or LOGFORMAT (%S %j %u [%d/%M/%Y:%h:%n:%j] %j %r %j %c %b %f %B%j %D) LOGFORMAT (%j, %S %j %u [%d/%M/%Y:%h:%n:%j] %j %r %j %c %b %f %B%j %D) Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] non si aggiorna automaticamente
[EMAIL PROTECTED] wrote: Buongiorno, ho caricato e compilato su piattaforma linux, la versione 6.0 di analog; ho seguito tutte le istruzioni ma non si aggiorna automaticamente ogni giorno: come mai? ho attivato il tutto il 30 ottobre e ancora oggi mi visualizza il contenuto del file di log presente fino a quel momento: Programma attivato Gio 30-Ott-2008 alle 10:41. Analizzate le richieste da Mar 28-Ott-2008 alle 12:17 a Gio 30-Ott-2008 alle 10:33 (1,93 giorni). http://translate.google.com/translate_t#it|en| translates this to: I loaded and compiled on Linux, version 6.0 of analog, I followed all the instructions but do not automatically updates each day: why? I turned it on October 30 and I still displays the contents the log file this until then: Program started Thu 30-Oct-2008 at 10:41 a.m.. Analyzed requests from Mar 28-Oct-2008 at 12:17 p.m. on Thurs 30-Oct-2008 to 10:33 (1.93 days). What's wrong? Thank you right now who can help me. Analog only generates output when you run it. To have Analog run automatically at a certain time every day, you can use the cron command. This might not work properly at first, because of file permissions, but give it a try. Solo analogico genera output quando si esegue esso. Per avere analogico eseguito automaticamente in un determinato periodo di tempo ogni giorno, è possibile utilizzare il cron comando. http://analog.cx/docs/faq.html#faq150 Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Still in development?
Joelle Tegwen [EMAIL PROTECTED] wrote: Hi, I realize that I'm asking the choir here, but I'm hoping to get an honest assessment. We're looking for a new log analyzer. The two that consistently came up were AWStats and Analog. Analog seems like a great product, but it seems to be no longer under development. We don't have anyone on staff who could update this so if updates or changes needed to be made, we would be out of luck. Is it worth it to start using Analog now? Are there current open bugs that are a concern? How would we deal with new browsers/OS (like Vista)? Or is this project going away and we should look elsewhere for a log analyzer? Are there other issues I should be aware of? Off the top of my head, I can't think of anything that I would describe as an open bug. While the current version of Analog was released 4 years ago, there haven't really been any significant changes in the Web log landscape in that time, except the addition of Windows Vista to the OS report, and there are user contributed patches that address that. Presumably someone will add Windows 7 in the near future too. A 64-bit compiled version for Windows might also be more useful now than it was 4 years ago, but that's a prolem with the available tools, rather than with Analog itself. As the availability of 64-bit tools for Windows development improves, expet to see a user-contributed build of 64-bit Analog for Windows. (64-bit versions for other operating systems are already available). Analog has remained pretty static because the flexibity and customization that it offers comes from changes made to config files, not from code modifications and recompilation. Is it worth starting to use Analog now? I'd turn that question around - is there any reason not to use Analog? If Analog doesn't deliver on some of your needs today, it's probably not going to change significantly at this point. On the other hand, if you have a good idea what you want to get from your log files, and Analog delivers on those needs, then it's going to keep delivering. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Updated analog.exe with expanded memory (3GB) support
Nick Altmann [EMAIL PROTECTED] wrote: It's not as good as a full 64-bit version (any progress on that front?), but it may give some of you a little more breathing room. My interest in the 64-bit version was purely academic - when I documented the process for compiling the 32-bit Windows version, I checked to see whether a 64-bit version would be a possibility, but the 64-bit version of MinGW was a very early alpha at that point. I had a quick look at the project page today (http://sourceforge.net/projects/mingw-w64/0, and it's still listed as alpha, but the discussion forum indicates that it's a bit more usable at this point. If I had the time (and a pressing need), I'd invest an afternoon in trying to get it working, but I think I'll leave it on the to-do list for now :-) Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Analog issues
On 11/22/2008 3:52 PM, Ric Erwin wrote: I have downloaded the latest version of Analog (6.0) and attempted to install. The file unzips, the installer seems to start, there’s a glimpsed flash of the DOS-Command screen, and then nothing. I reboot, but Analog 5.1 is still there. What am I doing wrong? http://analog.cx/docs/faq.html#faq101 Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Need help to retrieve (and correct) all reports fromthis log (Maybe I need the LOGFORMAT)
Leung, Michael [EMAIL PROTECTED] wrote: Dear Analog experts, The following is an example of the log along with the description: format=%Ses-client.ip% - %Req-vars.auth-user% [%SYSDATE%] %Req-reqpb.clf-request% %Req-srvhdrs.clf-status% %Req-srvhdrs.content-length% %Req-headers.referer% %Req-headers.user-agent% %Req-headers.cookie.vrsnsf% %Req-headers.cookie.JSESSIONID% %Req-headers.cookie.landing% 205.178.191.170 - - [23/Nov/2008:00:01:01 -0500] GET /manage-it/hosting-overview.jsp HTTP/1.1 200 55065 https://www.networksolutions.com/manage-it/private-registration-splash. jsp Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.18) Gecko/20081029 Firefox/2.0.0.18 4b31b171ac7c472da07cff3748a69 c7b881bcf1fcfe5edf3cd514469c - If I don't specify a LOGFORMAT, it won't complaint, but some of the reports don't seem to giving any meaning data. For example, Domain report doesn't seem right. What didn't look right? Analog won't convert IP addresses into Hostnames automatically, so the Domain Report will be based purely on IP numbers, unless you set up DNS lookups. The reports also have certain floors, and they don't show information that falls below those floors, so for a report on a small logfile, you might not see entries for addresses that you expect to see. Based on what I read, I tried to use the following LOGFORMAT statement, but it complaints something wrong about it. LOGFORMAT %s - %u [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b %f %A %j %j - Almost right. The Logformat string has to be delimited (usually with () ) and the Browser string is usually indicated with %B, but %A seems to work too. LOGFORMAT (%s - %u [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b %f %B %j %j -) Can someone give me some suggestion? First, what wrong with my LOGFORMAT statement? Why some of the reports didn't give meaningful information? Can you describe the problem that you are having with the reports in greater detail? Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Need help to retrieve (and correct) all reports(need help on LOGFORMAT)
Leung, Michael [EMAIL PROTECTED] wrote: Aengus, The below is what we see for the Domain Report, but it is not what we are expecting. Listing domains, sorted by the amount of traffic. reqs %bytes domain 655193 100% [unresolved numerical addresses] Even it is entirely based on IP numbers, I should see a list of several IP addresses, instead of what we have now. My mistake - it's actually the Organization Report that shows breakdown by IP address when DNS resolution isn't enabled - the Domain Report reports on Top Level Domains (.com, .org, .co.uk, etc) so it requires IP names, not IP numbers. The Organisation report lists the organizations (companies, institutions, ISPs etc.) that the IP addresses are registered to. When you only have IP numbers, the Organization Report basically breaks the addresses down by Class, so anything from 12.x.y.x will be listed under 12, (a Class A address) but higher addresses will typically be listed with 2 or more octets. But when I am using the above, instead of letting analog to use its auto-detect, I got the following error message in the output: analog: Warning L: Large number of corrupt lines in logfile /source_data1/weblog/datafiles/1.log: turn debugging on or try different LOGFORMAT (For help on all errors and warnings, see docs/errors.html) Current logfile format: %S - %j [%d/%M/%Y:%h:%n:%j %j] %j %r %j %c %b %f %A %j %j -\n what does it mean? Does it mean that I should this suggested format? It means that not all of the lines in your logfile match the LOGFORMAT that you told Analog to use. At a guess, the last field in your logformat isn't always -, so you might just want LOGFORMATm (%S - %j [%d/%M/%Y:%h:%n:%j] %j %r %j %c %b %f %A %j) The Domain report is one issue. And then, some of the search reports are turn off. analog: Warning R: Turning off empty Search Query Report analog: Warning R: Turning off empty Search Word Report analog: Warning R: Turning off empty Internal Search Query Report analog: Warning R: Turning off empty Internal Search Word Report how do I verify if we have any data for these reports? The Search Word and Query Reports rely on the Referrer field, and on the relevant Search Engine being defined in your Analog.cfg (there are a couple of dozen of the more common ones listed in the default analog.cfg). If you have any referrers from Google or Yahoo, then your Search Word Reports should not be empty. The Internal Search Reports need you to define a particular URL on your web server as an search engine, and which field in the Query String is the search term. The Internal Search Engine is not defined by default, so it's reports will always be empty unless you've defined an Internal Search Engine, Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Need help to retrieve (and correct) reports (need help on LOGFORMAT)
On 12/1/2008 8:42 PM, Edward Spodick wrote: Perform DNS processing on your log files - either letting Analog do it with its slower code, or using a different tool to pre-process the logfile before triggering Analog (see http://www.analog.cx/helpers/#dns ) With 806393 requests in the logfile, using Analogs built-in lookups wouldn't be the best idea. For learning about DNS lookups, a sample from the logfile of a few hundred lines would be a better idea. For DNS lookups on anything larger than that, you really need to use one of the the DNS helper apps. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Need help to retrieve (and correct) reports (need help on LOGFORMAT)
On 12/1/2008 6:55 PM, Leung, Michael wrote: Based on my LOGFORMAT, my Referrer field shouldn't be empty. Right? The sample logfile entry that you provided had a referrer field (https://www.networksolutions.com/manage-it/private-registration-splash.jsp) Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: Convert IP to Domain Name - RE: [analog-help] Need help to retrievereports
Leung, Michael [EMAIL PROTECTED] wrote: It looks like the first one was converted Original New 146.115.44.11 146-115-44-11.c3-0.sbo-ubr1.sbo.ma.cable.rcn.com 125.17.144.210axcend.com The second one seems right, but the first one looks a little odd to me. C:\ping -a 146.115.44.11 Pinging 146-115-44-11.c3-0.sbo-ubr1.sbo.ma.cable.rcn.com [146.115.44.11] with 32 bytes of data: Consumer ISPs often include the actual number in the DNS name that they record for each address. They have to have something to distinguish each entry in the host field, and for privacy reasons they wouldn't want to use the customers name or account info, and they often use overlapping pools of numbers, so the geographic part of the name doesn't necessarily map into a specific subnet. So you end up with something like 146-115-44-11.c3-0.sbo-ubr1.sbo.ma.cable.rcn.com. It makes sense to someone at rcn.com. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Need an example how to display only Top 20 in variousreports
Leung, Michael [EMAIL PROTECTED] wrote: Dear Sir/Madam, It is basically what my subject said. I am trying to limit the Request Report, Referrer Report and the others to show only the top 20 entries. How do I accomplish this? Do we use a combination of Floor and Row commands? I go through the documentations on the websites, and there seems to be several keywords/commands that could be related for this purpose, but I couldn't find an example about this. Reports have Floors. To specify the top 20, you use a Floor of -20. But you also have to specify which top 20 you want - top 20 by requests, by page views, by bandwidth, requests in the last 7 days, etc. To see the top 20 requests, by bytes transferred, you'd use REQFLOOR -20b To see the top 20 requests, by page views, over the last 7 days, you'd use REQFLOOR -20q http://analog.cx/docs/othreps.html#FLOOR Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Need help to retrieve (and correct) reports(needhelp on LOGFORMAT)
Leung, Michael [EMAIL PROTECTED] wrote: Well, I was hoping that the report will list the IP address under the domain column. For example, reqs: %bytes: domain 806393 40% 146.115.44.11 As Stephen suggested, you want the Host report (which is not turned on by Default). Turn it on with HOST ON The full list of reports that Analog can produce is listed here: http://analog.cx/docs/reports.html#repoth The list of commands for turning these different reports on and off is here: http://analog.cx/docs/output.html#replist Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Need an example how to display only Top 20 reports
Leung, Michael [EMAIL PROTECTED] wrote: REQARGFLOOR 100%r That should be REQARGSFLOOR 100%r Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Regarding SEARCH WORD and INTERNAL SERACH reports
Leung, Michael [EMAIL PROTECTED] wrote: If my Search Word Reports shouldn't be empty, what might go wrong? Or, where I should check first? I have tried with following setup: INTSEARCHWORD ON INTSEARCHQUERY ON INTSEARCHENGINE /Search.aspx keyword FILEINCLUDE /Search.aspx* ARGSINCLUDE /Search.aspx But this will give me empty reports with warning messages saying all the reports are empty. I created a one line logfile based on your original sample logfile: 205.178.191.170 - - [23/Nov/2008:00:01:01 -0500] GET /Search.aspx?keyword=Analog6 HTTP/1.1 200 55065 http://www.google.com/search?q=analog+6.0; Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.18) Gecko/20081029 Firefox/2.0.0.18 - - - I modified your request string to /Search.aspx?keyword=Analog6 to match the INtsearch entries you listed above. I modified your referrer string to http://www.google.com/search?q=analog+6.0; because Google is one of the searchengines listed in sample analog.cfg file: SEARCHENGINE http://*google.*/* q,as_q,as_epq,as_oq When I run Analog against this 1 line logfile, I get the Search and Internal Search Reports. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Support for Vista
On 12/11/2008 6:32 AM, Arnab Ganguly wrote: Hi All, At present does analog supports Vista?When I am sending incident from Vista machine I am getting the User Agent as Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.0.04506).So is it possible to get the report as Vista under the OS report and how it can be done? http://lists.meer.net/pipermail/analog-help/2008-October/021339.html Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Support for Vista
On 12/11/2008 11:52 AM, Arnab Ganguly wrote: Hi All, Thanks for the update. Issue is that without changing the source code and compiling, is it possible with the existing binary to support Vista also how do I check about the Vista support is there or not in my binary? A number of people have posted modified binaries that you can download from their own websites. None of the binaries available from www.analog.cx have this support. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Support for Vista
On 12/11/2008 11:52 AM, Arnab Ganguly wrote: Hi All, Thanks for the update. Issue is that without changing the source code and compiling, is it possible with the existing binary to support Vista also how do I check about the Vista support is there or not in my binary? If I use BROWALIAS Windows NT 6.0 Windows Vista will it work? I don't think that's ever come up on the list - have you tried it? It should only take a couple of seconds. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Support for Vista
On 12/11/2008 9:54 PM, Arnab Ganguly wrote: Once the User-Agent comes as Windows NT 6.0 we get it as unknown windows which is for Vista in this case.So is there any work around for the above like aliasing etc? The only discussion of this issue on this mailing list led to the creation of the patches mentioned earlier. You can download a copy of the modified executable for Windows here: http://www.xs4all.nl/~jfeelder/downloads/analog.exe Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Help with errors sortby errors
On 1/6/2009 6:56 PM, Casual Account wrote: Hi I am trying to set up a configuration file, mainly viewing data by pages but am getting sortby errors all the time regarding bytes and I cant seem to see why. The errors are typically around Warning D: In directory Report, SORTBY (pages) doesn’t match SUBSORTBY (bytes) I have commented out any sorts using bytes and cant see to work out why this is occurring. Have you tried adding SUBDIRSORTBY PAGES? Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +
Re: [analog-help] Help with errors sortby errors
On 1/7/2009 7:42 PM, Casual Account wrote: Thanks Aengus That sorted the problem. Does that command also apply to the browser report? (ie when I added it I got a mismatch on bytes and pages again, but could fix this by changing BROWREPSORTBY REQUESTS and BROWSUMSORTBY REQUESTS to PAGES. Some reports are hierarchical - the entries can be broken down in greater detail. These sub reports have their own floor and sort settings - for example, you might only want to see the top 20 directories, sorted by requests, but only want to see requests if they totalled more than a certain number of bytes. http://analog.cx/docs/hierreps.html Note that Analog is just giving warnings - you can simply ignore the warnings - you only need to change the subfloors or the subsortbys if the information displayed in your reports isn't actually what you wanted to see. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Analog Documentation: http://analog.cx/docs/Readme.html | List archives: http://www.analog.cx/docs/mailing.html#listarchives | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general +