Re: [analog-help] MSN Bot - related
Likely the number of results on MSN Search is an estimate until you delve deeper into the result set. If you are having trouble using MSN Search, I think the best forum is the Feedback link at the bottom of MSN pages. Whether you block a search engine from your site, I think only you can make the decision on. Innerlab wrote: I hope this is not too out of topic. I used some software to check link popularity of my important web pages in several search engines, and all except MSN displayed at least one result for each. Also, I typed directly my domain's name into MSN search bar and initially created about a 1000 results, and the links necessary to access them (IE: Page1, Page2, Page3...etc...Next ). But after clicking on page number two, the total results containing my domain now become 20, and the links to more results just dissapear. I wonder why that would be. Considering the fact that the MSN bot is really bombarding my website everyday, and it's not really doing much for me, should I just BLOCK it??? + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives + -- Jeremy Wadsack Seven Simple Machines + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +
Re: [analog-help] using analog to pre-parse IIS logs for spiders/bots
Will wrote: Greetings everyone. This is my first post so I apologize in advance if im doing something wrong. I have inherited stats duties for our company which has about 20 domains and lots and lots of IIS logs. We are using Urchin and cannot switch at this time. I want to use Analog to pre-parse my IIS log files on a daily basis by removing all log entries made by spiders (as identified by some external machine-generated spiders.cfg file). Urchin has very crappy and limited functionality for filtering spiders. It is clearly not doing a good job identifying crawlers so I figured this was my best bet, to pre-parse using analog before Urchin gets its grubby hands on my log files. Can anyone help me with a .cfg file and command line syntax for accomplising this? I dont want it to do any reporting or analyzing, just output the identical IIS log but with all spider/bot entries removed. Analog won't modify your logfiles - it will only read them in and report on the contents. If you want to physically exclude robots/spiders from your logs, you can use something as simple as the FINDSTR command included in Windows, alobg with a list of strings that identify spiders. You can create that list from information on http://www.robotstxt.org/ or you could create a custom list by using Analog to analyse your logs for behaviour that you identify as spider-like. (For example, you could run a Full Browser report to get a list of browser names that are obviously spiders). You would use FINDSTR like this to create a no spider version of your logfile: FINDSTR /V /I /F:spiders.txt ex050523.log ns0505024.log spiders.txt would contain a list of strings that match known spiders in your logfile. That might be agent strings or host addresses. For example, it might contain the following lines: googlebot msnbot slurp 10.123.45.67 (where 10.123.45.67 is the IP address of a spider, for example). Note that this approach can have unexpected consequences. If you have a lot of referrals from a page called slurpy.htm, for example, it would also be excluded by the reference to the Inktomi spider in the list above. Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +
[analog-help] CGI warning about unrequested logfile
I'm a new user of analog and I'm feeling my way through slowly but I'm getting a warning that I don't understand why. I'm hoping that someone will be able to enlighten me. When I use the cgi program, I get the following in my apache error log: [Tue May 24 15:22:10 2005] [error] [client aa.bb.cc.dd] [Tue May 24 15:22:10 2005] /usr/bin/analog: analog version 6.0/Unix, referer: http://sam/cgi-bin/anlgform.pl [Tue May 24 15:22:10 2005] [error] [client aa.bb.cc.dd] [Tue May 24 15:22:10 2005] /usr/bin/analog: Warning F: Failed to open logfile, referer: http://sam/cgi-bin/anlgform.pl [Tue May 24 15:22:10 2005] [error] [client aa.bb.cc.dd] /usr/bin/logfile.log: ignoring it, referer: http://sam/cgi-bin/anlgform.pl [Tue May 24 15:22:10 2005] [error] [client aa.bb.cc.dd] (For help on all errors and warnings, see docs/errors.html), referer: http://sam/cgi-bin/anlgform.pl Now from reading the docs etc I presume this means that analog is looking for the default logfile in the executable directory. Only problem is that I'm actually specifying a logfile. Here's the output from the cgi using qv=1: CONFIGFILE /etc/analog/proxy.cfg CGI ON DNS NONE WARNINGS FL LOGFILE /var/log/05/14-Sat/apache-proxy SUBMIT Go DEBUG -C OUTFILE stdout I also tried putting SETTINGS ON but the logfile part of that which follows looks fine to me: Configuration files read: /etc/analog/analog.cfg standard input /etc/analog/proxy.cfg Warning types on: FL Debugging types on: none Reading cache files: none Reading logfiles: /var/log/05/14-Sat/apache-proxy Logfile format: Automatic detection\n I _am_ getting the report that I ask for displayed in my browser. However, unexplained errors make me nervous (apart from being untidy) and I don't want to open up access to analog until I know what's happening here. If you need more detail then just ask but there's nothing that looks helpful to me. I didn't get any warnings from using analog directly - I did as I was told and got this working before trying the cgi program :) (but I didn't do anything to the default settings for warnings or debug). + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +
[analog-help] Analog help
Hi, I have read through lots of stuff but am unable to find how MRTG graphs can be constructed from Analog output. Can anyone help? Thanks. _ Express yourself instantly with MSN Messenger! Download today - it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +
Re: [analog-help] Virtual Hosts report
Cameron Biggart wrote: Dear all We have several virtual hosts defined in our apache.ctrl file, all of them write to the same logfile. I turned on the VHOST report and set the VHOSTFLOOR 0b but when I run analog I get analog: Warning R: Turning off empty Virtual Host Report Why is it not seeing the virtual hosts? I tried the documents but could only find examples of running it with separate log files, do I need to start splitting up our logs for the VHOSTs? Does your logfile include a field indicating which vhost each entry was intended for? Aengus + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +
Re: [analog-help] Virtual Hosts report
Aengus wrote: Cameron Biggart wrote: Dear all We have several virtual hosts defined in our apache.ctrl file, all of them write to the same logfile. I turned on the VHOST report and set the VHOSTFLOOR 0b but when I run analog I get analog: Warning R: Turning off empty Virtual Host Report Why is it not seeing the virtual hosts? I tried the documents but could only find examples of running it with separate log files, do I need to start splitting up our logs for the VHOSTs? Does your logfile include a field indicating which vhost each entry was intended for? Aengus It's just using the standard Apache log format CustomLog /var/log/apache/access.log combined Is there somewhere I need to turn on the virtual hosts logging? They show up in the log file as. 203.111.75.196 - - [23/May/2005:10:59:20 +1000] GET /images/nahlogo.jpg HTTP/1.1 304 - http://www.natureandhealth.com.au/; Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4 203.111.75.196 - - [23/May/2005:10:59:35 +1000] GET /images/aphlogo.jpg HTTP/1.1 304 - http://www.ausphotography.com.au/; Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4 211.28.83.114 - - [22/May/2005:06:47:18 +1000] GET /images/acrlogo.gif HTTP/1.1 200 1800 http://www.australiancreative.com.au/; Mozilla/4.0 (compatible; MSIE 5.22; Mac_PowerPC) 203.12.97.92 - - [25/May/2005:14:13:14 +1000] GET /yaffa.css HTTP/1.0 200 284 http://news.yaffa.com.au/subs.htm; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322) So you can see the 4 vhosts there in the log file but Analog isn't seeing them somehow. -- Regards Cameron Biggart IT Manager Yaffa Publishing (02) 9281 2333 [EMAIL PROTECTED] + | TO UNSUBSCRIBE from this list: |http://lists.meer.net/mailman/listinfo/analog-help | | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +