Re: [analog-help] MSN Bot - related

2005-05-24 Thread Jeremy Wadsack
Likely the number of results on MSN Search is an estimate until you 
delve deeper into the result set. If you are having trouble using MSN 
Search, I think the best forum is the Feedback link at the bottom of MSN 
pages.


Whether you block a search engine from your site, I think only you can 
make the decision on.


Innerlab wrote:


I hope this is not too out of topic.

I used some software to check link popularity of my important web pages in
several search engines, and all
except MSN displayed at least one result for each. Also, I typed directly my
domain's name into MSN search bar and
initially created about a 1000 results, and the links necessary to access
them (IE: Page1, Page2, Page3...etc...Next ).
But after clicking on page number two, the total results containing my
domain now become 20, and the links to more
results just dissapear.

I wonder why that would be. Considering the fact that the MSN bot is really
bombarding my website everyday, and it's not really doing much for me,
should I just BLOCK it???

+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
+


 



--
Jeremy Wadsack
Seven Simple Machines

+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
+


Re: [analog-help] using analog to pre-parse IIS logs for spiders/bots

2005-05-24 Thread Aengus
Will wrote:
 Greetings everyone.  This is my first post so I apologize in advance
 if im doing something wrong.

 I have inherited stats duties for our company which has about 20
 domains and lots and lots of IIS logs.  We are using Urchin and cannot
 switch at this time.  I want to use Analog to pre-parse my IIS log
 files on a daily basis by removing all log entries made by spiders (as
 identified by some external machine-generated spiders.cfg file).

 Urchin has very crappy and limited functionality for filtering
 spiders.  It is clearly not doing a good job identifying crawlers so I
 figured this was my best bet, to pre-parse using analog before Urchin
 gets its grubby hands on my log files.

 Can anyone help me with a .cfg file and command line syntax for
 accomplising this?  I dont want it to do any reporting or analyzing,
 just output the identical IIS log but with all spider/bot entries
 removed.

Analog won't modify your logfiles - it will only read them in and report on
the contents. If you want to physically exclude robots/spiders from your
logs, you can use something as simple as the FINDSTR command included in
Windows, alobg with a list of strings that identify spiders. You can create
that list from information on http://www.robotstxt.org/ or you could create
a custom list by using Analog to analyse your logs for behaviour that you
identify as spider-like. (For example, you could run a Full Browser report
to get a list of browser names that are obviously spiders).

You would use FINDSTR like this to create a no spider version of your
logfile:

FINDSTR /V /I /F:spiders.txt ex050523.log  ns0505024.log

spiders.txt would contain a list of strings that match known spiders in your
logfile. That might be agent strings or host addresses. For example, it
might contain the following lines:

googlebot
msnbot
slurp
10.123.45.67

(where 10.123.45.67 is the IP address of a spider, for example).

Note that this approach can have unexpected consequences. If you have a lot
of referrals from a page called slurpy.htm, for example, it would also be
excluded by the reference to the Inktomi spider in the list above.

Aengus

+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
+


[analog-help] CGI warning about unrequested logfile

2005-05-24 Thread George Klein
I'm a new user of analog and I'm feeling my way through slowly but I'm
getting a warning that I don't understand why.  I'm hoping that someone
will be able to enlighten me.

When I use the cgi program, I get the following in my apache error log:

[Tue May 24 15:22:10 2005] [error] [client aa.bb.cc.dd] [Tue May 24
15:22:10 2005] /usr/bin/analog: analog version 6.0/Unix, referer:
http://sam/cgi-bin/anlgform.pl
[Tue May 24 15:22:10 2005] [error] [client aa.bb.cc.dd] [Tue May 24 15:22:10 
2005] /usr/bin/analog: Warning F: Failed to open logfile, referer: 
http://sam/cgi-bin/anlgform.pl
[Tue May 24 15:22:10 2005] [error] [client aa.bb.cc.dd]   /usr/bin/logfile.log: 
ignoring it, referer: http://sam/cgi-bin/anlgform.pl
[Tue May 24 15:22:10 2005] [error] [client aa.bb.cc.dd]   (For help on all 
errors and warnings, see docs/errors.html), referer: 
http://sam/cgi-bin/anlgform.pl

Now from reading the docs etc I presume this means that analog is
looking for the default logfile in the executable directory.  Only
problem is that I'm actually specifying a logfile.  Here's the output
from the cgi using qv=1:

CONFIGFILE /etc/analog/proxy.cfg
CGI ON
DNS NONE
WARNINGS FL
LOGFILE /var/log/05/14-Sat/apache-proxy
SUBMIT Go
DEBUG -C
OUTFILE stdout

I also tried putting SETTINGS ON but the logfile part of that which
follows looks fine to me:

Configuration files read:
  /etc/analog/analog.cfg
  standard input
  /etc/analog/proxy.cfg
Warning types on: FL
Debugging types on: none
Reading cache files:
  none
Reading logfiles:
  /var/log/05/14-Sat/apache-proxy
Logfile format:
  Automatic detection\n

I _am_ getting the report that I ask for displayed in my browser. 
However, unexplained errors make me nervous (apart from being untidy)
and I don't want to open up access to analog until I know what's
happening here.

If you need more detail then just ask but there's nothing that looks
helpful to me.  I didn't get any warnings from using analog directly - I
did as I was told and got this working before trying the cgi program :)
(but I didn't do anything to the default settings for warnings or
debug).

+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
+


[analog-help] Analog help

2005-05-24 Thread Shweta R

Hi,

I have read through lots of stuff but am unable to find how MRTG graphs can 
be constructed from Analog output. Can anyone help?


Thanks.

_
Express yourself instantly with MSN Messenger! Download today - it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
+


Re: [analog-help] Virtual Hosts report

2005-05-24 Thread Aengus
Cameron Biggart wrote:
 Dear all

 We have several virtual hosts defined in our apache.ctrl file, all of
 them write to the same logfile. I turned on the VHOST report and set
 the VHOSTFLOOR 0b but when I run analog I get

 analog: Warning R: Turning off empty Virtual Host Report

 Why is it not seeing the virtual hosts? I tried the documents but
 could only find examples of running it with separate log files, do I
 need to start splitting up our logs for the VHOSTs?

Does your logfile include a field indicating which vhost each entry was
intended for?

Aengus

+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
+


Re: [analog-help] Virtual Hosts report

2005-05-24 Thread Cameron Biggart

Aengus wrote:

Cameron Biggart wrote:


Dear all

We have several virtual hosts defined in our apache.ctrl file, all of
them write to the same logfile. I turned on the VHOST report and set
the VHOSTFLOOR 0b but when I run analog I get

analog: Warning R: Turning off empty Virtual Host Report

Why is it not seeing the virtual hosts? I tried the documents but
could only find examples of running it with separate log files, do I
need to start splitting up our logs for the VHOSTs?



Does your logfile include a field indicating which vhost each entry was
intended for?

Aengus



It's just using the standard Apache log format

CustomLog /var/log/apache/access.log combined

Is there somewhere I need to turn on the virtual hosts logging? They 
show up in the log file as.
203.111.75.196 - - [23/May/2005:10:59:20 +1000] GET /images/nahlogo.jpg 
HTTP/1.1 304 - http://www.natureandhealth.com.au/; Mozilla/5.0 
(Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4
203.111.75.196 - - [23/May/2005:10:59:35 +1000] GET /images/aphlogo.jpg 
HTTP/1.1 304 - http://www.ausphotography.com.au/; Mozilla/5.0 
(Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4
211.28.83.114 - - [22/May/2005:06:47:18 +1000] GET /images/acrlogo.gif 
HTTP/1.1 200 1800 http://www.australiancreative.com.au/; Mozilla/4.0 
(compatible; MSIE 5.22; Mac_PowerPC)
203.12.97.92 - - [25/May/2005:14:13:14 +1000] GET /yaffa.css HTTP/1.0 
200 284 http://news.yaffa.com.au/subs.htm; Mozilla/4.0 (compatible; 
MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)


So you can see the 4 vhosts there in the log file but Analog isn't 
seeing them somehow.


--
Regards

Cameron Biggart
IT Manager
Yaffa Publishing
(02) 9281 2333
[EMAIL PROTECTED]
+
|  TO UNSUBSCRIBE from this list:
|http://lists.meer.net/mailman/listinfo/analog-help
|
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
+