Re: Custom Logging and User Tracking

2002-02-14 Thread Per Einar Ellefsen

At 15:22 13.02.2002 -0800, Ryan Parr wrote:
Nothing special to the way these sites work. You can check out
http://www.rileyjames.com and http://www.ryanparr.com (the programming on
the latter will leave you in awe :) I want to host my sites and have a
decent usage statistics location, but I just can't seem to get the logging
part down. I've got a long road ahead of me :)

For instance, the code below logs the following on entrance to
rileyjames.com (setup as PerlFixupHandler):
www.rileyjames.com  /   Wed Feb 13 16:17:15 2002
www.rileyjames.com  /index.html Wed Feb 13 16:17:15 2002
www.rileyjames.com  /topnavigation.htm  Wed Feb 13 16:17:15 2002
www.rileyjames.com  /white.htm  Wed Feb 13 16:17:15 2002
www.rileyjames.com  /green.htm  Wed Feb 13 16:17:15 2002
www.rileyjames.com  /index1.htm Wed Feb 13 16:17:15 2002
www.rileyjames.com  /topnav.css Wed Feb 13 16:17:15 2002
www.rileyjames.com  /graphics/redarrow.gif  Wed Feb 13 16:17:15 2002
www.rileyjames.com  /border.css Wed Feb 13 16:17:15 2002
www.rileyjames.com  /text.css   Wed Feb 13 16:17:15 2002
www.rileyjames.com  /graphics/frontpaglogo.gif  Wed Feb 13 16:17:15
2002


The problem you seem to be having is that:
1) The client is sent the main page as HTML (index.html)
2) As this file includes many references to other URLs, for images, CSS, 
frames, etc.., the client knows that it'll need these files, so sends out 
new requests for these files, many of them at the same time.
3) Apache processes these new requests, without knowing that they came from 
one other request.

You're faced with one problem (and feature) of the HTTP protocol: it's 
stateless, so the httpd could not possibly know that any requests are linked.
You have some ways of working around this, though. It's been tried over and 
over again, and as many people know, getting reliable statistics on visits 
(etc) is pretty hard. Here are some possible solutions:
1) as you're using frames on rileyjames.com, you could log only visits on 
/topnavigation.htm, which would be loaded only once. Of course, logging the 
number of visits is not really what you want.
2) Say that one IP can only be counted visiting when it visits within a 
certain amount of time: for example, all visits after the first one from a 
specific IP are ignored for 5 seconds.. One problem here is that:
 - IPs aren't reliable enough as a method (there is no IP-computer 
match, because of NAT and proxies)
 - You might not have reached the logging phase of the first page 
when the other pages are requested (although this is unlikely)
3) What I think is the best solution: declare only some pages as loggable. 
Either log only specific pages, say the HTML files of your choice and some 
big pictures, *or* add a query string to the pages you want logged/don't 
want logged...
Say: /graphics/frontpaglogo.gif?log=yes would still get you the image, but 
you can get the query string in the logger, and check whether to log or not.

There are probably many other solutions... But just remember that while the 
line return DECLINED unless($r-is_main()); is useful for subrequests, it 
won't help you a bit in your situation here, because of the fact that the 
requests you're seeing are indeed separate.



-- 
Per Einar Ellefsen
[EMAIL PROTECTED]




Re: Custom Logging and User Tracking

2002-02-14 Thread masta



On Wed, 13 Feb 2002, Ryan Parr wrote:

 I checked it out and it's a good mod. I've already got the ability to log
 the data however. The issue that I'm having is that I can't seem to only get
 1 log per hit. I can't seem to get around the fact that wherever I put my
 mod (PerlFixupHandler,PerlHandler,PerlLogHandler) or whatever statement I
 use ($r-is_main(), $r-is_initial_req()) I'm getting not only the requested
 page but every other request from the inital request. For instance, I'm
 getting and logging every graphic, css, javascript, or any other file that's
 linked in. But for my user tracking I want *just* the initial request, not
 that and all subrequests. I just can't seem to figure out how to do that.
 $r-is_main() and $r-is_initial_req() return true for everything.

Maybe i'm wrong, but is_initial_req, is just there, to keep track of
INTERNAL redirects, like, if you internaly redirect a file to another one,
and you don't want to process the redirected one (because you don't care
anymore).

I think the best one for you problem should be something like this...

1.) use a shared hash (IPC::SharedCache or something like this)
2.) compute a MD5-Hash, with some data from the user, like IP and
Browser-string
3.) put the MD5, or whatever in the hash, and set a timeoutvalue for this
one
4.) on every request, lookup for the key, and check/update the
timeout-value, and log the request, if it doesn't exists inside of the
hash, or the timeout occured
5.) run a cleanup every 50/100 request, and clear the cache from timeouted
values (for shared memory, this could also be done by a script running
from crontab)


i hope you get what i mean, and i hope it helps ;)))


ciao nico





Custom Logging and User Tracking

2002-02-13 Thread Ryan Parr



I'm trying to setup some custom logging including 
the whole User/Session tracking thing. The problem that I'm encountering is how 
to log for the page that was requested and ignore all the additional files that 
may be included in the page. I.e. graphics. Without trying to maintain session 
uniqueness by comparing mod_uniqueid values.

return DECLINED unless($r-is_main()); does 
nothing
return DECLINED unless($r-is_initial_req()); 
does nothing

PerlFixupHandler logs every included file (is this 
what a subrequest is?)
PerlLogHandler logs every included 
file
PerlHandler only logs the initial request, but only 
logs for the / URI request. No other URI'sare logged.

my $code = EO_CODE_SAMPLE; 
sub handler {my $r = shift;

open TRACK, 
"/usr/local/www/usertracker.txt" or die "Couldn't open log: 
$!";print TRACK 
join("\t",($r-hostname,$r-uri,scalar(localtime),$r-connection-remote_ip,$r-connection-hostname 
|| '-' ,$r-header_in('Referer') || 
'-',$r-header_in('User-agent'))),"\n";close TRACK;return 
DECLINED;}
EO_CODE_SAMPLE


Re: Custom Logging and User Tracking

2002-02-13 Thread Ask Bjoern Hansen

On Wed, 13 Feb 2002, Ryan Parr wrote:

 I'm trying to setup some custom logging including the whole
 User/Session tracking thing. The problem that I'm encountering is
 how to log for the page that was requested and ignore all the
 additional files that may be included in the page. I.e. graphics.

return DECLINED if $r-content_type =~ m!^image/!;

?


 - ask

-- 
ask bjoern hansen, http://ask.netcetera.dk/ !try; do();
more than a billion impressions per week, http://valueclick.com




Re: Custom Logging and User Tracking

2002-02-13 Thread Ryan Parr

Unfortunately we do have areas on the site where a link would point directly
to a graphic file, which I'd like to log. Otherwise that would work quite
well.

I had always thought that these extra requests would be subrequests. If not,
though, what would be the definition of a sub-request?

-- Ryan

- Original Message -
From: Ask Bjoern Hansen [EMAIL PROTECTED]
To: Ryan Parr [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Wednesday, February 13, 2002 12:02 PM
Subject: Re: Custom Logging and User Tracking


 On Wed, 13 Feb 2002, Ryan Parr wrote:

  I'm trying to setup some custom logging including the whole
  User/Session tracking thing. The problem that I'm encountering is
  how to log for the page that was requested and ignore all the
  additional files that may be included in the page. I.e. graphics.

 return DECLINED if $r-content_type =~ m!^image/!;

 ?


  - ask

 --
 ask bjoern hansen, http://ask.netcetera.dk/ !try; do();
 more than a billion impressions per week, http://valueclick.com





Re: Custom Logging and User Tracking

2002-02-13 Thread Ryan Parr

Nothing special to the way these sites work. You can check out
http://www.rileyjames.com and http://www.ryanparr.com (the programming on
the latter will leave you in awe :) I want to host my sites and have a
decent usage statistics location, but I just can't seem to get the logging
part down. I've got a long road ahead of me :)

For instance, the code below logs the following on entrance to
rileyjames.com (setup as PerlFixupHandler):
www.rileyjames.com  /   Wed Feb 13 16:17:15 2002
www.rileyjames.com  /index.html Wed Feb 13 16:17:15 2002
www.rileyjames.com  /topnavigation.htm  Wed Feb 13 16:17:15 2002
www.rileyjames.com  /white.htm  Wed Feb 13 16:17:15 2002
www.rileyjames.com  /green.htm  Wed Feb 13 16:17:15 2002
www.rileyjames.com  /index1.htm Wed Feb 13 16:17:15 2002
www.rileyjames.com  /topnav.css Wed Feb 13 16:17:15 2002
www.rileyjames.com  /graphics/redarrow.gif  Wed Feb 13 16:17:15 2002
www.rileyjames.com  /border.css Wed Feb 13 16:17:15 2002
www.rileyjames.com  /text.css   Wed Feb 13 16:17:15 2002
www.rileyjames.com  /graphics/frontpaglogo.gif  Wed Feb 13 16:17:15
2002

The code follows:
sub handler {
my $r = shift;
return DECLINED unless($r-is_main());
# Same behavior when:
# return DECLINED unless($r-is_initial_req());

open TRACK, /usr/local/www/usertracker.txt or die Couldn't open
log: $!;
print TRACK
join(\t,($r-hostname,$r-uri,scalar(localtime))),\n;
close TRACK;
return DECLINED;
}

-- Ryan

- Original Message -
From: Ask Bjoern Hansen [EMAIL PROTECTED]
To: Ryan Parr [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Wednesday, February 13, 2002 2:34 PM
Subject: Re: Custom Logging and User Tracking


 On Wed, 13 Feb 2002, Ryan Parr wrote:

  Unfortunately we do have areas on the site where a link would point
directly
  to a graphic file, which I'd like to log. Otherwise that would work
quite
  well.
 
  I had always thought that these extra requests would be subrequests. If
not,
  though, what would be the definition of a sub-request?

 A subrequest is when during processing of the original request make
 a new internal request.

 What you are looking for might be the Referer header; but without
 knowing more exactly how your site works and what URLs you use, it's
 hard to tell.


  - ask

 --
 ask bjoern hansen, http://ask.netcetera.dk/ !try; do();
 more than a billion impressions per week, http://valueclick.com





Re: Custom Logging and User Tracking

2002-02-13 Thread Dave Rolsky

On Wed, 13 Feb 2002, Ryan Parr wrote:

 The code follows:
 sub handler {
 my $r = shift;
 return DECLINED unless($r-is_main());
 # Same behavior when:
 # return DECLINED unless($r-is_initial_req());

 open TRACK, /usr/local/www/usertracker.txt or die Couldn't open
 log: $!;
 print TRACK
 join(\t,($r-hostname,$r-uri,scalar(localtime))),\n;
 close TRACK;
 return DECLINED;
 }

Hmm, no file locking for something being used by multiple processes?
Could be problematic.  Is print atomic?  Better be sure.

Also, if you just open the filehandle once (not in the handler) this'd
probably be a bit quicker.  And for increased perceived speed have the
writing occur in a cleanup handler.


-dave

/*==
www.urth.org
we await the New Sun
==*/




Re: Custom Logging and User Tracking

2002-02-13 Thread Ryan Parr

All good points. This code is only to test mod_perl Perl*Handler mechanisms
to ensure that I can get the proper log. Once I figure out the necessary
routines to do this then I'll integrate it with the rest of my mod, which
logs request and session info to a database.

-- Ryan

- Original Message -
From: Dave Rolsky [EMAIL PROTECTED]
To: Ryan Parr [EMAIL PROTECTED]
Cc: mod_perl list [EMAIL PROTECTED]
Sent: Wednesday, February 13, 2002 4:23 PM
Subject: Re: Custom Logging and User Tracking


 On Wed, 13 Feb 2002, Ryan Parr wrote:

  The code follows:
  sub handler {
  my $r = shift;
  return DECLINED unless($r-is_main());
  # Same behavior when:
  # return DECLINED unless($r-is_initial_req());
 
  open TRACK, /usr/local/www/usertracker.txt or die Couldn't
open
  log: $!;
  print TRACK
  join(\t,($r-hostname,$r-uri,scalar(localtime))),\n;
  close TRACK;
  return DECLINED;
  }

 Hmm, no file locking for something being used by multiple processes?
 Could be problematic.  Is print atomic?  Better be sure.

 Also, if you just open the filehandle once (not in the handler) this'd
 probably be a bit quicker.  And for increased perceived speed have the
 writing occur in a cleanup handler.


 -dave

 /*==
 www.urth.org
 we await the New Sun
 ==*/





Re: Custom Logging and User Tracking

2002-02-13 Thread Ryan Parr

I checked it out and it's a good mod. I've already got the ability to log
the data however. The issue that I'm having is that I can't seem to only get
1 log per hit. I can't seem to get around the fact that wherever I put my
mod (PerlFixupHandler,PerlHandler,PerlLogHandler) or whatever statement I
use ($r-is_main(), $r-is_initial_req()) I'm getting not only the requested
page but every other request from the inital request. For instance, I'm
getting and logging every graphic, css, javascript, or any other file that's
linked in. But for my user tracking I want *just* the initial request, not
that and all subrequests. I just can't seem to figure out how to do that.
$r-is_main() and $r-is_initial_req() return true for everything.

KeepAlive is on. This happens with MSIE, Netscape, Lynx, Opera I would
assume Konquerer too.

I know that I have to be missing something pretty basic, I'm new to
programming in mod_perl.

-- Ryan

- Original Message -
From: Andrew Moore [EMAIL PROTECTED]
To: Ryan Parr [EMAIL PROTECTED]
Cc: mod_perl list [EMAIL PROTECTED]
Sent: Wednesday, February 13, 2002 5:00 PM
Subject: Re: Custom Logging and User Tracking



 On Wed, Feb 13, 2002 at 04:42:02PM -0800, Ryan Parr wrote:
  All good points. This code is only to test mod_perl Perl*Handler
mechanisms
  to ensure that I can get the proper log. Once I figure out the necessary
  routines to do this then I'll integrate it with the rest of my mod,
which
  logs request and session info to a database.
 
  -- Ryan
 

 You might check out Ask's Apache::DBILogger module. It's pretty simple
source,
 so you can make it log whatever you like pretty easily.

 -Andy