Re: Custom Logging and User Tracking
At 15:22 13.02.2002 -0800, Ryan Parr wrote: Nothing special to the way these sites work. You can check out http://www.rileyjames.com and http://www.ryanparr.com (the programming on the latter will leave you in awe :) I want to host my sites and have a decent usage statistics location, but I just can't seem to get the logging part down. I've got a long road ahead of me :) For instance, the code below logs the following on entrance to rileyjames.com (setup as PerlFixupHandler): www.rileyjames.com / Wed Feb 13 16:17:15 2002 www.rileyjames.com /index.html Wed Feb 13 16:17:15 2002 www.rileyjames.com /topnavigation.htm Wed Feb 13 16:17:15 2002 www.rileyjames.com /white.htm Wed Feb 13 16:17:15 2002 www.rileyjames.com /green.htm Wed Feb 13 16:17:15 2002 www.rileyjames.com /index1.htm Wed Feb 13 16:17:15 2002 www.rileyjames.com /topnav.css Wed Feb 13 16:17:15 2002 www.rileyjames.com /graphics/redarrow.gif Wed Feb 13 16:17:15 2002 www.rileyjames.com /border.css Wed Feb 13 16:17:15 2002 www.rileyjames.com /text.css Wed Feb 13 16:17:15 2002 www.rileyjames.com /graphics/frontpaglogo.gif Wed Feb 13 16:17:15 2002 The problem you seem to be having is that: 1) The client is sent the main page as HTML (index.html) 2) As this file includes many references to other URLs, for images, CSS, frames, etc.., the client knows that it'll need these files, so sends out new requests for these files, many of them at the same time. 3) Apache processes these new requests, without knowing that they came from one other request. You're faced with one problem (and feature) of the HTTP protocol: it's stateless, so the httpd could not possibly know that any requests are linked. You have some ways of working around this, though. It's been tried over and over again, and as many people know, getting reliable statistics on visits (etc) is pretty hard. Here are some possible solutions: 1) as you're using frames on rileyjames.com, you could log only visits on /topnavigation.htm, which would be loaded only once. Of course, logging the number of visits is not really what you want. 2) Say that one IP can only be counted visiting when it visits within a certain amount of time: for example, all visits after the first one from a specific IP are ignored for 5 seconds.. One problem here is that: - IPs aren't reliable enough as a method (there is no IP-computer match, because of NAT and proxies) - You might not have reached the logging phase of the first page when the other pages are requested (although this is unlikely) 3) What I think is the best solution: declare only some pages as loggable. Either log only specific pages, say the HTML files of your choice and some big pictures, *or* add a query string to the pages you want logged/don't want logged... Say: /graphics/frontpaglogo.gif?log=yes would still get you the image, but you can get the query string in the logger, and check whether to log or not. There are probably many other solutions... But just remember that while the line return DECLINED unless($r-is_main()); is useful for subrequests, it won't help you a bit in your situation here, because of the fact that the requests you're seeing are indeed separate. -- Per Einar Ellefsen [EMAIL PROTECTED]
Re: Custom Logging and User Tracking
On Wed, 13 Feb 2002, Ryan Parr wrote: I checked it out and it's a good mod. I've already got the ability to log the data however. The issue that I'm having is that I can't seem to only get 1 log per hit. I can't seem to get around the fact that wherever I put my mod (PerlFixupHandler,PerlHandler,PerlLogHandler) or whatever statement I use ($r-is_main(), $r-is_initial_req()) I'm getting not only the requested page but every other request from the inital request. For instance, I'm getting and logging every graphic, css, javascript, or any other file that's linked in. But for my user tracking I want *just* the initial request, not that and all subrequests. I just can't seem to figure out how to do that. $r-is_main() and $r-is_initial_req() return true for everything. Maybe i'm wrong, but is_initial_req, is just there, to keep track of INTERNAL redirects, like, if you internaly redirect a file to another one, and you don't want to process the redirected one (because you don't care anymore). I think the best one for you problem should be something like this... 1.) use a shared hash (IPC::SharedCache or something like this) 2.) compute a MD5-Hash, with some data from the user, like IP and Browser-string 3.) put the MD5, or whatever in the hash, and set a timeoutvalue for this one 4.) on every request, lookup for the key, and check/update the timeout-value, and log the request, if it doesn't exists inside of the hash, or the timeout occured 5.) run a cleanup every 50/100 request, and clear the cache from timeouted values (for shared memory, this could also be done by a script running from crontab) i hope you get what i mean, and i hope it helps ;))) ciao nico
Re: Custom Logging and User Tracking
On Wed, 13 Feb 2002, Ryan Parr wrote: I'm trying to setup some custom logging including the whole User/Session tracking thing. The problem that I'm encountering is how to log for the page that was requested and ignore all the additional files that may be included in the page. I.e. graphics. return DECLINED if $r-content_type =~ m!^image/!; ? - ask -- ask bjoern hansen, http://ask.netcetera.dk/ !try; do(); more than a billion impressions per week, http://valueclick.com
Re: Custom Logging and User Tracking
Unfortunately we do have areas on the site where a link would point directly to a graphic file, which I'd like to log. Otherwise that would work quite well. I had always thought that these extra requests would be subrequests. If not, though, what would be the definition of a sub-request? -- Ryan - Original Message - From: Ask Bjoern Hansen [EMAIL PROTECTED] To: Ryan Parr [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Wednesday, February 13, 2002 12:02 PM Subject: Re: Custom Logging and User Tracking On Wed, 13 Feb 2002, Ryan Parr wrote: I'm trying to setup some custom logging including the whole User/Session tracking thing. The problem that I'm encountering is how to log for the page that was requested and ignore all the additional files that may be included in the page. I.e. graphics. return DECLINED if $r-content_type =~ m!^image/!; ? - ask -- ask bjoern hansen, http://ask.netcetera.dk/ !try; do(); more than a billion impressions per week, http://valueclick.com
Re: Custom Logging and User Tracking
Nothing special to the way these sites work. You can check out http://www.rileyjames.com and http://www.ryanparr.com (the programming on the latter will leave you in awe :) I want to host my sites and have a decent usage statistics location, but I just can't seem to get the logging part down. I've got a long road ahead of me :) For instance, the code below logs the following on entrance to rileyjames.com (setup as PerlFixupHandler): www.rileyjames.com / Wed Feb 13 16:17:15 2002 www.rileyjames.com /index.html Wed Feb 13 16:17:15 2002 www.rileyjames.com /topnavigation.htm Wed Feb 13 16:17:15 2002 www.rileyjames.com /white.htm Wed Feb 13 16:17:15 2002 www.rileyjames.com /green.htm Wed Feb 13 16:17:15 2002 www.rileyjames.com /index1.htm Wed Feb 13 16:17:15 2002 www.rileyjames.com /topnav.css Wed Feb 13 16:17:15 2002 www.rileyjames.com /graphics/redarrow.gif Wed Feb 13 16:17:15 2002 www.rileyjames.com /border.css Wed Feb 13 16:17:15 2002 www.rileyjames.com /text.css Wed Feb 13 16:17:15 2002 www.rileyjames.com /graphics/frontpaglogo.gif Wed Feb 13 16:17:15 2002 The code follows: sub handler { my $r = shift; return DECLINED unless($r-is_main()); # Same behavior when: # return DECLINED unless($r-is_initial_req()); open TRACK, /usr/local/www/usertracker.txt or die Couldn't open log: $!; print TRACK join(\t,($r-hostname,$r-uri,scalar(localtime))),\n; close TRACK; return DECLINED; } -- Ryan - Original Message - From: Ask Bjoern Hansen [EMAIL PROTECTED] To: Ryan Parr [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Wednesday, February 13, 2002 2:34 PM Subject: Re: Custom Logging and User Tracking On Wed, 13 Feb 2002, Ryan Parr wrote: Unfortunately we do have areas on the site where a link would point directly to a graphic file, which I'd like to log. Otherwise that would work quite well. I had always thought that these extra requests would be subrequests. If not, though, what would be the definition of a sub-request? A subrequest is when during processing of the original request make a new internal request. What you are looking for might be the Referer header; but without knowing more exactly how your site works and what URLs you use, it's hard to tell. - ask -- ask bjoern hansen, http://ask.netcetera.dk/ !try; do(); more than a billion impressions per week, http://valueclick.com
Re: Custom Logging and User Tracking
On Wed, 13 Feb 2002, Ryan Parr wrote: The code follows: sub handler { my $r = shift; return DECLINED unless($r-is_main()); # Same behavior when: # return DECLINED unless($r-is_initial_req()); open TRACK, /usr/local/www/usertracker.txt or die Couldn't open log: $!; print TRACK join(\t,($r-hostname,$r-uri,scalar(localtime))),\n; close TRACK; return DECLINED; } Hmm, no file locking for something being used by multiple processes? Could be problematic. Is print atomic? Better be sure. Also, if you just open the filehandle once (not in the handler) this'd probably be a bit quicker. And for increased perceived speed have the writing occur in a cleanup handler. -dave /*== www.urth.org we await the New Sun ==*/
Re: Custom Logging and User Tracking
All good points. This code is only to test mod_perl Perl*Handler mechanisms to ensure that I can get the proper log. Once I figure out the necessary routines to do this then I'll integrate it with the rest of my mod, which logs request and session info to a database. -- Ryan - Original Message - From: Dave Rolsky [EMAIL PROTECTED] To: Ryan Parr [EMAIL PROTECTED] Cc: mod_perl list [EMAIL PROTECTED] Sent: Wednesday, February 13, 2002 4:23 PM Subject: Re: Custom Logging and User Tracking On Wed, 13 Feb 2002, Ryan Parr wrote: The code follows: sub handler { my $r = shift; return DECLINED unless($r-is_main()); # Same behavior when: # return DECLINED unless($r-is_initial_req()); open TRACK, /usr/local/www/usertracker.txt or die Couldn't open log: $!; print TRACK join(\t,($r-hostname,$r-uri,scalar(localtime))),\n; close TRACK; return DECLINED; } Hmm, no file locking for something being used by multiple processes? Could be problematic. Is print atomic? Better be sure. Also, if you just open the filehandle once (not in the handler) this'd probably be a bit quicker. And for increased perceived speed have the writing occur in a cleanup handler. -dave /*== www.urth.org we await the New Sun ==*/
Re: Custom Logging and User Tracking
I checked it out and it's a good mod. I've already got the ability to log the data however. The issue that I'm having is that I can't seem to only get 1 log per hit. I can't seem to get around the fact that wherever I put my mod (PerlFixupHandler,PerlHandler,PerlLogHandler) or whatever statement I use ($r-is_main(), $r-is_initial_req()) I'm getting not only the requested page but every other request from the inital request. For instance, I'm getting and logging every graphic, css, javascript, or any other file that's linked in. But for my user tracking I want *just* the initial request, not that and all subrequests. I just can't seem to figure out how to do that. $r-is_main() and $r-is_initial_req() return true for everything. KeepAlive is on. This happens with MSIE, Netscape, Lynx, Opera I would assume Konquerer too. I know that I have to be missing something pretty basic, I'm new to programming in mod_perl. -- Ryan - Original Message - From: Andrew Moore [EMAIL PROTECTED] To: Ryan Parr [EMAIL PROTECTED] Cc: mod_perl list [EMAIL PROTECTED] Sent: Wednesday, February 13, 2002 5:00 PM Subject: Re: Custom Logging and User Tracking On Wed, Feb 13, 2002 at 04:42:02PM -0800, Ryan Parr wrote: All good points. This code is only to test mod_perl Perl*Handler mechanisms to ensure that I can get the proper log. Once I figure out the necessary routines to do this then I'll integrate it with the rest of my mod, which logs request and session info to a database. -- Ryan You might check out Ask's Apache::DBILogger module. It's pretty simple source, so you can make it log whatever you like pretty easily. -Andy