Re: Logging user's movements
ben syverson wrote: [...] The problem with this is that 99% of the time, the document won't contain any of the new node names, so mod_perl is wasting most of its time serving up cached HTML. I have two suggestions, 1) Use a reverse proxy/cache and send proper Cache-Control and Etag/Content-Length headers, eg: Last-Modified: Fri, 04 Feb 2005 11:11:11 GMT Cache-Control: public, must-revalidate 2) Use a 307 Temporary Redirect and let thttpd serve it. 307 Temporary Redirect Location: http://static.domain.com/WikiPage.html RFC2616 13 Caching in HTTP http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13 RFC2616 10.3.8 307 Temporary Redirect http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.8 -- Regards Christian Hansen
Re: Logging user's movements
First of all, thanks for the suggestions, everyone! It's giving me a lot to chew on. I now realize (sound of hand smacking forehead) that the main problem is not the list of links and tracking users, but rather the inline Wiki links: On Feb 4, 2005, at 8:58 AM, Malcolm J Harwood wrote: What are you doing with the data once you have it? Is there any reason that it needs to be 'live'? Sort of -- imagine our Wiki scenario, but without delimiters (I think this is rather common in the .biz world). So if the "dinosaur" node contains: "Some scientists suggest that dinosaurs may actually have evolved from birds." It'll automagically link to the "birds" node. However lets say the node "scientist" node doesn't yet exist -- but when it does, we want it to link up. I wouldn't say it "needs to be live," but it would be nice to get that link happening sooner rather than later. The way the system works now, it is live. Every time a page is generated, it stores the most recent node ID along with the cached file. The next time the page is viewed, it checks to see what node is the most recent, and compares it against what was the newest when the file was cached. If they're the same, nothing has changed, and the cache file is served. If they're different, the system looks through the node additions that happened since the node was cached, and sees if the original node's text contains any of those node names. If it does, it regenerates, recaches and serves the page. Otherwise, it revalidates the cache file by storing the new most recent node ID with the old cache file, and serves it up. The problem with this is that 99% of the time, the document won't contain any of the new node names, so mod_perl is wasting most of its time serving up cached HTML. However, If you use a cron job log-analysis approach, every time a new node is added, you have to search through EVERY node's text to see if it needs a link to the new node. Image this with 1,000,000 two page documents. So maybe my system is as optimized as it's going to get? - ben
MP2, SOAP::Lite and Oracle
Hello, I have a few custom modules that work nicely in a standalone SOAP::Lite server. After deciding we needed the performance boost of moving it to mod_perl 2, we have encountered some issues. I am getting this error in the apache error_log: DBI connect('','username',...) failed: ERROR OCIEnvNlsCreate (check ORACLE_HOME and NLS settings etc.) at /usr/lp/lib/AMS/DB.pm line 32 This is in spite of having: PerlSetVar ORACLE_HOME "/path" PerlSetVar TWO_TASK "sidname" In the perl block of the httpd.conf, in fact, dumping %ENV reveals they are both set to the correct values. SOAP::Lite is the latest version + patches from Randy Kobes (porting SOAP::Lite to MP2): http://groups.yahoo.com/group/soaplite/message/4329 Any clues? Thanks in advance, Juan Natera
Re: [mp2] threaded applications inside of mod_perl
Stas Bekman wrote: Stas Bekman wrote: Thanks for the details. I can now reproduce the segfault. I'll post again when this is fixed. I've traced it down to a perl-core issue. I'm submitting a report to p5p and I've CC'ed you, so you can stay in the loop. Meanwhile, there are two workarounds: In fact just using: SetHandler modperl and starting your script with: my $r = shift; tie *STDOUT, $r; is sufficient. Below you will find all the workarounds that I've found working at the moment (added as a test to the mp2 test suite): use strict; use warnings FATAL => 'all'; # # there is a problem when STDOUT is internally opened to an # Apache::PerlIO layer is cloned on a new thread start. PerlIO_clone # in perl_clone() is called too early, before PL_defstash is # cloned. As PerlIO_clone calls PerlIOApache_getarg, which calls # gv_fetchpv via sv_setref_pv and boom the segfault happens. # # at the moment we should either not use an internally opened to # :Apache streams, so the config must be: # # SetHandler modperl # # and then either use $r->print("foo") or tie *STDOUT, $r + print "foo" # # or close and re-open STDOUT to :Apache *after* the thread was spawned # # the above discussion equally applies to STDIN # # XXX: ->join calls leak under registry, this doesn't happen in the # non-registry tests. use threads; my $r = shift; $r->print("Content-type: text/plain\n\n"); { # now we can use $r->print API: my $thr = threads->new( sub { my $id = shift; $r->print("thread $id\n"); return 1; }, 1); # $thr->join; # XXX: leaks scalar } { # close and re-open STDOUT to :Apache *after* the thread was # spawned my $thr = threads->new( sub { my $id = shift; close STDOUT; open STDOUT, ">:Apache", $r or die "can't open STDOUT via :Apache layer : $!"; print "thread $id\n"; return 1; }, 2); # $thr->join; # XXX: leaks scalar } { # tie STDOUT to $r *after* the ithread was started has # happened, in which case we can use print my $thr = threads->new( sub { my $id = shift; tie *STDOUT, $r; print "thread $id\n"; return 1; }, 3); # $thr->join; # XXX: leaks scalar } { # tie STDOUT to $r before the ithread was started has # happened, in which case we can use print tie *STDOUT, $r; my $thr = threads->new( sub { my $id = shift; print "thread $id\n"; return 1; }, 4); # $thr->join; # XXX: leaks scalar } print "parent"; -- __ Stas BekmanJAm_pH --> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: Sanity check on mod_rewrite and POST data [slightly OT]
Martin Moss wrote: However after the rewrite, the POST data is lost. Can anybody throw any light on this? the rewrite rule is this:- RewriteRule ^(.*)$ http://%{HTTP_HOST}$1 [R] Not sure what you are trying to do here. You are making a non-ssl request back to the exact same server, with the exact same parameters - hopefully this is just for your example. Since you are using the "R" flag, you are causing an external redirect. An external redirect will not cause the browser to send the POST information again to the new server. You will probably need to make sure you have mod_proxy installed on the server and use the "P" flag instead. This will proxy the request, which WILL send the post data through. As a side question, can anybody tell me if a https GET request would encrypt the parameters passed? Yes - it is. Everything about the request is encrpyted. That is why you cannot use some of the normal http/apache features such as virtual hosts. -- [EMAIL PROTECTED]
Sanity check on mod_rewrite and POST data [slightly OT]
All, Can I get a sanity check on this:- I have a form which POSTs to https://server/url That https servers uses mod_rewrite to forward the request onto another server internally as http://server/url However after the rewrite, the POST data is lost. Can anybody throw any light on this? the rewrite rule is this:- RewriteRule ^(.*)$ http://%{HTTP_HOST}$1 [R] As a side question, can anybody tell me if a https GET request would encrypt the parameters passed? Regards Marty ___ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com
Re: [mp2] threaded applications inside of mod_perl
Stas Bekman wrote: Thanks for the details. I can now reproduce the segfault. I'll post again when this is fixed. I've traced it down to a perl-core issue. I'm submitting a report to p5p and I've CC'ed you, so you can stay in the loop. Meanwhile, there are two workarounds: You must start with not using a tied STDOUT, i.e. change the SetHandler setting to 'modperl': SetHandler modperl PerlResponseHandler ModPerl::Registry PerlOptions +ParseHeaders +GlobalRequest Options ExecCGI now you can either use $r->print(), or tie STDOUT to $r in each thread where you want to use it. Do not tie it before starting the threads, since you will hit the same problem. The following program demonstrates both techniques: use strict; use warnings FATAL => 'all'; use threads; my $r = shift; $r->print("Content-type: text/plain\n\n"); threads->create( sub { $r->print("thread 1\n"); }, undef); threads->create( sub { tie *STDOUT, $r; print "thread 2\n"; }, undef); $r->print("done"); as you use +GlobalRequest you can replace: my $r = shift; with my $r = Apache->request; but it's a bit slower. -- __ Stas BekmanJAm_pH --> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
RE: ModPerl Installiation help
Since nobody else bit... First read the user docs => http://perl.apache.org/docs/2.0/user/index.html (parts I and II at a minimum.) (Actually, you should read ALL of the docs => http://perl.apache.org/docs/2.0/index.html) Looking at the package list for FC3 you get httpd-2.0.52 and mod_perl-1.99_16 IF you indeed do the "everything" install. You just have to track down where the config files for Apache are located. Read the relevant portions of the modperl2 docs for configuration. Your other choice is rolling your own as you mentioned. See docs => http://perl.apache.org/docs/2.0/user/install/install.html You are more likely to get help for this method on the list as it appears most folks install this way. + you will be using the later and cleaner version. Disclaimer: I am not an expert. I just play one at work. > -Original Message- > From: steve silvers [mailto:[EMAIL PROTECTED] > Sent: Thursday, February 03, 2005 6:27 PM > To: modperl@perl.apache.org > Subject: ModPerl Installiation help > > I just installed Fedora core 3, everything. The default Perl > install is > 5.8.5 and not sure about Apache for httpd -v does not display > the version. > My question is how do I now install modperl and get it > working. Do I have to > download another version of Perl and Apache to rebuild? I > have never used > modperl before let alone install it. Could someone please > point me in the > right direction. I really want to learn this. > > Thank you > Steve > > *CONFIDENTIALITY NOTICE* This e-mail and any files or attachments may contain confidential and privileged information. If you have received this message in error, please notify the sender at the above e-mail address and delete it and all copies from your system.
Re: [mp2] threaded applications inside of mod_perl
Thanks for the details. I can now reproduce the segfault. I'll post again when this is fixed. -- __ Stas BekmanJAm_pH --> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: AW: Logging user's movements
On 4 Feb 2005, at 14:16, James Smith wrote: On Fri, 4 Feb 2005, Denis Banovic wrote: I have a very similar app running in mod_perl with about 1/2 mio hits a day. I need to do some optimisation, so I'm just interessted what optimisations that you are using brought you the best improvements. Was it preloading modules in the startup.pl or caching the 1x1 gif image, or maybe optimising the database cache ( I'm using mysql ). I'm sure you are also having usage peaks, so it would be interessting how many hits(inserts)/hour can a single server machine handle approx. Simplest thing to do is hijack the referer logs, and then parse them at the end. You just need to add a unique ID for each session (via a cookie or in the URL) which is added to the logs [or placed in a standard logged variable] I totally agree with James. I'm thinking of switching to just using a log file for this rather than it being live (as I only generate reports once a day). I'm actually using a log based system for user tracking, which was implemented after this counter. The counter system is used to count how many times a product appears in search results / how many times someone views it in detail, a good tip, if you have 20 products on the page, Do not call the counter for every one, just pass all the id's in - obvious but if your implementing it in a rush you might miss! It used to be part of the main search code, but this prevented caching. The optimisations I did were: Put in startup.pl (and read the image in as a global from BEGIN block) use Apache::DBI->connect_on_init() so DBH is from the pool of connections and not a new one each time (I'm using MySQL as well). I have a light (non mod_perl) apache at the front, which proxies to a mod_perl apache that runs the module, and the database is on a 3rd machine. I've not got to the point of it overloading the system, so I haven't investigated the actual hit rate. Cheers Leo
Re: Logging user's movements
On Friday 04 February 2005 3:13 am, ben syverson wrote: > I'm curious how the "pros" would approach an interesting system design > problem I'm facing. I'm building a system which keeps track of user's > movements through a collection of information (for the sake of > argument, a Wiki). For example, if John moves from the "dinosaur" page > to the "bird" page, the system logs it -- but only once a day per > connection between nodes per user. That is, if Jane then travels from > "dinosaur" to "bird," it will log it, but if "John" travels moves back > to "dinosaur" from "bird," it won't be logged. The result is a log of > every unique connection made by every user that day. What are you doing with the data once you have it? Is there any reason that it needs to be 'live'? If not, you could simply add the username in a field in the logfile, and post-process the logs (assuming you trust the referer field sufficiently). That removes all the load from the webserver > My initial thoughts on how to improve the system were to relieve > mod_perl of having to serve the files, and instead write a perl script > that would run daily to analyze the day's thttpd log files, and then > update the database. However, certain factors (including the need to > store user data in cookies, which have to be checked against MySQL) > make this impossible. Why does storing user data in cookies prevent you from logging enough to identify the user again later? Or are you storing something you need to reconstruct the trace that you can't get otherwise? -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan
Re: Intercepting with data for mod_dav
Jeff Finn schrieb: I've been doing this with mod_perl 2.. here's the relative parts of my config: Alias /dav_files /home/users # # hook into the other phases for # PerlOutputFilterHandler MyEncrypt::output PerlInputFilterHandler MyEncrypt::input PerlSetOutputFilter DEFLATE # # Request Phase Handlers # PerlAuthenHandlerMyAuthenticate AuthType basic AuthName "xxx" Require valid-user Satisfy all # # Actual access to the files use will be already authenticated # AllowOverride None DAV on == Hope this helps. That looks like my config ;-) The point is, must I take care of the DAV specific things ? For example, I wrote a sub named input (whoww!) which did nothing more than consuming the input files on PUT request, it should return to the upper/lower layers afterwards. But the apache process which handled my request, hang. sub input { my $req = shift; if($req->method eq "PUT") { #maybe nothing here allowed ? } return Apache::OK; } 1; Caused that effect. Am I thinking wrong ? To the list: My encryption is proprietary based on the PW the user sends... anyone know a tested symmetric streaming (not block) encryption algoritm? Mh, for streams with a token size of 32-64 bits, you could try http://www.simonshepherd.supanet.com/tea.htm It's easy to implement in perl, as the C source is easy. I use the C source to encrypt 64 bit tokens, which would be 8 bytes from a stream. Perhaps you can then tie that thing between your stream. Jeff -Original Message- From: Stefan Sonnenberg-Carstens [mailto:[EMAIL PROTECTED] Sent: Friday, February 04, 2005 9:11 AM To: modperl@perl.apache.org Subject: Intercepting with data for mod_dav Hi list, I'm struggeling a bit with the following : I set a mod_dav DAV server, which works fine. One thing I *must* accomplish, is to write the uploaded files encrypted in some way to the disk, and publish them back unencrypted. That should be perfectly possible with apache's filters. The problem seems to be, that mod_perl doesn't see anything if dav is set to on on a specific dir ? Is that true ? What I need is small (very small) hint, how to get the data that the PUT and GET requests offer, or if this is possible at all. Thx in advance, Stefan Sonnenberg
RE: Intercepting with data for mod_dav
I've been doing this with mod_perl 2.. here's the relative parts of my config: Alias /dav_files /home/users # # hook into the other phases for # PerlOutputFilterHandler MyEncrypt::output PerlInputFilterHandler MyEncrypt::input PerlSetOutputFilter DEFLATE # # Request Phase Handlers # PerlAuthenHandlerMyAuthenticate AuthType basic AuthName "xxx" Require valid-user Satisfy all # # Actual access to the files use will be already authenticated # AllowOverride None DAV on == Hope this helps. To the list: My encryption is proprietary based on the PW the user sends... anyone know a tested symmetric streaming (not block) encryption algoritm? Jeff -Original Message- From: Stefan Sonnenberg-Carstens [mailto:[EMAIL PROTECTED] Sent: Friday, February 04, 2005 9:11 AM To: modperl@perl.apache.org Subject: Intercepting with data for mod_dav Hi list, I'm struggeling a bit with the following : I set a mod_dav DAV server, which works fine. One thing I *must* accomplish, is to write the uploaded files encrypted in some way to the disk, and publish them back unencrypted. That should be perfectly possible with apache's filters. The problem seems to be, that mod_perl doesn't see anything if dav is set to on on a specific dir ? Is that true ? What I need is small (very small) hint, how to get the data that the PUT and GET requests offer, or if this is possible at all. Thx in advance, Stefan Sonnenberg
Re: setting environment variables
Yes I think it's more complicated. I don't have the original setup that caused my problem, but i'm pretty sure I found that if I set a mixed-case env var (say 'MyEnv_Var') with SetEnv, in my mod_perl app I got the variable set (exists == true) but with no value, whereas using PerlSetEnv with the same variable name, I got the value in %ENV but the var name was uppercased. At the moment i have just worked around the problem by using SetEnv but with an all uppercase varaible name. Regards; Colin Randy Kobes wrote: On Wed, 2 Feb 2005, Stas Bekman wrote: Randy Kobes wrote: [...] So the behaviour of SetEnv changed from Apache-1 to Apache-2, as far as Win32 case goes, while PerlSetEnv maintained the same behaviour from mp1 to mp2. I suppose one could argue that we should change PerlSetEnv under mp2 to lower-case things, so as to be consistent with SetEnv? I think yes. I'm sure you have a patch already :) Actually, things are a bit more complicated on mp2 than I thought ... The example I gave earlier had 2 SetEnv/PerlSetEnv directives, differing in case, which is a bit artificial. If there's just one such directive, then both SetEnv/PerlSetEnv seem to behave normally (taking into account that, on Windows, $ENV{FOO} and $ENV{foo} are the same). However, there does seen to be a problem (with SetEnv) when it's all lower-case SetEnv foo bar in that $ENV{foo} doesn't seem to get set (irrespective of the case of "foo". There's still a difference between PerlSetEnv and SetEnv, but I don't see the pattern yet; I'll keep looking.
Re: AW: Logging user's movements
On Fri, 4 Feb 2005, Denis Banovic wrote: > Hi Leo, > > I have a very similar app running in mod_perl with about 1/2 mio hits a day. > I need to do some optimisation, so I'm just interessted what optimisations > that you are using brought you the best improvements. > Was it preloading modules in the startup.pl or caching the 1x1 gif image, or > maybe optimising the database cache ( I'm using mysql ). > I'm sure you are also having usage peaks, so it would be interessting how > many hits(inserts)/hour can a single server machine handle approx. > Simplest thing to do is hijack the referer logs, and then parse them at the end. You just need to add a unique ID for each session (via a cookie or in the URL) which is added to the logs [or placed in a standard logged variable] Then write a parser which tracks usage - using referer + page viewed If you don't want to rely on referers then you could encrypt this in the URL... (but watch out for search engines who could hammer your site!!) James > > Thanks > > Denis > > > > > > -Ursprüngliche Nachricht- > Von: Leo Lapworth [mailto:[EMAIL PROTECTED] > Gesendet: Freitag, 4. Februar 2005 10:37 > An: ben syverson > Cc: modperl@perl.apache.org > Betreff: Re: Logging user's movements > > > H > On 4 Feb 2005, at 08:13, ben syverson wrote: > > > Hello, > > > > I'm curious how the "pros" would approach an interesting system design > > problem I'm facing. I'm building a system which keeps track of user's > > movements through a collection of information (for the sake of > > argument, a Wiki). For example, if John moves from the "dinosaur" page > > to the "bird" page, the system logs it -- but only once a day per > > connection between nodes per user. That is, if Jane then travels from > > "dinosaur" to "bird," it will log it, but if "John" travels moves back > > to "dinosaur" from "bird," it won't be logged. The result is a log of > > every unique connection made by every user that day. > > > > The question is, how would you do this with the least amount of strain > > on the server? > > > I think the standard approach for user tracking is a 1x1 gif, there are > lots of ways of doing it, here are 2: > > Javascript + Logs - update tracking when logs are processed > > - > > Use javascript to set a cookie (session or 24 hours) - if there isn't > already one. Then use javascript to do a document write to the gif. > > so /tracker/c.gif?c=&page=dinosaur > > It should then be fast (no live processing) and fairly easy to extract > this information from the logs and into a db. > > Mod_perl - live db updates > - > Alternatively if you need live updates create a mod_perl handle that > sits at /tracker/c.gif, processes the parameters and puts them into a > database, then returns a gif (I do this, read the gif in and store it > as a global when the module starts so it just stays in memory). It's > fast and means you can still get the benefits of caching with squid or > what ever. > > I get about half a million hits a day to my gif. > > I think the main point is you should separate it from your main content > handler if you want it to be flexible and still allow other levels of > caching. > > Cheers > > Leo > > > > Virus checked by G DATA AntiVirusKit > Version: AVK 15.0.2702 from 26.01.2005 > Virus news: www.antiviruslab.com > >
Intercepting with data for mod_dav
Hi list, I'm struggeling a bit with the following : I set a mod_dav DAV server, which works fine. One thing I *must* accomplish, is to write the uploaded files encrypted in some way to the disk, and publish them back unencrypted. That should be perfectly possible with apache's filters. The problem seems to be, that mod_perl doesn't see anything if dav is set to on on a specific dir ? Is that true ? What I need is small (very small) hint, how to get the data that the PUT and GET requests offer, or if this is possible at all. Thx in advance, Stefan Sonnenberg
AW: Logging user's movements
Hi Leo, I have a very similar app running in mod_perl with about 1/2 mio hits a day. I need to do some optimisation, so I'm just interessted what optimisations that you are using brought you the best improvements. Was it preloading modules in the startup.pl or caching the 1x1 gif image, or maybe optimising the database cache ( I'm using mysql ). I'm sure you are also having usage peaks, so it would be interessting how many hits(inserts)/hour can a single server machine handle approx. Thanks Denis -Ursprüngliche Nachricht- Von: Leo Lapworth [mailto:[EMAIL PROTECTED] Gesendet: Freitag, 4. Februar 2005 10:37 An: ben syverson Cc: modperl@perl.apache.org Betreff: Re: Logging user's movements H On 4 Feb 2005, at 08:13, ben syverson wrote: > Hello, > > I'm curious how the "pros" would approach an interesting system design > problem I'm facing. I'm building a system which keeps track of user's > movements through a collection of information (for the sake of > argument, a Wiki). For example, if John moves from the "dinosaur" page > to the "bird" page, the system logs it -- but only once a day per > connection between nodes per user. That is, if Jane then travels from > "dinosaur" to "bird," it will log it, but if "John" travels moves back > to "dinosaur" from "bird," it won't be logged. The result is a log of > every unique connection made by every user that day. > > The question is, how would you do this with the least amount of strain > on the server? > I think the standard approach for user tracking is a 1x1 gif, there are lots of ways of doing it, here are 2: Javascript + Logs - update tracking when logs are processed - Use javascript to set a cookie (session or 24 hours) - if there isn't already one. Then use javascript to do a document write to the gif. so /tracker/c.gif?c=&page=dinosaur It should then be fast (no live processing) and fairly easy to extract this information from the logs and into a db. Mod_perl - live db updates - Alternatively if you need live updates create a mod_perl handle that sits at /tracker/c.gif, processes the parameters and puts them into a database, then returns a gif (I do this, read the gif in and store it as a global when the module starts so it just stays in memory). It's fast and means you can still get the benefits of caching with squid or what ever. I get about half a million hits a day to my gif. I think the main point is you should separate it from your main content handler if you want it to be flexible and still allow other levels of caching. Cheers Leo Virus checked by G DATA AntiVirusKit Version: AVK 15.0.2702 from 26.01.2005 Virus news: www.antiviruslab.com
[JOB] Perl/PHP Web application development
Title: [JOB] Perl/PHP Web application development Hi! We search a developer with good programming skills in (mod_)Perl / PHP for a full-time job in Salzburg, Austria. You should also have a working expirience with Linux and MySQL. We are the biggest internet agency in western Austria. If you are interessted we can help you find a place to stay. Send your application to mailto:[EMAIL PROTECTED]. Please send us your resume, examples of work that you have done, and anything else that will describe you. Looking forward to seeing your application, Denis Banovic "THINK THE WEB WAY." --- NCM - NET COMMUNICATION MANAGEMENT GmbH ---[ Denis Banovic - CTO mailto:[EMAIL PROTECTED] ---[ Mühlstrasse 4a AT - 5023 Salzburg Tel. 0662 / 644 688 ---[ Fax: 0662 / 644 688 - 88 http://www.ncm.at --- Virus checked by G DATA AntiVirusKit Version: AVK 15.0.2702 from 26.01.2005 Virus news: www.antiviruslab.com
Re: Logging user's movements
H On 4 Feb 2005, at 08:13, ben syverson wrote: Hello, I'm curious how the "pros" would approach an interesting system design problem I'm facing. I'm building a system which keeps track of user's movements through a collection of information (for the sake of argument, a Wiki). For example, if John moves from the "dinosaur" page to the "bird" page, the system logs it -- but only once a day per connection between nodes per user. That is, if Jane then travels from "dinosaur" to "bird," it will log it, but if "John" travels moves back to "dinosaur" from "bird," it won't be logged. The result is a log of every unique connection made by every user that day. The question is, how would you do this with the least amount of strain on the server? I think the standard approach for user tracking is a 1x1 gif, there are lots of ways of doing it, here are 2: Javascript + Logs - update tracking when logs are processed - Use javascript to set a cookie (session or 24 hours) - if there isn't already one. Then use javascript to do a document write to the gif. so /tracker/c.gif?c=&page=dinosaur It should then be fast (no live processing) and fairly easy to extract this information from the logs and into a db. Mod_perl - live db updates - Alternatively if you need live updates create a mod_perl handle that sits at /tracker/c.gif, processes the parameters and puts them into a database, then returns a gif (I do this, read the gif in and store it as a global when the module starts so it just stays in memory). It's fast and means you can still get the benefits of caching with squid or what ever. I get about half a million hits a day to my gif. I think the main point is you should separate it from your main content handler if you want it to be flexible and still allow other levels of caching. Cheers Leo
Re: [mp2] threaded applications inside of mod_perl
On Thu, 3 Feb 2005, Stas Bekman wrote: where is the modperl confguration? As far as you've shown there is no mod_perl involved in serving any requests. (Hint: show us Directory/Location/etc container responsible for a request that has triggered the segfault) SetHandler perl-script PerlResponseHandler ModPerl::Registry PerlOptions +ParseHeaders +GlobalRequest Options ExecCGI I have changed the startup.pl file to read to remove as many variables as possible. It still segfaults with this minimal configuration #!/usr/bin/perl 1; Please reread my original reply. I still have no idea how you've invoked the script. i.e. show us the URL that you've called. and above I've asked you for the relevant config section. I invoke the script by running http://localhost:8080/apps/test Resulting in a closed connection and [Thu Feb 03 21:25:19 2005] [notice] child pid 7393 exit signal Segmentation fault (11) Is a script using threads beneath the mod_perl interpretor expected to work or is this a dark corner of mod_perl best left untouched? #!/usr/bin/perl use strict; require threads; my $thread = threads->create(sub { print "I am a thread"},undef); Thank you for your time.
ModPerl Installiation help
I just installed Fedora core 3, everything. The default Perl install is 5.8.5 and not sure about Apache for httpd -v does not display the version. My question is how do I now install modperl and get it working. Do I have to download another version of Perl and Apache to rebuild? I have never used modperl before let alone install it. Could someone please point me in the right direction. I really want to learn this. Thank you Steve
ModPerl Installiation help
I just installed Fedora core 3, everything. The default Perl install is 5.8.5 and not sure about Apache for httpd -v does not display the version. My question is how do I now install modperl and get it working. Do I have to download another version of Perl and Apache to rebuild? I have never used modperl before let alone install it. Could someone please point me in the right direction. I really want to learn this. Thank you Steve
Logging user's movements
Hello, I'm curious how the "pros" would approach an interesting system design problem I'm facing. I'm building a system which keeps track of user's movements through a collection of information (for the sake of argument, a Wiki). For example, if John moves from the "dinosaur" page to the "bird" page, the system logs it -- but only once a day per connection between nodes per user. That is, if Jane then travels from "dinosaur" to "bird," it will log it, but if "John" travels moves back to "dinosaur" from "bird," it won't be logged. The result is a log of every unique connection made by every user that day. The question is, how would you do this with the least amount of strain on the server? Currently, I'm using Squid to switch between thttpd (for non-"Wiki" files) and mod_perl, with the metadata in MySQL, and the text data in flatfiles (don't worry, everything's write-once). The code I'm using to generate the "Wiki" pages is fairly fast as I'm testing it, but it's not clear (and impossible to test) how well it will scale as more nodes and users are added. As a defensive measure, I'm caching the HTML output of the mod_perl handler, but the cached files aren't being served by thttpd, because the handler still needs to register where people are going. So every time a page is requested, the handler looks and sees if this user has made this connection in the past 24 hours, if not log it, and then either serve the cached file or generate a new one (they go out of date sporadically). My initial thoughts on how to improve the system were to relieve mod_perl of having to serve the files, and instead write a perl script that would run daily to analyze the day's thttpd log files, and then update the database. However, certain factors (including the need to store user data in cookies, which have to be checked against MySQL) make this impossible. Am I on the right track with this? - ben