Re: Logging user's movements

2005-02-04 Thread Christian Hansen
ben syverson wrote:
[...]
The problem with this is that 99% of the time, the document won't 
contain any of the new node names, so mod_perl is wasting most of its 
time serving up cached HTML.
I have two suggestions,
1) Use a reverse proxy/cache and send proper Cache-Control and 
Etag/Content-Length headers, eg:

  Last-Modified: Fri, 04 Feb 2005 11:11:11 GMT
  Cache-Control: public, must-revalidate
2) Use a 307 Temporary Redirect and let thttpd serve it.
  307 Temporary Redirect
  Location: http://static.domain.com/WikiPage.html
RFC2616 13 Caching in HTTP
http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13
RFC2616 10.3.8 307 Temporary Redirect
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.8
--
Regards
Christian Hansen


Re: Logging user's movements

2005-02-04 Thread ben syverson
First of all, thanks for the suggestions, everyone! It's giving me a 
lot to chew on. I now realize (sound of hand smacking forehead) that 
the main problem is not the list of links and tracking users, but 
rather the inline Wiki links:

On Feb 4, 2005, at 8:58 AM, Malcolm J Harwood wrote:
What are you doing with the data once you have it? Is there any reason 
that it
needs to be 'live'?
Sort of -- imagine our Wiki scenario, but without delimiters (I think 
this is rather common in the .biz world). So if the "dinosaur" node 
contains:

"Some scientists suggest that dinosaurs may actually have evolved from 
birds."

It'll automagically link to the "birds" node. However lets say the node 
"scientist" node doesn't yet exist -- but when it does, we want it to 
link up. I wouldn't say it "needs to be live," but it would be nice to 
get that link happening sooner rather than later.

The way the system works now, it is live. Every time a page is 
generated, it stores the most recent node ID along with the cached 
file. The next time the page is viewed, it checks to see what node is 
the most recent, and compares it against what was the newest when the 
file was cached. If they're the same, nothing has changed, and the 
cache file is served. If they're different, the system looks through 
the node additions that happened since the node was cached, and sees if 
the original node's text contains any of those node names. If it does, 
it regenerates, recaches and serves the page. Otherwise, it revalidates 
the cache file by storing the new most recent node ID with the old 
cache file, and serves it up.

The problem with this is that 99% of the time, the document won't 
contain any of the new node names, so mod_perl is wasting most of its 
time serving up cached HTML.

However, If you use a cron job log-analysis approach, every time a new 
node is added, you have to search through EVERY node's text to see if 
it needs a link to the new node. Image this with 1,000,000 two page 
documents.

So maybe my system is as optimized as it's going to get?
- ben


MP2, SOAP::Lite and Oracle

2005-02-04 Thread Juan Natera
Hello,

I have a few custom modules that work nicely in a standalone SOAP::Lite
server. After deciding we needed the performance boost of moving it to
mod_perl 2, we have encountered some issues.

I am getting this error in the apache error_log:

DBI connect('','username',...) failed: ERROR OCIEnvNlsCreate (check
ORACLE_HOME and NLS settings etc.) at /usr/lp/lib/AMS/DB.pm line 32

This is in spite of having:

PerlSetVar ORACLE_HOME "/path"
PerlSetVar TWO_TASK "sidname"

In the perl block of the httpd.conf, in fact, dumping %ENV reveals they
are both set to the correct values.

SOAP::Lite is the latest version + patches from Randy Kobes (porting
SOAP::Lite to MP2):

http://groups.yahoo.com/group/soaplite/message/4329

Any clues?

Thanks in advance,

Juan Natera





Re: [mp2] threaded applications inside of mod_perl

2005-02-04 Thread Stas Bekman
Stas Bekman wrote:
Stas Bekman wrote:
Thanks for the details. I can now reproduce the segfault. I'll post 
again when this is fixed.

I've traced it down to a perl-core issue. I'm submitting a report to p5p 
and I've CC'ed you, so you can stay in the loop.

Meanwhile, there are two workarounds:
In fact just using:
 SetHandler modperl
and starting your script with:
my $r = shift;
tie *STDOUT, $r;
is sufficient. Below you will find all the workarounds that I've found 
working at the moment (added as a test to the mp2 test suite):

use strict;
use warnings FATAL => 'all';
#
# there is a problem when STDOUT is internally opened to an
# Apache::PerlIO layer is cloned on a new thread start. PerlIO_clone
# in perl_clone() is called too early, before PL_defstash is
# cloned. As PerlIO_clone calls PerlIOApache_getarg, which calls
# gv_fetchpv via sv_setref_pv and boom the segfault happens.
#
# at the moment we should either not use an internally opened to
# :Apache streams, so the config must be:
#
# SetHandler modperl
#
# and then either use $r->print("foo") or tie *STDOUT, $r + print "foo"
#
# or close and re-open STDOUT to :Apache *after* the thread was spawned
#
# the above discussion equally applies to STDIN
#
# XXX: ->join calls leak under registry, this doesn't happen in the
# non-registry tests.
use threads;
my $r = shift;
$r->print("Content-type: text/plain\n\n");
{
# now we can use $r->print API:
my $thr = threads->new(
sub {
my $id = shift;
$r->print("thread $id\n");
return 1;
}, 1);
# $thr->join; # XXX: leaks scalar
}
{
# close and re-open STDOUT to :Apache *after* the thread was
# spawned
my $thr = threads->new(
sub {
my $id = shift;
close STDOUT;
open STDOUT, ">:Apache", $r
or die "can't open STDOUT via :Apache layer : $!";
print "thread $id\n";
return 1;
}, 2);
# $thr->join; # XXX: leaks scalar
}
{
# tie STDOUT to $r *after* the ithread was started has
# happened, in which case we can use print
my $thr = threads->new(
sub {
my $id = shift;
tie *STDOUT, $r;
print "thread $id\n";
return 1;
}, 3);
# $thr->join; # XXX: leaks scalar
}
{
# tie STDOUT to $r before the ithread was started has
# happened, in which case we can use print
tie *STDOUT, $r;
my $thr = threads->new(
sub {
my $id = shift;
print "thread $id\n";
return 1;
}, 4);
# $thr->join; # XXX: leaks scalar
}
print "parent";

--
__
Stas BekmanJAm_pH --> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: Sanity check on mod_rewrite and POST data [slightly OT]

2005-02-04 Thread ___cliff rayman___
Martin Moss wrote:
However after the rewrite, the POST data is lost. Can
anybody throw any light on this?
the rewrite rule is this:-
RewriteRule ^(.*)$ http://%{HTTP_HOST}$1 [R]
Not sure what you are trying to do here.  You are making a non-ssl 
request back to the exact same server, with the exact same parameters - 
hopefully this is just for your example.  Since you are using the "R" 
flag, you are causing an external redirect.  An external redirect will 
not cause the browser to send the POST information again to the new 
server.  You will probably need to make sure you have mod_proxy 
installed on the server and use the "P" flag instead.  This will proxy 
the request, which WILL send the post data through.

As a side question, can anybody tell me if a https GET
request would encrypt the parameters passed?
Yes - it is.  Everything about the request is encrpyted.  That is why 
you cannot use some of the normal http/apache features such as virtual 
hosts.

--
[EMAIL PROTECTED]



Sanity check on mod_rewrite and POST data [slightly OT]

2005-02-04 Thread Martin Moss
All,

Can I get a sanity check on this:-

I have a form which POSTs to https://server/url
That https servers uses mod_rewrite to forward the
request onto another server internally as
http://server/url

However after the rewrite, the POST data is lost. Can
anybody throw any light on this?

the rewrite rule is this:-

RewriteRule ^(.*)$ http://%{HTTP_HOST}$1 [R]

As a side question, can anybody tell me if a https GET
request would encrypt the parameters passed?

Regards

Marty









___ 
ALL-NEW Yahoo! Messenger - all new features - even more fun! 
http://uk.messenger.yahoo.com


Re: [mp2] threaded applications inside of mod_perl

2005-02-04 Thread Stas Bekman
Stas Bekman wrote:
Thanks for the details. I can now reproduce the segfault. I'll post 
again when this is fixed.
I've traced it down to a perl-core issue. I'm submitting a report to p5p 
and I've CC'ed you, so you can stay in the loop.

Meanwhile, there are two workarounds:
You must start with not using a tied STDOUT, i.e. change the SetHandler 
setting to 'modperl':


  SetHandler modperl
  PerlResponseHandler ModPerl::Registry
  PerlOptions +ParseHeaders +GlobalRequest
  Options ExecCGI

now you can either use $r->print(), or tie STDOUT to $r in each thread 
where you want to use it. Do not tie it before starting the threads, since 
you will hit the same problem. The following program demonstrates both 
techniques:

use strict;
use warnings FATAL => 'all';
use threads;
my $r = shift;
$r->print("Content-type: text/plain\n\n");
threads->create(
sub {
   $r->print("thread 1\n");
}, undef);
threads->create(
sub {
tie *STDOUT, $r;
print "thread 2\n";
}, undef);
$r->print("done");
as you use +GlobalRequest you can replace:
my $r = shift;
with
my $r = Apache->request;
but it's a bit slower.

--
__
Stas BekmanJAm_pH --> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


RE: ModPerl Installiation help

2005-02-04 Thread Barksdale, Ray
Since nobody else bit...

First read the user docs => http://perl.apache.org/docs/2.0/user/index.html
(parts I and II at a minimum.)
(Actually, you should read ALL of the docs =>
http://perl.apache.org/docs/2.0/index.html)

Looking at the package list for FC3 you get httpd-2.0.52 and
mod_perl-1.99_16
IF you indeed do the "everything" install. You just have to track down where
the config files for Apache are located. Read the relevant portions of the
modperl2 docs for configuration.

Your other choice is rolling your own as you mentioned.
See docs => http://perl.apache.org/docs/2.0/user/install/install.html
You are more likely to get help for this method on the list as it
appears most folks install this way. + you will be using the later
and cleaner version.

Disclaimer: I am not an expert. I just play one at work.

> -Original Message-
> From: steve silvers [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, February 03, 2005 6:27 PM
> To: modperl@perl.apache.org
> Subject: ModPerl Installiation help
> 
> I just installed Fedora core 3, everything. The default Perl 
> install is 
> 5.8.5 and not sure about Apache for httpd -v does not display 
> the version. 
> My question is how do I now install modperl and get it 
> working. Do I have to 
> download another version of Perl and Apache to rebuild? I 
> have never used 
> modperl before let alone install it. Could someone please 
> point me in the 
> right direction. I really want to learn this.
> 
> Thank you
> Steve
> 
> 


*CONFIDENTIALITY NOTICE*
This e-mail and any files or attachments may contain confidential and
privileged information.  If you have received this message in error, please
notify the sender at the above e-mail address and delete it and all copies
from your system.




Re: [mp2] threaded applications inside of mod_perl

2005-02-04 Thread Stas Bekman
Thanks for the details. I can now reproduce the segfault. I'll post again 
when this is fixed.

--
__
Stas BekmanJAm_pH --> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: AW: Logging user's movements

2005-02-04 Thread Leo Lapworth
On 4 Feb 2005, at 14:16, James Smith wrote:
On Fri, 4 Feb 2005, Denis Banovic wrote:
I have a very similar app running in mod_perl with about 1/2 mio hits 
a day. I need to do some optimisation, so I'm just interessted what 
optimisations that you are using brought you the best improvements.
Was it preloading modules in the startup.pl or caching the 1x1 gif 
image, or maybe optimising the database cache ( I'm using mysql ).
I'm sure you are also having usage peaks, so it would be interessting 
how many hits(inserts)/hour can a single server  machine handle 
approx.
Simplest thing to do is hijack the referer logs, and then parse
them at the end. You just need to add a unique ID for each session
(via a cookie or in the URL) which is added to the logs [or placed
in a standard logged variable]
I totally agree with James. I'm thinking of switching to just using a 
log file for this rather than it being live (as I only generate reports 
once a day). I'm actually using a log based system for user tracking, 
which was implemented after this counter. The counter system is used to 
count how many times a product appears in search results / how many 
times someone views it in detail, a good tip, if you have 20 products 
on the page, Do not call the counter for every one, just pass all the 
id's in - obvious but if your implementing it in a rush you might miss! 
It used to be part of the main search code, but this prevented caching.

The optimisations I did were:
Put in startup.pl (and read the image in as a global from BEGIN block)
use Apache::DBI->connect_on_init() so DBH is from the pool of 
connections and not a new one each time (I'm using MySQL as well).

I have a light (non mod_perl) apache at the front, which proxies to a 
mod_perl apache that runs the module, and the database is on a 3rd 
machine.

I've not got to the point of it overloading the system, so I haven't 
investigated the actual hit rate.

Cheers
Leo


Re: Logging user's movements

2005-02-04 Thread Malcolm J Harwood
On Friday 04 February 2005 3:13 am, ben syverson wrote:

> I'm curious how the "pros" would approach an interesting system design
> problem I'm facing. I'm building a system which keeps track of user's
> movements through a collection of information (for the sake of
> argument, a Wiki). For example, if John moves from the "dinosaur" page
> to the "bird" page, the system logs it -- but only once a day per
> connection between nodes per user. That is, if Jane then travels from
> "dinosaur" to "bird," it will log it, but if "John" travels moves back
> to "dinosaur" from "bird," it won't be logged. The result is a log of
> every unique connection made by every user that day.

What are you doing with the data once you have it? Is there any reason that it 
needs to be 'live'? If not, you could simply add the username in a field in 
the logfile, and post-process the logs (assuming you trust the referer field 
sufficiently). That removes all the load from the webserver

> My initial thoughts on how to improve the system were to relieve
> mod_perl of having to serve the files, and instead write a perl script
> that would run daily to analyze the day's thttpd log files, and then
> update the database. However, certain factors (including the need to
> store user data in cookies, which have to be checked against MySQL)
> make this impossible.

Why does storing user data in cookies prevent you from logging enough to 
identify the user again later? Or are you storing something you need to 
reconstruct the trace that you can't get otherwise?

-- 
"Debugging is twice as hard as writing the code in the first place.
 Therefore, if you write the code as cleverly as possible, you are,
 by definition, not smart enough to debug it."
- Brian W. Kernighan


Re: Intercepting with data for mod_dav

2005-02-04 Thread Stefan Sonnenberg-Carstens
Jeff Finn schrieb:
I've been doing this with mod_perl 2.. here's the relative parts of my
config:

 Alias /dav_files /home/users
 #
 # hook into the other phases for
 #
 
PerlOutputFilterHandler MyEncrypt::output
PerlInputFilterHandler  MyEncrypt::input
PerlSetOutputFilter DEFLATE
#
# Request Phase Handlers
#
PerlAuthenHandlerMyAuthenticate
AuthType basic
AuthName "xxx"
Require valid-user
Satisfy all
 
 #
 # Actual access to the files use will be already authenticated
 #
 
AllowOverride   None
DAV on
 
==
Hope this helps.
 

That looks like my config ;-)
The point is, must I take care of the DAV specific things ?
For example, I wrote a sub named input (whoww!)
which did nothing more than consuming the input files on
PUT request, it should return to the upper/lower layers
afterwards. But the apache process which handled my request, hang.
sub input {
   my $req = shift;
   if($req->method eq "PUT") {
  #maybe nothing here allowed ?
   }
   return Apache::OK;
}
1;
Caused that effect. Am I thinking wrong ?
To the list:
My encryption is proprietary based on the PW the user sends... anyone know a
tested symmetric streaming (not block) encryption algoritm?
 

Mh, for streams with a token size of 32-64 bits, you could try
http://www.simonshepherd.supanet.com/tea.htm
It's easy to implement in perl, as the C source is easy.
I use the C source to encrypt 64 bit tokens, which would
be 8 bytes from a stream.
Perhaps you can then tie that thing between your stream.
Jeff
-Original Message-
From: Stefan Sonnenberg-Carstens [mailto:[EMAIL PROTECTED]
Sent: Friday, February 04, 2005 9:11 AM
To: modperl@perl.apache.org
Subject: Intercepting with data for mod_dav
Hi list,
I'm struggeling a bit with the following :
I set a mod_dav DAV server, which works fine.
One thing I *must* accomplish, is to write the
uploaded files encrypted in some way to the disk,
and publish them back unencrypted.
That should be perfectly possible with
apache's filters.
The problem seems to be, that mod_perl doesn't see
anything if dav is set to on on a specific dir ?
Is that true ?
What I need is small (very small) hint, how to get the data
that the PUT and GET requests offer, or if this
is possible at all.
Thx in advance,
Stefan Sonnenberg
 



RE: Intercepting with data for mod_dav

2005-02-04 Thread Jeff Finn
I've been doing this with mod_perl 2.. here's the relative parts of my
config:

  Alias /dav_files /home/users

  #
  # hook into the other phases for
  #
  
PerlOutputFilterHandler MyEncrypt::output
PerlInputFilterHandler  MyEncrypt::input

PerlSetOutputFilter DEFLATE

#
# Request Phase Handlers
#
PerlAuthenHandlerMyAuthenticate

AuthType basic
AuthName "xxx"
Require valid-user
Satisfy all
  

  #
  # Actual access to the files use will be already authenticated
  #
  
AllowOverride   None
DAV on
  
==

Hope this helps.

To the list:
My encryption is proprietary based on the PW the user sends... anyone know a
tested symmetric streaming (not block) encryption algoritm?

Jeff

-Original Message-
From: Stefan Sonnenberg-Carstens [mailto:[EMAIL PROTECTED]
Sent: Friday, February 04, 2005 9:11 AM
To: modperl@perl.apache.org
Subject: Intercepting with data for mod_dav


Hi list,
I'm struggeling a bit with the following :
I set a mod_dav DAV server, which works fine.
One thing I *must* accomplish, is to write the
uploaded files encrypted in some way to the disk,
and publish them back unencrypted.
That should be perfectly possible with
apache's filters.
The problem seems to be, that mod_perl doesn't see
anything if dav is set to on on a specific dir ?
Is that true ?
What I need is small (very small) hint, how to get the data
that the PUT and GET requests offer, or if this
is possible at all.

Thx in advance,

Stefan Sonnenberg



Re: setting environment variables

2005-02-04 Thread colin_e
Yes I think it's more complicated.
I don't have the original setup that caused my problem, but i'm pretty 
sure I found
that if I set a mixed-case env var (say 'MyEnv_Var') with SetEnv, in my 
mod_perl
app I got the variable set (exists == true) but with no value, whereas 
using PerlSetEnv
with the same variable name, I got the value in %ENV but the var name 
was uppercased.

At the moment i have just worked around the problem by using SetEnv but 
with an
all uppercase varaible name.

Regards; Colin
Randy Kobes wrote:
On Wed, 2 Feb 2005, Stas Bekman wrote:
 

Randy Kobes wrote:
[...]
   

So the behaviour of SetEnv changed from Apache-1 to
Apache-2, as far as Win32 case goes, while PerlSetEnv
maintained the same behaviour from mp1 to mp2.
I suppose one could argue that we should change
PerlSetEnv under mp2 to lower-case things, so as
to be consistent with SetEnv?
 

I think yes. I'm sure you have a patch already :)
   

Actually, things are a bit more complicated on mp2 than I
thought ... The example I gave earlier had 2
SetEnv/PerlSetEnv directives, differing in case, which is a
bit artificial. If there's just one such directive, then
both SetEnv/PerlSetEnv seem to behave normally (taking into
account that, on Windows, $ENV{FOO} and $ENV{foo} are the
same). However, there does seen to be a problem (with
SetEnv) when it's all lower-case
  SetEnv foo bar
in that $ENV{foo} doesn't seem to get set (irrespective of
the case of "foo". There's still a difference between
PerlSetEnv and SetEnv, but I don't see the pattern yet;
I'll keep looking.
 




Re: AW: Logging user's movements

2005-02-04 Thread James Smith
On Fri, 4 Feb 2005, Denis Banovic wrote:

> Hi Leo,
>
> I have a very similar app running in mod_perl with about 1/2 mio hits a day. 
> I need to do some optimisation, so I'm just interessted what optimisations 
> that you are using brought you the best improvements.
> Was it preloading modules in the startup.pl or caching the 1x1 gif image, or 
> maybe optimising the database cache ( I'm using mysql ).
> I'm sure you are also having usage peaks, so it would be interessting how 
> many hits(inserts)/hour can a single server  machine handle approx.
>

Simplest thing to do is hijack the referer logs, and then parse
them at the end. You just need to add a unique ID for each session
(via a cookie or in the URL) which is added to the logs [or placed
in a standard logged variable]

Then write a parser which tracks usage - using referer + page viewed
If you don't want to rely on referers then you could encrypt this in
the URL... (but watch out for search engines who could hammer your site!!)

James

>
> Thanks
>
> Denis
>
>
>
>
>
> -Ursprüngliche Nachricht-
> Von: Leo Lapworth [mailto:[EMAIL PROTECTED]
> Gesendet: Freitag, 4. Februar 2005 10:37
> An: ben syverson
> Cc: modperl@perl.apache.org
> Betreff: Re: Logging user's movements
>
>
> H
> On 4 Feb 2005, at 08:13, ben syverson wrote:
>
> > Hello,
> >
> > I'm curious how the "pros" would approach an interesting system design
> > problem I'm facing. I'm building a system which keeps track of user's
> > movements through a collection of information (for the sake of
> > argument, a Wiki). For example, if John moves from the "dinosaur" page
> > to the "bird" page, the system logs it -- but only once a day per
> > connection between nodes per user. That is, if Jane then travels from
> > "dinosaur" to "bird," it will log it, but if "John" travels moves back
> > to "dinosaur" from "bird," it won't be logged. The result is a log of
> > every unique connection made by every user that day.
> >
> > The question is, how would you do this with the least amount of strain
> > on the server?
> >
> I think the standard approach for user tracking is a 1x1 gif, there are
> lots of ways of doing it, here are 2:
>
> Javascript + Logs - update tracking when logs are processed
> 
> -
>
> Use javascript to set a cookie (session or 24 hours) - if there isn't
> already one. Then use javascript to do a document write to the gif.
>
> so /tracker/c.gif?c=&page=dinosaur
>
> It should then be fast (no live processing) and fairly easy to extract
> this information from the logs and into a db.
>
> Mod_perl - live db updates
> -
> Alternatively if you need live updates create a mod_perl handle that
> sits at /tracker/c.gif, processes the parameters and puts them into a
> database, then returns a gif (I do this, read the gif in and store it
> as a global when the module starts so it just stays in memory). It's
> fast and means you can still get the benefits of caching with squid or
> what ever.
>
> I get about half a million hits a day to my gif.
>
> I think the main point is you should separate it from your main content
> handler if you want it to be flexible and still allow other levels of
> caching.
>
> Cheers
>
> Leo
>
>
> 
> Virus checked by G DATA AntiVirusKit
> Version: AVK 15.0.2702 from 26.01.2005
> Virus news: www.antiviruslab.com
>
>


Intercepting with data for mod_dav

2005-02-04 Thread Stefan Sonnenberg-Carstens
Hi list,
I'm struggeling a bit with the following :
I set a mod_dav DAV server, which works fine.
One thing I *must* accomplish, is to write the
uploaded files encrypted in some way to the disk,
and publish them back unencrypted.
That should be perfectly possible with
apache's filters.
The problem seems to be, that mod_perl doesn't see
anything if dav is set to on on a specific dir ?
Is that true ?
What I need is small (very small) hint, how to get the data
that the PUT and GET requests offer, or if this
is possible at all.
Thx in advance,
Stefan Sonnenberg


AW: Logging user's movements

2005-02-04 Thread Denis Banovic
Hi Leo,

I have a very similar app running in mod_perl with about 1/2 mio hits a day. I 
need to do some optimisation, so I'm just interessted what optimisations that 
you are using brought you the best improvements.
Was it preloading modules in the startup.pl or caching the 1x1 gif image, or 
maybe optimising the database cache ( I'm using mysql ).
I'm sure you are also having usage peaks, so it would be interessting how many 
hits(inserts)/hour can a single server  machine handle approx.


Thanks

Denis





-Ursprüngliche Nachricht-
Von: Leo Lapworth [mailto:[EMAIL PROTECTED] 
Gesendet: Freitag, 4. Februar 2005 10:37
An: ben syverson
Cc: modperl@perl.apache.org
Betreff: Re: Logging user's movements


H
On 4 Feb 2005, at 08:13, ben syverson wrote:

> Hello,
>
> I'm curious how the "pros" would approach an interesting system design  
> problem I'm facing. I'm building a system which keeps track of user's  
> movements through a collection of information (for the sake of  
> argument, a Wiki). For example, if John moves from the "dinosaur" page  
> to the "bird" page, the system logs it -- but only once a day per  
> connection between nodes per user. That is, if Jane then travels from  
> "dinosaur" to "bird," it will log it, but if "John" travels moves back  
> to "dinosaur" from "bird," it won't be logged. The result is a log of  
> every unique connection made by every user that day.
>
> The question is, how would you do this with the least amount of strain  
> on the server?
>
I think the standard approach for user tracking is a 1x1 gif, there are  
lots of ways of doing it, here are 2:

Javascript + Logs - update tracking when logs are processed
 
-

Use javascript to set a cookie (session or 24 hours) - if there isn't  
already one. Then use javascript to do a document write to the gif.

so /tracker/c.gif?c=&page=dinosaur

It should then be fast (no live processing) and fairly easy to extract  
this information from the logs and into a db.

Mod_perl - live db updates
-
Alternatively if you need live updates create a mod_perl handle that  
sits at /tracker/c.gif, processes the parameters and puts them into a  
database, then returns a gif (I do this, read the gif in and store it  
as a global when the module starts so it just stays in memory). It's  
fast and means you can still get the benefits of caching with squid or  
what ever.

I get about half a million hits a day to my gif.

I think the main point is you should separate it from your main content  
handler if you want it to be flexible and still allow other levels of  
caching.

Cheers

Leo



Virus checked by G DATA AntiVirusKit
Version: AVK 15.0.2702 from 26.01.2005
Virus news: www.antiviruslab.com



[JOB] Perl/PHP Web application development

2005-02-04 Thread Denis Banovic
Title: [JOB] Perl/PHP Web application development






Hi!


We search a developer with good programming skills in (mod_)Perl / PHP for a full-time job in Salzburg, Austria. You should also have a working expirience with Linux and MySQL.

We are the biggest internet agency in western Austria. If you are interessted we can help you find a place to stay.


Send your application to mailto:[EMAIL PROTECTED]. 

Please send us your resume, examples of work that you have done,  and anything else that will describe you.



Looking forward to seeing your application,


Denis Banovic



"THINK THE WEB WAY."

---

    NCM - NET COMMUNICATION MANAGEMENT GmbH

---[  Denis Banovic - CTO

    mailto:[EMAIL PROTECTED]

---[  Mühlstrasse 4a

  AT - 5023 Salzburg

  Tel. 0662 / 644 688

---[  Fax: 0662 / 644 688 - 88 

  http://www.ncm.at

---



Virus checked by G DATA AntiVirusKit
Version: AVK 15.0.2702 from 26.01.2005
Virus news: 
www.antiviruslab.com



Re: Logging user's movements

2005-02-04 Thread Leo Lapworth
H
On 4 Feb 2005, at 08:13, ben syverson wrote:
Hello,
I'm curious how the "pros" would approach an interesting system design  
problem I'm facing. I'm building a system which keeps track of user's  
movements through a collection of information (for the sake of  
argument, a Wiki). For example, if John moves from the "dinosaur" page  
to the "bird" page, the system logs it -- but only once a day per  
connection between nodes per user. That is, if Jane then travels from  
"dinosaur" to "bird," it will log it, but if "John" travels moves back  
to "dinosaur" from "bird," it won't be logged. The result is a log of  
every unique connection made by every user that day.

The question is, how would you do this with the least amount of strain  
on the server?

I think the standard approach for user tracking is a 1x1 gif, there are  
lots of ways of doing it, here are 2:

Javascript + Logs - update tracking when logs are processed
 
-

Use javascript to set a cookie (session or 24 hours) - if there isn't  
already one. Then use javascript to do a document write to the gif.

so /tracker/c.gif?c=&page=dinosaur
It should then be fast (no live processing) and fairly easy to extract  
this information from the logs and into a db.

Mod_perl - live db updates
-
Alternatively if you need live updates create a mod_perl handle that  
sits at /tracker/c.gif, processes the parameters and puts them into a  
database, then returns a gif (I do this, read the gif in and store it  
as a global when the module starts so it just stays in memory). It's  
fast and means you can still get the benefits of caching with squid or  
what ever.

I get about half a million hits a day to my gif.
I think the main point is you should separate it from your main content  
handler if you want it to be flexible and still allow other levels of  
caching.

Cheers
Leo


Re: [mp2] threaded applications inside of mod_perl

2005-02-04 Thread bob-modperl
On Thu, 3 Feb 2005, Stas Bekman wrote:
where is the modperl confguration? As far as you've shown there is no 
mod_perl involved in serving any requests. (Hint: show us 
Directory/Location/etc container responsible for a request that has triggered 
the segfault)

  SetHandler perl-script
  PerlResponseHandler ModPerl::Registry
  PerlOptions +ParseHeaders +GlobalRequest
  Options ExecCGI


I have changed the startup.pl file to read to remove as many variables as 
possible.  It still segfaults with this minimal configuration

#!/usr/bin/perl
1;

Please reread my original reply. I still have no idea how you've invoked the 
script. i.e. show us the URL that you've called. and above I've asked you for 
the relevant config section.

I invoke the script by running
http://localhost:8080/apps/test
Resulting in a closed connection and
[Thu Feb 03 21:25:19 2005] [notice] child pid 7393 exit signal 
Segmentation fault (11)


Is a script using threads beneath the mod_perl interpretor expected to 
work or is this a dark corner of mod_perl best left untouched?

#!/usr/bin/perl
use strict;
require threads;
my $thread = threads->create(sub { print "I am a thread"},undef);

Thank you for your time.


ModPerl Installiation help

2005-02-04 Thread steve silvers
I just installed Fedora core 3, everything. The default Perl install is 
5.8.5 and not sure about Apache for httpd -v does not display the version. 
My question is how do I now install modperl and get it working. Do I have to 
download another version of Perl and Apache to rebuild? I have never used 
modperl before let alone install it. Could someone please point me in the 
right direction. I really want to learn this.

Thank you
Steve



ModPerl Installiation help

2005-02-04 Thread steve silvers
I just installed Fedora core 3, everything. The default Perl install is 
5.8.5 and not sure about Apache for httpd -v does not display the version. 
My question is how do I now install modperl and get it working. Do I have to 
download another version of Perl and Apache to rebuild? I have never used 
modperl before let alone install it. Could someone please point me in the 
right direction. I really want to learn this.

Thank you
Steve



Logging user's movements

2005-02-04 Thread ben syverson
Hello,
I'm curious how the "pros" would approach an interesting system design 
problem I'm facing. I'm building a system which keeps track of user's 
movements through a collection of information (for the sake of 
argument, a Wiki). For example, if John moves from the "dinosaur" page 
to the "bird" page, the system logs it -- but only once a day per 
connection between nodes per user. That is, if Jane then travels from 
"dinosaur" to "bird," it will log it, but if "John" travels moves back 
to "dinosaur" from "bird," it won't be logged. The result is a log of 
every unique connection made by every user that day.

The question is, how would you do this with the least amount of strain 
on the server?

Currently, I'm using Squid to switch between thttpd (for non-"Wiki" 
files) and mod_perl, with the metadata in MySQL, and the text data in 
flatfiles (don't worry, everything's write-once). The code I'm using to 
generate the "Wiki" pages is fairly fast as I'm testing it, but it's 
not clear (and impossible to test) how well it will scale as more nodes 
and users are added. As a defensive measure, I'm caching the HTML 
output of the mod_perl handler, but the cached files aren't being 
served by thttpd, because the handler still needs to register where 
people are going. So every time a page is requested, the handler looks 
and sees if this user has made this connection in the past 24 hours, if 
not log it, and then either serve the cached file or generate a new one 
(they go out of date sporadically).

My initial thoughts on how to improve the system were to relieve 
mod_perl of having to serve the files, and instead write a perl script 
that would run daily to analyze the day's thttpd log files, and then 
update the database. However, certain factors (including the need to 
store user data in cookies, which have to be checked against MySQL) 
make this impossible.

Am I on the right track with this?
- ben