Re: Ways to scale a mod_perl site

2009-09-16 Thread Michael Peters

On 09/16/2009 11:49 AM, Igor Chudov wrote:


1) Use a load balancer like perlbal (I am already doing that)


A load balancer is good but so are proxies. If you can separate your 
application server from the server that servers static content then 
you'll get a boost even if they are on the same machine.



2) Separate a MySQL database server from webservers.


This is probably the first and easiest thing you should do.


3) Being enabled by item 2, add more webservers and balancers
4) Create a separate database for cookie data (Apache::Session objects)
??? -- not sure if good idea --


I've never seen the need to do that. In fact, I would suggest you drop 
sessions altogether if you can. If you need any per-session information 
then put it in a cookie. If you need this information to be tamper-proof 
then you can create a hash of the cookie's data that you store as part 
of the cookie. If you can reduce the # of times that each request needs 
to actually hit the database you'll have big wins.



5) Use a separate database handle for readonly database requests
(SELECT), as opposed to INSERTS and UPDATEs. Use replication to access
multiple slave servers for read only data, and only access the master
for INSERT and UPDATE and DELETE.


Reducing DB usage is more important than this. Also, before you go down 
that road you should look at adding a caching layer to your application 
(memcached is a popular choice).


--
Michael Peters
Plus Three, LP


Re: Ways to scale a mod_perl site

2009-09-16 Thread Igor Chudov
On Wed, Sep 16, 2009 at 11:05 AM, Michael Peters wrote:

> On 09/16/2009 11:49 AM, Igor Chudov wrote:
>
>  1) Use a load balancer like perlbal (I am already doing that)
>>
>
> A load balancer is good but so are proxies. If you can separate your
> application server from the server that servers static content then you'll
> get a boost even if they are on the same machine.
>

I have very little static content. Even images are generated. My site
generates images of math formulae such as (x-1)/(x+1) on the fly.,


>
>  2) Separate a MySQL database server from webservers.
>>
>
> This is probably the first and easiest thing you should do.
>
>
 agreed

 3) Being enabled by item 2, add more webservers and balancers
>> 4) Create a separate database for cookie data (Apache::Session objects)
>> ??? -- not sure if good idea --
>>
>
> I've never seen the need to do that. In fact, I would suggest you drop
> sessions altogether if you can. If you need any per-session information then
> put it in a cookie. If you need this information to be tamper-proof then you
> can create a hash of the cookie's data that you store as part of the cookie.
> If you can reduce the # of times that each request needs to actually hit the
> database you'll have big wins.
>

I use sessions to keep users logged on. So the cookie is just an ID, and the
sessions table stores data such as authenticated userid.

I will double check, however, whether I give people sessions even if they
are not logged in.

Or maybe I can give them a cookie that will say "I am not logged in, do not
bother looking up my session".

Hm


>  5) Use a separate database handle for readonly database requests
>> (SELECT), as opposed to INSERTS and UPDATEs. Use replication to access
>> multiple slave servers for read only data, and only access the master
>> for INSERT and UPDATE and DELETE.
>>
>
> Reducing DB usage is more important than this. Also, before you go down
> that road you should look at adding a caching layer to your application
> (memcached is a popular choice).
>
>
It is not going to be that helpful due to dynamic content. (which is my
site's advantage). But this may be useful for other applications.

i


Re: Ways to scale a mod_perl site

2009-09-16 Thread Igor Chudov
On Wed, Sep 16, 2009 at 11:15 AM, C. J. L. wrote

>
> I would buy a fast server with 4 or more cpu cores and the SSD or SAS
> drives and run the backend db on a dedicated mysql instance.
>


By the way, guys, the performance difference between a regular SATA drive
and a fast SAS drive is comparatively small.

The difference between a SAS drive and an SSD drive is tremendous.

i


Re: Ways to scale a mod_perl site

2009-09-16 Thread Adam Prime

Igor Chudov wrote:



On Wed, Sep 16, 2009 at 11:05 AM, Michael Peters > wrote:


On 09/16/2009 11:49 AM, Igor Chudov wrote:

1) Use a load balancer like perlbal (I am already doing that)


A load balancer is good but so are proxies. If you can separate your
application server from the server that servers static content then
you'll get a boost even if they are on the same machine.


I have very little static content. Even images are generated. My site 
generates images of math formulae such as (x-1)/(x+1) on the fly.,


I can understand generating them on the fly for flexibility reasons, but 
I'd cache them, and serve them statically after that, rather than 
regenerate the images on every single request.  You can accomplish that 
in the app itself, or just by throwing a caching proxy in front of it 
(maybe you're already doing this with perlbal)


Adam


Re: Ways to scale a mod_perl site

2009-09-16 Thread Igor Chudov
On Wed, Sep 16, 2009 at 11:48 AM, Adam Prime  wrote:

> Igor Chudov wrote
>>
>>
>> I have very little static content. Even images are generated. My site
>> generates images of math formulae such as (x-1)/(x+1) on the fly.,
>>
>
> I can understand generating them on the fly for flexibility reasons, but
> I'd cache them, and serve them statically after that, rather than regenerate
> the images on every single request.  You can accomplish that in the app
> itself, or just by throwing a caching proxy in front of it (maybe you're
> already doing this with perlbal)
>
>
I actually do cache generated pictures, I store them in a database table
called 'bincache'. This way I do not have to compute and draw every image on
the fly. If I have a picture in bincache, I serve it, and if I do not, I
generate it and save it. That saves some CPU, but makes mysql work harder.

i


Re: Ways to scale a mod_perl site

2009-09-16 Thread Michael Peters

On 09/16/2009 12:13 PM, Brad Van Sickle wrote:


Can I get you to explain this a little more? I don't see how this could
be used for truly secure sites because I don't quite understand how
storing a hash in a plain text cookie would be secure.


If you need to store per-session data about a client that the client 
shouldn't be able to see, then you just encrypt that data, base-64 
encode it and then put it into a cookie.


If you don't care if the user sees that information you just want to 
make sure that they don't change it then add an extra secure hash of 
that information to the cookie itself and then verify it when you 
receive it.


I like to use JSON for my cookie data because it's simple and fast, but 
any serializer should work. Something like this:


use JSON qw(to_json from_json);
use Digest::MD5 qw(md5_hex);
use MIME::Base64::URLSafe qw(urlsafe_b64encode urlsafe_b64decode);

# to generate the cookie
my %data = ( foo => 1, bar => 2, baz => 'frob' );
$data{secure} = generate_data_hash(\%data);
my $cookie = urlsafe_b64encode(to_json(\%data));
print "Cookie: $cookie\n";

# to process/validate the cookie
my $new_data = from_json(urlsafe_b64decode($cookie));
my $new_hash = delete $new_data->{secure};
if( $new_hash eq generate_data_hash($new_data) ) {
print "Cookie is ok!\n";
} else {
print "Cookie has been tampered with! Ignore.\n";
}

# very simple hash generation function
sub generate_data_hash {
my $data = shift;
my $secret = 'some configured secret';
return md5_hex($secret . join('|', map { "$_ - $data->{$_}" } keys 
%$data));

}

Doing encryption and encoding on small bits of data (like cookies) in 
memory will almost always be faster than having to hit the database 
(especially if it's on another machine). But the biggest reason is that 
it takes the load off the DB and puts it on the web machines which are 
much easier to scale linearly.


> I know a lot of true app servers (Websphere, etc..) store

this data in cached memory,


You could do the same with your session data, or even store it in a 
shared resource like a BDB file. But unless it's available to all of 
your web servers you're stuck with "sticky" sessions and that's a real 
killer for performance/scalability.


--
Michael Peters
Plus Three, LP


Re: Ways to scale a mod_perl site

2009-09-16 Thread Michael Peters

On 09/16/2009 12:48 PM, Adam Prime wrote:


I have very little static content. Even images are generated. My site
generates images of math formulae such as (x-1)/(x+1) on the fly.,


I can understand generating them on the fly for flexibility reasons, but
I'd cache them, and serve them statically after that, rather than
regenerate the images on every single request.


Definitely good advice. Especially if your images are generated the same 
each time and never change. For instance, I don't think the image 
generated by the formula "(x-1)/(x+1)" would ever change (unless you 
changed your application code and in that case you can clear you cache).


--
Michael Peters
Plus Three, LP


Re: Ways to scale a mod_perl site

2009-09-16 Thread Michael Peters

On 09/16/2009 01:02 PM, Igor Chudov wrote:


I actually do cache generated pictures, I store them in a database table
called 'bincache'. This way I do not have to compute and draw every
image on the fly. If I have a picture in bincache, I serve it, and if I
do not, I generate it and save it. That saves some CPU, but makes mysql
work harder.


Then don't put it in your database. A cache is not a permanent store and 
it's usage patterns will be different than a database. I'd either use a 
real cache like memcached or have your proxies cache them. In addition 
to that you can send the appropriate HTTP cache headers so that browsers 
themselves will never request that image again. Make the client machine 
do the caching.


--
Michael Peters
Plus Three, LP


Re: Ways to scale a mod_perl site

2009-09-16 Thread Douglas Sims
I'm curious... what is the hardware like on the one server?  How many CPUs
and RAM?

Also, a few thoughts...

- You do a 301 from algebra.com to www.algebra.com.  That doesn't take much
work from the server, but why not just serve up everything from the original
location?

- The algebra problem I just tried returned twelve separate images.  What
if, instead of serving gifs you displayed each stage of transformation of
the equation using HTML and CSS?  That would be rather tricky with things
like root signs but I think it could be done - though a bit of work.

I wish this site had been around when I was in high school.



On Wed, Sep 16, 2009 at 11:48 AM, Adam Prime  wrote:

> Igor Chudov wrote:
>
>>
>>
>> On Wed, Sep 16, 2009 at 11:05 AM, Michael Peters 
>> > mpet...@plusthree.com>> wrote:
>>
>>On 09/16/2009 11:49 AM, Igor Chudov wrote:
>>
>>1) Use a load balancer like perlbal (I am already doing that)
>>
>>
>>A load balancer is good but so are proxies. If you can separate your
>>application server from the server that servers static content then
>>you'll get a boost even if they are on the same machine.
>>
>>
>> I have very little static content. Even images are generated. My site
>> generates images of math formulae such as (x-1)/(x+1) on the fly.,
>>
>
> I can understand generating them on the fly for flexibility reasons, but
> I'd cache them, and serve them statically after that, rather than regenerate
> the images on every single request.  You can accomplish that in the app
> itself, or just by throwing a caching proxy in front of it (maybe you're
> already doing this with perlbal)
>
> Adam
>


Re: Ways to scale a mod_perl site

2009-09-16 Thread Igor Chudov
Guys, I completely love this discussion about cookies. You have really
enlightened me.

I think that letting users store cookie info in a manner that is secure
(involves both encryption and some form of authentication), instead of
storing them in a table, could possibly result in a very substantial
reduction of database use.

The cookie is

1) Encrypted string that I want and
2) MD5 of that string with a secret code appended that the users do not
know, which serves as a form of signing

That should work. I will not change it now, but will do if I get 2x more
traffic.

That way I would need zero hits to the database to handle my users sessions.


(I only retrieve account information when necessary)

As far as I remember now, I do not store much more information in a session
beyond username. (I hope that I am not wrong). So it should be easy.

Even now, I make sure that I reset the cookie table only every several
months. This way I would let users stay logged on forever.

Thanks a lot.

Igor


Re: Ways to scale a mod_perl site

2009-09-16 Thread Igor Chudov
On Wed, Sep 16, 2009 at 12:21 PM, Douglas Sims  wrote:

> I'm curious... what is the hardware like on the one server?  How many CPUs
> and RAM?
>
>
AMD Athlon quad core, running 32 bit Ubuntu Hardy. 16 GB of RAM. Algebra.Com
data is stored on an SSD>


> Also, a few thoughts...
>
> - You do a 301 from algebra.com to www.algebra.com.  That doesn't take
> much work from the server, but why not just serve up everything from the
> original location?
>
>
then I will have to serve algebra.com twice to all search engines.


> - The algebra problem I just tried returned twelve separate images.  What
> if, instead of serving gifs you displayed each stage of transformation of
> the equation using HTML and CSS?  That would be rather tricky with things
> like root signs but I think it could be done - though a bit of work.
>
>
I rather like the way I do it, I let my site render images exactly how I
want, as opposed to letting browsers do it.



> I wish this site had been around when I was in high school.
>
>
>
thanks. I have some real math addicts on my site, who solved many thousands
of problems and helped hundreds of kids. I am glad to serve them.

i



>
> On Wed, Sep 16, 2009 at 11:48 AM, Adam Prime wrote:
>
>> Igor Chudov wrote:
>>
>>>
>>>
>>> On Wed, Sep 16, 2009 at 11:05 AM, Michael Peters 
>>> >> mpet...@plusthree.com>> wrote:
>>>
>>>On 09/16/2009 11:49 AM, Igor Chudov wrote:
>>>
>>>1) Use a load balancer like perlbal (I am already doing that)
>>>
>>>
>>>A load balancer is good but so are proxies. If you can separate your
>>>application server from the server that servers static content then
>>>you'll get a boost even if they are on the same machine.
>>>
>>>
>>> I have very little static content. Even images are generated. My site
>>> generates images of math formulae such as (x-1)/(x+1) on the fly.,
>>>
>>
>> I can understand generating them on the fly for flexibility reasons, but
>> I'd cache them, and serve them statically after that, rather than regenerate
>> the images on every single request.  You can accomplish that in the app
>> itself, or just by throwing a caching proxy in front of it (maybe you're
>> already doing this with perlbal)
>>
>> Adam
>>
>
>


Re: Ways to scale a mod_perl site

2009-09-16 Thread Perrin Harkins
On Wed, Sep 16, 2009 at 11:49 AM, Igor Chudov  wrote:
> Any thoughts?

In addition to the good advice you're getting on the thread, here are
some books you might find useful:

- Practical mod_perl -- http://modperlbook.org/ -- is old, but has a
lot of general architecture and tuning advice that really hasn't
changed much since then.

- High-Performance MySQL, the best book available on MySQL tuning.

- Building Scalable Websites, which is about PHP sites, but has good
food for thought.

- Scalable Internet Architectures, a book that is more about general
principles to apply to the problem.

And, the most important piece of advice: Devel::NYTProf.

Happy tuning,
Perrin


Re: Ways to scale a mod_perl site

2009-09-16 Thread Igor Chudov
Perrin, thanks a lot. I bought all books recommended below. Should be a good
read.

I want to be ready when the need arises, and I do not want to do anything
stupid in the meantime that would make me not scalable.

Again, thank you.

Igor

On Wed, Sep 16, 2009 at 1:12 PM, Perrin Harkins  wrote:

> On Wed, Sep 16, 2009 at 11:49 AM, Igor Chudov  wrote:
> > Any thoughts?
>
> In addition to the good advice you're getting on the thread, here are
> some books you might find useful:
>
> - Practical mod_perl -- http://modperlbook.org/ -- is old, but has a
> lot of general architecture and tuning advice that really hasn't
> changed much since then.
>
> - High-Performance MySQL, the best book available on MySQL tuning.
>
> - Building Scalable Websites, which is about PHP sites, but has good
> food for thought.
>
> - Scalable Internet Architectures, a book that is more about general
> principles to apply to the problem.
>
> And, the most important piece of advice: Devel::NYTProf.
>
> Happy tuning,
> Perrin
>


Re: Ways to scale a mod_perl site

2009-09-16 Thread Scott Gifford
Igor Chudov  writes:

> My algebra.com server serves about 77k pageviews and a little over a million
> objects requests per day (with half of it being served in just 4 hours). I 
> peak
> out at 35 requests per second currently.

Some high-level advice: Profile everything you can to see where your
bottlenecks are.  If you don't have bottlenecks, simulate enough load
that you do.  I am frequently surprised by what turn out to be the
slow parts of my code.

At a high-level, you can use tools like top, vmstat, iostat, iotop,
etc. to check whether it's CPU, memory, or disk space that you're
running out of first.

For CPU, you can use top to see which process is using most of your
CPU, the database, app, or something else.

Inside your app, you can use Perl's profiling tools to see which parts
of your app need to be sped up.

Hope this is helpful!

Scott.


Re: Ways to scale a mod_perl site

2009-09-16 Thread Jeff Peng
How many servers?
We have run the systems with about 500 million PV each day, with many squid 
boxes + 200 apache webservers + 200 mysql hosts.
The applications were written with FastCGI.

-Original Message-

From: Igor Chudov 

Sent: Sep 16, 2009 11:49 AM

To: Mod_Perl 

Subject: Ways to scale a mod_perl site



My algebra.com server serves about 77k pageviews and a little over a million 
objects requests per day (with half of it being served in just 4 hours). I peak 
out at 35 requests per second currently. 


I use mod_perl, mysql, and perlbal with everything running on one server. 

The server has a solid state disk to hold mysql data. 

I believe that it can handle 3x-5x more traffic all by itself. However, I am 
thinking of ways to scale up a mod_perl installation. 


1) Use a load balancer like perlbal (I am already doing that)
2) Separate a MySQL database server from webservers. 
3) Being enabled by item 2, add more webservers and balancers
4) Create a separate database for cookie data (Apache::Session objects) ??? -- 
not sure if good idea --


(next level)

5) Use a separate database handle for readonly database requests (SELECT), as 
opposed to INSERTS and UPDATEs. Use replication to access multiple slave 
servers for read only data, and only access the master for INSERT and UPDATE 
and DELETE. 


Any thoughts?


Re: Ways to scale a mod_perl site

2009-09-17 Thread Cosimo Streppone

Jeff Peng  wrote:


How many servers?
We have run the systems with about 500 million PV each day, with many  
squid boxes + 200 apache webservers + 200 mysql hosts.

The applications were written with FastCGI.


Wow! Why don't you tell or blog a bit about this?
I would love to know more about what challenges
you went through.

Maybe someone else has also stories to tell.

At Opera Software, our team works on a social network website,
and we currently serve 2 million page views per day peak,
with something around 700,000 unique visitors per day peak,
and ~120M hits per day with under 20 servers.

Those include database servers, apache fronts,
mod_perl backends, varnish/memcached caches, upload servers,
cronjobs/mail, etc...

For the curious:
http://www.slideshare.net/cstrep/myoperacom-scalability-v20

And yes, we're looking for ways to optimize/scale better
our application, since we're growing more and more... :-)

--
Cosimo


Re: Ways to scale a mod_perl site

2009-09-17 Thread Cosimo Streppone
In data 17 september 2009 alle ore 09:43:50, Cosimo Streppone  
 ha scritto:



Jeff Peng  wrote:


How many servers?
We have run the systems with about 500 million PV each day, with many  
squid boxes + 200 apache webservers + 200 mysql hosts.

The applications were written with FastCGI.


Wow! Why don't you tell or blog a bit about this?

[...]
and we currently serve [...]


Mmh, I just re-read that, and I realized it may sound
like: "Wow, look at how cool we are!".

It wasn't meant to sound like that, obviously,
but more like: here's our experience. We'd like to confront
with others on this list.

:-)

--
Cosimo


Re: Ways to scale a mod_perl site

2009-09-17 Thread Jeff Peng


-Original Message-
>From: Cosimo Streppone 
>Sent: Sep 17, 2009 3:43 AM
>To: Mod_perl users 
>Cc: Jeff Peng 
>Subject: Re: Ways to scale a mod_perl site
>
>Jeff Peng  wrote:
>
>> How many servers?
>> We have run the systems with about 500 million PV each day, with many  
>> squid boxes + 200 apache webservers + 200 mysql hosts.
>> The applications were written with FastCGI.
>
>Wow! Why don't you tell or blog a bit about this?
>I would love to know more about what challenges
>you went through.
>


Yup, at that time the primary pressure against performance was database.
We used distributed Mysql servers with an oracle index server.
Each mysql host served 1 - 1.5 million users.
When an user logined, the application queried oracle to get the mysql host id 
with the key of username.
Then the application queried to mysql and got anything it wanted.
The systems generated 2T data each day (surely we had large amount of store).

The front apache servers with FastCGI were running heavily, I remember 8G 
memory were almost eated.
Squid was useful for static resources, but for dynamic applications like CGI, 
no way to reduce the pressure but adding more machines.

Last, the applications are webmail, the best popolar provider here.


Regards,
Jeff Peng


Re: Ways to scale a mod_perl site

2009-09-17 Thread James Smith

Igor Chudov wrote:

Guys, I completely love this discussion about cookies. You have really
enlightened me.

I think that letting users store cookie info in a manner that is secure
(involves both encryption and some form of authentication), instead of
storing them in a table, could possibly result in a very substantial
reduction of database use.

  
Alternatively store the information in a two level cache! 
memcached/database - with
write through - then most of the time you get the data from memcached - 
you can do

the same with the images...

write entry: -> write data to memcached ; write data to sql cache

read entry: -> read data from memcached and return OR
  read data from sql cache and write to memcached 
and return


Should avoid most database reads! works well for the images you create 
to minimize

database accesses

The cookie is

1) Encrypted string that I want and
2) MD5 of that string with a secret code appended that the users do not
know, which serves as a form of signing

That should work. I will not change it now, but will do if I get 2x more
traffic.

That way I would need zero hits to the database to handle my users sessions.


(I only retrieve account information when necessary)

As far as I remember now, I do not store much more information in a session
beyond username. (I hope that I am not wrong). So it should be easy.

Even now, I make sure that I reset the cookie table only every several
months. This way I would let users stay logged on forever.

Thanks a lot.

Igor

  





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 


Re: Ways to scale a mod_perl site

2009-09-17 Thread Phil Van
Just curious: since you are already running FastCGI, why not serving
dynamic contents directly via it? Also, you may eliminate Squid. Using
Apache for static content is good enough (easy to get 5k static PV per
second per server, or 400 millions per day).


Phil



On 9/17/09, Jeff Peng  wrote:
>
>
> -Original Message-
>>From: Cosimo Streppone 
>>Sent: Sep 17, 2009 3:43 AM
>>To: Mod_perl users 
>>Cc: Jeff Peng 
>>Subject: Re: Ways to scale a mod_perl site
>>
>>Jeff Peng  wrote:
>>
>>> How many servers?
>>> We have run the systems with about 500 million PV each day, with many
>>> squid boxes + 200 apache webservers + 200 mysql hosts.
>>> The applications were written with FastCGI.
>>
>>Wow! Why don't you tell or blog a bit about this?
>>I would love to know more about what challenges
>>you went through.
>>
>
>
> Yup, at that time the primary pressure against performance was database.
> We used distributed Mysql servers with an oracle index server.
> Each mysql host served 1 - 1.5 million users.
> When an user logined, the application queried oracle to get the mysql host
> id with the key of username.
> Then the application queried to mysql and got anything it wanted.
> The systems generated 2T data each day (surely we had large amount of
> store).
>
> The front apache servers with FastCGI were running heavily, I remember 8G
> memory were almost eated.
> Squid was useful for static resources, but for dynamic applications like
> CGI, no way to reduce the pressure but adding more machines.
>
> Last, the applications are webmail, the best popolar provider here.
>
>
> Regards,
> Jeff Peng
>


Re: Ways to scale a mod_perl site

2009-09-17 Thread Torsten Foertsch
On Wed 16 Sep 2009, Igor Chudov wrote:
> >> I have very little static content. Even images are generated. My
> >> site generates images of math formulae such as (x-1)/(x+1) on the
> >> fly.,
> >
> > I can understand generating them on the fly for flexibility
> > reasons, but I'd cache them, and serve them statically after that,
> > rather than regenerate the images on every single request.  You can
> > accomplish that in the app itself, or just by throwing a caching
> > proxy in front of it (maybe you're already doing this with perlbal)
>
> I actually do cache generated pictures, I store them in a database
> table called 'bincache'. This way I do not have to compute and draw
> every image on the fly. If I have a picture in bincache, I serve it,
> and if I do not, I generate it and save it. That saves some CPU, but
> makes mysql work harder.

I'd go for Apache's mod_cache + mod_disk_cache. The only thing you have 
to do is to set cache control headers. Mod_cache is really fast b/c it 
skips almost all of the http request cycle. And in your case it takes 
load from the database. The request won't even hit mod_perl.

Torsten

-- 
Need professional mod_perl support?
Just hire me: torsten.foert...@gmx.net


Re: Ways to scale a mod_perl site

2009-09-17 Thread David Nicol
On Thu, Sep 17, 2009 at 4:23 PM, Torsten Foertsch
 wrote:

> I'd go for Apache's mod_cache + mod_disk_cache. The only thing you have
> to do is to set cache control headers. Mod_cache is really fast b/c it
> skips almost all of the http request cycle. And in your case it takes
> load from the database. The request won't even hit mod_perl.
>
> Torsten

it seems like an equivalent way to do this possibly with less
configuration would be to generate the cacheables with file names
representing their input parameters and do the construction of the new
ones with a custom 404 handler. TMTOWTDI.



-- 
"As if you could kill time without injuring eternity!"  -- Henry David Thoreau


Re: Ways to scale a mod_perl site

2009-09-17 Thread Jeff Peng


-Original Message-
>From: Phil Van 
>Sent: Sep 18, 2009 4:10 AM
>To: Jeff Peng 
>Cc: modperl-list 
>Subject: Re: Ways to scale a mod_perl site
>
>Just curious: since you are already running FastCGI, why not serving
>dynamic contents directly via it? 

we needed some reverse proxies for CDN.
for example, our primary webservers were in ISP A, while in ISP B, we put some 
squid as reverse proxies to serve the users in local ISP.


>Also, you may eliminate Squid. Using
>Apache for static content is good enough (easy to get 5k static PV per
>second per server, or 400 millions per day).
>

No. I'm sure serving static content Apache is worse than squid.
when I was in another department, I maintained the systems for AD union (like 
google's AD).
all content are static, PV of each day was about 200 million.
but we had less than 20 squid (IIRC it was 18) boxes for handling this amount 
of requests.
the same amount of Apache couldn't handle that case.



Regards,
Jeff Peng


Re: Ways to scale a mod_perl site

2009-09-18 Thread Tina Mueller

On Wed, 16 Sep 2009, Igor Chudov wrote:


On Wed, Sep 16, 2009 at 11:05 AM, Michael Peters wrote:


Reducing DB usage is more important than this. Also, before you go down
that road you should look at adding a caching layer to your application
(memcached is a popular choice).



It is not going to be that helpful due to dynamic content. (which is my
site's advantage). But this may be useful for other applications.


That's a common misconception, I think. Even if a website is completely
dynamic you can cache things.
First misunderstanding: people think about caching a whole HTML page.
In memcached you typically cache data structures.
As an example I will take my portal software. It has a forum, blog,
guestbook, it has a list of users who are online, the forum has a
"posts from the last 24 hours", and it has other things that are
shown with every request (new private messages, notes for moderators
about new forum threads, ...)

Now, Should I fetch the online users from the database with every request?
Should I fetch all the threads and authors of the last 24 hours whenever
somebody requests that page, even if nothing has changed?

First answer: the list of online users can be cached for, say 1 minute.
Nobody will care or even notice. Only if someone logs in you will expire
the entry in memcached. All other changes are not so important that
you cannot cache them for one single minute.

Make that 1 page view per second and you save 59 database requests per
minute.


Second: The list of the recent posts can be cached, let's say for 3 minutes.
The entry in memcached is only expired explicitly when somebody posts a new
thread/reply or a title is changed etc.

I believe *every* application has things like that which you can cache.
I know of a website that reduced its load dramatically when using
memcached. It's quite a big webseite (300 million Page views per month
locally (mostly requests from german speaking countries)).
But some people are reluctant to use memcached. One person said to me
"what if storing the data in memcached is more work than fetching it
from the database every time?" I don't know what to say. Try it out.
I know the example of the company who uses memcached.
Did you know faceboook uses memcached very very extensively?
If you're not sure: analyse your website usage. What kind of data
is fetched how often. Make a testcase and use memcached for that and
see what's faster.


Another thing you could do is: seperate your database schema.
Some tables do not connect to others. For example, my portal software
is modular, so that you can activate/deactive certain modules.
The easiest thing was to just create one (DBIx::Class) schema per
module. Of course, what connects all these is the user schema, and
because I cannot do joins to the user tables any more I might have
a request more here and there. But I can sepereate all these schemas
and put them all on their own database server. With every request,
you only need a part of all the schemas.
Typically the highest load is on the database so splitting the db to several
servers like this might be an option.

And last but not least: for searching the database, use a search engine.
KinoSearch works quite well, and there are also other search engines for perl.

regards,
tina


Re: Ways to scale a mod_perl site

2009-09-18 Thread Brad Van Sickle





3) Being enabled by item 2, add more webservers and balancers
4) Create a separate database for cookie data (Apache::Session objects)
??? -- not sure if good idea --


I've never seen the need to do that. In fact, I would suggest you drop 
sessions altogether if you can. If you need any per-session 
information then put it in a cookie. If you need this information to 
be tamper-proof then you can create a hash of the cookie's data that 
you store as part of the cookie. If you can reduce the # of times that 
each request needs to actually hit the database you'll have big wins.





Can I get you to explain this a little more?  I don't see how this could 
be used for truly secure sites because I don't quite understand how 
storing a hash in a plain text cookie would be secure. 

The thing I hate most about my "secure" applications is the fact that I 
have to read the DB at the start of every request to ensure that the 
session cookie is valid and to extract information about the user from 
the session table using the session ID stored in the cookie.  Hitting 
the DB is the quickest way to kill performance and scalability in my 
experience.I know a lot of true app servers (Websphere, etc..)  
store this data in cached memory, but I was unaware that there might be 
an option for doing this without using a DB with mod_perl .


Re: Ways to scale a mod_perl site

2009-09-18 Thread Jeff Peng


-Original Message-
>From: Brad Van Sickle 
>Sent: Sep 17, 2009 12:13 AM
>To: Michael Peters 
>Cc: Mod_Perl 
>Subject: Re: Ways to scale a mod_perl site

> but I was unaware that there might be 
>an option for doing this without using a DB with mod_perl .

As Tina said, how about using memcached for this case?

Regards,
Jeff Peng


Re: Ways to scale a mod_perl site

2009-09-18 Thread Igor Chudov
Michael, you inspired me to reimplement cookies this way. For my site, the
cookie table is the most frequently updated one (even though I do not grant
cookies to search engines). I will try to use this kind of implementation.

Even now, my users like the fact that they can stay signed  on forever, but
now I can do it at no cost to myself.

A quick question, is there an existing perl module to do this sort of thing?

Igor

On Wed, Sep 16, 2009 at 12:11 PM, Michael Peters wrote:

> On 09/16/2009 12:13 PM, Brad Van Sickle wrote:
>
>  Can I get you to explain this a little more? I don't see how this could
>> be used for truly secure sites because I don't quite understand how
>> storing a hash in a plain text cookie would be secure.
>>
>
> If you need to store per-session data about a client that the client
> shouldn't be able to see, then you just encrypt that data, base-64 encode it
> and then put it into a cookie.
>
> If you don't care if the user sees that information you just want to make
> sure that they don't change it then add an extra secure hash of that
> information to the cookie itself and then verify it when you receive it.
>
> I like to use JSON for my cookie data because it's simple and fast, but any
> serializer should work. Something like this:
>
> use JSON qw(to_json from_json);
> use Digest::MD5 qw(md5_hex);
> use MIME::Base64::URLSafe qw(urlsafe_b64encode urlsafe_b64decode);
>
> # to generate the cookie
> my %data = ( foo => 1, bar => 2, baz => 'frob' );
> $data{secure} = generate_data_hash(\%data);
> my $cookie = urlsafe_b64encode(to_json(\%data));
> print "Cookie: $cookie\n";
>
> # to process/validate the cookie
> my $new_data = from_json(urlsafe_b64decode($cookie));
> my $new_hash = delete $new_data->{secure};
> if( $new_hash eq generate_data_hash($new_data) ) {
>print "Cookie is ok!\n";
> } else {
>print "Cookie has been tampered with! Ignore.\n";
> }
>
> # very simple hash generation function
> sub generate_data_hash {
>my $data = shift;
>my $secret = 'some configured secret';
>return md5_hex($secret . join('|', map { "$_ - $data->{$_}" } keys
> %$data));
> }
>
> Doing encryption and encoding on small bits of data (like cookies) in
> memory will almost always be faster than having to hit the database
> (especially if it's on another machine). But the biggest reason is that it
> takes the load off the DB and puts it on the web machines which are much
> easier to scale linearly.
>
> > I know a lot of true app servers (Websphere, etc..) store
>
>> this data in cached memory,
>>
>
> You could do the same with your session data, or even store it in a shared
> resource like a BDB file. But unless it's available to all of your web
> servers you're stuck with "sticky" sessions and that's a real killer for
> performance/scalability.
>
>
> --
> Michael Peters
> Plus Three, LP
>


Re: Ways to scale a mod_perl site

2009-09-18 Thread Fayland Lam
This?
http://search.cpan.org/~jkrasnoo/ApacheCookieEncrypted-0.03/Encrypted.pm

Catalyst has a plugin:
http://search.cpan.org/~lbrocard/Catalyst-Plugin-CookiedSession-0.35/lib/Catalyst/Plugin/CookiedSession.pm

Thanks.

On Fri, Sep 18, 2009 at 9:06 PM, Igor Chudov  wrote:
> Michael, you inspired me to reimplement cookies this way. For my site, the
> cookie table is the most frequently updated one (even though I do not grant
> cookies to search engines). I will try to use this kind of implementation.
>
> Even now, my users like the fact that they can stay signed  on forever, but
> now I can do it at no cost to myself.
>
> A quick question, is there an existing perl module to do this sort of thing?
>
> Igor
>
> On Wed, Sep 16, 2009 at 12:11 PM, Michael Peters 
> wrote:
>>
>> On 09/16/2009 12:13 PM, Brad Van Sickle wrote:
>>
>>> Can I get you to explain this a little more? I don't see how this could
>>> be used for truly secure sites because I don't quite understand how
>>> storing a hash in a plain text cookie would be secure.
>>
>> If you need to store per-session data about a client that the client
>> shouldn't be able to see, then you just encrypt that data, base-64 encode it
>> and then put it into a cookie.
>>
>> If you don't care if the user sees that information you just want to make
>> sure that they don't change it then add an extra secure hash of that
>> information to the cookie itself and then verify it when you receive it.
>>
>> I like to use JSON for my cookie data because it's simple and fast, but
>> any serializer should work. Something like this:
>>
>> use JSON qw(to_json from_json);
>> use Digest::MD5 qw(md5_hex);
>> use MIME::Base64::URLSafe qw(urlsafe_b64encode urlsafe_b64decode);
>>
>> # to generate the cookie
>> my %data = ( foo => 1, bar => 2, baz => 'frob' );
>> $data{secure} = generate_data_hash(\%data);
>> my $cookie = urlsafe_b64encode(to_json(\%data));
>> print "Cookie: $cookie\n";
>>
>> # to process/validate the cookie
>> my $new_data = from_json(urlsafe_b64decode($cookie));
>> my $new_hash = delete $new_data->{secure};
>> if( $new_hash eq generate_data_hash($new_data) ) {
>>    print "Cookie is ok!\n";
>> } else {
>>    print "Cookie has been tampered with! Ignore.\n";
>> }
>>
>> # very simple hash generation function
>> sub generate_data_hash {
>>    my $data = shift;
>>    my $secret = 'some configured secret';
>>    return md5_hex($secret . join('|', map { "$_ - $data->{$_}" } keys
>> %$data));
>> }
>>
>> Doing encryption and encoding on small bits of data (like cookies) in
>> memory will almost always be faster than having to hit the database
>> (especially if it's on another machine). But the biggest reason is that it
>> takes the load off the DB and puts it on the web machines which are much
>> easier to scale linearly.
>>
>> > I know a lot of true app servers (Websphere, etc..) store
>>>
>>> this data in cached memory,
>>
>> You could do the same with your session data, or even store it in a shared
>> resource like a BDB file. But unless it's available to all of your web
>> servers you're stuck with "sticky" sessions and that's a real killer for
>> performance/scalability.
>>
>> --
>> Michael Peters
>> Plus Three, LP
>
>



-- 
Fayland Lam // http://www.fayland.org/


Re: Ways to scale a mod_perl site

2009-09-18 Thread Igor Chudov
On Fri, Sep 18, 2009 at 8:12 AM, Fayland Lam  wrote:

> This?
> http://search.cpan.org/~jkrasnoo/ApacheCookieEncrypted-0.03/Encrypted.pm
>
> Catalyst has a plugin:
>
> http://search.cpan.org/~lbrocard/Catalyst-Plugin-CookiedSession-0.35/lib/Catalyst/Plugin/CookiedSession.pm
>
> This module seems to want libapreq.1-34, which I interpret as not being
compatible with mod_perl 2?

I tried installing it with CPAN on Ubuntu Jaunty and failed.

  CPAN.pm: Going to build I/IS/ISAAC/libapreq-1.34.tar.gz

Please install mod_perl: 1.25 < version < 1.99
(Can't locate mod_perl.pm in @INC (@INC contains: /root/misc/life/modules
/root/lisleelectric.com /etc/perl /usr/local/lib/perl/5.10.0
/usr/local/share/perl/5.10.0 /usr/lib/perl5 /usr/share/perl5
/usr/lib/perl/5.10 /usr/share/perl/5.10 /usr/local/lib/site_perl .) at
Makefile.PL line 7.
) at Makefile.PL line 8.
BEGIN failed--compilation aborted at Makefile.PL line 18.
Warning: No success on command[/usr/bin/perl Makefile.PL INSTALLDIRS=site]
Warning (usually harmless): 'YAML' not installed, will not store persistent
state
  ISAAC/libapreq-1.34.tar.gz
  /usr/bin/perl Makefile.PL INSTALLDIRS=site -- NOT OK
Running make test

Igor


> Thanks.
>
> On Fri, Sep 18, 2009 at 9:06 PM, Igor Chudov  wrote:
> > Michael, you inspired me to reimplement cookies this way. For my site,
> the
> > cookie table is the most frequently updated one (even though I do not
> grant
> > cookies to search engines). I will try to use this kind of
> implementation.
> >
> > Even now, my users like the fact that they can stay signed  on forever,
> but
> > now I can do it at no cost to myself.
> >
> > A quick question, is there an existing perl module to do this sort of
> thing?
> >
> > Igor
> >
> > On Wed, Sep 16, 2009 at 12:11 PM, Michael Peters 
> > wrote:
> >>
> >> On 09/16/2009 12:13 PM, Brad Van Sickle wrote:
> >>
> >>> Can I get you to explain this a little more? I don't see how this could
> >>> be used for truly secure sites because I don't quite understand how
> >>> storing a hash in a plain text cookie would be secure.
> >>
> >> If you need to store per-session data about a client that the client
> >> shouldn't be able to see, then you just encrypt that data, base-64
> encode it
> >> and then put it into a cookie.
> >>
> >> If you don't care if the user sees that information you just want to
> make
> >> sure that they don't change it then add an extra secure hash of that
> >> information to the cookie itself and then verify it when you receive it.
> >>
> >> I like to use JSON for my cookie data because it's simple and fast, but
> >> any serializer should work. Something like this:
> >>
> >> use JSON qw(to_json from_json);
> >> use Digest::MD5 qw(md5_hex);
> >> use MIME::Base64::URLSafe qw(urlsafe_b64encode urlsafe_b64decode);
> >>
> >> # to generate the cookie
> >> my %data = ( foo => 1, bar => 2, baz => 'frob' );
> >> $data{secure} = generate_data_hash(\%data);
> >> my $cookie = urlsafe_b64encode(to_json(\%data));
> >> print "Cookie: $cookie\n";
> >>
> >> # to process/validate the cookie
> >> my $new_data = from_json(urlsafe_b64decode($cookie));
> >> my $new_hash = delete $new_data->{secure};
> >> if( $new_hash eq generate_data_hash($new_data) ) {
> >>print "Cookie is ok!\n";
> >> } else {
> >>print "Cookie has been tampered with! Ignore.\n";
> >> }
> >>
> >> # very simple hash generation function
> >> sub generate_data_hash {
> >>my $data = shift;
> >>my $secret = 'some configured secret';
> >>return md5_hex($secret . join('|', map { "$_ - $data->{$_}" } keys
> >> %$data));
> >> }
> >>
> >> Doing encryption and encoding on small bits of data (like cookies) in
> >> memory will almost always be faster than having to hit the database
> >> (especially if it's on another machine). But the biggest reason is that
> it
> >> takes the load off the DB and puts it on the web machines which are much
> >> easier to scale linearly.
> >>
> >> > I know a lot of true app servers (Websphere, etc..) store
> >>>
> >>> this data in cached memory,
> >>
> >> You could do the same with your session data, or even store it in a
> shared
> >> resource like a BDB file. But unless it's available to all of your web
> >> servers you're stuck with "sticky" sessions and that's a real killer for
> >> performance/scalability.
> >>
> >> --
> >> Michael Peters
> >> Plus Three, LP
> >
> >
>
>
>
> --
> Fayland Lam // http://www.fayland.org/
>


Re: Ways to scale a mod_perl site

2009-09-18 Thread David Avery
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

unsubscribe

Jeff Peng wrote:
> 
> -Original Message-
>> From: Brad Van Sickle 
>> Sent: Sep 17, 2009 12:13 AM
>> To: Michael Peters 
>> Cc: Mod_Perl 
>> Subject: Re: Ways to scale a mod_perl site
> 
>> but I was unaware that there might be 
>> an option for doing this without using a DB with mod_perl .
> 
> As Tina said, how about using memcached for this case?
> 
> Regards,
> Jeff Peng
> 
> 



- --
David Avery
Front Gate Solutions
1711 South Congress Austin, TX 78704
Ph: 512-674-9364
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkqzlKsACgkQ7SsBcHOnG7JsUwCfVTesb2CKmK2QtgBa5ZU9waTW
XIQAoK0kbL1rlBBnXQ6rHl3bOHWf04yI
=k0ZL
-END PGP SIGNATURE-


Re: Ways to scale a mod_perl site

2009-09-18 Thread Tina Mueller

On Wed, 16 Sep 2009, Michael Peters wrote:


On 09/16/2009 12:13 PM, Brad Van Sickle wrote:


Can I get you to explain this a little more? I don't see how this could
be used for truly secure sites because I don't quite understand how
storing a hash in a plain text cookie would be secure.


If you need to store per-session data about a client that the client 
shouldn't be able to see, then you just encrypt that data, base-64 encode it 
and then put it into a cookie.


How does the user invalidate that "session"? (in case the cookie leaked
or something like that). Or how can the website owner log out a certain
user?
If I have a session cookie with data in the server database I can always
invalidate that session by login out and thus removing the database
entry.
I personally prefer to have control over such things...

Is one select per request that bad? if the website is completely
dynamic you will probably have other requests as well?

If you care about the number of selects you should IMHO better safe those
with the help of caching.


Re: Ways to scale a mod_perl site

2009-09-18 Thread Michael Peters

On 09/18/2009 11:13 AM, Tina Mueller wrote:


How does the user invalidate that "session"? (in case the cookie leaked
or something like that). Or how can the website owner log out a certain
user?


When you generate the hash for the cookie, you can also include the 
timestamp and the IP address of the client. If the cookie leaks it can't 
be used (unless the person who steals it is also on the same NAT'd 
network and uses it quickly). But you'll have that same problem anyway.



Is one select per request that bad? if the website is completely
dynamic you will probably have other requests as well?


One extra select on every request can add up. In most web architectures 
the DB is a scarce shared resource.



If you care about the number of selects you should IMHO better safe those
with the help of caching.


Caching of sessions could help, but if you don't need to go down that 
road, why do it in the first place?


--
Michael Peters
Plus Three, LP


Re: Ways to scale a mod_perl site

2009-09-18 Thread Scott Gifford
Brad Van Sickle  writes:

>>
>>> 3) Being enabled by item 2, add more webservers and balancers
>>> 4) Create a separate database for cookie data (Apache::Session objects)
>>> ??? -- not sure if good idea --
>>
>> I've never seen the need to do that. In fact, I would suggest you
>> drop sessions altogether if you can. If you need any per-session
>> information then put it in a cookie. If you need this information to
>> be tamper-proof then you can create a hash of the cookie's data that
>> you store as part of the cookie. If you can reduce the # of times
>> that each request needs to actually hit the database you'll have big
>> wins.
>>
>>
>
> Can I get you to explain this a little more?  I don't see how this
> could be used for truly secure sites because I don't quite understand
> how storing a hash in a plain text cookie would be secure.

The general idea is that you store a cryptographic hash of the cookie
information plus a secret only your app knows.  Using | to show string
contatenation, your cookie would be:

YourCookieFields|HASH(YourCookieFields|YourSecret)

An attacker can't create the right hash because they don't know your
secret, and they can't change any fields in the cookie because the
hash would become invalid.

-Scott.


Re: Ways to scale a mod_perl site

2009-09-18 Thread Scott Gifford
Tina Mueller  writes:

> On Wed, 16 Sep 2009, Michael Peters wrote:
>
[...]
>> If you need to store per-session data about a client that the client
>> shouldn't be able to see, then you just encrypt that data, base-64
>> encode it and then put it into a cookie.
>
> How does the user invalidate that "session"? (in case the cookie leaked
> or something like that). Or how can the website owner log out a certain
> user?

Right, that is the trade-off for improved performance and scalability.
Different trade-offs will make sense for different sites.  For most
sites, the performance and scalability won't matter too much, but for
some it will.

Simple things like timestamping the cookie and expiring it after
awhile can help some, but they will not get you the flexibility of
keeping everything in a database.

-Scott.


RE: Ways to scale a mod_perl site

2009-09-18 Thread Ihnen, David
It amounts to shared private key security.

Each web server, for instance, is configured with the key abcd1234

The session looks like 

{ username => 'dog'
, group => 'canid'
, premium => 0
, login_time => 1253289574
}

I serialize that into a string with join '|', (map { $_, $session->{$_} } sort 
keys %session;

$cookiebase = login_time|1253289574|group|canid|premium|0|username|dog

I apply md5_hex from the Digest::MD5 module

$signature = md5_hex($cookiebase . "|" . 'abcd1234');

Which yields

68b07c585c18282ea418937266b031d7 

I then construct my cookie.

$cookie = join ':', %session, $signature;

So the cookie string looks like

premium:0:time:1253289574:username:dog:group:canid:68b07c585c18282ea418937266b031d7



When I receive the cookie on a request I just do the inverse.

my (%cookie, $signature) = split /:/, $cookie;

die 'BOGUS SESSION' unless ($signature eq md5_hex(join '|', (map { $_, 
$session->{$_} } sort keys %cookie), 'abcd1234';

If you change the 'plaintext' string in any way - the md5_hex will change.  If 
you change or drop the signature, the md5_hex will change. 

Its security through obscurity admittedly - security in that you can't see my 
code, methodology, or shared secret configuration.

But most people consider that plenty secure for securing the session data.

David


-Original Message-
From: Brad Van Sickle [mailto:bvansick...@gmail.com] 
Sent: Wednesday, September 16, 2009 9:13 AM
To: Michael Peters
Cc: Mod_Perl
Subject: Re: Ways to scale a mod_perl site


>
>> 3) Being enabled by item 2, add more webservers and balancers
>> 4) Create a separate database for cookie data (Apache::Session objects)
>> ??? -- not sure if good idea --
>
> I've never seen the need to do that. In fact, I would suggest you drop 
> sessions altogether if you can. If you need any per-session 
> information then put it in a cookie. If you need this information to 
> be tamper-proof then you can create a hash of the cookie's data that 
> you store as part of the cookie. If you can reduce the # of times that 
> each request needs to actually hit the database you'll have big wins.
>
>

Can I get you to explain this a little more?  I don't see how this could 
be used for truly secure sites because I don't quite understand how 
storing a hash in a plain text cookie would be secure. 

The thing I hate most about my "secure" applications is the fact that I 
have to read the DB at the start of every request to ensure that the 
session cookie is valid and to extract information about the user from 
the session table using the session ID stored in the cookie.  Hitting 
the DB is the quickest way to kill performance and scalability in my 
experience.I know a lot of true app servers (Websphere, etc..)  
store this data in cached memory, but I was unaware that there might be 
an option for doing this without using a DB with mod_perl .


Re: Ways to scale a mod_perl site

2009-09-18 Thread Michael Peters

On 09/18/2009 12:16 PM, Ihnen, David wrote:


Its security through obscurity admittedly - security in that you can't see my 
code, methodology, or shared secret configuration.


No it's not really through obscurity. Even if someone found out your 
method of serialization your data is still safe. It's only if they find 
out your secret key that you'll have problems. But that's the same for 
SSL, PGP and any other crypto.


--
Michael Peters
Plus Three, LP


Re: Ways to scale a mod_perl site

2009-09-18 Thread Igor Chudov
On Fri, Sep 18, 2009 at 10:13 AM, Tina Mueller  wrote:

> On Wed, 16 Sep 2009, Michael Peters wrote:
>
>  On 09/16/2009 12:13 PM, Brad Van Sickle wrote:
>>
>>  Can I get you to explain this a little more? I don't see how this could
>>> be used for truly secure sites because I don't quite understand how
>>> storing a hash in a plain text cookie would be secure.
>>>
>>
>> If you need to store per-session data about a client that the client
>> shouldn't be able to see, then you just encrypt that data, base-64 encode it
>> and then put it into a cookie.
>>
>
> How does the user invalidate that "session"? (in case the cookie leaked
> or something like that). Or how can the website owner log out a certain
> user?
>

Same way you do with a table: when the user logs out, you update their
cookie to a new one, where "userid" is not set.



> If I have a session cookie with data in the server database I can always
> invalidate that session by login out and thus removing the database
> entry.
> I personally prefer to have control over such things...
>
> Is one select per request that bad? if the website is completely
> dynamic you will probably have other requests as well?
>
>
Well, the cookie table is the one that gets hit a lot and grows out of
control. It is hard to scale and replicate. Storing cookies on the browsers
solves this completely. I can have a billion browsers connect to my site and
no database growth will occur from that.



> If you care about the number of selects you should IMHO better safe those
> with the help of caching.
>


Re: Ways to scale a mod_perl site

2009-09-18 Thread Igor Chudov
On Fri, Sep 18, 2009 at 12:11 PM, James Smith  wrote:

>  Igor Chudov wrote:
>
>
>
> On Fri, Sep 18, 2009 at 10:13 AM, Tina Mueller wrote:
>
>> On Wed, 16 Sep 2009, Michael Peters wrote:
>>
>>  On 09/16/2009 12:13 PM, Brad Van Sickle wrote:
>>>
>>>  Can I get you to explain this a little more? I don't see how this could
 be used for truly secure sites because I don't quite understand how
 storing a hash in a plain text cookie would be secure.

>>>
>>> If you need to store per-session data about a client that the client
>>> shouldn't be able to see, then you just encrypt that data, base-64 encode it
>>> and then put it into a cookie.
>>>
>>
>>  How does the user invalidate that "session"? (in case the cookie leaked
>> or something like that). Or how can the website owner log out a certain
>> user?
>>
>
> Same way you do with a table: when the user logs out, you update their
> cookie to a new one, where "userid" is not set.
>
>
>
> You missed the point in the previous email - that is when the "System" logs
> the user out... User X does something naughty - you need to ban him from
> doing anything else so you
> make his cookie invalid - this information can only occur on the server
> side so you delete the reference in the database refering to this session..
> You are now having to check this
> each time which is the same as getting the information out of the
> database...
>
> I also think that everybody putting lots of stuff in their cookies is not
> thinking about network latency bandwidth etc - remember unless you are very
> careful how you specify your
> cookies you end up sending them on every request - for images/js/css/etc
> this adds to both bandwidth and CPU power - so it's a case of swings and
> roundabouts... balance your
> considerations
>
> You will usually find a fast right through cache is the best solution for
> most information on the backend... and being careful to only really create
> sessions when you have to!
>

Thanks. I think that I understand the issue a little better.

When I delete someone's account, I blow away everything, their users table
entry, and all their content.

So when they have a cookie with an invalid userid, they cannot do that much
-- but I gotta admit that I need to think this through a little better.

i


Re: Ways to scale a mod_perl site

2009-09-18 Thread Matthew Paluch
While many great minds are here, I would like to focus on one point for a
moment, which in my experience, has been the most critical:

The database

Before I were to ask any other of your questions (all of which were valid),
I would ask myself:

 - What kind of database tables am I implementing? (innodb, berkley,etc)
 What effect do they have on the filesystejm, or the pagefile?
 - How have I defined connections, connection pooling, shared resources,
partitions v logical drives, semaphores v shared memory handles to handles?
 - Have I analyzed which tables actually get used, and by which processes,
and paid attention to which operations only require simple foreign to
primary key relationships, and not complex JOINS?

And secondarily:

 - Is it possible to set up simple READ-ONLY copies of frequently read but
rarely changed data (such as login information) so that some work could be
off-loaded in an intelligent manner without regards to load balancing?

In short, the database's interaction with the main application is most
commonly the issue, regardless of the underlying technologies.  Start there,
so says I.

Matthew P
gedanken

-- 
Matthew Paluch
404.375.8898


Re: Ways to scale a mod_perl site

2009-09-18 Thread Ask Bjørn Hansen


On Sep 16, 2009, at 9:13, Brad Van Sickle wrote:

I've never seen the need to do that. In fact, I would suggest you  
drop sessions altogether if you can. If you need any per-session  
information then put it in a cookie. If you need this information  
to be tamper-proof then you can create a hash of the cookie's data  
that you store as part of the cookie. If you can reduce the # of  
times that each request needs to actually hit the database you'll  
have big wins.


Can I get you to explain this a little more?  I don't see how this  
could be used for truly secure sites because I don't quite  
understand how storing a hash in a plain text cookie would be secure.



If you are just concerned about the cookie being changed; add a time  
stamp and a hash to the cookie data.


There's an example on page 19 of http://develooper.com/talks/rww-mysql-2008.pdf 
 ...


If you are concerned about the cookie being readable at all, you can  
encrypt the whole thing.


Either way it's "tamper proof".


  - ask

--
http://develooper.com/ - http://askask.com/




Re: Ways to scale a mod_perl site

2009-09-19 Thread Tina Müller

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Fri, 18 Sep 2009, Igor Chudov wrote:


On Fri, Sep 18, 2009 at 10:13 AM, Tina Mueller  wrote:

> How does the user invalidate that "session"? (in case the cookie leaked
> or something like that). Or how can the website owner log out a certain
> user?
>

Same way you do with a table: when the user logs out, you update their
cookie to a new one, where "userid" is not set.


That doesn't invalidate the cookie.
It resets the cookie in the browser, but the string itself is still a valid
session and can be reused.
Since there is nothing stored about it server side the server just gets
the session string from the client and doesn't care (doesn't know) if
any browser "logged out".

And storing the IP in the session wouldn't work for users that get a
new IP very often. On the other hand, several users might have the
same IP in the view of the server.


> Is one select per request that bad? if the website is completely
> dynamic you will probably have other requests as well?
>
>
Well, the cookie table is the one that gets hit a lot and grows out of
control. It is hard to scale and replicate. Storing cookies on the browsers
solves this completely. I can have a billion browsers connect to my site and
no database growth will occur from that.


You said your site is completely dynamic. So you probably have other
database requests per page view. This is the point where I would start
to optimize. IMHO that will bring you performance very fast.
On many pages of my portal, in the ideal case there is only one select
to the session table, all other things are cached (of course this
counts only for overview data and forum threads that were recently viewed,
but these are the pages which have the most views).

A session table is quite small and your selects always use an indexed
column. You might even be able to seperate the sessions into several
tables/databases (splitted by the first character of the sid, for
example), which enables you to split to different servers
without the tradeoff of replication.

- -- 
http://darkdance.net/

http://perlpunks.de/
http://www.trashcave.de/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Made with pgp4pine 1.76

iEYEARECAAYFAkq1JlUACgkQ8ezKMar1ua1nZgCgrEcvGn8FmKfQ+0Bo0SgsdBHt
+RgAn1F/+QTJew5RYtcaxMOj7Ac4a/Od
=jQVj
-END PGP SIGNATURE-



Re: Ways to scale a mod_perl site

2009-09-19 Thread Bill Moseley
On Sat, Sep 19, 2009 at 11:43 AM, Tina Müller  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On Fri, 18 Sep 2009, Igor Chudov wrote:
>
>  On Fri, Sep 18, 2009 at 10:13 AM, Tina Mueller 
>> wrote:
>>
>> > How does the user invalidate that "session"? (in case the cookie leaked
>> > or something like that). Or how can the website owner log out a certain
>> > user?
>> >
>>
>> Same way you do with a table: when the user logs out, you update their
>> cookie to a new one, where "userid" is not set.
>>
>
> That doesn't invalidate the cookie.
> It resets the cookie in the browser, but the string itself is still a valid
> session and can be reused.
>

That's why you have an expires time in the cookie data.  Each request you
check and extend.  Then if you see one that's past the expires time you
require authentication again.

"Logged out" is a fuzzy concept.  If it means the user must provide
credentials again then you flag logged out in the cookie and then it will
appear to the user that they are logged out.  Sure, if they copy the cookie
some place, log out, then they can use the cookie again seemingly w/o
logging in.  But it's just an appearance.Logging in just means you have
provided the credentials and given them a tempoary token (the cookie) that
says they don't need to re-authenticate every request.  It's a free pass for
the time allowed (regardless of the log out).

If you have much more stict business needs around "logging out" or a way to
imeadiately disable a user then you need to track that elsewhere -- set a
flag in memcached or use the db.



> Since there is nothing stored about it server side the server just gets
> the session string from the client and doesn't care (doesn't know) if
> any browser "logged out".
>
> And storing the IP in the session wouldn't work for users that get a
> new IP very often. On the other hand, several users might have the
> same IP in the view of the server.


Right, IPs are not much good.  I use them sometimes to force a captcha if
too many failed logins come from the same IP.


-- 
Bill Moseley
mose...@hank.org