Re: Wild Proposal :)

2000-10-12 Thread David E. Wheeler

Perrin Harkins wrote:
 
 My point was that Apache::DBI already gives you persistent connections,
 and when people say they want actual pooled connections instead they
 usually don't have a good reason for it.

Let's say that I have 20 customers, each of whom has a database schema
for their data. I have one Apache web server serving all of those
customers. Say that Apache has forked off 20 children. Each of the
customers who connects has to use their own authentication to their own
schema. That means that Apache::DBI is caching 20 different connections
- one per customer. Not only that, but Apache::DBI is caching 20
different connections in each of the 20 processes. Suddenly you've got
400 connections to your database at once! And only 20 can actually be in
use at any one time (one for each Apache childe).

Start adding new customers and new database schemas, and you'll soon
find yourself with more connections than you can handle.

And that's why connection pooling makes sense in some cases.

David

-- 
David E. Wheeler
Software Engineer
Salon Internet ICQ:   15726394
[EMAIL PROTECTED]   AIM:   dwTheory



Re: Wild Proposal :)

2000-10-12 Thread Leslie Mikesell

According to David E. Wheeler:
 Perrin Harkins wrote:
  
  My point was that Apache::DBI already gives you persistent connections,
  and when people say they want actual pooled connections instead they
  usually don't have a good reason for it.
 
 Let's say that I have 20 customers, each of whom has a database schema
 for their data. I have one Apache web server serving all of those
 customers. Say that Apache has forked off 20 children. Each of the
 customers who connects has to use their own authentication to their own
 schema. That means that Apache::DBI is caching 20 different connections
 - one per customer. Not only that, but Apache::DBI is caching 20
 different connections in each of the 20 processes. Suddenly you've got
 400 connections to your database at once! And only 20 can actually be in
 use at any one time (one for each Apache childe).
 
 Start adding new customers and new database schemas, and you'll soon
 find yourself with more connections than you can handle.

Wouldn't this be handled just as well by running an Apache
per customer and letting each manage it's own pool of children
which will only connect to it's own database?

 And that's why connection pooling makes sense in some cases.

I think you could make a better case for it in a situation where
the reusability  of the connection isn't known ahead of time,
as would be the case if the end user provided a name/password
for the connection.

  Les Mikesell
 [EMAIL PROTECTED]



RE: Wild Proposal :)

2000-10-11 Thread Stephen Anderson



 -Original Message-
 From: Perrin Harkins [mailto:[EMAIL PROTECTED]]
 Sent: 11 October 2000 04:45
 To: Ajit Deshpande
 Cc: [EMAIL PROTECTED]
 Subject: Re: Wild Proposal :)
 
 
 Hi Ajit,
 
 It's not entirely clear to me what problem you're trying to 
 solve here. 
 I'll comment on some of the specifics you've written down here, but I
 may be missing your larger point.

Ajit's examples aren't perfect, but the problem is a real one. The problem
is one of generalisation. Logically, you don't want to put an application
that is 10% web-related into mod_perl. So, you can take it out the other 90%
and stick it into an RPC server, but wouldn't it be nice if there was an
application server framework that handled connections,load balancing and
resource management for you?

 There's DBI::Proxy already.  Before jumping on the "we need pooled
 connections" bandwagon, you should read Jeffrey Baker's post on the
 subject here:

http://forum.swarthmore.edu/epigone/modperl/breetalwox/38B4DB3F.612476CE@acm
.org

People always manage to miss the point on this one. It's not about saving
the cycles required to open the connection, as they're minimal at worst.
It's about saving the _time_ to open the connection. On a network
application, opening a connection is going to be quite possibly your largest
latency. On a large application  doing a lot of transactions per second, the
overhead involved in building connections and tearing them down can lose you
serious time. It also complicates scaling the database server. It's far
better to pay your overhead once and just re-use the connection.

Stephen.



RE: Wild Proposal :)

2000-10-11 Thread Perrin Harkins

On Wed, 11 Oct 2000, Stephen Anderson wrote:
  There's DBI::Proxy already.  Before jumping on the "we need pooled
  connections" bandwagon, you should read Jeffrey Baker's post on the
  subject here:
 
 http://forum.swarthmore.edu/epigone/modperl/breetalwox/38B4DB3F.612476CE@acm
 .org
 
 People always manage to miss the point on this one. It's not about saving
 the cycles required to open the connection, as they're minimal at worst.
 It's about saving the _time_ to open the connection.

My point was that Apache::DBI already gives you persistent connections,
and when people say they want actual pooled connections instead they
usually don't have a good reason for it.

- Perrin




Re: Wild Proposal :)

2000-10-10 Thread Perrin Harkins

Hi Ajit,

It's not entirely clear to me what problem you're trying to solve here. 
I'll comment on some of the specifics you've written down here, but I
may be missing your larger point.

 OBJECTIVE
 
 Provide a perl server that can execute miscellaneous perl jobs that
 will communicate with mod_perl enabled Apache kids using IPC. This
 can be considered something similar to a master "Servlet" but in perl;
 call it Perlet. The master Perlet can manage a pool of kid Perlets that
 will be governed by the load.

You can do this fairly easily using RPC::PlServer or one of the other
RPC modules on CPAN.  This is how DBI::Proxy works.

 MOTIVATIONS
 
 - Modperl in Apache 1.x does not provide a good way of sharing data
   and operations on that data in an efficient manner between Apache kids.

They may not perform quite as well as multi-threading, but there are a
number of modules that solve this problem pretty well.  Apache::Session
is one example, and there are many shared cache modules out there. 
Using the file system for this is a pretty good solution on systems that
do aggressive memory buffering of the file system, like Linux.  (We were
just talking about this stuff on the Mason list.  Seems like almost as
popular a topic as templating systems.)

One thing to keep in mind is that any perl server is likely to use a
multi-process approach and thus will have the same issues with data
sharing that mod_perl does.  You'd have to get production quality
multi-threading support in Perl to avoid this, or write  server that
multiplexes using select calls and non-blocking I/O.

 EXAMPLE USES
 
 The following are probably a bit ambitious. #2 and #3 are something that
 I can see implementing fairly easily. The most important thing would of
 course be the design of the API between mod_perl Apache and a Perlet.
 
 - Perlet::DB that will provide a pool of database connections and
   miscellaneous DB querying etc.

There's DBI::Proxy already.  Before jumping on the "we need pooled
connections" bandwagon, you should read Jeffrey Baker's post on the
subject here:
http:[EMAIL PROTECTED]

 - Perlet::Mail that will provide asynchronous Mail handoffs

qmail-inject will cover this.

The other examples (HTML/XML parsers) don't make sense to me, since
these work fine with mod_perl and are generally synchronous
applications.

- Perrin