subject:"Re\: \*Occasional\* PostgreSQL Error"

Re: Occasional PostgreSQL Error

2008-01-25 Thread Jacob Kaplan-Moss

Hi Doug --

On 1/24/08, Doug Van Horn <[EMAIL PROTECTED]> wrote:
> OperationalError: could not connect to server: No such file or
> directory
>Is the server running locally and accepting
>connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.
> 5432"?

This means that, for some reason, a connection to the database
couldn't be established. You'll get this error if the database isn't
running, but since you only get it under load I'd guess that you're
hitting PostgreSQL's max connection limit (see the max_connections
setting in postgresql.conf). You can tell for sure by checking your
Postgres log; it'll have a message about reaching the connection
limit.

Jacob

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---

Re: Occasional PostgreSQL Error

2008-01-29 Thread Mark Green



On Fri, 2008-01-25 at 15:14 -0800, Jacob Kaplan-Moss wrote:
> Hi Doug --
> 
> On 1/24/08, Doug Van Horn <[EMAIL PROTECTED]> wrote:
> > OperationalError: could not connect to server: No such file or
> > directory
> >Is the server running locally and accepting
> >connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.
> > 5432"?
> 
> This means that, for some reason, a connection to the database
> couldn't be established. You'll get this error if the database isn't
> running, but since you only get it under load I'd guess that you're
> hitting PostgreSQL's max connection limit (see the max_connections
> setting in postgresql.conf). You can tell for sure by checking your
> Postgres log; it'll have a message about reaching the connection
> limit.

Just curious, what's the state of connection pooling in django?
And what does the user see when such an error occurs, I guess
an error 500 message?


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---

Re: Occasional PostgreSQL Error

2008-01-29 Thread James Bennett


On Jan 29, 2008 10:04 PM, Mark Green <[EMAIL PROTECTED]> wrote:
> Just curious, what's the state of connection pooling in django?

My personal opinion is that the application level (e.g., Django) is
the wrong place for connection pooling and for the equivalent "front
end" solution of load balancing your web servers: the less the
application layer has to know about what's in front of and behind it,
the more flexible it will be (since you can make changes without
having to alter your application-layer code).

So, for example, connection pooling for Postgres would best be handled
by a dedicated pooling connection manager like pgpool; Django can
connect to pgpool as if it's simply a Postgres database, which means
you don't have to go specifying pooling parameters at the application
level.


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---

Re: Occasional PostgreSQL Error

2008-01-29 Thread Mark Green

On Tue, 2008-01-29 at 22:07 -0600, James Bennett wrote:
> On Jan 29, 2008 10:04 PM, Mark Green <[EMAIL PROTECTED]> wrote:
> > Just curious, what's the state of connection pooling in django?
> 
> My personal opinion is that the application level (e.g., Django) is
> the wrong place for connection pooling and for the equivalent "front
> end" solution of load balancing your web servers: the less the
> application layer has to know about what's in front of and behind it,
> the more flexible it will be (since you can make changes without
> having to alter your application-layer code).
> 
> So, for example, connection pooling for Postgres would best be handled
> by a dedicated pooling connection manager like pgpool; Django can
> connect to pgpool as if it's simply a Postgres database, which means
> you don't have to go specifying pooling parameters at the application
> level.

Hm, that doesn't sit so well with me.
I agree on the loadbalancer front but the overhead for all
those TCP connections (and pgpool managing them) worries me a bit.

Furthermore, and much more serious, I see no way to ensure
graceful degration in case of overload.

Let's assume we run a local pgpool instance along with django on each
machine. Django goes through the local pgpool for database access.

Now what happens when the database becomes too slow to
keep up with requests for any reason?

I see two options:

a) pgpool is configured without a limit on inbound connections;
   the hanging connections between django and pgpool will
   eventually exhaust the total number of allowed tcp-
   connections for the django-user or even systemwide.

   django will not be able to open new database connections and
   display nasty error pages to the users. Worse yet, if django
   and webserver are running under the same uid then the webserver
   will likely no longer be able to accept new inbound connections
   and the users get funny error messages from their browsers.

b) pgpool is configured with a limit on inbound connections;
   pgpool will hit the limit and refuse subsequent attempts from
   django, which in turn displays nasty error pages to users.

In order to achieve the desired behaviour of django slowing down
gracefully instead of spitting error pages I think we'd have to
teach django to retry database connections. But this would
open a whole new can of worms, such as risking duplicated
requests when users hit reload, etc...

So, long story short, I see no way out of this without
proper connection pooling built right into django.
Or am I missing something?

-mark

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---

Re: Occasional PostgreSQL Error

2008-01-29 Thread James Bennett

On Jan 29, 2008 11:18 PM, Mark Green <[EMAIL PROTECTED]> wrote:
> I agree on the loadbalancer front but the overhead for all
> those TCP connections (and pgpool managing them) worries me a bit.

I've used pgpool in production with great success, so I'm not really
sure what overhead you're talking about.

> Furthermore, and much more serious, I see no way to ensure
> graceful degration in case of overload.

And here you completely change the topic of discussion from persistent
pooling of connections to failover when a database reaches its maximum
connection level, so I'm not really sure what it has to do with
anything...

> So, long story short, I see no way out of this without
> proper connection pooling built right into django.
> Or am I missing something?

You're missing the fact that you've switched from asking about pooling
to asking about failover.

Also, your solution would mean that:

1. Django must have its own configuration for the number of
connections it's allowed to use, how long to keep them alive and how
often to retry them in case of failure, and this must be updated if
and when use patterns change.
2. Django must have its own configuration for being notified of what
every other client application of the same database is doing, and this
must be updated if and when use patterns change.
3. Every other client application of the same database must have
similar dual configuration to know what it's allowed to do and what
everybody else is doing, and these must be updated if and when use
patterns change.

Or you could just use a single external utility to manage database
connections, thus keeping all that essentially infrastructural cruft
out of the application layer while giving you a single place to
configure it and a single place to make changes when you need them.

-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---

Re: Occasional PostgreSQL Error

2008-01-30 Thread Mark Green

On Tue, 2008-01-29 at 23:33 -0600, James Bennett wrote:
> On Jan 29, 2008 11:18 PM, Mark Green <[EMAIL PROTECTED]> wrote:
> > I agree on the loadbalancer front but the overhead for all
> > those TCP connections (and pgpool managing them) worries me a bit.
> 
> I've used pgpool in production with great success, so I'm not really
> sure what overhead you're talking about.
> 
> > Furthermore, and much more serious, I see no way to ensure
> > graceful degration in case of overload.
> 
> And here you completely change the topic of discussion from persistent
> pooling of connections to failover when a database reaches its maximum
> connection level, so I'm not really sure what it has to do with
> anything...

Well, "Build for failure". Temporary overload can happen at any
time and I'd expect django to behave exceptionally bad in that
case as it is.

The problem is that under load it will start displaying errors to
users instead of just slowing down. Not nice, imho. There is no
"buffer" (i.e. timeout, connection retries) between django
and the [next hop to the] database.

Disclaimer: I haven't actually tested this behaviour but I've seen it
in JDBC apps before we added pooling and don't know why django should
be different. These apps would basically "chop off" (i.e. return errors
for) the excess percentile of requests. Naturally the affected users
would use their "reload"-button and there we have a nice death spiral...

> > So, long story short, I see no way out of this without
> > proper connection pooling built right into django.
> > Or am I missing something?
> 
> You're missing the fact that you've switched from asking about pooling
> to asking about failover.

Hm, I wouldn't say this is about failover. It's about behaving
gracefully under load.

> Also, your solution would mean that:
> 
> 1. Django must have its own configuration for the number of
> connections it's allowed to use, how long to keep them alive and how
> often to retry them in case of failure, and this must be updated if
> and when use patterns change.

Yup, connection pooling.

> 2. Django must have its own configuration for being notified of what
> every other client application of the same database is doing, and this
> must be updated if and when use patterns change.
>
> 3. Every other client application of the same database must have
> similar dual configuration to know what it's allowed to do and what
> everybody else is doing, and these must be updated if and when use
> patterns change.

Not really. My desire is to make each individual django instance
play well when things get crowded. Making them aware of each other,
or even making all database clients aware of each other, sounds
like an interesting project but is not what I'm after here.

> Or you could just use a single external utility to manage database
> connections, thus keeping all that essentially infrastructural cruft
> out of the application layer while giving you a single place to
> configure it and a single place to make changes when you need them.

Well, there is a point where a single instance of the external
utility doesn't cut it anymore. The only way to go seems to be
one pgpool instance per django instance (for performance and
to avoid the single point of failure).

So there you have your "n configurations" again, only outside
of django and without really solving the overload-problem.

Maybe I'm blowing all this out of proportion but I wonder
if any of the high-traffic, multi-server django sites ever
ran into it?

-mark

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---

Re: Occasional PostgreSQL Error

2008-01-30 Thread James Bennett


On Jan 30, 2008 8:57 AM, Mark Green <[EMAIL PROTECTED]> wrote:
> Well, "Build for failure". Temporary overload can happen at any
> time and I'd expect django to behave exceptionally bad in that
> case as it is.

Running out of resources is never a good thing for any system.

> Disclaimer: I haven't actually tested this behaviour but I've seen it
> in JDBC apps before we added pooling and don't know why django should
> be different. These apps would basically "chop off" (i.e. return errors
> for) the excess percentile of requests. Naturally the affected users
> would use their "reload"-button and there we have a nice death spiral...

And if it just slows down you don't think they'll do the same thing?

> Not really. My desire is to make each individual django instance
> play well when things get crowded. Making them aware of each other,
> or even making all database clients aware of each other, sounds
> like an interesting project but is not what I'm after here.

But in order to know that things are "crowded", each one has to know
what all the others are doing. And any non-Django application using
the same database *also* has to include its own copy of all that
configuration.

> Well, there is a point where a single instance of the external
> utility doesn't cut it anymore. The only way to go seems to be
> one pgpool instance per django instance (for performance and
> to avoid the single point of failure).

Again: you're repeatedly changing the topic from connection pooling to
failover. When you decide you want to talk about one or the other for
more than a few sentences at a time, let me know.

> Maybe I'm blowing all this out of proportion

Almost certainly.

> but I wonder
> if any of the high-traffic, multi-server django sites ever
> ran into it?

Not really. If you're hitting the max on your DB you have more
immediate problems than whether your users see an error page or an
eternal "Loading..." bar.


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---

Re: Occasional PostgreSQL Error

2008-01-30 Thread Mark Green


On Wed, 2008-01-30 at 11:03 -0600, James Bennett wrote:
> On Jan 30, 2008 8:57 AM, Mark Green <[EMAIL PROTECTED]> wrote:
> > Well, "Build for failure". Temporary overload can happen at any
> > time and I'd expect django to behave exceptionally bad in that
> > case as it is.
> 
> Running out of resources is never a good thing for any system.

Obviously.

> > Disclaimer: I haven't actually tested this behaviour but I've seen it
> > in JDBC apps before we added pooling and don't know why django should
> > be different. These apps would basically "chop off" (i.e. return errors
> > for) the excess percentile of requests. Naturally the affected users
> > would use their "reload"-button and there we have a nice death spiral...
> 
> And if it just slows down you don't think they'll do the same thing?

Ahem, there's a huge difference between being confronted with
a spinner/progress bar or an error page. The former speaks
"Please wait", the latter speaks "Try again".

> > Not really. My desire is to make each individual django instance
> > play well when things get crowded. Making them aware of each other,
> > or even making all database clients aware of each other, sounds
> > like an interesting project but is not what I'm after here.
> 
> But in order to know that things are "crowded", each one has to know
> what all the others are doing. And any non-Django application using
> the same database *also* has to include its own copy of all that
> configuration.

I guess I still haven't made clear what I mean.
I expect *each individual instance* of django to behave gracefully
when it can't get through to the database. No group-knowledge is
needed for a plain old connection retry or (better) connection pooling.

> > Well, there is a point where a single instance of the external
> > utility doesn't cut it anymore. The only way to go seems to be
> > one pgpool instance per django instance (for performance and
> > to avoid the single point of failure).
> 
> Again: you're repeatedly changing the topic from connection pooling to
> failover. When you decide you want to talk about one or the other for
> more than a few sentences at a time, let me know.

Erm. I have not mentioned failover a single time. I'm just trying
to point out where your "let pgpool handle it"-strategy seems to
fall down.

> > Maybe I'm blowing all this out of proportion
> 
> Almost certainly.
> 
> > but I wonder
> > if any of the high-traffic, multi-server django sites ever
> > ran into it?
> 
> Not really. If you're hitting the max on your DB you have more
> immediate problems than whether your users see an error page or an
> eternal "Loading..." bar.

I'm not talking about maxing out the db constantly. I'm talking about
scratching the limits during peak hours which is something that I'm
pretty sure almost every bigger site has expirienced at least once
(cf. "growing pains").

During these peak hours there's a huge difference between users randomly
getting an error page and users randomly having to wait a little longer.


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---

Re: Occasional PostgreSQL Error

2008-01-30 Thread James Bennett

On Jan 30, 2008 6:01 PM, Mark Green <[EMAIL PROTECTED]> wrote:
> Ahem, there's a huge difference between being confronted with
> a spinner/progress bar or an error page. The former speaks
> "Please wait", the latter speaks "Try again".

OK, so let's break this down.

There are two potential cases where you run up against your database's
concurrent connection limit:

1. Your average traffic level involves more concurrent connections
than your database permits.
2. Your average traffic level is within the number of concurrent
connections your database permits, but you are experiencing a
temporary spike above that level.

In case (1), the odds of a timeout/retry scheme yielding success
within the time the average user is prepared to wait are low; your
database is simply swamped, and the only options are to increase
available resources (in the form of database connections) or refuse to
serve some number of requests.

So case (2) is the only one worth discussing as a target for a
timeout/retry scheme.

In this case, you are (apparently) asking for Django to provide three things:

1. A mechanism for specifying a set number of database connections
which Django will persistently maintain across request/response
boundaries.
2. A mechanism for specifying how long Django should wait, when
attempting to obtain a connection, before timing out and concluding
that no connection will be obtained.
3. A mechanism for specifying how many times, after failing to obtain
a connection, Django should re-try the attempt.

How, then, would we apply these hypothetical configuration directives
to the situation of a traffic spike? There are two possibilities:

1. Set these directives in advance and have them be a permanent part
of the running application's configuration.
2. Avoid setting these directives until a spike is imminent (difficult
to do) or in progress, and leave them only so long as the spike
persists.

In case (1) you are flat-out wasting resources and complicating
Django's operation, by holding resources which are not used and
mandating more complex logic for obtaining those resources. In nearly
all cases this is a bad trade-off to make.

In case (2) the first option really isn't possible, because you can't
reliably predict traffic spikes in advance. This leaves the second
option, which requires you to be constantly watching the number of
database connections in use and involves shutting down your
application temporarily in order to insert the necessary configuration
directives. It is also unlikely that you will be able to do so before
at least some users have received error pages.

So you must either waste resources, or accept increased monitoring
overhead and the inevitability that some requests will not receive
successful responses.

Add to this the following disadvantages:

* More complex configuration of Django (and hence more potential for
configuration error).
* More complex code base for Django (and hence more operating overhead
and more potential bugs).
* Loss of flexibility in that the application layer must now possess
more information about the database layer.
* Loss of flexibility in that Django must now maintain persistent
resources between requests, and have some mechanism for equitably
distributing them across arbitrarily-many server processes on one or
more instances of the HTTP daemon on one or more virtual or physical
machines.

Given all of this, I don't see this level of configuration in the
application layer as a worthwhile trade-off to make.

You've mentioned JDBC, which points to one potential conflict between
your expectations and the reality of Django's design: Java-based web
applications typically run threaded within a single Java-based server
instance which is able to contain persist resources across requests
and manage them equitably across all threads.

Django, on the other hand, typically runs multi-*process* behind or
embedded in an HTTP daemon and shares no information across the
boundaries of those processes, and deliberately seeks to minimize the
number of things which must be persisted across the boundaries of
request/response cycles. This is a fundamentally different
architecture, and so decreases the portability of assumptions from a
Java-style background. In particular, there is no master Django
process which can create and maintain a resource pool and allocate
resources from it to other Django processes (especially since
processes may be running behind/embedded in other instances of the
HTTP daemon on the same machine, or on remote machines). Which brings
up yet another disadvantage:

* The configuration of the number of connections Django should
maintain, and the timeout/retry directives, must be calculated with
the assumption that they will be applied by each and every process
independently and without knowledge of any of the others.

At that point even the calculations involved in determining the
correct settings are probably impossible, since common server
arrangements can and w

Re: Occasional PostgreSQL Error

2008-01-30 Thread Mark Green

On Wed, 2008-01-30 at 19:34 -0600, James Bennett wrote:
> On Jan 30, 2008 6:01 PM, Mark Green <[EMAIL PROTECTED]> wrote:
> > Ahem, there's a huge difference between being confronted with
> > a spinner/progress bar or an error page. The former speaks
> > "Please wait", the latter speaks "Try again".
> 
> OK, so let's break this down.

Yay, thanks for that exhaustive response. :-)
I guess we'll eventually have to agree on disagreement but
I'll add my counterpoints for completeness.

> There are two potential cases where you run up against your database's
> concurrent connection limit:
> 
> 1. Your average traffic level involves more concurrent connections
> than your database permits.
> 2. Your average traffic level is within the number of concurrent
> connections your database permits, but you are experiencing a
> temporary spike above that level.
> 
> In case (1), the odds of a timeout/retry scheme yielding success
> within the time the average user is prepared to wait are low; your
> database is simply swamped, and the only options are to increase
> available resources (in the form of database connections) or refuse to
> serve some number of requests.
> 
> So case (2) is the only one worth discussing as a target for a
> timeout/retry scheme.

Yes.

> In this case, you are (apparently) asking for Django to provide three things:
> 
> 1. A mechanism for specifying a set number of database connections
> which Django will persistently maintain across request/response
> boundaries.
> 2. A mechanism for specifying how long Django should wait, when
> attempting to obtain a connection, before timing out and concluding
> that no connection will be obtained.
> 3. A mechanism for specifying how many times, after failing to obtain
> a connection, Django should re-try the attempt.

Yes. Basically a bog-standard connection pool.

> How, then, would we apply these hypothetical configuration directives
> to the situation of a traffic spike? There are two possibilities:
> 
> 1. Set these directives in advance and have them be a permanent part
> of the running application's configuration.
> 2. Avoid setting these directives until a spike is imminent (difficult
> to do) or in progress, and leave them only so long as the spike
> persists.
> 
> In case (1) you are flat-out wasting resources and complicating
> Django's operation, by holding resources which are not used and
> mandating more complex logic for obtaining those resources. In nearly
> all cases this is a bad trade-off to make.

What ressources are held and wasted exactly?
Maintaining a number of open TCP connection is much cheaper
than creating/discarding them at a high rate.

I agree that django's CGI-style mode of operation might make
implementation tricky (separate thread?) but you can't seriously
suggest that creating/discarding n connections per second would
be cheaper that maintaining, say, n*10 long-lived connections?

Predictability is the keyword here. From the perspective of my database
(or pgpool instance) I want to be sure that the configured maximum
number of inbound connections can never be exceeded because clients
(such as django) should never get into the uncomfortable situation
of having to deal with a "connection refused".

"Fail-fast" as by django just doesn't work so well on the frontend.
Users don't like error pages at all, they're a surefire way to damage
your reputation. Yes, slow load times are bad too, but still a much more
comfortable position to be in during temporary rush hours.

(ever had some marketing droid yell at you in his highest pitch because 
 10% of their expensive click-throughs went to http-500 hell? ;-) )

> In case (2) the first option really isn't possible, because you can't
> reliably predict traffic spikes in advance. This leaves the second
> option, which requires you to be constantly watching the number of
> database connections in use and involves shutting down your
> application temporarily in order to insert the necessary configuration
> directives. It is also unlikely that you will be able to do so before
> at least some users have received error pages.

Hmm. Dynamic adaption of the pool config is an interesting subject
(java c3p0 can actually do it at runtime, within limits, i think) but
totally out of my scope here. I think a fixed pool config would suffice
to achieve my goal of "graceful behaviour under load".

> So you must either waste resources, or accept increased monitoring
> overhead and the inevitability that some requests will not receive
> successful responses.
> 
> Add to this the following disadvantages:
> 
> * More complex configuration of Django (and hence more potential for
> configuration error).

Oh c'mon. A connection pool is not so complicated.

> * More complex code base for Django (and hence more operating overhead
> and more potential bugs).

Well, I'd think that the constant flux of connections
causes more overhead than even the sloppiest implementation
of a pool ever could. Creating a TCP c

Re: Occasional PostgreSQL Error

2008-01-30 Thread James Bennett

On Jan 30, 2008 8:55 PM, Mark Green <[EMAIL PROTECTED]> wrote:
> What ressources are held and wasted exactly?
> Maintaining a number of open TCP connection is much cheaper
> than creating/discarding them at a high rate.

Every connection that one Django application holds on to is a
connection that something else can't use.

> I agree that django's CGI-style mode of operation might make
> implementation tricky (separate thread?) but you can't seriously
> suggest that creating/discarding n connections per second would
> be cheaper that maintaining, say, n*10 long-lived connections?

If you use an external pooling manager you're not creating/discarding
*database* connections, which would really hurt; you're
creating/discarding connections to the connection manager.

> Predictability is the keyword here. From the perspective of my database
> (or pgpool instance) I want to be sure that the configured maximum
> number of inbound connections can never be exceeded because clients
> (such as django) should never get into the uncomfortable situation
> of having to deal with a "connection refused".

It's *your* keyword. The question from Django's perspective is whether
it is of such all-consuming importance that Django must be completely
re-architected to accommodate your use case. The answer from Django's
perspective is "no".

> "Fail-fast" as by django just doesn't work so well on the frontend.
> Users don't like error pages at all, they're a surefire way to damage
> your reputation. Yes, slow load times are bad too, but still a much more
> comfortable position to be in during temporary rush hours.

A slow-loading page is an error, to a user. Numerous usability studies
demonstrate this.

> Hmm. Dynamic adaption of the pool config is an interesting subject
> (java c3p0 can actually do it at runtime, within limits, i think) but
> totally out of my scope here. I think a fixed pool config would suffice
> to achieve my goal of "graceful behaviour under load".

No, it wouldn't. And you'd see that if you'd actually read what I'm saying here.

> Oh c'mon. A connection pool is not so complicated.

Compared to not having it? Yes, yes it is.

> Well, I'd think that the constant flux of connections
> causes more overhead than even the sloppiest implementation
> of a pool ever could. Creating a TCP connection is not free
> and then there's the DB handshake on top.

Again, see above about connecting to an external pooling manager.

> Out of curiosity: Does django create a new connection for each query or
> for each http-request?

For each request/response cycle.

> And about code complexity...  Yes, tying the thing to the django
> execution model might take a bit of thinking. But the pool itself
> should be a 20-liner, we're talking python, right? ;-)

20 lines of code to implement the pool, and complete re-architecting
of a 32,000-line codebase so you'll have a place to put it. A winning
trade this is not.

> Oh c'mon again...

You're saying it *wouldn't* tie the application layer to knowing about
the database layer?

> Sounds like a win to me, what kind of flexibility would be lost here?

Suppose I have my Django site running on one physical web server, and
I configure a connection pool within Django and do all the math to
make it fail gracefully.

Now, suppose my traffic grows to where I need a second physical web server.

Whoops, time to re-calculate the Django configuration.

If I instead use a pooling connection manager external to Django, I
just drop another web node into place and keep right on trucking, with
zero reconfiguration and zero downtime. That, to my mind, qualifies as
"flexibility".

> Well, doing it *this* way is obviously a bad idea. If django were to
> persist anything across requests (such as a connection pool) then
> ofcourse a long-running thread/process seems like the logical choice.

Yes, and once you're there why not keep the application layer simple
by having that long-running process not be part of it?

> Although I still wonder if the mod_python stuff really doesn't allow for
> any kind of persistence or if django just chose not to use it?

You *can* persist things quite easily on a per-process basis, and
Django does persist some things, just not the database connection.

> As said, the particular problem of "users getting error pages instead of
> delays" can unfortunately not be solved externally.

Yes, it can. We do it already in production. Here's how:

1. Set up your web nodes.
2. Set up Perlbal in front of them (even if you have only one web
node, you'll want a load balancer for any sort of serious traffic, and
Perlbal's a best of breed tool for this sort of stack).
3. When you hit a max connections and a web node errors out, Perlbal
detects this and routes to another, effectively re-trying the request.

And voila: graceful retry without cluttering up the application layer.
And in the time you've spent arguing for Django to be fundamentally
rearchitected to deal with this already-solved probl

Re: Occasional PostgreSQL Error

Re: Occasional PostgreSQL Error

Re: Occasional PostgreSQL Error

Re: Occasional PostgreSQL Error

Re: Occasional PostgreSQL Error

Re: Occasional PostgreSQL Error

Re: Occasional PostgreSQL Error

Re: Occasional PostgreSQL Error

Re: Occasional PostgreSQL Error

Re: Occasional PostgreSQL Error

Re: Occasional PostgreSQL Error

11 matches

Site Navigation

Mail list logo

Footer information