Re: synchronization problem

2008-03-30 Thread Mark Green

On Sun, 2008-03-30 at 17:49 -0700, meppum wrote:
> The easiest way to deal with this is to create a "version" column that
> gets updated with the current version number of the row in the
> database. Increment this column value each time a save is performed on
> that row. Override the save method on the model you want to perform
> this check on (in your example the poll model), and check on commit
> check that the value of the "version" colum in the database is LESS
> THEN the value of the "verson" column you are about to commit. If it
> is not, then throw an error. and rollback the transaction.

Ehm, and how would you "check on commit"?


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: *Occasional* PostgreSQL Error

2008-01-30 Thread Mark Green

On Wed, 2008-01-30 at 19:34 -0600, James Bennett wrote:
> On Jan 30, 2008 6:01 PM, Mark Green <[EMAIL PROTECTED]> wrote:
> > Ahem, there's a huge difference between being confronted with
> > a spinner/progress bar or an error page. The former speaks
> > "Please wait", the latter speaks "Try again".
> 
> OK, so let's break this down.

Yay, thanks for that exhaustive response. :-)
I guess we'll eventually have to agree on disagreement but
I'll add my counterpoints for completeness.

> There are two potential cases where you run up against your database's
> concurrent connection limit:
> 
> 1. Your average traffic level involves more concurrent connections
> than your database permits.
> 2. Your average traffic level is within the number of concurrent
> connections your database permits, but you are experiencing a
> temporary spike above that level.
> 
> In case (1), the odds of a timeout/retry scheme yielding success
> within the time the average user is prepared to wait are low; your
> database is simply swamped, and the only options are to increase
> available resources (in the form of database connections) or refuse to
> serve some number of requests.
> 
> So case (2) is the only one worth discussing as a target for a
> timeout/retry scheme.

Yes.

> In this case, you are (apparently) asking for Django to provide three things:
> 
> 1. A mechanism for specifying a set number of database connections
> which Django will persistently maintain across request/response
> boundaries.
> 2. A mechanism for specifying how long Django should wait, when
> attempting to obtain a connection, before timing out and concluding
> that no connection will be obtained.
> 3. A mechanism for specifying how many times, after failing to obtain
> a connection, Django should re-try the attempt.

Yes. Basically a bog-standard connection pool.

> How, then, would we apply these hypothetical configuration directives
> to the situation of a traffic spike? There are two possibilities:
> 
> 1. Set these directives in advance and have them be a permanent part
> of the running application's configuration.
> 2. Avoid setting these directives until a spike is imminent (difficult
> to do) or in progress, and leave them only so long as the spike
> persists.
> 
> In case (1) you are flat-out wasting resources and complicating
> Django's operation, by holding resources which are not used and
> mandating more complex logic for obtaining those resources. In nearly
> all cases this is a bad trade-off to make.

What ressources are held and wasted exactly?
Maintaining a number of open TCP connection is much cheaper
than creating/discarding them at a high rate.

I agree that django's CGI-style mode of operation might make
implementation tricky (separate thread?) but you can't seriously
suggest that creating/discarding n connections per second would
be cheaper that maintaining, say, n*10 long-lived connections?

Predictability is the keyword here. From the perspective of my database
(or pgpool instance) I want to be sure that the configured maximum
number of inbound connections can never be exceeded because clients
(such as django) should never get into the uncomfortable situation
of having to deal with a "connection refused".

"Fail-fast" as by django just doesn't work so well on the frontend.
Users don't like error pages at all, they're a surefire way to damage
your reputation. Yes, slow load times are bad too, but still a much more
comfortable position to be in during temporary rush hours.

(ever had some marketing droid yell at you in his highest pitch because 
 10% of their expensive click-throughs went to http-500 hell? ;-) )

> In case (2) the first option really isn't possible, because you can't
> reliably predict traffic spikes in advance. This leaves the second
> option, which requires you to be constantly watching the number of
> database connections in use and involves shutting down your
> application temporarily in order to insert the necessary configuration
> directives. It is also unlikely that you will be able to do so before
> at least some users have received error pages.

Hmm. Dynamic adaption of the pool config is an interesting subject
(java c3p0 can actually do it at runtime, within limits, i think) but
totally out of my scope here. I think a fixed pool config would suffice
to achieve my goal of "graceful behaviour under load".

> So you must either waste resources, or accept increased monitoring
> overhead and the inevitability that some requests will not receive
> successful responses.
> 
> Add to this the following disadvantages:
> 
> * More complex configuration of Django (and hence more potential for
> configuration error).

Oh c'mon. A connection pool is not so complicated.

>

Re: *Occasional* PostgreSQL Error

2008-01-30 Thread Mark Green

On Wed, 2008-01-30 at 11:03 -0600, James Bennett wrote:
> On Jan 30, 2008 8:57 AM, Mark Green <[EMAIL PROTECTED]> wrote:
> > Well, "Build for failure". Temporary overload can happen at any
> > time and I'd expect django to behave exceptionally bad in that
> > case as it is.
> 
> Running out of resources is never a good thing for any system.

Obviously.

> > Disclaimer: I haven't actually tested this behaviour but I've seen it
> > in JDBC apps before we added pooling and don't know why django should
> > be different. These apps would basically "chop off" (i.e. return errors
> > for) the excess percentile of requests. Naturally the affected users
> > would use their "reload"-button and there we have a nice death spiral...
> 
> And if it just slows down you don't think they'll do the same thing?

Ahem, there's a huge difference between being confronted with
a spinner/progress bar or an error page. The former speaks
"Please wait", the latter speaks "Try again".

> > Not really. My desire is to make each individual django instance
> > play well when things get crowded. Making them aware of each other,
> > or even making all database clients aware of each other, sounds
> > like an interesting project but is not what I'm after here.
> 
> But in order to know that things are "crowded", each one has to know
> what all the others are doing. And any non-Django application using
> the same database *also* has to include its own copy of all that
> configuration.

I guess I still haven't made clear what I mean.
I expect *each individual instance* of django to behave gracefully
when it can't get through to the database. No group-knowledge is
needed for a plain old connection retry or (better) connection pooling.

> > Well, there is a point where a single instance of the external
> > utility doesn't cut it anymore. The only way to go seems to be
> > one pgpool instance per django instance (for performance and
> > to avoid the single point of failure).
> 
> Again: you're repeatedly changing the topic from connection pooling to
> failover. When you decide you want to talk about one or the other for
> more than a few sentences at a time, let me know.

Erm. I have not mentioned failover a single time. I'm just trying
to point out where your "let pgpool handle it"-strategy seems to
fall down.

> > Maybe I'm blowing all this out of proportion
> 
> Almost certainly.
> 
> > but I wonder
> > if any of the high-traffic, multi-server django sites ever
> > ran into it?
> 
> Not really. If you're hitting the max on your DB you have more
> immediate problems than whether your users see an error page or an
> eternal "Loading..." bar.

I'm not talking about maxing out the db constantly. I'm talking about
scratching the limits during peak hours which is something that I'm
pretty sure almost every bigger site has expirienced at least once
(cf. "growing pains").

During these peak hours there's a huge difference between users randomly
getting an error page and users randomly having to wait a little longer.


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: *Occasional* PostgreSQL Error

2008-01-30 Thread Mark Green

On Tue, 2008-01-29 at 23:33 -0600, James Bennett wrote:
> On Jan 29, 2008 11:18 PM, Mark Green <[EMAIL PROTECTED]> wrote:
> > I agree on the loadbalancer front but the overhead for all
> > those TCP connections (and pgpool managing them) worries me a bit.
> 
> I've used pgpool in production with great success, so I'm not really
> sure what overhead you're talking about.
> 
> > Furthermore, and much more serious, I see no way to ensure
> > graceful degration in case of overload.
> 
> And here you completely change the topic of discussion from persistent
> pooling of connections to failover when a database reaches its maximum
> connection level, so I'm not really sure what it has to do with
> anything...

Well, "Build for failure". Temporary overload can happen at any
time and I'd expect django to behave exceptionally bad in that
case as it is.

The problem is that under load it will start displaying errors to
users instead of just slowing down. Not nice, imho. There is no
"buffer" (i.e. timeout, connection retries) between django
and the [next hop to the] database.

Disclaimer: I haven't actually tested this behaviour but I've seen it
in JDBC apps before we added pooling and don't know why django should
be different. These apps would basically "chop off" (i.e. return errors
for) the excess percentile of requests. Naturally the affected users
would use their "reload"-button and there we have a nice death spiral...

> > So, long story short, I see no way out of this without
> > proper connection pooling built right into django.
> > Or am I missing something?
> 
> You're missing the fact that you've switched from asking about pooling
> to asking about failover.

Hm, I wouldn't say this is about failover. It's about behaving
gracefully under load.

> Also, your solution would mean that:
> 
> 1. Django must have its own configuration for the number of
> connections it's allowed to use, how long to keep them alive and how
> often to retry them in case of failure, and this must be updated if
> and when use patterns change.

Yup, connection pooling.

> 2. Django must have its own configuration for being notified of what
> every other client application of the same database is doing, and this
> must be updated if and when use patterns change.
>
> 3. Every other client application of the same database must have
> similar dual configuration to know what it's allowed to do and what
> everybody else is doing, and these must be updated if and when use
> patterns change.

Not really. My desire is to make each individual django instance
play well when things get crowded. Making them aware of each other,
or even making all database clients aware of each other, sounds
like an interesting project but is not what I'm after here.

> Or you could just use a single external utility to manage database
> connections, thus keeping all that essentially infrastructural cruft
> out of the application layer while giving you a single place to
> configure it and a single place to make changes when you need them.

Well, there is a point where a single instance of the external
utility doesn't cut it anymore. The only way to go seems to be
one pgpool instance per django instance (for performance and
to avoid the single point of failure).

So there you have your "n configurations" again, only outside
of django and without really solving the overload-problem.

Maybe I'm blowing all this out of proportion but I wonder
if any of the high-traffic, multi-server django sites ever
ran into it?


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: *Occasional* PostgreSQL Error

2008-01-29 Thread Mark Green


On Tue, 2008-01-29 at 22:07 -0600, James Bennett wrote:
> On Jan 29, 2008 10:04 PM, Mark Green <[EMAIL PROTECTED]> wrote:
> > Just curious, what's the state of connection pooling in django?
> 
> My personal opinion is that the application level (e.g., Django) is
> the wrong place for connection pooling and for the equivalent "front
> end" solution of load balancing your web servers: the less the
> application layer has to know about what's in front of and behind it,
> the more flexible it will be (since you can make changes without
> having to alter your application-layer code).
> 
> So, for example, connection pooling for Postgres would best be handled
> by a dedicated pooling connection manager like pgpool; Django can
> connect to pgpool as if it's simply a Postgres database, which means
> you don't have to go specifying pooling parameters at the application
> level.

Hm, that doesn't sit so well with me.
I agree on the loadbalancer front but the overhead for all
those TCP connections (and pgpool managing them) worries me a bit.

Furthermore, and much more serious, I see no way to ensure
graceful degration in case of overload.

Let's assume we run a local pgpool instance along with django on each
machine. Django goes through the local pgpool for database access.

Now what happens when the database becomes too slow to
keep up with requests for any reason?

I see two options:

a) pgpool is configured without a limit on inbound connections;
   the hanging connections between django and pgpool will
   eventually exhaust the total number of allowed tcp-
   connections for the django-user or even systemwide.

   django will not be able to open new database connections and
   display nasty error pages to the users. Worse yet, if django
   and webserver are running under the same uid then the webserver
   will likely no longer be able to accept new inbound connections
   and the users get funny error messages from their browsers.

b) pgpool is configured with a limit on inbound connections;
   pgpool will hit the limit and refuse subsequent attempts from
   django, which in turn displays nasty error pages to users.

In order to achieve the desired behaviour of django slowing down
gracefully instead of spitting error pages I think we'd have to
teach django to retry database connections. But this would
open a whole new can of worms, such as risking duplicated
requests when users hit reload, etc...

So, long story short, I see no way out of this without
proper connection pooling built right into django.
Or am I missing something?


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: *Occasional* PostgreSQL Error

2008-01-29 Thread Mark Green


On Fri, 2008-01-25 at 15:14 -0800, Jacob Kaplan-Moss wrote:
> Hi Doug --
> 
> On 1/24/08, Doug Van Horn <[EMAIL PROTECTED]> wrote:
> > OperationalError: could not connect to server: No such file or
> > directory
> >Is the server running locally and accepting
> >connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.
> > 5432"?
> 
> This means that, for some reason, a connection to the database
> couldn't be established. You'll get this error if the database isn't
> running, but since you only get it under load I'd guess that you're
> hitting PostgreSQL's max connection limit (see the max_connections
> setting in postgresql.conf). You can tell for sure by checking your
> Postgres log; it'll have a message about reaching the connection
> limit.

Just curious, what's the state of connection pooling in django?
And what does the user see when such an error occurs, I guess
an error 500 message?


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



hierarchical constraints (country/city) and form validation?

2007-12-01 Thread Mark Green

Hello djangoics,

I'd like to ask for the opinion of some django veterans
on a task that I imagine to be a fairly common one.

My site allows the user to set a country and city
as part of their UserProfile. Obviously the city-field
should only allow values that are valid within the context
of the selected country ("France / Chicago" doesn't work).

I have hacked up a fairly naive implementation that involves
two CharFields, country and city, where the former refers to
a lengthy list of "choices" and the latter just relies on all forms
to be equipped with a custom constructor (to populate the options)
and validator (to deny invalid combinations). In the templates I'm
using the usual bit of a ajax trickery to populate the "city"
select-widget according to and after the country has been selected.

Now this works ok but it's a lot of ugly boilerplate code and
I'm convinced there must be a much more elegant way to solve
this all at the model level.

I have the country/city relation in the database already, so
it's just a matter of instrumenting it properly within django.

Can someone set me on the right track? :)


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: Passing the max_length from the Model into the Form

2007-11-23 Thread Mark Green

On Fri, 2007-11-23 at 14:59 -0800, leotr wrote:
> Yes, not only it's dirty but it's ineffective as well!

And your idea violates DRY.

> Declare a constant somewhere and use it in both model and form.

And where would that "somewhere" be?
Custom constants on the model?

I see no point in adding indirection here.
The model definition itself is supposed to
be canonical (dynamic model tricks aside).

No, I think Wolfram's approach is conceptually correct.
If it looks dirty then that's a matter of API cosmetics.
If it's inefficient then that's a matter of optimization.

Rule of thumb: If you want information about the
model then ask the model.

> max_link_url_len = 80
> 
> class Link(models.Model):
> url = bla_bla_field(max_length = max_link_url_len)
> 
> class myForm(forms.Form):
>
> url = forms.URLField( max_length=max_link_url_len)
> 
> :)



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: image/file uploads, custom filenames, security

2007-10-17 Thread Mark Green

On Tue, 2007-10-16 at 10:18 -0400, Marty Alchin wrote:
> I've done some work on FileField lately that address some of your concerns.
> 
> On 10/16/07, Mark Green <[EMAIL PROTECTED]> wrote:
> > * does django properly sanitize the filename or rather, use
> >   safe temp files?  i wonder what would happen if i tried to
> >   upload a file called "../../traverse.txt"?
> 
> I haven't done any testing on that particular situation, so I can't
> speak to that one.

well, i guess i'll give it a shot and report to the list
if there are problems.

> > * how can i enforce a filename on the uploaded file?
> >   i want to completely ignore the remote name of the file
> >   and instead store it as, for example, {{username}}.jpg
> 
> There's a ticket[1] in Trac to revamp the way file storage is defined,
> which would allow you to override some of how Django selects a
> filename. Currently, it won't allow you to use the username, or any
> other details of the model the image is attached to, but that's
> becoming a common request, so I'll see about adding it before it hits
> trunk.

interesting!
i can only second that common request. ;)
any idea when it will be done?

> > * anyone know if the PIL stuff is hardened against image bombs?
> >   (small images that expand to gigabytes when expanded to bitmap)
> >   would it be feasible to subclass ImageFile and replace the PIL
> >   calls with some paranoid homegrown stuff (i.e. ImageMagick),
> >   anyone know a starting point for this?
> 
> The ticket I mentioned above also makes it much easier to subclass
> FileField and ImageField to add or change whatever functionality you
> like. I don't know whether PIL already does what you need, but if
> you're paranoid, this patch should help you out.

awesome. i know it's probably a fairly exotic request but
since my site deals heavily with images i can imagine some
customization might pay off (security- or performancewise).


thanks for the info!

-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



image/file uploads, custom filenames, security

2007-10-16 Thread Mark Green

hi all,

i've been playing with ImageField and FileField recently and so
far they work like a charm.

some questions remain, though:

* does django properly sanitize the filename or rather, use
  safe temp files?  i wonder what would happen if i tried to
  upload a file called "../../traverse.txt"?

* how can i enforce a filename on the uploaded file?
  i want to completely ignore the remote name of the file
  and instead store it as, for example, {{username}}.jpg

* anyone know if the PIL stuff is hardened against image bombs?
  (small images that expand to gigabytes when expanded to bitmap)
  would it be feasible to subclass ImageFile and replace the PIL
  calls with some paranoid homegrown stuff (i.e. ImageMagick),
  anyone know a starting point for this?


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



choices/, how to get rid of the dashes?

2007-10-02 Thread Mark Green

Hi all,

sorry for repeating my question but I haven't gotten
a solution on first attempt and this really nags me.

This is my model:

class Person(models.Model):
GENDER_CHOICES = (
( 'm', 'Male' ),
( 'f', 'Female' ),
)
gender = models.CharField( blank=False, null=False, "gender", maxlength=1, 
choices=GENDER_CHOICES, default='m' )



Using form_for_model() on the above model results in HTML like this:


-
Male
Female



How do I get rid of the first option (the dashes)?
Creating my form with initial={ 'gender':'m' } at least sets the default option
correctly (default='m' on the model is happily ignored btw, why?) but the dashes
are still there.

It obviously doesn't make sense to offer an option to the user
that they aren't allowed to select, so can anyone confirm that
this is a bug or is my code just acting wierd?


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: Should Django have a road map?

2007-10-01 Thread Mark Green

On Mon, 2007-10-01 at 17:07 +0200, Stefan Matthias Aust wrote: 
> Joe,
> 
> 2007/10/1, Joe <[EMAIL PROTECTED]>:
> > [...]
> > And this is the biggest disconnect between Django's team and the
> > business world.  If I went to my bosses and told them "It's done when
> > it's done" about our upcoming product releases, I would get fired.
> > Your response should be, "It's really hard to estimate, but here is my
> > best guess and a target for us to shoot for."  And, you know what?
> > Most of the time our estimates are pretty close.  And by tracking how
> > we do on our estimates, we can make them even more accurate.
> > [...]
> 
> You explained my points much better than I could (in plain English at least ;)

Throwing in my 2 cents here...

I basically agree with all that has been said, as in: it would be great
to have a stable, predictable timeline - but it's unrealistic to achieve
that without at least one full-time person to do the housekeeping.

Since that full-time person doesn't exist I'd like to propose
something in between that I have seen implemented successfully
even in very small projects:

0. Realize that creating a roadmap takes zero time and only very
   little effort. You get it for free, from your ticket-system.

1. Get sorted. Take advantage of the trac milestones and, more 
   importantly, ticket relations (does trac have them nowadays?).
   The future will become *much* clearer once you have added hierarchy
   to the ticket swarm. It forces you to decide which things to do
   first (Milestone 1) and which to delay for a later date.
   At the same time a roadmap emerges naturally because the
   "almost-done things" bubble up and become more visible.

2. Don't bother with actual calendar dates. An occassional rough 
   estimate "could be ready at" never hurts but "70% done, ticket-wise"
   gives a much better indication of progress anyways. Better yet, 
   instead of picking randomly, people can then specificially 
   choose to work on tickets that are relevant to the next
   milestone or to the particular feature that they're after.

3. Maybe investigate on a better ticket system.
   The trac ticketing sucks very hard in all regards
   and is beaten hands down by http://mantisbt.org or
   http://redmine.org. I hate to say but even the dreaded
   JIRA does it better. Well, long story short, you want
   custom workflow/ticket states, so your tickets can't be
   "ready for checkin" and "new" at the same time. You want
   a clean UI and a working search that doesn't hurt everytime
   you use it. You want to draw clear parent/child relationships
   all the way up to the milestones.
   Ofcourse a new ticket-system is not a must but having fought
   my own share of uphill battles against trac I can tell from 
   expirience that many trac-users don't that they're missing
   a fair share of essential features.

In summary, I think this whole discussion should really be
about transparency, not about fixed dates.

Programming schedules don't work with fixed dates.
"It's done when it's done" is not an excuse, it's a
honest summary of the situation.

So all we need to do is add more transparency to the state of affairs
and that's simply a matter of using better tools or using the existing
tools better.

I don't think the call for a roadmap would have appeared if
somewhere on the django site it read like:

---snip---

Milestone 7  (16%, 40 of 240 tickets open, aggregate eta: 01-Apr-2009)
   - Sub-Goal 1: old admin to new admin
 (66%, 14 of 21 tickets open, eta: 01-12-2007)
   - Sub-Goal 2: enforce winnie-pooh.css for all sites
 (0%, 1 of 1 tickets open)
   - etc.

---snip---

Ofcourse the ETA doesn't really mean anything (just some numbers
magic that a good ticket system can do) but the above view is imho
the closest to a "roadmap" that an OSS project can get.
And, believe it or not, towards the end of a Sub-Goal those figures
often converge to surprisingly accurate estimates.

Last but no least, don't underestimate the motivational factor
of more transparency. People like the feeling of actually causing
an impact on something (the fame-factor). With the current trac
there's just an anonymous sea of tickets, not really inviting
to give or take yet another drop. A little more structure
and clustering would provide much better positive feedback
to the contributor here. It just *feels* better to close ticket
6 of 10 on a Sub-Goal, pushing it from 50% to 60%, than to close
one random ticket out of hundreds without knowing what other tickets
may be interacting with or even blocking the issue at hand.

Well, this is getting too long but I hope some of my ideas
don't sound too far fetched to some of you. :)

-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 

Re: how to scale (was: how to do something at startup)

2007-09-30 Thread Mark Green

On Sun, 2007-09-30 at 20:37 -0500, James Bennett wrote:
> On 9/30/07, Mark Green <[EMAIL PROTECTED]> wrote:
> > I'm not sure what drove me to call it "fragment caching".
> > What I really meant to point at are the little things (such as
> > form_for_model()) that would likely benefit from some object
> > caching instead of burning cycles for each request.
> 
> You can do this, by the way, and in fact quite a few people do it
> accidentally when taking their first steps with Django: if you
> instantiate an object at the module level, it'll remain resident in
> memory and will be re-used instead of re-instantiated on every
> request.
> 
> The canonical example is people who evaluate a QuerySet in their
> URLConf module, and then are surprised at how it seems to be "cached"
> forever ;)

Interesting!
Time to review my little custom thread. Think I jumped through some
very unneccessary hoops...



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: how to scale (was: how to do something at startup)

2007-09-30 Thread Mark Green

On Sun, 2007-09-30 at 20:29 -0500, James Bennett wrote:
> On 9/30/07, Mark Green <[EMAIL PROTECTED]> wrote:
> > My question was really only about the former, a much simpler problem:
> > How to keep a tcp connection persistent and re-use it across requests?
> 
> By using a pooling connection manager external to Django. Again,
> complicating the application layer with too many details of the other
> layers in the stack seems -- to me, at least -- like a premature
> optimization that costs flexibility in the long run.
> 
> > While this overhead may be constant in most (not all!) scenarios
> > it's still a waste of resources that doesn't sit well with me.
> 
> You say "waste", I say "trade-off" ;)
> 
> And that's what web development is, really: a series of trades. The
> ability to "hot swap" front-end or back-end nodees by using pooling
> and load balancing external to Django is -- again, to me -- worth the
> trade of a slight increase in overhead, because it means you can bring
> additional nodes into or out of the pool without having to reconfigure
> the application layer.

I fully agree with your picture of the trade-off but I think connection
pooling neither contradicts hot-swapping nor does it introduce any kind
of application layer configuration.

What it does introduce would be additional code complexity, so imho the
argument should be about whether it's worth that or not.

In my opinion it should be, simply because if the protocol was meant to
be used like it is by django it would probably be using UDP transport
instead of TCP. The overhead of creating a new TCP connection,
potentially for each request, should imho not be underestimated.
Just try it on a host receiving 50 hits/sec and up...

> > I do understand (and endorse very much) that django is a shared nothing
> > architecture but imho that doesn't imply "zero internal persistence
> > across requests".
> 
> Keep in mind also that Django deliberately runs a bit closer to the
> bare HTTP than some of the heavyweight frameworks and that HTTP -- by
> design -- is utterly stateless. Again, it's a trade: inherently
> stateless architectures are tremendously easy to scale across virtual
> or physical machines, and I'd argue that's worth the use of external
> persistence mechanisms when that sort of thing is needed.

I'm not arguing that at all. I don't question the decision to not
internalize certain things, I only propose to make the interface to
these external things as efficient as possible.

There are a few good reasons why connection pooling towards the
database is common practice nowadays. I think one is because juggling
with an intense flux of tcp connections is quite expensive on some
architectures and/or databases, another would be the scenario I
mentioned: better control in overload situations.

> > Further problems arise when you need to integrate with a remote peer
> > that simply depends on persistent connections. My current candidate is
> > the spread toolkit (http://www.spread.org) but it's certainly not the
> > only piece of "environmental software" working that way.
> 
> There have been a couple people lately arguing about cases where
> Django isn't an appropriate solution, and so far I haven't really
> agreed with the examples put forth. But in this thread I think you've
> hit a genuine use case where Django probably isn't what you want: if
> you need high-performance networking with external services, I'd
> highly recommend Twisted[1] as the best Python option I'm aware of.

Yes and no. Yes, I agree django isn't for everything and No, I don't
really think we're trying to abuse it to that extend.

To elaborate a bit, spread is a messaging framework, like
activemq in the java world, only less broken. ;)

While it can indeed serve as the backbone for grid-style
applications we're only using it for light internode messaging
and a small "common tuplespace", to realize, for example:

- Synchronized list of logged in users across all nodes

- AJAX Chat

- Announcement of asynchronous events (e.g. backend processing in a 
  non-django process finishes) to the user

As you can see, our webapp is at heart really a webapp,
we're not trying to shoehorn django into being a computing
cluster or bittorrent client.

Messaging as opposed to say, polling a memcached or even the database,
means very real performance advantages to us.

So, I stand by my point; I think it would be nice if django spawned a
set of worker threads on startup, used them for ORM connection pooling
and offered a small API for the developer to take advantage of them.

I'll also happily contribute my little custom-thread.py for
inclusion if that helps but I somehow doubt the django guru's
couldn't do better in less time. ;)


-mark



--~--~-

Re: how to scale (was: how to do something at startup)

2007-09-30 Thread Mark Green

On Sun, 2007-09-30 at 16:16 -0500, James Bennett wrote:
> On 9/30/07, Mark Green <[EMAIL PROTECTED]> wrote:
> > Hm, this raises some serious scalabity questions for me.
> > >From your description it sounds like there is no template
> > fragment caching, not even db connection pooling possible
> > with django?
> 
> You can cache anything you want to cache; read the caching
> documentation (the whole thing) before jumping to conclusions about
> that. At work we use a custom template Node class which caches its
> output, for example.

Sorry, I was indeed jumping too quick on the caching issue or
rather wording my concerns poorly.

I'm not sure what drove me to call it "fragment caching".
What I really meant to point at are the little things (such as
form_for_model()) that would likely benefit from some object
caching instead of burning cycles for each request.

I do admit though that this may be scratching the realm of
micro-optimization and I realize I shouldn't have brought it up
without at least measuring it first. Let's just skip this point for now
(my bad, sorry again) and instead focus on the (imho) more glaring
issue of "no persistent connections", see below.

> As for database clustering, there's a philosophical issue here: Django
> shouldn't need to know whether there's one database server behind it,
> or five, or a hundred. We've had success using pgpool, for example,
> which -- from Django's point of view -- looks the same as any
> PostgreSQL database, but in reality is pooling connections and
> supports multiple actual databases running behind it.
> 
> Think of it the same way you'd do load-balancing in front of your
> application: just as users shouldn't need to know that you have, say,
> ten web nodes running Django, and just as they shouldn't have to stop
> and ask, "which one of the site's web nodes to I want to request a
> page from?", Django shouldn't need to know how many database nodes you
> have, or which one it should talk to on each query. The less the
> various layers of your stack have to know about each other, the easier
> it'll be to make changes.
> 
> I'd suggest reading the deployment chapter of the Django book for more 
> details:
> 
> http://www.djangobook.com/en/beta/chapter21/
> 
> > And what about integration with a messaging framework
> > (spread or somesuch) for efficient cluster communications?
> 
> So long as there's an interface you can talk to from Python, or over
> standard networking protocols, what's the holdup? Django does not have
> "out of the box" support for interoperating with every single
> component someone might want to use, but then neither would an
> "enterprise" Java framework; that's why you have programmers ;)

First off, thanks for all the insight. Unfortunately I think
you misread my "db connection pooling" as "db clustering".

My question was really only about the former, a much simpler problem:
How to keep a tcp connection persistent and re-use it across requests?

Creating and discarding tcp connections at a high rate imposes a
measurable overhead for both the initiator (django) and the
receiving end (e.g. RDBMS or even a pgpool on localhost).
While this overhead may be constant in most (not all!) scenarios
it's still a waste of resources that doesn't sit well with me.
In particular, if and when the receiving end slows down under load,
the last thing you want is incoming connection attempts to pile up.

I do understand (and endorse very much) that django is a shared nothing
architecture but imho that doesn't imply "zero internal persistence
across requests".

Further problems arise when you need to integrate with a remote peer
that simply depends on persistent connections. My current candidate is
the spread toolkit (http://www.spread.org) but it's certainly not the
only piece of "environmental software" working that way.

I'm currently approaching the problem by spawning a custom thread on
first request (thus my inquiry about "how to do something at startup"),
but I think django would benefit from providing standard infrastructure
for that - which comes for free when proper connection pooling for the
ORM is implemented.


-mark


PS: Sorry if django actually *does* proper pooling already and I'm 
beating a dead horse here. My assumption that it doesn't do it
comes from the fact that it doesn't seem to pull up a
persistent thread and because my grep for "pool" over the
svn sources didn't hit anything. If murder is the case
you can just ignore my whole ranting...



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: how to do something at startup

2007-09-30 Thread Mark Green

On Fri, 2007-09-28 at 22:34 -0600, staff-gmail wrote:
> James Bennett wrote:
> > On 9/28/07, Mark Green <[EMAIL PROTECTED]> wrote:
> >   
> >> i'm looking for a way to perform a bunch of initialization tasks
> >> right after django startup.
> >> 
> >
> > There really is no such thing as "Django startup"; remember that
> > Django is hosted inside a web server, and that server processes will
> > come and go over time with no real concept of anything persisting
> > beyond the life of a process, unless you serialize out to an external
> > store (such as your database, or a file, or memcached). And then
> > you'll want to be very careful in how you "initialize", because that's
> > probably going to happen every time a server process is started, and
> > you'll need to take care that you're not unnecessarily regenerating or
> > recalculating something when you could load it from something
> > external.
> >   
> 
> Not sure what you are trying to initialize, but you could call an 
> initialization method from your initial view method.  If it is like a 
> new game or new project or new whatever - I have the user click on a 
> start button which then calls a function (in the project or game or 
> whatever model) that retrieves or resets a bunch of variables and 
> creates a new object from that model.  Likewise I have some initialize 
> functions in models that create new "seed" data for a new account and 
> are called and executed from the on-submit method that is specified in 
> the login form.  Is that what you are trying to do?

Thanks for the idea but my needs are a bit more low-level I think.
In particular I need to start up (and maintain over the lifetime
of the django instance) a tcp connection to our messaging broker for
later use from views-code.


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: how to scale (was: how to do something at startup)

2007-09-30 Thread Mark Green

On Fri, 2007-09-28 at 22:29 -0500, James Bennett wrote:
> On 9/28/07, Mark Green <[EMAIL PROTECTED]> wrote:
> > i'm looking for a way to perform a bunch of initialization tasks
> > right after django startup.
> 
> There really is no such thing as "Django startup"; remember that
> Django is hosted inside a web server, and that server processes will
> come and go over time with no real concept of anything persisting
> beyond the life of a process, unless you serialize out to an external
> store (such as your database, or a file, or memcached). And then
> you'll want to be very careful in how you "initialize", because that's
> probably going to happen every time a server process is started, and
> you'll need to take care that you're not unnecessarily regenerating or
> recalculating something when you could load it from something
> external.

Oops!  Guess I missed that concept because I've only been
playing with the developement server so far.

Hm, this raises some serious scalabity questions for me.
>From your description it sounds like there is no template
fragment caching, not even db connection pooling possible
with django?

And what about integration with a messaging framework
(spread or somesuch) for efficient cluster communications?

These all seem to be basic requirements for scalability
and integration with existing infrastructure.

Any thoughts on that?


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



how to do something at startup

2007-09-28 Thread Mark Green

hi all,

i'm looking for a way to perform a bunch of initialization tasks
right after django startup.

where would i put such things and how/when are they called?


-mark



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: Two newform best practice questions

2007-09-26 Thread Mark Green

Hi Joseph,

I say thanks for the pointer, too.
A quick question (since you seem to be involved with this), is there any
reason to have django not prefix the form fields by default with, say,
the model-name (so prefix='' or prefix='somethingelse' can still be
used if someone doesn't want it that way).

I would definately vote for making it default unless someone can come
up with potential problems it may cause. There'd be just one thing less
to worry about if it was that way...

And if there are worries about backwards compatibility I'd propose a
toggle-option in settings.py defaulting to False.


-mark

On Wed, 2007-09-26 at 01:29 -0500, Joseph Kocherhans wrote:
> On 9/26/07, Przemek Gawronski <[EMAIL PROTECTED]> wrote:
> >
> > I'm using several forms (newforms) to build one html form. One thing to
> > watch out for is common field names in your django form classes. So if
> > you have two django forms and they both have a field 'date' for example,
> > then handling in your view method in request.POST there will be only one
> > key 'date' that will go in to both django forms via
> > f1=Form1(request.POST) and f2=Form2(request.POST). I'm still thinking on
> > how to solve this in a elegant way instead of changing the field names
> > in django forms to say date1 and date2.
> 
> Try this:
> 
>f1 = Form1(request.POST, prefix='form1')
>f2 = Form2(request.POST, prefix='form2')
> 
> That should solve your problem. The prefix argument is prepended to
> every field name in the html, thus preventing name clashes between
> forms. That should be in the 0.96 release. No docs on it yet though.
> Sorry :(
> 
> Here's the changeset that added it, and the tests there should show
> you some examples if needed
> http://code.djangoproject.com/changeset/4194
> 
> Joseph



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: how to implement "stay logged on this computer until i log out"?

2007-09-24 Thread Mark Green

Hi Joe,

thx for the pointer, looks workable.
Will try it soon.

-moe

On Sun, 2007-09-23 at 23:08 -0700, Joseph Heck wrote:
> You can certainly access cookies manually and do with them as you
> like. That's how we've implemented a "remember who it was that last
> logged in from this computer" kind of feature.
> 
> There's a set_cookie() method on the request object that can do this
> work for you - but it's not thoroughly documented. I'd recommend
> looking at the code directly and maybe checking out this thread:
> 
> http://www.google.com/url?sa=t=res=4=http%3A%2F%2Fgroups.google.com%2Fgroup%2Fdjango-users%2Fbrowse_thread%2Fthread%2F7b65ff5783f71b9c%2Fa4079aa60e7dfa37=GFT3Ro7tOKGegAPn49C_BA=AFQjCNHCE35TvLu4ZdYy1SklwYZh8b_wqg=fPsOuIxzECnESDy_ZtHWqg
> 
> -joe
> 
> On 9/23/07, Mark Green <[EMAIL PROTECTED]> wrote:
> >
> > Hi list,
> >
> > I would like to have sessions normally timeout after
> > 8 hours, that is easily achieved by setting
> > SESSION_COOKIE_AGE in settings.py.
> >
> > But additionally I'd like to provide a checkbox to "stay logged
> > in on this computer until i log out" which shall make the
> > session immortal (remove expiry).
> >
> > Is there anything in the API to implement that or can I get
> > at the session-cookie meat to do it manually?
> >
> > I basically need to override the default from
> > SESSION_COOKIE_AGE for individual sessions.
> >
> >
> > -mark
> >
> >
> > >
> >
> 
> > 
> 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



how to implement "stay logged on this computer until i log out"?

2007-09-23 Thread Mark Green

Hi list,

I would like to have sessions normally timeout after
8 hours, that is easily achieved by setting
SESSION_COOKIE_AGE in settings.py.

But additionally I'd like to provide a checkbox to "stay logged
in on this computer until i log out" which shall make the
session immortal (remove expiry).

Is there anything in the API to implement that or can I get
at the session-cookie meat to do it manually?

I basically need to override the default from
SESSION_COOKIE_AGE for individual sessions.


-mark


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: choices/ and getting rid of the dashes?

2007-09-23 Thread Mark Green

Thanks for the pointer.
This does indeed change the default value to 'Male'
but the select-box still offers the dashes...

I want to eliminate the dashes-option completely,
why give the user a choice that will never be accepted?

-mark

On Mon, 2007-09-10 at 06:47 +, Ryan wrote:
> Use initial when calling your form class.
> 
> formClass = forms.form_for_model(Person)
> form = formClass(initial={'gender': 'm'})
> 
> On Sep 6, 4:31 pm, Mark Green <[EMAIL PROTECTED]> wrote:
> > Hi all,
> >
> > This is my model:
> >
> > class Person(models.Model):
> > GENDER_CHOICES = (
> > ( 'm', 'Male' ),
> > ( 'f', 'Female' ),
> > )
> > gender = models.CharField( blank=False, "gender", maxlength=1, 
> > choices=GENDER_CHOICES, default='m' )
> >
> > Using form_for_model() on the above model results in HTML like this:
> >
> > 
> > -
> > Male
> > Female
> > 
> >
> > How do I get rid of the first option (the dashes)?
> > I would prefer to have the -widget default to my default-value.
> >
> > Any help appreciated!
> >
> > -mark


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: Form field deletion

2007-09-07 Thread Mark Green

On Fri, Sep 07, 2007 at 03:24:46PM -0500, jake elliott wrote:
> 
> Oleg Korsak wrote:
> >> Sure - use the 'fields' option to specify the subset of model fields
> >> you want to use on the form.
> >>
> >   form_for_instance() got an unexpected keyword argument 'fields'
> > 
> 
> this argument to form_for_instance() and form_for_model() is only
> available in the SVN version of django.
> 
> one quick way to get this behavior without that argument is to set the
> 'user' field 'editable=False'
> 
> user = models.ForeignKey(User, editable=False)
> 
> remember this will affect the change form in contrib.admin also, which
> may or may not be desirable for your project.
> 
> -jake

i have a related question, is there a way to change the
order of the fields, too?

my form adds a second password field like this:


class JoinForm(forms.BaseForm):
def __init__(self, *args, **kwargs):
self.base_fields['pw2'] = forms.CharField( label = 'again', min_length 
= 3 )
self.base_fields['pw2'].widget = forms.PasswordInput()


unfortunately this new field always appears at the end
of my form and fields=(..) doesn't affect the order.


-mark


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



choices/ and getting rid of the dashes?

2007-09-06 Thread Mark Green

Hi all,

This is my model:

class Person(models.Model):
GENDER_CHOICES = (
( 'm', 'Male' ),
( 'f', 'Female' ),
)
gender = models.CharField( blank=False, "gender", maxlength=1, 
choices=GENDER_CHOICES, default='m' )



Using form_for_model() on the above model results in HTML like this:


-
Male
Female



How do I get rid of the first option (the dashes)?
I would prefer to have the -widget default to my default-value.

Any help appreciated!


-mark


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---