Re: synchronization problem
On Sun, 2008-03-30 at 17:49 -0700, meppum wrote: > The easiest way to deal with this is to create a "version" column that > gets updated with the current version number of the row in the > database. Increment this column value each time a save is performed on > that row. Override the save method on the model you want to perform > this check on (in your example the poll model), and check on commit > check that the value of the "version" colum in the database is LESS > THEN the value of the "verson" column you are about to commit. If it > is not, then throw an error. and rollback the transaction. Ehm, and how would you "check on commit"? -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: *Occasional* PostgreSQL Error
On Wed, 2008-01-30 at 19:34 -0600, James Bennett wrote: > On Jan 30, 2008 6:01 PM, Mark Green <[EMAIL PROTECTED]> wrote: > > Ahem, there's a huge difference between being confronted with > > a spinner/progress bar or an error page. The former speaks > > "Please wait", the latter speaks "Try again". > > OK, so let's break this down. Yay, thanks for that exhaustive response. :-) I guess we'll eventually have to agree on disagreement but I'll add my counterpoints for completeness. > There are two potential cases where you run up against your database's > concurrent connection limit: > > 1. Your average traffic level involves more concurrent connections > than your database permits. > 2. Your average traffic level is within the number of concurrent > connections your database permits, but you are experiencing a > temporary spike above that level. > > In case (1), the odds of a timeout/retry scheme yielding success > within the time the average user is prepared to wait are low; your > database is simply swamped, and the only options are to increase > available resources (in the form of database connections) or refuse to > serve some number of requests. > > So case (2) is the only one worth discussing as a target for a > timeout/retry scheme. Yes. > In this case, you are (apparently) asking for Django to provide three things: > > 1. A mechanism for specifying a set number of database connections > which Django will persistently maintain across request/response > boundaries. > 2. A mechanism for specifying how long Django should wait, when > attempting to obtain a connection, before timing out and concluding > that no connection will be obtained. > 3. A mechanism for specifying how many times, after failing to obtain > a connection, Django should re-try the attempt. Yes. Basically a bog-standard connection pool. > How, then, would we apply these hypothetical configuration directives > to the situation of a traffic spike? There are two possibilities: > > 1. Set these directives in advance and have them be a permanent part > of the running application's configuration. > 2. Avoid setting these directives until a spike is imminent (difficult > to do) or in progress, and leave them only so long as the spike > persists. > > In case (1) you are flat-out wasting resources and complicating > Django's operation, by holding resources which are not used and > mandating more complex logic for obtaining those resources. In nearly > all cases this is a bad trade-off to make. What ressources are held and wasted exactly? Maintaining a number of open TCP connection is much cheaper than creating/discarding them at a high rate. I agree that django's CGI-style mode of operation might make implementation tricky (separate thread?) but you can't seriously suggest that creating/discarding n connections per second would be cheaper that maintaining, say, n*10 long-lived connections? Predictability is the keyword here. From the perspective of my database (or pgpool instance) I want to be sure that the configured maximum number of inbound connections can never be exceeded because clients (such as django) should never get into the uncomfortable situation of having to deal with a "connection refused". "Fail-fast" as by django just doesn't work so well on the frontend. Users don't like error pages at all, they're a surefire way to damage your reputation. Yes, slow load times are bad too, but still a much more comfortable position to be in during temporary rush hours. (ever had some marketing droid yell at you in his highest pitch because 10% of their expensive click-throughs went to http-500 hell? ;-) ) > In case (2) the first option really isn't possible, because you can't > reliably predict traffic spikes in advance. This leaves the second > option, which requires you to be constantly watching the number of > database connections in use and involves shutting down your > application temporarily in order to insert the necessary configuration > directives. It is also unlikely that you will be able to do so before > at least some users have received error pages. Hmm. Dynamic adaption of the pool config is an interesting subject (java c3p0 can actually do it at runtime, within limits, i think) but totally out of my scope here. I think a fixed pool config would suffice to achieve my goal of "graceful behaviour under load". > So you must either waste resources, or accept increased monitoring > overhead and the inevitability that some requests will not receive > successful responses. > > Add to this the following disadvantages: > > * More complex configuration of Django (and hence more potential for > configuration error). Oh c'mon. A connection pool is not so complicated. >
Re: *Occasional* PostgreSQL Error
On Wed, 2008-01-30 at 11:03 -0600, James Bennett wrote: > On Jan 30, 2008 8:57 AM, Mark Green <[EMAIL PROTECTED]> wrote: > > Well, "Build for failure". Temporary overload can happen at any > > time and I'd expect django to behave exceptionally bad in that > > case as it is. > > Running out of resources is never a good thing for any system. Obviously. > > Disclaimer: I haven't actually tested this behaviour but I've seen it > > in JDBC apps before we added pooling and don't know why django should > > be different. These apps would basically "chop off" (i.e. return errors > > for) the excess percentile of requests. Naturally the affected users > > would use their "reload"-button and there we have a nice death spiral... > > And if it just slows down you don't think they'll do the same thing? Ahem, there's a huge difference between being confronted with a spinner/progress bar or an error page. The former speaks "Please wait", the latter speaks "Try again". > > Not really. My desire is to make each individual django instance > > play well when things get crowded. Making them aware of each other, > > or even making all database clients aware of each other, sounds > > like an interesting project but is not what I'm after here. > > But in order to know that things are "crowded", each one has to know > what all the others are doing. And any non-Django application using > the same database *also* has to include its own copy of all that > configuration. I guess I still haven't made clear what I mean. I expect *each individual instance* of django to behave gracefully when it can't get through to the database. No group-knowledge is needed for a plain old connection retry or (better) connection pooling. > > Well, there is a point where a single instance of the external > > utility doesn't cut it anymore. The only way to go seems to be > > one pgpool instance per django instance (for performance and > > to avoid the single point of failure). > > Again: you're repeatedly changing the topic from connection pooling to > failover. When you decide you want to talk about one or the other for > more than a few sentences at a time, let me know. Erm. I have not mentioned failover a single time. I'm just trying to point out where your "let pgpool handle it"-strategy seems to fall down. > > Maybe I'm blowing all this out of proportion > > Almost certainly. > > > but I wonder > > if any of the high-traffic, multi-server django sites ever > > ran into it? > > Not really. If you're hitting the max on your DB you have more > immediate problems than whether your users see an error page or an > eternal "Loading..." bar. I'm not talking about maxing out the db constantly. I'm talking about scratching the limits during peak hours which is something that I'm pretty sure almost every bigger site has expirienced at least once (cf. "growing pains"). During these peak hours there's a huge difference between users randomly getting an error page and users randomly having to wait a little longer. -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: *Occasional* PostgreSQL Error
On Tue, 2008-01-29 at 23:33 -0600, James Bennett wrote: > On Jan 29, 2008 11:18 PM, Mark Green <[EMAIL PROTECTED]> wrote: > > I agree on the loadbalancer front but the overhead for all > > those TCP connections (and pgpool managing them) worries me a bit. > > I've used pgpool in production with great success, so I'm not really > sure what overhead you're talking about. > > > Furthermore, and much more serious, I see no way to ensure > > graceful degration in case of overload. > > And here you completely change the topic of discussion from persistent > pooling of connections to failover when a database reaches its maximum > connection level, so I'm not really sure what it has to do with > anything... Well, "Build for failure". Temporary overload can happen at any time and I'd expect django to behave exceptionally bad in that case as it is. The problem is that under load it will start displaying errors to users instead of just slowing down. Not nice, imho. There is no "buffer" (i.e. timeout, connection retries) between django and the [next hop to the] database. Disclaimer: I haven't actually tested this behaviour but I've seen it in JDBC apps before we added pooling and don't know why django should be different. These apps would basically "chop off" (i.e. return errors for) the excess percentile of requests. Naturally the affected users would use their "reload"-button and there we have a nice death spiral... > > So, long story short, I see no way out of this without > > proper connection pooling built right into django. > > Or am I missing something? > > You're missing the fact that you've switched from asking about pooling > to asking about failover. Hm, I wouldn't say this is about failover. It's about behaving gracefully under load. > Also, your solution would mean that: > > 1. Django must have its own configuration for the number of > connections it's allowed to use, how long to keep them alive and how > often to retry them in case of failure, and this must be updated if > and when use patterns change. Yup, connection pooling. > 2. Django must have its own configuration for being notified of what > every other client application of the same database is doing, and this > must be updated if and when use patterns change. > > 3. Every other client application of the same database must have > similar dual configuration to know what it's allowed to do and what > everybody else is doing, and these must be updated if and when use > patterns change. Not really. My desire is to make each individual django instance play well when things get crowded. Making them aware of each other, or even making all database clients aware of each other, sounds like an interesting project but is not what I'm after here. > Or you could just use a single external utility to manage database > connections, thus keeping all that essentially infrastructural cruft > out of the application layer while giving you a single place to > configure it and a single place to make changes when you need them. Well, there is a point where a single instance of the external utility doesn't cut it anymore. The only way to go seems to be one pgpool instance per django instance (for performance and to avoid the single point of failure). So there you have your "n configurations" again, only outside of django and without really solving the overload-problem. Maybe I'm blowing all this out of proportion but I wonder if any of the high-traffic, multi-server django sites ever ran into it? -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: *Occasional* PostgreSQL Error
On Tue, 2008-01-29 at 22:07 -0600, James Bennett wrote: > On Jan 29, 2008 10:04 PM, Mark Green <[EMAIL PROTECTED]> wrote: > > Just curious, what's the state of connection pooling in django? > > My personal opinion is that the application level (e.g., Django) is > the wrong place for connection pooling and for the equivalent "front > end" solution of load balancing your web servers: the less the > application layer has to know about what's in front of and behind it, > the more flexible it will be (since you can make changes without > having to alter your application-layer code). > > So, for example, connection pooling for Postgres would best be handled > by a dedicated pooling connection manager like pgpool; Django can > connect to pgpool as if it's simply a Postgres database, which means > you don't have to go specifying pooling parameters at the application > level. Hm, that doesn't sit so well with me. I agree on the loadbalancer front but the overhead for all those TCP connections (and pgpool managing them) worries me a bit. Furthermore, and much more serious, I see no way to ensure graceful degration in case of overload. Let's assume we run a local pgpool instance along with django on each machine. Django goes through the local pgpool for database access. Now what happens when the database becomes too slow to keep up with requests for any reason? I see two options: a) pgpool is configured without a limit on inbound connections; the hanging connections between django and pgpool will eventually exhaust the total number of allowed tcp- connections for the django-user or even systemwide. django will not be able to open new database connections and display nasty error pages to the users. Worse yet, if django and webserver are running under the same uid then the webserver will likely no longer be able to accept new inbound connections and the users get funny error messages from their browsers. b) pgpool is configured with a limit on inbound connections; pgpool will hit the limit and refuse subsequent attempts from django, which in turn displays nasty error pages to users. In order to achieve the desired behaviour of django slowing down gracefully instead of spitting error pages I think we'd have to teach django to retry database connections. But this would open a whole new can of worms, such as risking duplicated requests when users hit reload, etc... So, long story short, I see no way out of this without proper connection pooling built right into django. Or am I missing something? -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: *Occasional* PostgreSQL Error
On Fri, 2008-01-25 at 15:14 -0800, Jacob Kaplan-Moss wrote: > Hi Doug -- > > On 1/24/08, Doug Van Horn <[EMAIL PROTECTED]> wrote: > > OperationalError: could not connect to server: No such file or > > directory > >Is the server running locally and accepting > >connections on Unix domain socket "/var/run/postgresql/.s.PGSQL. > > 5432"? > > This means that, for some reason, a connection to the database > couldn't be established. You'll get this error if the database isn't > running, but since you only get it under load I'd guess that you're > hitting PostgreSQL's max connection limit (see the max_connections > setting in postgresql.conf). You can tell for sure by checking your > Postgres log; it'll have a message about reaching the connection > limit. Just curious, what's the state of connection pooling in django? And what does the user see when such an error occurs, I guess an error 500 message? -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
hierarchical constraints (country/city) and form validation?
Hello djangoics, I'd like to ask for the opinion of some django veterans on a task that I imagine to be a fairly common one. My site allows the user to set a country and city as part of their UserProfile. Obviously the city-field should only allow values that are valid within the context of the selected country ("France / Chicago" doesn't work). I have hacked up a fairly naive implementation that involves two CharFields, country and city, where the former refers to a lengthy list of "choices" and the latter just relies on all forms to be equipped with a custom constructor (to populate the options) and validator (to deny invalid combinations). In the templates I'm using the usual bit of a ajax trickery to populate the "city" select-widget according to and after the country has been selected. Now this works ok but it's a lot of ugly boilerplate code and I'm convinced there must be a much more elegant way to solve this all at the model level. I have the country/city relation in the database already, so it's just a matter of instrumenting it properly within django. Can someone set me on the right track? :) -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: Passing the max_length from the Model into the Form
On Fri, 2007-11-23 at 14:59 -0800, leotr wrote: > Yes, not only it's dirty but it's ineffective as well! And your idea violates DRY. > Declare a constant somewhere and use it in both model and form. And where would that "somewhere" be? Custom constants on the model? I see no point in adding indirection here. The model definition itself is supposed to be canonical (dynamic model tricks aside). No, I think Wolfram's approach is conceptually correct. If it looks dirty then that's a matter of API cosmetics. If it's inefficient then that's a matter of optimization. Rule of thumb: If you want information about the model then ask the model. > max_link_url_len = 80 > > class Link(models.Model): > url = bla_bla_field(max_length = max_link_url_len) > > class myForm(forms.Form): > > url = forms.URLField( max_length=max_link_url_len) > > :) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: image/file uploads, custom filenames, security
On Tue, 2007-10-16 at 10:18 -0400, Marty Alchin wrote: > I've done some work on FileField lately that address some of your concerns. > > On 10/16/07, Mark Green <[EMAIL PROTECTED]> wrote: > > * does django properly sanitize the filename or rather, use > > safe temp files? i wonder what would happen if i tried to > > upload a file called "../../traverse.txt"? > > I haven't done any testing on that particular situation, so I can't > speak to that one. well, i guess i'll give it a shot and report to the list if there are problems. > > * how can i enforce a filename on the uploaded file? > > i want to completely ignore the remote name of the file > > and instead store it as, for example, {{username}}.jpg > > There's a ticket[1] in Trac to revamp the way file storage is defined, > which would allow you to override some of how Django selects a > filename. Currently, it won't allow you to use the username, or any > other details of the model the image is attached to, but that's > becoming a common request, so I'll see about adding it before it hits > trunk. interesting! i can only second that common request. ;) any idea when it will be done? > > * anyone know if the PIL stuff is hardened against image bombs? > > (small images that expand to gigabytes when expanded to bitmap) > > would it be feasible to subclass ImageFile and replace the PIL > > calls with some paranoid homegrown stuff (i.e. ImageMagick), > > anyone know a starting point for this? > > The ticket I mentioned above also makes it much easier to subclass > FileField and ImageField to add or change whatever functionality you > like. I don't know whether PIL already does what you need, but if > you're paranoid, this patch should help you out. awesome. i know it's probably a fairly exotic request but since my site deals heavily with images i can imagine some customization might pay off (security- or performancewise). thanks for the info! -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
image/file uploads, custom filenames, security
hi all, i've been playing with ImageField and FileField recently and so far they work like a charm. some questions remain, though: * does django properly sanitize the filename or rather, use safe temp files? i wonder what would happen if i tried to upload a file called "../../traverse.txt"? * how can i enforce a filename on the uploaded file? i want to completely ignore the remote name of the file and instead store it as, for example, {{username}}.jpg * anyone know if the PIL stuff is hardened against image bombs? (small images that expand to gigabytes when expanded to bitmap) would it be feasible to subclass ImageFile and replace the PIL calls with some paranoid homegrown stuff (i.e. ImageMagick), anyone know a starting point for this? -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
choices/, how to get rid of the dashes?
Hi all, sorry for repeating my question but I haven't gotten a solution on first attempt and this really nags me. This is my model: class Person(models.Model): GENDER_CHOICES = ( ( 'm', 'Male' ), ( 'f', 'Female' ), ) gender = models.CharField( blank=False, null=False, "gender", maxlength=1, choices=GENDER_CHOICES, default='m' ) Using form_for_model() on the above model results in HTML like this: - Male Female How do I get rid of the first option (the dashes)? Creating my form with initial={ 'gender':'m' } at least sets the default option correctly (default='m' on the model is happily ignored btw, why?) but the dashes are still there. It obviously doesn't make sense to offer an option to the user that they aren't allowed to select, so can anyone confirm that this is a bug or is my code just acting wierd? -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: Should Django have a road map?
On Mon, 2007-10-01 at 17:07 +0200, Stefan Matthias Aust wrote: > Joe, > > 2007/10/1, Joe <[EMAIL PROTECTED]>: > > [...] > > And this is the biggest disconnect between Django's team and the > > business world. If I went to my bosses and told them "It's done when > > it's done" about our upcoming product releases, I would get fired. > > Your response should be, "It's really hard to estimate, but here is my > > best guess and a target for us to shoot for." And, you know what? > > Most of the time our estimates are pretty close. And by tracking how > > we do on our estimates, we can make them even more accurate. > > [...] > > You explained my points much better than I could (in plain English at least ;) Throwing in my 2 cents here... I basically agree with all that has been said, as in: it would be great to have a stable, predictable timeline - but it's unrealistic to achieve that without at least one full-time person to do the housekeeping. Since that full-time person doesn't exist I'd like to propose something in between that I have seen implemented successfully even in very small projects: 0. Realize that creating a roadmap takes zero time and only very little effort. You get it for free, from your ticket-system. 1. Get sorted. Take advantage of the trac milestones and, more importantly, ticket relations (does trac have them nowadays?). The future will become *much* clearer once you have added hierarchy to the ticket swarm. It forces you to decide which things to do first (Milestone 1) and which to delay for a later date. At the same time a roadmap emerges naturally because the "almost-done things" bubble up and become more visible. 2. Don't bother with actual calendar dates. An occassional rough estimate "could be ready at" never hurts but "70% done, ticket-wise" gives a much better indication of progress anyways. Better yet, instead of picking randomly, people can then specificially choose to work on tickets that are relevant to the next milestone or to the particular feature that they're after. 3. Maybe investigate on a better ticket system. The trac ticketing sucks very hard in all regards and is beaten hands down by http://mantisbt.org or http://redmine.org. I hate to say but even the dreaded JIRA does it better. Well, long story short, you want custom workflow/ticket states, so your tickets can't be "ready for checkin" and "new" at the same time. You want a clean UI and a working search that doesn't hurt everytime you use it. You want to draw clear parent/child relationships all the way up to the milestones. Ofcourse a new ticket-system is not a must but having fought my own share of uphill battles against trac I can tell from expirience that many trac-users don't that they're missing a fair share of essential features. In summary, I think this whole discussion should really be about transparency, not about fixed dates. Programming schedules don't work with fixed dates. "It's done when it's done" is not an excuse, it's a honest summary of the situation. So all we need to do is add more transparency to the state of affairs and that's simply a matter of using better tools or using the existing tools better. I don't think the call for a roadmap would have appeared if somewhere on the django site it read like: ---snip--- Milestone 7 (16%, 40 of 240 tickets open, aggregate eta: 01-Apr-2009) - Sub-Goal 1: old admin to new admin (66%, 14 of 21 tickets open, eta: 01-12-2007) - Sub-Goal 2: enforce winnie-pooh.css for all sites (0%, 1 of 1 tickets open) - etc. ---snip--- Ofcourse the ETA doesn't really mean anything (just some numbers magic that a good ticket system can do) but the above view is imho the closest to a "roadmap" that an OSS project can get. And, believe it or not, towards the end of a Sub-Goal those figures often converge to surprisingly accurate estimates. Last but no least, don't underestimate the motivational factor of more transparency. People like the feeling of actually causing an impact on something (the fame-factor). With the current trac there's just an anonymous sea of tickets, not really inviting to give or take yet another drop. A little more structure and clustering would provide much better positive feedback to the contributor here. It just *feels* better to close ticket 6 of 10 on a Sub-Goal, pushing it from 50% to 60%, than to close one random ticket out of hundreds without knowing what other tickets may be interacting with or even blocking the issue at hand. Well, this is getting too long but I hope some of my ideas don't sound too far fetched to some of you. :) -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to
Re: how to scale (was: how to do something at startup)
On Sun, 2007-09-30 at 20:37 -0500, James Bennett wrote: > On 9/30/07, Mark Green <[EMAIL PROTECTED]> wrote: > > I'm not sure what drove me to call it "fragment caching". > > What I really meant to point at are the little things (such as > > form_for_model()) that would likely benefit from some object > > caching instead of burning cycles for each request. > > You can do this, by the way, and in fact quite a few people do it > accidentally when taking their first steps with Django: if you > instantiate an object at the module level, it'll remain resident in > memory and will be re-used instead of re-instantiated on every > request. > > The canonical example is people who evaluate a QuerySet in their > URLConf module, and then are surprised at how it seems to be "cached" > forever ;) Interesting! Time to review my little custom thread. Think I jumped through some very unneccessary hoops... --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: how to scale (was: how to do something at startup)
On Sun, 2007-09-30 at 20:29 -0500, James Bennett wrote: > On 9/30/07, Mark Green <[EMAIL PROTECTED]> wrote: > > My question was really only about the former, a much simpler problem: > > How to keep a tcp connection persistent and re-use it across requests? > > By using a pooling connection manager external to Django. Again, > complicating the application layer with too many details of the other > layers in the stack seems -- to me, at least -- like a premature > optimization that costs flexibility in the long run. > > > While this overhead may be constant in most (not all!) scenarios > > it's still a waste of resources that doesn't sit well with me. > > You say "waste", I say "trade-off" ;) > > And that's what web development is, really: a series of trades. The > ability to "hot swap" front-end or back-end nodees by using pooling > and load balancing external to Django is -- again, to me -- worth the > trade of a slight increase in overhead, because it means you can bring > additional nodes into or out of the pool without having to reconfigure > the application layer. I fully agree with your picture of the trade-off but I think connection pooling neither contradicts hot-swapping nor does it introduce any kind of application layer configuration. What it does introduce would be additional code complexity, so imho the argument should be about whether it's worth that or not. In my opinion it should be, simply because if the protocol was meant to be used like it is by django it would probably be using UDP transport instead of TCP. The overhead of creating a new TCP connection, potentially for each request, should imho not be underestimated. Just try it on a host receiving 50 hits/sec and up... > > I do understand (and endorse very much) that django is a shared nothing > > architecture but imho that doesn't imply "zero internal persistence > > across requests". > > Keep in mind also that Django deliberately runs a bit closer to the > bare HTTP than some of the heavyweight frameworks and that HTTP -- by > design -- is utterly stateless. Again, it's a trade: inherently > stateless architectures are tremendously easy to scale across virtual > or physical machines, and I'd argue that's worth the use of external > persistence mechanisms when that sort of thing is needed. I'm not arguing that at all. I don't question the decision to not internalize certain things, I only propose to make the interface to these external things as efficient as possible. There are a few good reasons why connection pooling towards the database is common practice nowadays. I think one is because juggling with an intense flux of tcp connections is quite expensive on some architectures and/or databases, another would be the scenario I mentioned: better control in overload situations. > > Further problems arise when you need to integrate with a remote peer > > that simply depends on persistent connections. My current candidate is > > the spread toolkit (http://www.spread.org) but it's certainly not the > > only piece of "environmental software" working that way. > > There have been a couple people lately arguing about cases where > Django isn't an appropriate solution, and so far I haven't really > agreed with the examples put forth. But in this thread I think you've > hit a genuine use case where Django probably isn't what you want: if > you need high-performance networking with external services, I'd > highly recommend Twisted[1] as the best Python option I'm aware of. Yes and no. Yes, I agree django isn't for everything and No, I don't really think we're trying to abuse it to that extend. To elaborate a bit, spread is a messaging framework, like activemq in the java world, only less broken. ;) While it can indeed serve as the backbone for grid-style applications we're only using it for light internode messaging and a small "common tuplespace", to realize, for example: - Synchronized list of logged in users across all nodes - AJAX Chat - Announcement of asynchronous events (e.g. backend processing in a non-django process finishes) to the user As you can see, our webapp is at heart really a webapp, we're not trying to shoehorn django into being a computing cluster or bittorrent client. Messaging as opposed to say, polling a memcached or even the database, means very real performance advantages to us. So, I stand by my point; I think it would be nice if django spawned a set of worker threads on startup, used them for ORM connection pooling and offered a small API for the developer to take advantage of them. I'll also happily contribute my little custom-thread.py for inclusion if that helps but I somehow doubt the django guru's couldn't do better in less time. ;) -mark --~--~-
Re: how to scale (was: how to do something at startup)
On Sun, 2007-09-30 at 16:16 -0500, James Bennett wrote: > On 9/30/07, Mark Green <[EMAIL PROTECTED]> wrote: > > Hm, this raises some serious scalabity questions for me. > > >From your description it sounds like there is no template > > fragment caching, not even db connection pooling possible > > with django? > > You can cache anything you want to cache; read the caching > documentation (the whole thing) before jumping to conclusions about > that. At work we use a custom template Node class which caches its > output, for example. Sorry, I was indeed jumping too quick on the caching issue or rather wording my concerns poorly. I'm not sure what drove me to call it "fragment caching". What I really meant to point at are the little things (such as form_for_model()) that would likely benefit from some object caching instead of burning cycles for each request. I do admit though that this may be scratching the realm of micro-optimization and I realize I shouldn't have brought it up without at least measuring it first. Let's just skip this point for now (my bad, sorry again) and instead focus on the (imho) more glaring issue of "no persistent connections", see below. > As for database clustering, there's a philosophical issue here: Django > shouldn't need to know whether there's one database server behind it, > or five, or a hundred. We've had success using pgpool, for example, > which -- from Django's point of view -- looks the same as any > PostgreSQL database, but in reality is pooling connections and > supports multiple actual databases running behind it. > > Think of it the same way you'd do load-balancing in front of your > application: just as users shouldn't need to know that you have, say, > ten web nodes running Django, and just as they shouldn't have to stop > and ask, "which one of the site's web nodes to I want to request a > page from?", Django shouldn't need to know how many database nodes you > have, or which one it should talk to on each query. The less the > various layers of your stack have to know about each other, the easier > it'll be to make changes. > > I'd suggest reading the deployment chapter of the Django book for more > details: > > http://www.djangobook.com/en/beta/chapter21/ > > > And what about integration with a messaging framework > > (spread or somesuch) for efficient cluster communications? > > So long as there's an interface you can talk to from Python, or over > standard networking protocols, what's the holdup? Django does not have > "out of the box" support for interoperating with every single > component someone might want to use, but then neither would an > "enterprise" Java framework; that's why you have programmers ;) First off, thanks for all the insight. Unfortunately I think you misread my "db connection pooling" as "db clustering". My question was really only about the former, a much simpler problem: How to keep a tcp connection persistent and re-use it across requests? Creating and discarding tcp connections at a high rate imposes a measurable overhead for both the initiator (django) and the receiving end (e.g. RDBMS or even a pgpool on localhost). While this overhead may be constant in most (not all!) scenarios it's still a waste of resources that doesn't sit well with me. In particular, if and when the receiving end slows down under load, the last thing you want is incoming connection attempts to pile up. I do understand (and endorse very much) that django is a shared nothing architecture but imho that doesn't imply "zero internal persistence across requests". Further problems arise when you need to integrate with a remote peer that simply depends on persistent connections. My current candidate is the spread toolkit (http://www.spread.org) but it's certainly not the only piece of "environmental software" working that way. I'm currently approaching the problem by spawning a custom thread on first request (thus my inquiry about "how to do something at startup"), but I think django would benefit from providing standard infrastructure for that - which comes for free when proper connection pooling for the ORM is implemented. -mark PS: Sorry if django actually *does* proper pooling already and I'm beating a dead horse here. My assumption that it doesn't do it comes from the fact that it doesn't seem to pull up a persistent thread and because my grep for "pool" over the svn sources didn't hit anything. If murder is the case you can just ignore my whole ranting... --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: how to do something at startup
On Fri, 2007-09-28 at 22:34 -0600, staff-gmail wrote: > James Bennett wrote: > > On 9/28/07, Mark Green <[EMAIL PROTECTED]> wrote: > > > >> i'm looking for a way to perform a bunch of initialization tasks > >> right after django startup. > >> > > > > There really is no such thing as "Django startup"; remember that > > Django is hosted inside a web server, and that server processes will > > come and go over time with no real concept of anything persisting > > beyond the life of a process, unless you serialize out to an external > > store (such as your database, or a file, or memcached). And then > > you'll want to be very careful in how you "initialize", because that's > > probably going to happen every time a server process is started, and > > you'll need to take care that you're not unnecessarily regenerating or > > recalculating something when you could load it from something > > external. > > > > Not sure what you are trying to initialize, but you could call an > initialization method from your initial view method. If it is like a > new game or new project or new whatever - I have the user click on a > start button which then calls a function (in the project or game or > whatever model) that retrieves or resets a bunch of variables and > creates a new object from that model. Likewise I have some initialize > functions in models that create new "seed" data for a new account and > are called and executed from the on-submit method that is specified in > the login form. Is that what you are trying to do? Thanks for the idea but my needs are a bit more low-level I think. In particular I need to start up (and maintain over the lifetime of the django instance) a tcp connection to our messaging broker for later use from views-code. -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: how to scale (was: how to do something at startup)
On Fri, 2007-09-28 at 22:29 -0500, James Bennett wrote: > On 9/28/07, Mark Green <[EMAIL PROTECTED]> wrote: > > i'm looking for a way to perform a bunch of initialization tasks > > right after django startup. > > There really is no such thing as "Django startup"; remember that > Django is hosted inside a web server, and that server processes will > come and go over time with no real concept of anything persisting > beyond the life of a process, unless you serialize out to an external > store (such as your database, or a file, or memcached). And then > you'll want to be very careful in how you "initialize", because that's > probably going to happen every time a server process is started, and > you'll need to take care that you're not unnecessarily regenerating or > recalculating something when you could load it from something > external. Oops! Guess I missed that concept because I've only been playing with the developement server so far. Hm, this raises some serious scalabity questions for me. >From your description it sounds like there is no template fragment caching, not even db connection pooling possible with django? And what about integration with a messaging framework (spread or somesuch) for efficient cluster communications? These all seem to be basic requirements for scalability and integration with existing infrastructure. Any thoughts on that? -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
how to do something at startup
hi all, i'm looking for a way to perform a bunch of initialization tasks right after django startup. where would i put such things and how/when are they called? -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: Two newform best practice questions
Hi Joseph, I say thanks for the pointer, too. A quick question (since you seem to be involved with this), is there any reason to have django not prefix the form fields by default with, say, the model-name (so prefix='' or prefix='somethingelse' can still be used if someone doesn't want it that way). I would definately vote for making it default unless someone can come up with potential problems it may cause. There'd be just one thing less to worry about if it was that way... And if there are worries about backwards compatibility I'd propose a toggle-option in settings.py defaulting to False. -mark On Wed, 2007-09-26 at 01:29 -0500, Joseph Kocherhans wrote: > On 9/26/07, Przemek Gawronski <[EMAIL PROTECTED]> wrote: > > > > I'm using several forms (newforms) to build one html form. One thing to > > watch out for is common field names in your django form classes. So if > > you have two django forms and they both have a field 'date' for example, > > then handling in your view method in request.POST there will be only one > > key 'date' that will go in to both django forms via > > f1=Form1(request.POST) and f2=Form2(request.POST). I'm still thinking on > > how to solve this in a elegant way instead of changing the field names > > in django forms to say date1 and date2. > > Try this: > >f1 = Form1(request.POST, prefix='form1') >f2 = Form2(request.POST, prefix='form2') > > That should solve your problem. The prefix argument is prepended to > every field name in the html, thus preventing name clashes between > forms. That should be in the 0.96 release. No docs on it yet though. > Sorry :( > > Here's the changeset that added it, and the tests there should show > you some examples if needed > http://code.djangoproject.com/changeset/4194 > > Joseph --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: how to implement "stay logged on this computer until i log out"?
Hi Joe, thx for the pointer, looks workable. Will try it soon. -moe On Sun, 2007-09-23 at 23:08 -0700, Joseph Heck wrote: > You can certainly access cookies manually and do with them as you > like. That's how we've implemented a "remember who it was that last > logged in from this computer" kind of feature. > > There's a set_cookie() method on the request object that can do this > work for you - but it's not thoroughly documented. I'd recommend > looking at the code directly and maybe checking out this thread: > > http://www.google.com/url?sa=t=res=4=http%3A%2F%2Fgroups.google.com%2Fgroup%2Fdjango-users%2Fbrowse_thread%2Fthread%2F7b65ff5783f71b9c%2Fa4079aa60e7dfa37=GFT3Ro7tOKGegAPn49C_BA=AFQjCNHCE35TvLu4ZdYy1SklwYZh8b_wqg=fPsOuIxzECnESDy_ZtHWqg > > -joe > > On 9/23/07, Mark Green <[EMAIL PROTECTED]> wrote: > > > > Hi list, > > > > I would like to have sessions normally timeout after > > 8 hours, that is easily achieved by setting > > SESSION_COOKIE_AGE in settings.py. > > > > But additionally I'd like to provide a checkbox to "stay logged > > in on this computer until i log out" which shall make the > > session immortal (remove expiry). > > > > Is there anything in the API to implement that or can I get > > at the session-cookie meat to do it manually? > > > > I basically need to override the default from > > SESSION_COOKIE_AGE for individual sessions. > > > > > > -mark > > > > > > > > > > > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
how to implement "stay logged on this computer until i log out"?
Hi list, I would like to have sessions normally timeout after 8 hours, that is easily achieved by setting SESSION_COOKIE_AGE in settings.py. But additionally I'd like to provide a checkbox to "stay logged in on this computer until i log out" which shall make the session immortal (remove expiry). Is there anything in the API to implement that or can I get at the session-cookie meat to do it manually? I basically need to override the default from SESSION_COOKIE_AGE for individual sessions. -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: choices/ and getting rid of the dashes?
Thanks for the pointer. This does indeed change the default value to 'Male' but the select-box still offers the dashes... I want to eliminate the dashes-option completely, why give the user a choice that will never be accepted? -mark On Mon, 2007-09-10 at 06:47 +, Ryan wrote: > Use initial when calling your form class. > > formClass = forms.form_for_model(Person) > form = formClass(initial={'gender': 'm'}) > > On Sep 6, 4:31 pm, Mark Green <[EMAIL PROTECTED]> wrote: > > Hi all, > > > > This is my model: > > > > class Person(models.Model): > > GENDER_CHOICES = ( > > ( 'm', 'Male' ), > > ( 'f', 'Female' ), > > ) > > gender = models.CharField( blank=False, "gender", maxlength=1, > > choices=GENDER_CHOICES, default='m' ) > > > > Using form_for_model() on the above model results in HTML like this: > > > > > > - > > Male > > Female > > > > > > How do I get rid of the first option (the dashes)? > > I would prefer to have the -widget default to my default-value. > > > > Any help appreciated! > > > > -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
Re: Form field deletion
On Fri, Sep 07, 2007 at 03:24:46PM -0500, jake elliott wrote: > > Oleg Korsak wrote: > >> Sure - use the 'fields' option to specify the subset of model fields > >> you want to use on the form. > >> > > form_for_instance() got an unexpected keyword argument 'fields' > > > > this argument to form_for_instance() and form_for_model() is only > available in the SVN version of django. > > one quick way to get this behavior without that argument is to set the > 'user' field 'editable=False' > > user = models.ForeignKey(User, editable=False) > > remember this will affect the change form in contrib.admin also, which > may or may not be desirable for your project. > > -jake i have a related question, is there a way to change the order of the fields, too? my form adds a second password field like this: class JoinForm(forms.BaseForm): def __init__(self, *args, **kwargs): self.base_fields['pw2'] = forms.CharField( label = 'again', min_length = 3 ) self.base_fields['pw2'].widget = forms.PasswordInput() unfortunately this new field always appears at the end of my form and fields=(..) doesn't affect the order. -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---
choices/ and getting rid of the dashes?
Hi all, This is my model: class Person(models.Model): GENDER_CHOICES = ( ( 'm', 'Male' ), ( 'f', 'Female' ), ) gender = models.CharField( blank=False, "gender", maxlength=1, choices=GENDER_CHOICES, default='m' ) Using form_for_model() on the above model results in HTML like this: - Male Female How do I get rid of the first option (the dashes)? I would prefer to have the -widget default to my default-value. Any help appreciated! -mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---