Re: Improvements to better support implementing optimistic concurrency control
On Aug 9, 1:17 am, Steven Cummings wrote: > I don't think we're talking about new or specific fields as part of the base > implementation here. Just enhanced behavior around updates to: > > 1) Provide more information about the actual rows modified > 2) Check preconditions with the actual DB stored values; and > 3) Avoid firing post-update/delete signals if nothing was changed > > From there you could implement fields as you see fit for your app, e.g., > version=IntegerField() that you use in a precondition. That would be useful. Especially if that can be done without too much code duplication. I had another idea for optimistic locking: why not use the pre_save signal for this? There is a proof of concept how to do this in https://github.com/akaariai/django_optimistic_lock The idea is basically that if you add a OptimisticLockField to your model, the pre_save (and pre_delete) signal will check that there have been no concurrent modifications. That's it. The code is really quickly written and downright ugly. It is a proof of concept and nothing more. I have tested it quickly using PostgreSQL and it seems to work for simple usage. However, it will probably eat your data. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Improvements to better support implementing optimistic concurrency control
On Aug 8, 6:30 pm, Steven Cummings wrote: > For backward compatibility, there may be a Model sub-class that would leave > Model alone altogether (this was suggested on the ticket). This seems fair > since many seem to be getting by without better optimistic concurrency > control from Django's ORM today. Would the subclass-based method automatically append a field into the model, or would there be need to also create the field used for version control? How does the subclass know which field to use? Yet another option is models.OptimisticLockField(). If there is one present in the model, and a save will result in update, the save method will check for conflicts and set the version to version + 1 if there are no conflicts. There is some precedent for a somewhat similar field, the AutoField(). AutoField also changes how save behaves. I wonder what to do if the save does not result in an update and the version is set to something else than 1. This could happen if another user deleted the model and you are now saving it. This would result in a reinsert. This should also be an error? - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Improvements to better support implementing optimistic concurrency control
On Aug 8, 4:54 pm, Steven Cummings wrote: > Interesting feature I hadn't noticed in memcached. That does seem like it > would do the trick where memcached is being used. I think the ability to > control it in Django would generally still be desirable though, as that is > where the data ultimately lives and I'd be hesitant to assume to control the > DB's concurrency from memcached. Ideally it should be the other way around. I assume the memcached implementation would be a version value stored in memcached. Can you really trust that memcached keeps the version value and doesn't discard it at will when it has been unused long enough? There are a couple of other things in model saving which could be better handled. If composite primary keys are included in Django, one would need the ability to update the primary key. If you have a model with (first_name, last_name) primary key, and you change the first_name and save, current implementation (and definition) of model save() would insert a new instance into the DB instead of doing an update. Another thing that could be handled better is update of just the changed fields. I wonder how to implement these things with backwards compatibility. Maybe a method update(condition=None, only_fields=None) which returns True if something was actually updated (or raises an exception if nothing was updated). The method would use the old pk and the condition (if given) in the where clause. If only_fields=None, then it would only update the changed fields... Seems uqly, but I can't think of anything better. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Weekly check-in #1
Just a quick note: It could be a good idea to have concrete_fields in addition to virtual_fields in the Meta class. Concrete fields would be the fields having a database column directly attached to it, the rest would be in virtual fields. The fields attribute would be these two joined together. This way it should be relatively easy to do Model __init__(), django.db.models.query.QuerySet iterator() etc. I haven't looked much into this, so this might be a silly idea. I just wanted to mention it so that this will no longer bother me on the last days of my holiday :) - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Weekly check-in #1
On Jul 20, 4:18 pm, Michal Petrucha wrote: > The last week I've been looking into related fields and all that > stuff. As it turns out, the issue is much more complex than I > originally anticipated and at the moment I have some doubts whether > this can be achieved as part of this GSoC. > > Basically, there are two approaches to the task: > > 1) Make ForeignKey and friends the field that manages all its rows > (i. e. it will return multiple columns when asked for them and so > on). > > 2) ForeignKey will be just a virtual field and it will create all > required auxiliary fields in the local model. ForeignKey would then > just handle the relationship stuff and leave everything else to the > aux fields. > > Some notes about both of them (I spent a few days trying to make > either support at least some basic features and both seem bloody > complex to me): > > 1) The changes required to ForeignKey itself can be kept to a minimum > but this would require drastic changes in many parts of the > internal code: > > * a complete rewrite of the database creation code > > * probably also a rewrite of the parts that match rows fetched > from the database to model fields > > * many internal changes in query code to support multi-column > fields > > Basically, the problem is that practically all parts of the code > rely heavily on the fact that each local field is backed by exactly > one database column (except for M2M which is often special-cased). > > Now, all of this code would need to be rewritten to also work with > fields spanning several database columns. I got completely lost > somewhere around SQLCompiler.resolve_columns and > DatabaseOperations.convert_values, though this is all just the tip > of the iceberg that I encountered while looking into raw querysets; > there is much more to it for regular ones. > > 2) This would require an extensive refactor of related fields. I can > imagine making the aux field sit at ForeignKey's attname to manage > the actual value. This would give us creation and row matching > practically for free, but again, some internal query changes would > still be necessary (multi-column joins, for one). > > The change could be made backwards-compatible if we made the > default aux field use the ForeignKey's db_column. > > Of course, it might be possible to make a half-assed hacky solution, > i. e. ForeignKey would be a full-featured field in some cases and a > virtual one otherwise but this would make a total mess out of > everything and it would require a serious amount of nasty hacks and > workarounds. > > At any rate, I don't feel competent to make the decision in this > matter and I honestly believe there ought to be some discussion about > which route we'll take. > > My personal favorite is the second option but I can imagine people not > liking code that adds local fields automagically. On the other hand, > there is already one such case (the id AutoField). > > Anyway, now is the time that I'd like to see some comments and > opinions of other people who know the ORM code. It might be a little late to comment on this, but here is some opinions from somebody knowing something (but not much) about the ORM. My first feeling is that from the django/db/models/sql/query.py point of view it doesn't matter that much what choice you make. Either way, when filtering through related models, the lookup needs to be resolved to a field, and if that field is a foreign key, then columns and tables needed in the joining need to be resolved. After that the code doesn't care what way the ForeignKey is presented in the models' Meta class. I would say that the second option is much better. There are a couple of reasons I believe this to be so: - This way, when there is a concrete field, there would always be a matching concrete database column. - If multi-column primary keys are virtual fields, then it makes sense that the related field is also virtual, and represented in as similar way as possible. It could make sense to try to make the pk a virtual field also. - Currently, foreign keys kind of create a new field in the model, but not really: foo = ForeignKey(Foo) will create something that is almost like a field (foo_id), but if I am not mistaken, this is not a field in the models' Meta class. It would IMHO be cleaner that the foo_id would be a concrete field, and foo a virtual field. - The most important thing is that a single field can be part of multiple foreign keys. It seems really hard to make this work using approach 1) In my opinion the biggest worries about this approach are: - Can this be made totally backwards compatible. - Even if this is backwards compatible, this will still break a ton of code. Model's Meta class (and especially it's fields attribute) isn't part of the documented API, but it is used heavily in many projects. While multi-column primary keys will break the fields attri
Re: localization in python code (views, models etc)
On Jul 12, 12:28 pm, Jannis Leidel wrote: > Yeah, django.utils.formats.localize is the main function to localize > a value using the format localization engine from Python. The missing > documentation is a bug, IMO. Just a minor correction: localize does not use the localization engine from Python, it uses Django's inbuilt localization. Python's localization can't be trusted to be thread-safe (although on some platforms it probably is). This is not a big point, except that the inbuilt localization engine is slower than Python's. One is written in Python, the other uses system libraries written in C. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Weekly check-in (this should be #5, right...?)
On Jul 6, 1:11 pm, Michal Petrucha wrote: > Hmm, this is exactly what I had in mind when thinking about this > problem. I see that even despite keeping the changes to a minimum, the > patch looks quite big. I'll definitely review this once I start > working on relationship fields. No wonder the patch is quite big. I accidentally branched from the conditional aggregation branch, so it has all the things in that patch included. And that patch is much larger than the multicolumn_join patch. I pushed a new branch to github (https://github.com/akaariai/django/ tree/multicolumn_join), this time the patch is much smaller: 4 files changed, 77 insertions(+), 63 deletions(-) I will destroy the composite_join branch. I didn't like the composite_join name anyways, multicolumn_join is much better name... :) - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Weekly check-in (this should be #5, right...?)
On Jul 6, 1:39 am, akaariai wrote: > Having said all this, for this project "extend the connection tuple" > approach seems to be the only sane choice. I implemented this in my github branch https://github.com/akaariai/django/tree/composite_join With this you can do: a = Article.objects.all() a.query.get_initial_alias() a.query.join(('basic_article', 'foo', ('pk_part1', 'pk_part2'), ('foo_pk_part1', 'foo_pk_part2')), promote=True) print a.query SELECT "basic_article"."id", "basic_article"."headline", "basic_article"."pub_date" FROM "basic_article" LEFT OUTER JOIN "foo" ON ("basic_article"."pk_part1" = "foo"."foo_pk_part1" AND "basic_article"."pk_part2" = "foo"."foo_pk_part2") The connection parameter is now a tuple (lhs, table, (lhs_col1, lhs_col2, ...), (col1, col2, ...)). This seemed to be the way of least pain. All current test are passed on sqlite3. There will probably be problems when more complex queries are tried with multi-column join conditions. I hope this gives at least an idea how to approach the multi-column outer joins problem. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Weekly check-in (this should be #5, right...?)
On Jun 27, 4:22 am, Michal Petrucha wrote: > some visible progress on my project at long last. I spent most of the > last week digging deep inside the ORM's entrails to make composite > field lookups possible and finally it looks promising. > > While working on this I found out the extra_filters approach I > intended to use was a dead end (which reminded me of what Russ wrote > in response to my proposal: "I'm almost completely certain you'll > find some gremlin lurking underneath some dark corner of the code"). I did a glance-over of your github branch. I was especially looking for how you will handle LEFT OUTER JOINS involving composite primary keys / foreign keys. If I am not missing something, I think this haven't been done yet. I have myself been thinking about this issue, and I thought it would be good to share what I have found out. The problematic part for multi-column join conditions is in django.db.models.sql.query: def join(self, connector, ...): """ Returns an alias for the join in 'connection', either reusing an existing alias for that join or creating a new one. 'connection' is a tuple (lhs, table, lhs_col, col) where 'lhs' is either an existing table alias or a table name. The join correspods to the SQL equivalent of:: lhs.lhs_col = table.col """ Obviously this can not work for creating multi-column joins. The connection information is stored in alias_map, join_map and rev_join_map. In particular, in alias_map is stored (table, alias, join_type, lhs, lhs_col, col, nullable). Currently the contents of the alias_map is turned into SQL (sql/compliler.py, get_from_clause()) as: result.append('%s %s%s ON (%s.%s = %s.%s)' % (join_type, qn(name), alias_str, qn(lhs), qn2(lhs_col), qn(alias), qn2(col))) The most simple way to extend this to contain more columns would probably be the following: - connection is defined as (lhs, table, lhs_col1, col1, lhs_col2, col2, ...) - alias_map format needs to change a bit so that the extra columns can be stored in there. One could store the extra column after the nullable. Cleaner would be to have the columns in one tuple: (table, alias, join_type, lhs, (cols), nullable) - Limited amount of places needs to be fixed, most notably the get_from_clause() of compiler.py The downside of the above is that it does not support any other join conditions than ones involving 2 tables and a list of anded columns. For composite fields this is enough. For future usage it would be nice if one could pass in Where nodes as the connection. This would allow for arbitrary join conditions. The Where node knows how to turn itself into SQL, how to relabel aliases and so on. This approach has some problems, however: - How to generate the Where node? - How to match existing joins to new joins? Currently this is done by checking that the connection four-tuple is equivalent to the existing join's four tuple. I don't think Where nodes know how to check equivalence to another node. And even if where nodes knew how to do that, also all the leaf nodes would need to know how to do that. - Performance issues, cloning a Where node is more expensive than cloning a tuple. Also construction, equivalence checking and other operations too are somewhat more expensive than using tuples. - Overkill for composite fields Of course, the approaches could be combined, that is you pass in the join condition as a tuple, and you can pass extra_filters (default None) as a Where node. This would keep the normal case efficient but allow for more complex join conditions if really needed. The join having extra_filters could not be reused, except when explicitly stated. Having said all this, for this project "extend the connection tuple" approach seems to be the only sane choice. The work you have done looks very promising. I hope this post has been at least somewhat useful to you. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Django ORM enchantments
I have implemented proof of concept versions of conditional aggregation, F-lookups in aggregates and annotating fields to a model (qs.field_annotate(age_x2=F('age')*2), note: no aggregation here). See ticket #11305 for more details. I would also hope to implement a patch which would allow to annotate reverse related models. The idea would be most useful for fetching translation models. Given models: Article( id=IntegerField() default_lang=CharField() ) ArticleTranslation( article=ForeignKey(Article, related_name='translations') name=TextField() abstract=TextField() content=TextField() lang=CharField() class Meta: unique_together = ('article', 'lang') ) And queryset: Aricle.objects.annotate( translation_def=ModelAnnotation( 'translations', only=Q(translations__lang=F('default_lang')) ), translation_fi=ModelAnnotation( 'translations', only=Q(translations__lang='fi') ) ) The above query would generate something like this: select article.id, article.default_lang, t1.name, ..., t3.name from article left join article_translation t1 on article.id = t1.article_id and t1.lang = 'fi' left join article_translation t3 on article.id = t3.article_id and t3.lang = article.default_lang And the objects returned would have (possibly None-valued) translation_fi and translation_def instances attached to them. These features require a lot of work before anything commit-quality is ready. I would ask if the community would consider these ideas before I do too much work. These patches would also require some attention from somebody with more ORM knowledge than I have. The ModelAnnotation idea is probably too hard for me to implement in even near commit- quality. These features would naturally make the ORM more powerful, but I see some objections to these features: 1. They will make the already complex ORM even more complex. This will result in new bugs and it will be harder to add new features to the ORM. 2. Django ORM has the philosophy of "80/20". Meaning that the ORM should make it possible to run 80% of your queries, the rest can be done using raw SQL. Are these features beyond the 80% threshold? 3. As the queries become more complex, it is likely that the ORM will not be able to generate efficient SQL. If this is the case raw SQL is needed. Back to square 1. 4. The ORM is nearing the point where the API is too complex. Instead of writing complicated SQL, you will be writing complicate ORM queries. On the other hand, combined with custom F() expressions, the need for .extra() would be smaller and maybe it could even be deprecated in the future. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Idea for i18n fields
On Jul 2, 12:59 am, Ric wrote: > Hi there, > > i have got a simple approach to make all django fields with a full > i18n support > > the django.models.fields.Field class can be subclassed like this > > from django.db import models > from django.utils.translation import get_language > > class i18nField(models.Field): > > def __init__(self, i18n = False, *args, **kwargs) > self.i18n = i18n > models.Field.__init__(self, *args, **kwargs) > > def db_column_get(self): > if not self.i18n: > return self._db_column or self.name > return "%s_%s" % ( > self._db_column or self.name, > get_language().lower() > ) > > def db_column_set(self, value): > self._db_column = value > > def _column_set(self, value): > pass > > db_column = property(db_column_get, db_column_set) > column = property(db_column_get, _column_set) > > then you can declare all other subfields as usual > > this work in that way: you need a separate db column for every > language installed. > a field called "name" needs to create ("name_%s" % code for code in > languages) columns > > so the framework automatically select the right column in every query. > > problems: > - serializing objects, you need to serialize all all fields, not just > current language > - many to many fields, to work they need to create an extra column in > every througth table, a column to store the language code. > - during sync db you need to create a column for every language > installed > > after two years on an i18n django site, i found this simple solution. > there are some small problems, that can be fixed if we put a i18n > option when you init a field, and solve some issue during syncdb > command and serialization of objects. > > for me is a very simple approch, > it automatically filter, sort and output the right queryset for your > language, and when you access a field you get the current language, it > works for every field, ForeignKeys too. > > and it works in admin (with no changes at all) > > let me know what you think. >From my point of view there are a couple of problems with this approach: - The idea of putting a column for all translated languages for every field directly to the base table is not feasible for many use cases. If you have 10 translated fields in your model, and you have 10 languages, that is already 100 fields. If you happen to need an index on one field, you need 10 indexes. For example in EU you might need to have the possibility to have the model translated in all official EU languages. That is over 20 languages you need to support right there. For the EU use case it is better to have a translations table containing the translations of one language in one row. - This approach makes it hard to fetch all the translations of the model. How does this work for NOT NULL fields? - There are ways to have proper fields for every translation in the DB table. It is better that there is 'name' field which fetches the default language according to the currently active language, and then there are the translated fields ('name_fi', 'name_en', ...) if you need those. See django-transmeta for one such solution (http:// code.google.com/p/django-transmeta/). For some use cases your solution definitely can be handy. But my feeling is that this does not belong to core (not that I have any power in deciding that). The biggest reason for this is that the more I have worked in different multilingual projects, the more certain I am that there is no single solution to model translations. At least no single solution how to handle the database layout part of it. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Conditional aggregations.
On Jun 28, 5:46 pm, Javier Guerra Giraldez wrote: > i might be totally wrong (wouldn't be first time..) but i've found > myself having to adapt to local dialects almost every time i see some > SQL inside a function, especially on mysql and sqlite. maybe it's > because of the bad quality of code i tend to see (typically > originating from hand-coded mssql accesses deep within an excel > sheet), but seeing CASE also rings my "i'll need an extra week just > for this" alarm. > I really do hope that the CASE WHEN construction can be used in all supported databases. I have high hopes that it can be used, because the CASE WHEN construction is one of the most standard constructions in SQL, and because I have tested it on MySQL 5.0, PostgreSQL 8.4, SQLite3 and Oracle 10g. I also attached a proof of concept patch to #11305. It is somewhat uqly, but I hope it is a good start. It should support aggregate() and annotate() for all the standard aggregates, and F() lookups should be usable with it. The restriction is that the Q-object used in the only condition can not add additional joins to the query. The patch is just a proof of concept, and it is safe to assume it will fail under more complicated queries. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Thoughts on solution to forward references in MySQL (#3615)
On Jun 28, 12:24 am, "Jim D." wrote: > I spent some time last week and over the weekend nailing down a > solution forhttps://code.djangoproject.com/ticket/3615. This is the > ticket about allowing forward references when loading data on the > MySQL InnoDB backend. My patch implements the proposed change > (disabling foreign key checks when the data is loaded) as well as a > straightforward SQL SELECT check for integrity after the data is > loaded, which if I understand it is the missing piece that has > prevented this ticket from moving forward for the last 4 years... This is probably not concurrency-safe if the tables are not locked for the duration of the fixture loading. I don't know if this will ever be used in situations where concurrency is an issue. Test fixture loading is certainly not a problematic use-case. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Conditional aggregations.
On Jun 28, 5:18 pm, Javier Guerra Giraldez wrote: > On Tue, Jun 28, 2011 at 8:41 AM, akaariai wrote: > > This should translate to the following SQL: > > SELECT sum(case when house.price > 41000 and house.price < 43000 then > > 1 else 0 end) as expensive_house, > > sum(case when house.price > 43000 then 1 else 0 end) as > > really_expensive_house, ... > > FROM house > > JOIN something on something.id = house.something_id > > this looks quite non-portable How? The CASE statement is specified in SQL standard, and it is implemented in all database I have used. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Conditional aggregations.
On Jun 27, 4:54 pm, Russell Keith-Magee wrote: > > queryset.aggregate( > > expensive_house=Count(house__price, > > only=(Q(house__price__gt=41000), Q(house__price__lt=43000))), > > ... > > ) > > Ok, so that's you're syntax proposal. Now show me the SQL that this > translates into. In particular, keep in mind that you're doing joins > in your Q clauses -- how does that get rolled out into SQL? This should translate to the following SQL: SELECT sum(case when house.price > 41000 and house.price < 43000 then 1 else 0 end) as expensive_house, sum(case when house.price > 43000 then 1 else 0 end) as really_expensive_house, ... FROM house JOIN something on something.id = house.something_id -- The given example queryset is clearly missing that something :) I think it might be good to restrict the only clauses to the fields of the same model the aggregated field is in. This way there is already a usable join generated by the aggregate. The only clause affects the "case when" structure only. The only clauses should never restrict the queryset. It gets really complicated to do that restriction when you have multiple aggregates using only. You can use filter to restrict the queryset instead. And you probably don't want the filter there in any case, In the above example, you would not get any results for the rows aggregating to 0. If the only clauses never restrict the queryset, then translating conditional aggregates to SQL isn't really _that_ complicated. When you normally generate "avg(table.column) as column_avg", now you just generate "avg(case when table.some_column matches only condition then table.column else null end)" as column_avg. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Removal of DictCursor from raw query.. why??
On Jun 17, 8:02 pm, Ian Kelly wrote: > The thing is, this is a DB API snippet, not a Django snippet > specifically. If Django were a DB API toolbox, then it might make > sense to include it in some form or other. But it's not, so in the > interest of keeping things relatively tidy I'm a -0 on this. It is often said here that Django ORM is designed to do 80% of the stuff, the rest can be done using raw SQL. So, giving pointers to users how to perform the raw SQL as painlessly as possible is something Django documentation should do. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Removal of DictCursor from raw query.. why??
On Jun 17, 2:54 pm, "Cal Leeming [Simplicity Media Ltd]" wrote: > Because I feel this is just something that should work (or be available) out > of the box. There are plenty of other places where Django docs has included > code snippets to give the user a heads up, and I think this is the perfect > case for one. > > If anyone has any objections to this, please let me know, if not ill put in > a ticket for consideration. I just wanted to say I support having something documented about this. Without documentation new users will most likely use index based cursors. I know I used to do that. The problem with no documentation is not so much that it would be hard to find a snippet about dict cursor implementation. It is more that new users don't know that using index based cursors might not be the best of ideas. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: RFC: Composite fields API
On May 17, 5:32 pm, Michal Petrucha wrote: > Proper subquery support is something that can be addressed once the > rest of the implementation is stable. To me the plan looks very reasonable (both disallowing subqueries and converting to disjunction form), unless there is some part in the internals which expects pk__in=qs to work. In that case it could just be converted to something like: if pk is multipart_pk: qs = list(qs.values_list('pk_part1', 'pk_part2')) continue as now. In any case, in my opinion pushing as much of this work to later patches is the way to go. The only question is how much can be pushed to later patches. I do not know the answer to that, unfortunately... - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: RFC: Composite fields API
On May 12, 2:41 pm, Michal Petrucha wrote: > Due to the nature of this field type, other lookup filters than > ``exact`` and ``in`` would have unclear semantics and won't be > supported. The original plan was to also exclude support for ``in`` > but as it turns out, ``in`` is used in several places under the > assumption that primary keys support it, for example DeleteQuery > or UpdateQuery. Therefore both filters will be implemented. I wonder how to implement __in lookups in SQLite3. SQLite3 doesn't support where (col1, col2) in ((val3, val4),(val5, val6)). But other DBs do (at least MySQL, Oracle and PostgreSQL). I do not know what would be the best way to write something equivalent in SQLite3. The obvious choice is to rewrite it as an OR lookup (as mentioned in the full proposal). Maybe write it as an OR lookup for every DB for the initial patch, and later on this can be improved to have per database handling. In lookups with subselects are a harder problem. Those would need to be rewritten as joined subselects with a distinct clause. [1] Not in lookups could be still harder due to weird null handling. (1 not in (null) -> Unknown). [2] I hope there will be an easy solution to this problem, as this feature is something which would be really, really valuabe for Django (no more telling DBAs: by the way, no composite foreign keys...). One simple solution would be to disallow __in lookups with subselects (or run the subselects separately) and use OR lookups when given a list of values. This should be relatively easy to implement and could be improved later on. - Anssi [1] http://asktom.oracle.com/pls/asktom/f?p=100:11:0P11_QUESTION_ID:953229842074 [2] http://asktom.oracle.com/pls/asktom/f?p=100:11:1089369944141559P11_QUESTION_ID:442029737684 -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Resetting settings under test
On May 13, 3:41 pm, Jeremy Dunck wrote: > In general, the TestCase does a good job of cleaning up the > environment (resetting the DB, etc.), but I've run across an edge case > that might be worth putting upstream. > > I have a large codebase running multi-tenant -- lots of sites per WSGI > process, running process-per-request, and it serves those sites by > mutating settings per request (via middleware), including poking an > urlconf onto the request. > > Under test, this leaves these fairly weird, since the settings > mutations can affect other test cases. > > I realize that in general, settings are intended to be immutable, > but... what do you think of TestCase tearDown restoring the settings > as they were before the test runs? The tearDown should also handle reset of cached settings. There are a few, at least translations __init__.py and localization have caches that need resetting. It would be good if settings had a method "reset" which would restore the original settings and clear the caches. A complete API of change_setting, restore_setting and reset_all would be even better. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Accidental logging disabling
On Apr 15, 7:34 am, Ivan Sagalaev wrote: > import logging > > import settings > from django.core.management import setup_environ > setup_environ(settings) > > from django.test.client import RequestFactory > > factory = RequestFactory() > factory.get('/') > logger = logging.getLogger('django.request') > logger.error('Error') > > The message doesn't show up in the console. Here's what's happening here: > > 1. setup_environ creates a lazy setting object > > 2. importing RequestFactory triggers import of django.core.handlers.wsgi > that creates a logger 'django.request' > > 3. after some time settings object is accessed for the first time and > gets initialized I have been using setup_environ in my projects, and the lazy initialization in can cause some weird problems, for example if you do manual timing using: start = datetime.now() access settings print 'Used %s' % (datetime.now() - start) You might get weird results as accessing settings can change your timezone. Would it be wise that setup_environ() would access the settings so that they are no more lazy? Or does this cause other problems? - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Read-only forms and DataBrowse
About the read-only forms part of the proposal: read-only forms will be easy to implement if the template widget rendering idea will be included in core. For example for SelectMultiple widget the base template is probably something like this: {% for choice in choices %} {% if choice.selected %} {{choice}} {% else %} {{choice}} {% endif %} {% endfor %} If you want a read-only widget, just give it a custom template: {% for choice in choices %} {% if choice.selected %} {{choice}} {% endif %} {% endfor %} Now, that was easy :) Combined with template based form rendering, it would be relatively easy to implement read-only forms. Another reason why template based form/widget rendering would be nice to have in core. By the way, it would be nice to see how the template based rendering compares to python based rendering in performance terms when rendering a larger list of choices. But on the other hand, if you have a large Select/SelectMultiple list there is bound to be some usability issues, so maybe the performance isn't that important... Sorry for bringing performance up here again, but I am a speed freak :) For what it's worth, I implemented somewhat working read-only fields / widgets some time ago. I have used it a little in some sites, and the ability to render any model / form easily in display-only mode is a really nice feature. The work can be found at [1], but it is incomplete and based on an old version of Django (last commit is from August 23, 2010). [1] https://github.com/akaariai/django/tree/ticket10427 - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Django Template Compilation rev.2
On Mar 30, 6:18 am, xtrqt wrote: > def templ(context, divisibleby=divisibleby): > my_list = context.get("my_list") > _loop_len = len(my_list) > result = [] > for forloop, i in enumerate(my_list): > forloop = { > "counter0": forloop, > "counter": forloop+1, > "revcounter": _loop_len - i, > "revcounter0": _loop_len - i - 1, > "first": i == 0, > "last": (i == _loop_len - 1), > } > if divisibleby(i, 2) == 0: > result.append(force_unicode(i)) > return "".join(result) > For comparison here is the performnace of these 2:: > >>> %timeit t.render(Context({"my_list": range(1000)})) > 10 loops, best of 3: 38.2 ms per loop > >>> %timeit templ(Context({"my_list": range(1000)})) > 100 loops, best of 3: 3.63 ms per loop > That's a 10-fold improvement! I did a little test by adding localize(i) in there. On my computer the time went to around 25ms. For datetimes the time needed is somewhere around 100ms. If you could inline the localize(i) call for the integer case you would get back to around 4ms, as it doesn't actually do anything else than return force_unicode(i)... So, when designing template compilation it is essential to see how the localization stuff could be made faster, else much of the benefit will be lost. It seems that at least for this test case localization uses over 50% of the time, so there would be bigger gain in making localization faster than in making compiled templates. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Template Compilation
On Mar 27, 5:48 am, "G.Boutsioukis" wrote: > Hi, I'm thinking about submitting a proposal for template compilation > and I'm posting this as a request for more info. > > In particular, I remember this project being discussed last year and I > pretty much assumed that Alex Gaynor's proposal would have been > accepted(I see he's listed as a mentor this year BTW). What was the > rationale behind the decision to reject it? Unless, of course, it was > made on his part. > > In any case, any other comment around compatibility, speed or other > concerns would also be helpful. In the other concerns department: for many workloads template compilation itself won't be that big of a benefit. There is a relatively big speed bottleneck in L10N related stuff. If you are rendering a big table of integers, if I recall correctly about 30-40% of time is used in localizing the representation of those integers. If you are rendering floats it will be more, and if dates/datetimes it will probably be 90%+. So, it would be important to see how to reduce the impact of L10N when trying to make template rendering faster. The L10N rendering was made faster in tickets #14290 and #14306. There is some low-hanging fruit in #14297 which never got applied. I don't mean to say that this is a reason not to implement template compilation, just to say that for some workloads the gain of template compilation is not going to be _that_ big. And in the case of integer rendering, it would be reasonable to still expect a speedup of nearly 50%. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Composite primary keys
On Mar 21, 1:20 pm, Michal Petrucha wrote: > > My suggestion is to create an Index type that can be included in a > > class just like a field can. The example we've been using would > > then look like: > > > class Foo(Model): > > x = models.FloatField() > > y = models.FloatField() > > a = models.ForeignKey(A) > > b = models.ForeignKey(B) > > > coords = models.CompositeIndex((x, y)) > > pair = models.CompositeIndex((a, b), primary_key=True) > > > We could have FieldIndex (the equivalent of the current > > db_index=True), CompositeIndex, and RawIndex, for things like > > expression indexes and other things that can be specified just as a > > raw SQL string. > > > I think this is a much better contract to offer in the API than one > > based on field which would have to throw exceptions left and right > > for most of the common field operations. > > I don't see how ForeignKeys would be possible this way. > In much the same way: class FooBar(Model): a = models.ForeignKey(A) b = models.ForeignKey(B) pair = models.ForeignKey(Foo, fields=(a, b)) Note that this is very close to what SQL does. If you have a composite unique index or composite foreign key you define the fields and then the index / foreign key. Though I don't know how much value that argument has in this discussion. You could add some DRY and allow a shortcut: class FooBar(Model): pair = models.ForeignKey(Foo) # a and b are created automatically. Now, to make things work consistently pair should be a field. But on the other hand when using a ModelForm, the pair should probably not be a field of that form. This is more clear in an example having a (city, state, country) primary key. These should clearly be separate fields in a form. In my opinion, if the composite structures are called fields or something else isn't that important. There are cases where composite structures behave like a field and some cases where they do not. The main problem is how the composite structures should behave in ModelForms and serialization, should they be assignable, how the relate to model __init__ method, should they be in model fields iterators, how they are used in QuerySets and so on. When these questions are answered it is probably easier to answer if the composite structures should be called fields or something else. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Expensive queryset cloning
On Mar 17, 3:11 am, Alexander Schepanovski wrote: > Can you find that patch and post somewhere? > If not still thanks for this idea. Unfortunately, no. Gone with my old laptop. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Expensive queryset cloning
On Mar 16, 10:14 am, Thomas Guettler wrote: > Hi Alexander, > > I have seen this in my app, too. It still runs fast enough. But > I guess the django code could be optimized. > I had a patch for this problem somewhere, but can't find it now. Basically it added inplace() method to queryset, and after that no cloning of the inner query class would happen. The outer QuerySet would still be cloned, but that is relatively cheap. This was to prevent usage of the old reference to the QuerySet accidentally.* This was done by keeping a "usage" count both in the inner query instance and in the outer QuerySet instance. - Anssi (*) qs.filter(pk=1) qs.filter(foo=bar) would be an error but qs = qs.filter(pk=1) qs.filter(foo=bar) would be ok. -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: the new SELECT DISTINCT query in 1.3 rc1, and how to turn it off
On Mar 5, 7:29 am, Karen Tracey wrote: > It's probably best if you open a ticket in trac > (http://code.djangoproject.com/newticket) for this. I can't think offhand how > to solve both the problem that changeset fixed and the one you are > encountering > If Django ORM would be able to perform a query: select distinct on(primary_key) id, val1, ... from table order by primary_key this would solve the problem. But making the ORM capable to do that will probably take some time... - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Forms: display_value for BoundFields and widgets
Hello, This is a proposal related to ticket 10427: Bound field needs an easy way to get form value [1]. Ticket #10427 is already closed as fixed, but the fix is only partial. It is now possible to get the value from BoundField, but the value might be a DB primary key or something else not usable for displaying to user. To fix this issue, so that it is possible to render a form in show- only-data state, I propose two additions: 1. form widgets should receive additional method, display_value(name, value, attrs=None). This method returns the value given in human preferred format. For TextInput this would be simply: %(value)s For more complex fields, like SelectMultiple, the display_value would know how to render the values correctly: If values are (1, 3) the output could be something like: text_representing_choice_1text_representing_choice_3 2. BoundFields would get additional property display_value, which would return the BoundField's widget's display_value output. I am already using a version of Django patched to do this in production. My own feeling is that this is really useful. So, now I am asking that if this is 1) Something that would be included in Django core (#10427 suggests so). 2) Does the approach seem valid. 3) What to do with FileFields. 4) Is the wrapping in tag sensible, or should it be something else. The code for my own version can be found from [2], but that code is not core ready, and it is a bit outdated, especially the tests part of it. A few pictures can never hurt my case, so here are some pictures of actual usage of the display_value. Sorry, the site is implemented in finnish, but the actual values are not the point... The first one is a form rendered simply by using a for loop and field.label and field.display_value. This is used for viewing data, the link "Muokkaa" in the upper left corner is a edit link. http://users.tkk.fi/akaariai/Screenshot-7.png The second picture is what you get when clicking the edit link. This is rendered with {{ form.as_table }}. http://users.tkk.fi/akaariai/Screenshot-8.png It is notable that select multiple fields are handled correctly and that the "Omistaja" field is actually an autocomplete field which has a totally different widget than usual select fields. Yet it is really easy to define display_value method for that widget, too. In general, this proposal will allow to easily make previews for forms, implement list - view - edit workflow for websites and in general to easily show the contents of any model - just define a ModelForm and display it. I hope to get some feedback before I start to write a patch for current Django trunk. The most work will be writing tests for this, so I would like to get the point 4) above (wrapping in ) correct on first try. - Anssi [1] http://code.djangoproject.com/ticket/10427 [2] https://github.com/akaariai/django/tree/ticket10427 -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-develop...@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Changing settings per test
On Nov 5, 2:18 am, Alex Gaynor wrote: > def setUp(self): > self.old_SETTING = getattr(settings, "SETING", _missing) > > def tearDown(self): > if self.old_SETTING is _missing: > del settings.SETTING" > else: > settings.SETTING = self.old_SETTING How about introducing a new function in settings: change_setting(name, new_value) which returns old setting or a marker when there is nothing configured for that value. This function would clear the caches if the setting is cached somewhere, and also handle the env updates for timezone or other settings needing that. Clearing caches is a problem that is not handled well at all in the Django test suite. You could then revert the setting using the same function. This would again handle clearing the caches & env handling. If passed in new_value was the marker for nothing, then the setting would be deleted. Using those functions setting changes would be done like this: def setUp(self): self.old_SETTING = settings.change_setting("SETTING", new_val) # or if you want just to store the setting and change it later in the actual tests # self.old_SETTING = settings.change_setting("SETTING") # will just fetch the old setting, or the marker for missing setting def tearDown(self): settings.change_setting("SETTING", self.old_SETTING) And you would not need to care if the setting was cached somewhere, the change_setting function will take care of that. I don't know if it would be good to have also settings.load_defaults (returns a dict containing all the old settings, loads global_settings). This would need a reverse function, I can't think of a good name for it, settings.revert_load_defaults(old_settings_dict) or something... The append case could have still another function, append_setting(name, value), returning old list (or marker if nothing) and inserting the new value in the list. Reverting would be just change_setting(name, append_setting_ret_val). Handling what needs to be done when changing a setting could be signal based (register_setting_change_listener), this would allow using the same mechanism for settings used by apps not in core. Of course, there could also be decorators which would use these functions... - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-develop...@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: AutoFields, legacy databases and non-standard sequence names.
On 8 loka, 10:41, Hanne Moa wrote: > You can't necessarily do this with a legacy database, as other systems > also using that database expect the existing names. alter sequence owned by does not change the sequnce name, just what pg_get_serial_sequence will return for given table, column combination. But as said, if there are multiple tables using the same sequence, then alter sequence owned by does not work. In these cases manually settable sequence names for models is likely the best solution. > > I need to use my own backend because of posgresql's own > table-inheritance. Most tables in the db inherit from the same table > and inherits its primary key and the sequence for that primary key. > Then there are a few tables that inherits from a table that inherits > from the grandfather table that defines the primary key and its > sequence. So, I need to recursively discover the oldest ancestor of > each table and use the sequence of that ancestor. > > HM -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-develop...@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: AutoFields, legacy databases and non-standard sequence names.
> Sorry, I should have been more clear. > > What I'm trying to do is solicit suggestions from django developers as > to how I *can* move ticket #1946 forward. I can find a way to work > around it in my own project, but it would be ideal to solve it on the > Django side, for everybody. > > I mentioned there were three possible suggestions in the ticket > discussion as to how to solve the problem. If a Django developer can > give me some guidance as to what approach seems to be the best long-term > solution, I'm happy to try my hand at writing a patch that can hopefully > be incorporated into the codebase. Django doesn't expect the sequence name to be tablename_columname_seq, at least not in trunk. The last_insert_id method in backends/ postgresql/operations.py uses select currval(pg_get_serial_sequence(tablename, columname)). pg_get_serial_sequence will return the correct sequence only if the sequence is owned by the tablename, columname combination. If you happen to have just one table per sequence, then issuing ALTER SEQUENCE sequencename OWNED BY tablename.columname; should fix the problem. There is one more proposal in ticket #13295. The proposed solution should allow using the same sequence for multiple tables, though management of the manually defined sequence is a bit hard when trying to build the schema and when resetting the sequence. Sorry if the proposal is a bit hard to follow... I don't know well enough how databases other than postgresql work, so I don't know if the solution is valid for other databases. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-develop...@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Changes to LazySettings
On Sep 28, 3:42 am, Luke Plant wrote: > But can anyone else think of any gotchas with this change before I > commit it? It produces a 30% improvement for the benchmark that relates > to the ticket [3]. Not directly related to this change, but there are at least one part in Django that will cache a setting when first used [1]. The code in django/utils/translations/__init__,py will cache the real translation provider for performance reasons. If the USE_I18N setting is changed when testing, it will not have any effect in which translations provider is used. There might be other parts also which do the same thing. I wonder if __setattr__ of the LazySettings should automatically flush this cache when USE_I18N is changed? [1] http://code.djangoproject.com/changeset/13899 - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-develop...@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Speed testing Django
On Sep 16, 10:43 am, Russell Keith-Magee wrote: > Do we have a continuous performance benchmark at present? No. > > Would it be worth having one? Certainly. > > There's a long standing ticket proposing just such a change: > > http://code.djangoproject.com/ticket/8949 > > And there was a sprint earlier this year that started developing a > benchmark set that could be used; results so far can be found here: > > http://github.com/jacobian/djangobench > > However, there's still a lot of work to do to build up the benchmark > set and deploy it in a continuous integration environment. I'm not > aware of anyone specifically coordinating this benchmark work at the > moment, so if you want to step up and make this your contribution, > feel free to do so! > > Yours, > Russ Magee %-) That looks to be a good base to start the work from. I hope I will have time to do this, but this might be bigger problem than I can handle. I will check if it is possible to integrate Codespeed with the benchmarks in djangobench. I will post status updates to ticket #8949, once I have better understanding of the problem. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-develop...@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Speed testing Django
Is there any continuous speed testing done for Django? It would be nice to see how performance of Django is evolving. For example while working on ticket #14290, seeing how my changes to utils/translation/ __init__.py affect other parts of the framework would be useful. In general, speed testing should expose performance problems early, so that design of features can be changed before the feature goes into a release. If there is no continuous benchmarking done at the moment, maybe using Codespeed (http://wiki.github.com/tobami/codespeed/) could be considered. It is the framework used for http://speed.pypy.org, and it is built using Django. If there is need for this kind of work, I am volunteering to do some work. That is, set up Codespeed and do some simple tests. I haven't tested integrating Codespeed with Django yet, so it is possible that Codespeed is not useful in this set up. I would like to integrate it with git, so that it would be easy to do speed testing of your own development branches. In SVN this is hard to do, if I am not mistaken... I am thinking of implementing the following tests (as time permits): Template rendering, simple case: build a table of, say 1000 x 10 cells using for loops. Template rendering, forms: build a large form (possibly with FormSets) and use that in a template. Maybe both with {{ form.as_table }} and iterating through the form. This would test form rendering, not initialization. Template rendering, real world case: Use {% extends %}, {% include %}, {% trans %}... blocks, that is, build a simple template that reflects real world usage. QuerySet building: Simple QuerySet building: Foo.objects.filter(pk=id). Complex QuerySet building (couple of filter operations, with some joins), maybe order by etc. Object creation: Just create a bunch of Model objects. A few cases here, simple object (only id) creation, complex object (many fields) creation, objects with foreign keys, inheritance and deferred fields object creation. Realistic use case: Make a request to the framework and go through all the parts of building the response, that is, make a request to runserver. However, do not use database queries. Regression tests: Go through trac and look for speed tests attached to tickets. This is probably a lot of work, but I would like to give it a try. Comments? - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-develop...@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: I need a pony: Model translations (aka "my proposal")
When using related table based approach for the translations, django could use pivoting to fetch all the translations with just one normal join. Some pseudo sql, skipping constrains and indexes... create table a ( id integer pkey ); create table b ( id integer pkey, a_id integer references a.id, lang varchar(2), val1 integer, val2 integer ... ); Now when fetching multiple languages instead of outer joining the b table multiple times, one can fetch the results with just one normal join. select a.id, max(case when b.lang = 'fi' then b.val else null end) as val_fi, max(case when b.lang = 'en' then b.val else null end) as val_en, max(case when b.lang = 'de' then b.val else null end) as val_de, ... from a, b where a.id = b.a_id group by a.id This results in some serious performance gains. I inserted 1 rows into a, and 7 translations for each in table b. I am using postgresql version 8.3.5 on macbook, OS X 10.5. Fetching one translation from table b, all rows: ~110ms Fetching all translations from table b, using pivot: ~270ms, using join: ~550ms Ok, not too big of a performance gain. Lets try filtering: one translation from b, 1000 < id < 1300: ~60ms all using join, 1000 < id < 1300: ~350ms all using pivot, 1000 < id < 1300: ~70ms just for comparison, fetching only from a, 1000 < id < 1300: ~30ms Fetching only from a is similar to having the translations directly in base table. The point of this post is that when fetching small amounts of data (less than 500 rows), fetching all of the translations at one time using pivoting is almost the same cost as fetching just one translation. Using columns in base table is of course the best performing solution. And I believe there is something fishy in my setup in the case of fetching all translations using pivot... Implementing something like this in Django is surely non-trivial. But, the topic of this thread encouraged me to post this :) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---