Re: Improvements to better support implementing optimistic concurrency control

2011-08-09 Thread akaariai
On Aug 9, 1:17 am, Steven Cummings  wrote:
> I don't think we're talking about new or specific fields as part of the base
> implementation here. Just enhanced behavior around updates to:
>
> 1) Provide more information about the actual rows modified
> 2) Check preconditions with the actual DB stored values; and
> 3) Avoid firing post-update/delete signals if nothing was changed
>
> From there you could implement fields as you see fit for your app, e.g.,
> version=IntegerField() that you use in a precondition.

That would be useful. Especially if that can be done without too much
code duplication.

I had another idea for optimistic locking: why not use the pre_save
signal for this? There is a proof of concept how to do this in
https://github.com/akaariai/django_optimistic_lock

The idea is basically that if you add a OptimisticLockField to your
model, the pre_save (and pre_delete) signal will check that there have
been no concurrent modifications. That's it.

The code is really quickly written and downright ugly. It is a proof
of concept and nothing more. I have tested it quickly using PostgreSQL
and it seems to work for simple usage. However, it will probably eat
your data.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Improvements to better support implementing optimistic concurrency control

2011-08-08 Thread akaariai
On Aug 8, 6:30 pm, Steven Cummings  wrote:
> For backward compatibility, there may be a Model sub-class that would leave
> Model alone altogether (this was suggested on the ticket). This seems fair
> since many seem to be getting by without better optimistic concurrency
> control from Django's ORM today.

Would the subclass-based method automatically append a field into the
model, or would there be need to also create the field used for
version control? How does the subclass know which field to use?

Yet another option is models.OptimisticLockField(). If there is one
present in the model, and a save will result in update, the save
method will check for conflicts and set the version to version + 1 if
there are no conflicts. There is some precedent for a somewhat similar
field, the AutoField(). AutoField also changes how save behaves.

I wonder what to do if the save does not result in an update and the
version is set to something else than 1. This could happen if another
user deleted the model and you are now saving it. This would result in
a reinsert. This should also be an error?

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Improvements to better support implementing optimistic concurrency control

2011-08-08 Thread akaariai
On Aug 8, 4:54 pm, Steven Cummings  wrote:
> Interesting feature I hadn't noticed in memcached. That does seem like it
> would do the trick where memcached is being used. I think the ability to
> control it in Django would generally still be desirable though, as that is
> where the data ultimately lives and I'd be hesitant to assume to control the
> DB's concurrency from memcached. Ideally it should be the other way around.

I assume the memcached implementation would be a version value stored
in memcached. Can you really trust that memcached keeps the version
value and doesn't discard it at will when it has been unused long
enough?

There are a couple of other things in model saving which could be
better handled. If composite primary keys are included in Django, one
would need the ability to update the primary key. If you have a model
with (first_name, last_name) primary key, and you change the
first_name and save, current implementation (and definition) of model
save() would insert a new instance into the DB instead of doing an
update. Another thing that could be handled better is update of just
the changed fields.

I wonder how to implement these things with backwards compatibility.
Maybe a method update(condition=None, only_fields=None) which returns
True if something was actually updated (or raises an exception if
nothing was updated). The method would use the old pk and the
condition (if given) in the where clause. If only_fields=None, then it
would only update the changed fields... Seems uqly, but I can't think
of anything better.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Weekly check-in #1

2011-07-27 Thread akaariai
Just a quick note: It could be a good idea to have concrete_fields in
addition to virtual_fields in the Meta class. Concrete fields would be
the fields having a database column directly attached to it, the rest
would be in virtual fields. The fields attribute would be these two
joined together. This way it should be relatively easy to do Model
__init__(), django.db.models.query.QuerySet iterator() etc.

I haven't looked much into this, so this might be a silly idea. I just
wanted to mention it so that this will no longer bother me on the last
days of my holiday :)

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Weekly check-in #1

2011-07-27 Thread akaariai
On Jul 20, 4:18 pm, Michal Petrucha  wrote:
> The last week I've been looking into related fields and all that
> stuff. As it turns out, the issue is much more complex than I
> originally anticipated and at the moment I have some doubts whether
> this can be achieved as part of this GSoC.
>
> Basically, there are two approaches to the task:
>
> 1) Make ForeignKey and friends the field that manages all its rows
>    (i. e. it will return multiple columns when asked for them and so
>    on).
>
> 2) ForeignKey will be just a virtual field and it will create all
>    required auxiliary fields in the local model. ForeignKey would then
>    just handle the relationship stuff and leave everything else to the
>    aux fields.
>
> Some notes about both of them (I spent a few days trying to make
> either support at least some basic features and both seem bloody
> complex to me):
>
> 1) The changes required to ForeignKey itself can be kept to a minimum
>    but this would require drastic changes in many parts of the
>    internal code:
>
>     * a complete rewrite of the database creation code
>
>     * probably also a rewrite of the parts that match rows fetched
>       from the database to model fields
>
>     * many internal changes in query code to support multi-column
>       fields
>
>    Basically, the problem is that practically all parts of the code
>    rely heavily on the fact that each local field is backed by exactly
>    one database column (except for M2M which is often special-cased).
>
>    Now, all of this code would need to be rewritten to also work with
>    fields spanning several database columns. I got completely lost
>    somewhere around SQLCompiler.resolve_columns and
>    DatabaseOperations.convert_values, though this is all just the tip
>    of the iceberg that I encountered while looking into raw querysets;
>    there is much more to it for regular ones.
>
> 2) This would require an extensive refactor of related fields. I can
>    imagine making the aux field sit at ForeignKey's attname to manage
>    the actual value. This would give us creation and row matching
>    practically for free, but again, some internal query changes would
>    still be necessary (multi-column joins, for one).
>
>    The change could be made backwards-compatible if we made the
>    default aux field use the ForeignKey's db_column.
>
> Of course, it might be possible to make a half-assed hacky solution,
> i. e. ForeignKey would be a full-featured field in some cases and a
> virtual one otherwise but this would make a total mess out of
> everything and it would require a serious amount of nasty hacks and
> workarounds.
>
> At any rate, I don't feel competent to make the decision in this
> matter and I honestly believe there ought to be some discussion about
> which route we'll take.
>
> My personal favorite is the second option but I can imagine people not
> liking code that adds local fields automagically. On the other hand,
> there is already one such case (the id AutoField).
>
> Anyway, now is the time that I'd like to see some comments and
> opinions of other people who know the ORM code.

It might be a little late to comment on this, but here is some
opinions from somebody knowing something (but not much) about the ORM.
My first feeling is that from the django/db/models/sql/query.py point
of view it doesn't matter that much what choice you make. Either way,
when filtering through related models, the lookup needs to be resolved
to a field, and if that field is a foreign key, then columns and
tables needed in the joining need to be resolved. After that the code
doesn't care what way the ForeignKey is presented in the models' Meta
class.

I would say that the second option is much better. There are a couple
of reasons I believe this to be so:
  - This way, when there is a concrete field, there would always be a
matching concrete database column.
  - If multi-column primary keys are virtual fields, then it makes
sense that the related field is also virtual, and represented in as
similar way as possible. It could make sense to try to make the pk a
virtual field also.
  - Currently, foreign keys kind of create a new field in the model,
but not really: foo = ForeignKey(Foo) will create something that is
almost like a field (foo_id), but if I am not mistaken, this is not a
field in the models' Meta class. It would IMHO be cleaner that the
foo_id would be a concrete field, and foo a virtual field.
  - The most important thing is that a single field can be part of
multiple foreign keys. It seems really hard to make this work using
approach 1)

In my opinion the biggest worries about this approach are:
  - Can this be made totally backwards compatible.
  - Even if this is backwards compatible, this will still break a ton
of code. Model's Meta class (and especially it's fields attribute)
isn't part of the documented API, but it is used heavily in many
projects. While multi-column primary keys will break the fields
attri

Re: localization in python code (views, models etc)

2011-07-12 Thread akaariai
On Jul 12, 12:28 pm, Jannis Leidel  wrote:
> Yeah, django.utils.formats.localize is the main function to localize
> a value using the format localization engine from Python. The missing
> documentation is a bug, IMO.

Just a minor correction: localize does not use the localization engine
from Python, it uses Django's inbuilt localization. Python's
localization can't be trusted to be thread-safe (although on some
platforms it probably is). This is not a big point, except that the
inbuilt localization engine is slower than Python's. One is written in
Python, the other uses system libraries written in C.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Weekly check-in (this should be #5, right...?)

2011-07-07 Thread akaariai
On Jul 6, 1:11 pm, Michal Petrucha  wrote:

> Hmm, this is exactly what I had in mind when thinking about this
> problem. I see that even despite keeping the changes to a minimum, the
> patch looks quite big. I'll definitely review this once I start
> working on relationship fields.

No wonder the patch is quite big. I accidentally branched from the
conditional aggregation branch, so it has all the things in that patch
included. And that patch is much larger than the multicolumn_join
patch.

I pushed a new branch to github (https://github.com/akaariai/django/
tree/multicolumn_join), this time the patch is much smaller:  4 files
changed, 77 insertions(+), 63 deletions(-)

I will destroy the composite_join branch. I didn't like the
composite_join name anyways, multicolumn_join is much better
name... :)

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Weekly check-in (this should be #5, right...?)

2011-07-05 Thread akaariai
On Jul 6, 1:39 am, akaariai  wrote:
> Having said all this, for this project "extend the connection tuple"
> approach seems to be the only sane choice.

I implemented this in my github branch 
https://github.com/akaariai/django/tree/composite_join

With this you can do:
a = Article.objects.all()
a.query.get_initial_alias()
a.query.join(('basic_article', 'foo', ('pk_part1', 'pk_part2'),
('foo_pk_part1', 'foo_pk_part2')), promote=True)
print a.query
SELECT "basic_article"."id", "basic_article"."headline",
"basic_article"."pub_date" FROM "basic_article" LEFT OUTER JOIN "foo"
ON ("basic_article"."pk_part1" = "foo"."foo_pk_part1" AND
"basic_article"."pk_part2" = "foo"."foo_pk_part2")

The connection parameter is now a tuple (lhs, table, (lhs_col1,
lhs_col2, ...), (col1, col2, ...)). This seemed to be the way of least
pain.

All current test are passed on sqlite3. There will probably be
problems when more complex queries are tried with multi-column join
conditions. I hope this gives at least an idea how to approach the
multi-column outer joins problem.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Weekly check-in (this should be #5, right...?)

2011-07-05 Thread akaariai
On Jun 27, 4:22 am, Michal Petrucha  wrote:
> some visible progress on my project at long last. I spent most of the
> last week digging deep inside the ORM's entrails to make composite
> field lookups possible and finally it looks promising.
>
> While working on this I found out the extra_filters approach I
> intended to use was a dead end (which reminded me of what Russ wrote
> in response to my proposal: "I'm almost completely certain you'll
> find some gremlin lurking underneath some dark corner of the code").

I did a glance-over of your github branch. I was especially looking
for how you will handle LEFT OUTER JOINS involving composite primary
keys / foreign keys. If I am not missing something, I think this
haven't been done yet. I have myself been thinking about this issue,
and I thought it would be good to share what I have found out.

The problematic part for multi-column join conditions is in
django.db.models.sql.query:
def join(self, connector, ...):
"""
Returns an alias for the join in 'connection', either reusing an
existing alias for that join or creating a new one. 'connection'
is a
tuple (lhs, table, lhs_col, col) where 'lhs' is either an existing
table alias or a table name. The join correspods to the SQL
equivalent
of::

lhs.lhs_col = table.col
"""

Obviously this can not work for creating multi-column joins.

The connection information is stored in alias_map, join_map and
rev_join_map. In particular, in alias_map is stored (table, alias,
join_type, lhs, lhs_col, col, nullable). Currently the contents of the
alias_map is turned into SQL (sql/compliler.py, get_from_clause()) as:

result.append('%s %s%s ON (%s.%s = %s.%s)'
% (join_type, qn(name), alias_str, qn(lhs),
 qn2(lhs_col), qn(alias), qn2(col)))

The most simple way to extend this to contain more columns would
probably be the following:
 - connection is defined as (lhs, table, lhs_col1, col1, lhs_col2,
col2, ...)
 - alias_map format needs to change a bit so that the extra columns
can be stored in there. One could store the extra column after the
nullable. Cleaner would be to have the columns in one tuple: (table,
alias, join_type, lhs, (cols), nullable)
 - Limited amount of places needs to be fixed, most notably the
get_from_clause() of compiler.py

The downside of the above is that it does not support any other join
conditions than ones involving 2 tables and a list of anded columns.
For composite fields this is enough.

For future usage it would be nice if one could pass in Where nodes as
the connection. This would allow for arbitrary join conditions. The
Where node knows how to turn itself into SQL, how to relabel aliases
and so on. This approach has some problems, however:
  - How to generate the Where node?
  - How to match existing joins to new joins? Currently this is done
by checking that the connection four-tuple is equivalent to the
existing join's four tuple. I don't think Where nodes know how to
check equivalence to another node. And even if where nodes knew how to
do that, also all the leaf nodes would need to know how to do that.
  - Performance issues, cloning a Where node is more expensive than
cloning a tuple. Also construction, equivalence checking and other
operations too are somewhat more expensive than using tuples.
  - Overkill for composite fields

Of course, the approaches could be combined, that is you pass in the
join condition as a tuple, and you can pass extra_filters (default
None) as a Where node. This would keep the normal case efficient but
allow for more complex join conditions if really needed. The join
having extra_filters could not be reused, except when explicitly
stated.

Having said all this, for this project "extend the connection tuple"
approach seems to be the only sane choice.

The work you have done looks very promising. I hope this post has been
at least somewhat useful to you.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Django ORM enchantments

2011-07-04 Thread akaariai
I have implemented proof of concept versions of conditional
aggregation, F-lookups in aggregates and annotating fields to a model
(qs.field_annotate(age_x2=F('age')*2), note: no aggregation here). See
ticket #11305 for more details.

I would also hope to implement a patch which would allow to annotate
reverse related models. The idea would be most useful for fetching
translation models.

Given models:
Article(
id=IntegerField()
default_lang=CharField()
)
ArticleTranslation(
article=ForeignKey(Article, related_name='translations')
name=TextField()
abstract=TextField()
content=TextField()
lang=CharField()
class Meta:
unique_together = ('article', 'lang')
)

And queryset:
Aricle.objects.annotate(
translation_def=ModelAnnotation(
'translations',
only=Q(translations__lang=F('default_lang'))
),
translation_fi=ModelAnnotation(
'translations',
only=Q(translations__lang='fi')
)
)

The above query would generate something like this:
select article.id, article.default_lang, t1.name, ..., t3.name
  from article
left join article_translation t1 on article.id = t1.article_id and
t1.lang = 'fi'
left join article_translation t3 on article.id = t3.article_id and
t3.lang = article.default_lang

And the objects returned would have (possibly None-valued)
translation_fi and translation_def instances attached to them.

These features require a lot of work before anything commit-quality is
ready. I would ask if the community would consider these ideas before
I do too much work. These patches would also require some attention
from somebody with more ORM knowledge than I have. The ModelAnnotation
idea is probably too hard for me to implement in even near commit-
quality.

These features would naturally make the ORM more powerful, but I see
some objections to these features:
 1. They will make the already complex ORM even more complex. This
will result in new bugs and it will be harder to add new features to
the ORM.
 2. Django ORM has the philosophy of "80/20". Meaning that the ORM
should make it possible to run 80% of your queries, the rest can be
done using raw SQL. Are these features beyond the 80% threshold?
 3. As the queries become more complex, it is likely that the ORM will
not be able to generate efficient SQL. If this is the case raw SQL is
needed. Back to square 1.
 4. The ORM is nearing the point where the API is too complex. Instead
of writing complicated SQL, you will be writing complicate ORM
queries.

On the other hand, combined with custom F() expressions, the need
for .extra() would be smaller and maybe it could even be deprecated in
the future.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Idea for i18n fields

2011-07-02 Thread akaariai
On Jul 2, 12:59 am, Ric  wrote:
> Hi there,
>
> i have got a simple approach to make all django fields with a full
> i18n support
>
> the django.models.fields.Field class can be subclassed like this
>
> from django.db import models
> from django.utils.translation import get_language
>
> class i18nField(models.Field):
>
>     def __init__(self, i18n = False, *args, **kwargs)
>         self.i18n = i18n
>         models.Field.__init__(self, *args, **kwargs)
>
>     def db_column_get(self):
>         if not self.i18n:
>             return self._db_column or self.name
>         return "%s_%s" % (
>             self._db_column or self.name,
>             get_language().lower()
>             )
>
>     def db_column_set(self, value):
>         self._db_column = value
>
>     def _column_set(self, value):
>         pass
>
>     db_column = property(db_column_get, db_column_set)
>     column = property(db_column_get, _column_set)
>
> then you can declare all other subfields as usual
>
> this work in that way: you need a separate db column for every
> language installed.
> a field called "name" needs to create ("name_%s" % code for code in
> languages) columns
>
> so the framework automatically select the right column in every query.
>
> problems:
>  - serializing objects, you need to serialize all all fields, not just
> current language
>  - many to many fields, to work they need to create an extra column in
> every througth table, a column to store the language code.
>  - during sync db you need to create a column for every language
> installed
>
> after two years on an i18n django site, i found this simple solution.
> there are some small problems, that can be fixed if we put a i18n
> option when you init a field, and solve some issue during syncdb
> command and serialization of objects.
>
>  for me is a very simple approch,
> it automatically filter, sort and output the right queryset for your
> language, and when you access a field you get the current language, it
> works for every field, ForeignKeys too.
>
>  and it works in admin (with no changes at all)
>
> let me know what you think.

>From my point of view there are a couple of problems with this
approach:
  - The idea of putting a column for all translated languages for
every field directly to the base table is not feasible for many use
cases. If you have 10 translated fields in your model, and you have 10
languages, that is already 100 fields. If you happen to need an index
on one field, you need 10 indexes. For example in EU you might need to
have the possibility to have the model translated in all official EU
languages. That is over 20 languages you need to support right there.
For the EU use case it is better to have a translations table
containing the translations of one language in one row.
  - This approach makes it hard to fetch all the translations of the
model. How does this work for NOT NULL fields?
  - There are ways to have proper fields for every translation in the
DB table. It is better that there is 'name' field which fetches the
default language according to the currently active language, and then
there are the translated fields ('name_fi', 'name_en', ...) if you
need those. See django-transmeta for one such solution (http://
code.google.com/p/django-transmeta/).

For some use cases your solution definitely can be handy. But my
feeling is that this does not belong to core (not that I have any
power in deciding that). The biggest reason for this is that the more
I have worked in different multilingual projects, the more certain I
am that there is no single solution to model translations. At least no
single solution how to handle the database layout part of it.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Conditional aggregations.

2011-06-29 Thread akaariai
On Jun 28, 5:46 pm, Javier Guerra Giraldez  wrote:
> i might be totally wrong (wouldn't be first time..) but i've found
> myself having to adapt to local dialects almost every time i see some
> SQL inside a function, especially on mysql and sqlite.   maybe it's
> because of the bad quality of code i tend to see (typically
> originating from hand-coded mssql accesses deep within an excel
> sheet), but seeing CASE also rings my "i'll need an extra week just
> for this" alarm.
>

I really do hope that the CASE WHEN construction can be used in all
supported databases. I have high hopes that it can be used, because
the CASE WHEN construction is one of the most standard constructions
in SQL, and because I have tested it on MySQL 5.0, PostgreSQL 8.4,
SQLite3 and Oracle 10g.

I also attached a proof of concept patch to #11305. It is somewhat
uqly, but I hope it is a good start. It should support aggregate() and
annotate() for all the standard aggregates, and F() lookups should be
usable with it. The restriction is that the Q-object used in the only
condition can not add additional joins to the query. The patch is just
a proof of concept, and it is safe to assume it will fail under more
complicated queries.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Thoughts on solution to forward references in MySQL (#3615)

2011-06-28 Thread akaariai
On Jun 28, 12:24 am, "Jim D."  wrote:
> I spent some time last week and over the weekend nailing down a
> solution forhttps://code.djangoproject.com/ticket/3615. This is the
> ticket about allowing forward references when loading data on the
> MySQL InnoDB backend. My patch implements the proposed change
> (disabling foreign key checks when the data is loaded) as well as a
> straightforward SQL SELECT check for integrity after the data is
> loaded, which if I understand it is the missing piece that has
> prevented this ticket from moving forward for the last 4 years...

This is probably not concurrency-safe if the tables are not locked for
the duration of the fixture loading. I don't know if this will ever be
used in situations where concurrency is an issue. Test fixture loading
is certainly not a problematic use-case.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Conditional aggregations.

2011-06-28 Thread akaariai
On Jun 28, 5:18 pm, Javier Guerra Giraldez  wrote:
> On Tue, Jun 28, 2011 at 8:41 AM, akaariai  wrote:
> > This should translate to the following SQL:
> > SELECT sum(case when house.price > 41000 and house.price < 43000 then
> > 1 else 0 end) as expensive_house,
> >       sum(case when house.price > 43000 then 1 else 0 end) as
> > really_expensive_house, ...
> >  FROM house
> >  JOIN something on something.id = house.something_id
>
> this looks quite non-portable

How? The CASE statement is specified in SQL standard, and it is
implemented in all database I have used.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Conditional aggregations.

2011-06-28 Thread akaariai
On Jun 27, 4:54 pm, Russell Keith-Magee 
wrote:
> > queryset.aggregate(
> >    expensive_house=Count(house__price,
> > only=(Q(house__price__gt=41000), Q(house__price__lt=43000))),
> >    ...
> >    )
>
> Ok, so that's you're syntax proposal. Now show me the SQL that this
> translates into. In particular, keep in mind that you're doing joins
> in your Q clauses -- how does that get rolled out into SQL?

This should translate to the following SQL:
SELECT sum(case when house.price > 41000 and house.price < 43000 then
1 else 0 end) as expensive_house,
   sum(case when house.price > 43000 then 1 else 0 end) as
really_expensive_house, ...
  FROM house
  JOIN something on something.id = house.something_id
-- The given example queryset is clearly missing that something :)

I think it might be good to restrict the only clauses to the fields of
the same model the aggregated field is in. This way there is already a
usable join generated by the aggregate. The only clause affects the
"case when" structure only.

The only clauses should never restrict the queryset. It gets really
complicated to do that restriction when you have multiple aggregates
using only. You can use filter to restrict the queryset instead. And
you probably don't want the filter there in any case, In the above
example, you would not get any results for the rows aggregating to 0.

If the only clauses never restrict the queryset, then translating
conditional aggregates to SQL isn't really _that_ complicated. When
you normally generate "avg(table.column) as column_avg", now you just
generate "avg(case when table.some_column matches only condition then
table.column else null end)" as column_avg.

 - Anssi






-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Removal of DictCursor from raw query.. why??

2011-06-17 Thread akaariai
On Jun 17, 8:02 pm, Ian Kelly  wrote:
 > The thing is, this is a DB API snippet, not a Django snippet
> specifically.  If Django were a DB API toolbox, then it might make
> sense to include it in some form or other.  But it's not, so in the
> interest of keeping things relatively tidy I'm a -0 on this.

It is often said here that Django ORM is designed to do 80% of the
stuff, the rest can be done using raw SQL. So, giving pointers to
users how to perform the raw SQL as painlessly as possible is
something Django documentation should do.

- Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Removal of DictCursor from raw query.. why??

2011-06-17 Thread akaariai
On Jun 17, 2:54 pm, "Cal Leeming [Simplicity Media Ltd]"
 wrote:
> Because I feel this is just something that should work (or be available) out
> of the box. There are plenty of other places where Django docs has included
> code snippets to give the user a heads up, and I think this is the perfect
> case for one.
>
> If anyone has any objections to this, please let me know, if not ill put in
> a ticket for consideration.

I just wanted to say I support having something documented about this.
Without documentation new users will most likely use index based
cursors. I know I used to do that.

The problem with no documentation is not so much that it would be hard
to find a snippet about dict cursor implementation. It is more that
new users don't know that using index based cursors might not be the
best of ideas.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: RFC: Composite fields API

2011-05-17 Thread akaariai
On May 17, 5:32 pm, Michal Petrucha  wrote:
> Proper subquery support is something that can be addressed once the
> rest of the implementation is stable.

To me the plan looks very reasonable (both disallowing subqueries and
converting to disjunction form), unless there is some part in the
internals which expects pk__in=qs to work. In that case it could just
be converted to something like:
if pk is multipart_pk:
qs = list(qs.values_list('pk_part1', 'pk_part2'))
continue as now.

In any case, in my opinion pushing as much of this work to later
patches is the way to go. The only question is how much can be pushed
to later patches. I do not know the answer to that, unfortunately...

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: RFC: Composite fields API

2011-05-17 Thread akaariai
On May 12, 2:41 pm, Michal Petrucha  wrote:
> Due to the nature of this field type, other lookup filters than
> ``exact`` and ``in`` would have unclear semantics and won't be
> supported. The original plan was to also exclude support for ``in``
> but as it turns out, ``in`` is used in several places under the
> assumption that primary keys support it, for example DeleteQuery
> or UpdateQuery. Therefore both filters will be implemented.

I wonder how to implement __in lookups in SQLite3. SQLite3 doesn't
support where (col1, col2) in ((val3, val4),(val5, val6)). But other
DBs do (at least MySQL, Oracle and PostgreSQL). I do not know what
would be the best way to write something equivalent in SQLite3. The
obvious choice is to rewrite it as an OR lookup (as mentioned in the
full proposal). Maybe write it as an OR lookup for every DB for the
initial patch, and later on this can be improved to have per database
handling. In lookups with subselects are a harder problem. Those would
need to be rewritten as joined subselects with a distinct clause. [1]
Not in lookups could be still harder due to weird null handling. (1
not in (null) -> Unknown). [2]

I hope there will be an easy solution to this problem, as this feature
is something which would be really, really valuabe for Django (no more
telling DBAs: by the way, no composite foreign keys...). One simple
solution would be to disallow __in lookups with subselects (or run the
subselects separately) and use OR lookups when given a list of values.
This should be relatively easy to implement and could be improved
later on.

 - Anssi

[1] 
http://asktom.oracle.com/pls/asktom/f?p=100:11:0P11_QUESTION_ID:953229842074
[2] 
http://asktom.oracle.com/pls/asktom/f?p=100:11:1089369944141559P11_QUESTION_ID:442029737684

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Resetting settings under test

2011-05-13 Thread akaariai
On May 13, 3:41 pm, Jeremy Dunck  wrote:
> In general, the TestCase does a good job of cleaning up the
> environment (resetting the DB, etc.), but I've run across an edge case
> that might be worth putting upstream.
>
> I have a large codebase running multi-tenant -- lots of sites per WSGI
> process, running process-per-request, and it serves those sites by
> mutating settings per request (via middleware), including poking an
> urlconf onto the request.
>
> Under test, this leaves these fairly weird, since the settings
> mutations can affect other test cases.
>
> I realize that in general, settings are intended to be immutable,
> but... what do you think of TestCase tearDown restoring the settings
> as they were before the test runs?

The tearDown should also handle reset of cached settings. There are a
few, at least translations __init__.py and localization have caches
that need resetting. It would be good if settings had a method "reset"
which would restore the original settings and clear the caches. A
complete API of change_setting, restore_setting and reset_all would be
even better.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Accidental logging disabling

2011-04-15 Thread akaariai
On Apr 15, 7:34 am, Ivan Sagalaev  wrote:
>      import logging
>
>      import settings
>      from django.core.management import setup_environ
>      setup_environ(settings)
>
>      from django.test.client import RequestFactory
>
>      factory = RequestFactory()
>      factory.get('/')
>      logger = logging.getLogger('django.request')
>      logger.error('Error')
>
> The message doesn't show up in the console. Here's what's happening here:
>
> 1. setup_environ creates a lazy setting object
>
> 2. importing RequestFactory triggers import of django.core.handlers.wsgi
> that creates a logger 'django.request'
>
> 3. after some time settings object is accessed for the first time and
> gets initialized

I have been using setup_environ in my projects, and the lazy
initialization in can cause some weird problems, for example if you do
manual timing using:
start = datetime.now()
access settings
print 'Used %s' % (datetime.now() - start)

You might get weird results as accessing settings can change your
timezone. Would it be wise that setup_environ() would access the
settings so that they are no more lazy? Or does this cause other
problems?

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Read-only forms and DataBrowse

2011-04-07 Thread akaariai
About the read-only forms part of the proposal: read-only forms will
be easy to implement if the template widget rendering idea will be
included in core.

For example for SelectMultiple widget the base template is probably
something like this:


{% for choice in choices %}
   {% if choice.selected %}
   {{choice}}
   {% else %}
   {{choice}}
   {% endif %}
{% endfor %}


If you want a read-only widget, just give it a custom template:


{% for choice in choices %}
{% if choice.selected %}
{{choice}}
{% endif %}
{% endfor %}


Now, that was easy :) Combined with template based form rendering, it
would be relatively easy to implement read-only forms. Another reason
why template based form/widget rendering would be nice to have in
core.

By the way, it would be nice to see how the template based rendering
compares to python based rendering in performance terms when rendering
a larger list of choices. But on the other hand, if you have a large
Select/SelectMultiple list there is bound to be some usability issues,
so maybe the performance isn't that important... Sorry for bringing
performance up here again, but I am a speed freak :)

For what it's worth, I implemented somewhat working read-only fields /
widgets some time ago. I have used it a little in some sites, and the
ability to render any model / form easily in display-only mode is a
really nice feature. The work can be found at [1], but it is
incomplete and based on an old version of Django (last commit is from
August 23, 2010).

[1] https://github.com/akaariai/django/tree/ticket10427

 - Anssi


-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Django Template Compilation rev.2

2011-03-30 Thread akaariai

On Mar 30, 6:18 am, xtrqt  wrote:
>     def templ(context, divisibleby=divisibleby):
>         my_list = context.get("my_list")
>         _loop_len = len(my_list)
>         result = []
>         for forloop, i in enumerate(my_list):
>             forloop = {
>                 "counter0": forloop,
>                 "counter": forloop+1,
>                 "revcounter": _loop_len - i,
>                 "revcounter0": _loop_len - i - 1,
>                 "first": i == 0,
>                 "last": (i == _loop_len - 1),
>             }
>             if divisibleby(i, 2) == 0:
>                 result.append(force_unicode(i))
>         return "".join(result)
> For comparison here is the performnace of these 2::
>     >>> %timeit t.render(Context({"my_list": range(1000)}))
>     10 loops, best of 3: 38.2 ms per loop
>     >>> %timeit templ(Context({"my_list": range(1000)}))
>     100 loops, best of 3: 3.63 ms per loop
> That's a 10-fold improvement!

I did a little test by adding localize(i) in there. On my computer the
time went to around 25ms. For datetimes the time needed is somewhere
around 100ms. If you could inline the localize(i) call for the integer
case you would get back to around 4ms, as it doesn't actually do
anything else than return force_unicode(i)... So, when designing
template compilation it is essential to see how the localization stuff
could be made faster, else much of the benefit will be lost. It seems
that at least for this test case localization uses over 50% of the
time, so there would be bigger gain in making localization faster than
in making compiled templates.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Template Compilation

2011-03-28 Thread akaariai
On Mar 27, 5:48 am, "G.Boutsioukis"  wrote:
> Hi, I'm thinking about submitting a proposal for template compilation
> and I'm posting this as a request for more info.
>
> In particular, I remember this project being discussed last year and I
> pretty much assumed that Alex Gaynor's proposal would have been
> accepted(I see he's listed as a mentor this year BTW). What was the
> rationale behind the decision to reject it? Unless, of course, it was
> made on his part.
>
> In any case, any other comment around compatibility, speed or other
> concerns would also be helpful.

In the other concerns department: for many workloads template
compilation itself won't be that big of a benefit. There is a
relatively big speed bottleneck in L10N related stuff. If you are
rendering a big table of integers, if I recall correctly about 30-40%
of time is used in localizing the representation of those integers. If
you are rendering floats it will be more, and if dates/datetimes it
will probably be 90%+. So, it would be important to see how to reduce
the impact of L10N when trying to make template rendering faster.

The L10N rendering was made faster in tickets #14290 and #14306. There
is some low-hanging fruit in #14297 which never got applied.

I don't mean to say that this is a reason not to implement template
compilation, just to say that for some workloads the gain of template
compilation is not going to be _that_ big. And in the case of integer
rendering, it would be reasonable to still expect a speedup of nearly
50%.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Composite primary keys

2011-03-21 Thread akaariai
On Mar 21, 1:20 pm, Michal Petrucha  wrote:
> > My suggestion is to create an Index type that can be included in a
> > class just like a field can.  The example we've been using would
> > then look like:
>
> > class Foo(Model):
> >    x = models.FloatField()
> >    y = models.FloatField()
> >    a = models.ForeignKey(A)
> >    b = models.ForeignKey(B)
>
> >    coords = models.CompositeIndex((x, y))
> >    pair = models.CompositeIndex((a, b), primary_key=True)
>
> > We could have FieldIndex (the equivalent of the current
> > db_index=True), CompositeIndex, and RawIndex, for things like
> > expression indexes and other things that can be specified just as a
> > raw SQL string.
>
> > I think this is a much better contract to offer in the API than one
> > based on field which would have to throw exceptions left and right
> > for most of the common field operations.
>
> I don't see how ForeignKeys would be possible this way.
>

In much the same way:

class FooBar(Model):
a = models.ForeignKey(A)
b = models.ForeignKey(B)
pair = models.ForeignKey(Foo, fields=(a, b))

Note that this is very close to what SQL does. If you have a composite
unique index or composite foreign key you define the fields and then
the index / foreign key. Though I don't know how much value that
argument has in this discussion.

You could add some DRY and allow a shortcut:
class FooBar(Model):
pair = models.ForeignKey(Foo)
# a and b are created automatically.

Now, to make things work consistently pair should be a field. But on
the other hand when using a ModelForm, the pair should probably not be
a field of that form. This is more clear in an example having a (city,
state, country) primary key. These should clearly be separate fields
in a form.

In my opinion, if the composite structures are called fields or
something else isn't that important. There are cases where composite
structures behave like a field and some cases where they do not. The
main problem is how the composite structures should behave in
ModelForms and serialization, should they be assignable, how the
relate to model __init__ method, should they be in model fields
iterators, how they are used in QuerySets and so on. When these
questions are answered it is probably easier to answer if the
composite structures should be called fields or something else.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Expensive queryset cloning

2011-03-16 Thread akaariai
On Mar 17, 3:11 am, Alexander Schepanovski  wrote:
> Can you find that patch and post somewhere?
> If not still thanks for this idea.

Unfortunately, no. Gone with my old laptop.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Expensive queryset cloning

2011-03-16 Thread akaariai


On Mar 16, 10:14 am, Thomas Guettler  wrote:
> Hi Alexander,
>
> I have seen this in my app, too. It still runs fast enough. But
> I guess the django code could be optimized.
>

I had a patch for this problem somewhere, but can't find it now.
Basically it added inplace() method to queryset, and after that no
cloning of the inner query class would happen. The outer QuerySet
would still be cloned, but that is relatively cheap. This was to
prevent usage of the old reference to the QuerySet accidentally.* This
was done by keeping a "usage" count both in the inner query instance
and in the outer QuerySet instance.

 - Anssi

(*)
 qs.filter(pk=1)
 qs.filter(foo=bar) would be an error
but
 qs = qs.filter(pk=1)
 qs.filter(foo=bar) would be ok.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: the new SELECT DISTINCT query in 1.3 rc1, and how to turn it off

2011-03-05 Thread akaariai


On Mar 5, 7:29 am, Karen Tracey  wrote:
> It's probably best if you open a ticket in trac 
> (http://code.djangoproject.com/newticket) for this. I can't think offhand how
> to solve both the problem that changeset fixed and the one you are
> encountering
>

If Django ORM would be able to perform a query: select distinct
on(primary_key) id, val1, ... from table order by primary_key this
would solve the problem. But making the ORM capable to do that will
probably take some time...

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Forms: display_value for BoundFields and widgets

2010-11-30 Thread akaariai
Hello,

This is a proposal related to ticket 10427: Bound field needs an easy
way to get form value [1].

Ticket #10427 is already closed as fixed, but the fix is only partial.
It is now possible to get the value from BoundField, but the value
might be a DB primary key or something else not usable for displaying
to user.

To fix this issue, so that it is possible to render a form in show-
only-data state, I propose two additions:

1. form widgets should receive additional method, display_value(name,
value, attrs=None). This method returns the value given in human
preferred format. For TextInput this would be simply:
%(value)s
For more complex fields, like SelectMultiple, the display_value would
know how to render the values correctly: If values are (1, 3) the
output could be something like:
text_representing_choice_1text_representing_choice_3

2. BoundFields would get additional property display_value, which
would return the BoundField's widget's display_value output.

I am already using a version of Django patched to do this in
production. My own feeling is that this is really useful. So, now I am
asking that if this is
1) Something that would be included in Django core (#10427 suggests
so).
2) Does the approach seem valid.
3) What to do with FileFields.
4) Is the wrapping in  tag sensible, or
should it be something else.

The code for my own version can be found from [2], but that code is
not core ready, and it is a bit outdated, especially the tests part of
it.

A few pictures can never hurt my case, so here are some pictures of
actual usage of the display_value. Sorry, the site is implemented in
finnish, but the actual values are not the point... The first one is a
form rendered simply by using a for loop and field.label and
field.display_value. This is used for viewing data, the link "Muokkaa"
in the upper left corner is a edit link.
http://users.tkk.fi/akaariai/Screenshot-7.png

The second picture is what you get when clicking the edit link. This
is rendered with {{ form.as_table }}.
http://users.tkk.fi/akaariai/Screenshot-8.png

It is notable that select multiple fields are handled correctly and
that the "Omistaja" field is actually an autocomplete field which has
a totally different widget than usual select fields. Yet it is really
easy to define display_value method for that widget, too.

In general, this proposal will allow to easily make previews for
forms, implement list - view - edit workflow for websites and in
general to easily show the contents of any model - just define a
ModelForm and display it.

I hope to get some feedback before I start to write a patch for
current Django trunk. The most work will be writing tests for this, so
I would like to get the point 4) above (wrapping in ) correct on
first try.

- Anssi

[1] http://code.djangoproject.com/ticket/10427
[2] https://github.com/akaariai/django/tree/ticket10427

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Changing settings per test

2010-11-05 Thread akaariai
On Nov 5, 2:18 am, Alex Gaynor  wrote:
> def setUp(self):
>    self.old_SETTING = getattr(settings, "SETING", _missing)
>
> def tearDown(self):
>     if self.old_SETTING is _missing:
>         del settings.SETTING"
>     else:
>         settings.SETTING = self.old_SETTING

How about introducing a new function in settings:
change_setting(name, new_value)
which returns old setting or a marker when there is nothing configured
for that value. This function would clear the caches if the setting is
cached somewhere, and also handle the env updates for timezone or
other settings needing that. Clearing caches is a problem that is not
handled well at all in the Django test suite.

You could then revert the setting using the same function. This would
again handle clearing the caches & env handling. If passed in
new_value was the marker for nothing, then the setting would be
deleted.

Using those functions setting changes would be done like this:

def setUp(self):
   self.old_SETTING = settings.change_setting("SETTING", new_val)
   # or if you want just to store the setting and change it later in
the actual tests
   # self.old_SETTING = settings.change_setting("SETTING")
   # will just fetch the old setting, or the marker for missing
setting

def tearDown(self):
   settings.change_setting("SETTING", self.old_SETTING)

And you would not need to care if the setting was cached somewhere,
the change_setting function will take care of that. I don't know if it
would be good to have also settings.load_defaults (returns a dict
containing all the old settings, loads global_settings). This would
need a reverse function, I can't think of a good name for it,
settings.revert_load_defaults(old_settings_dict) or something...

The append case could have still another function,
append_setting(name, value), returning old list (or marker if nothing)
and inserting the new value in the list. Reverting would be just
change_setting(name, append_setting_ret_val).

Handling what needs to be done when changing a setting could be signal
based (register_setting_change_listener), this would allow using the
same mechanism for settings used by apps not in core.

Of course, there could also be decorators which would use these
functions...

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: AutoFields, legacy databases and non-standard sequence names.

2010-10-08 Thread akaariai


On 8 loka, 10:41, Hanne Moa  wrote:

> You can't necessarily do this with a legacy database, as other systems
> also using that database expect the existing names.

alter sequence owned by does not change the sequnce name, just what
pg_get_serial_sequence will return for given table, column
combination. But as said, if there are multiple tables using the same
sequence, then alter sequence owned by does not work. In these cases
manually settable sequence names for models is likely the best
solution.

>
> I need to use my own backend because of posgresql's own
> table-inheritance. Most tables in the db inherit from the same table
> and inherits its primary key and the sequence for that primary key.
> Then there are a few tables that inherits from a table that inherits
> from the grandfather table that defines the primary key and its
> sequence. So, I need to recursively discover the oldest ancestor of
> each table and use the sequence of that ancestor.
>
> HM

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: AutoFields, legacy databases and non-standard sequence names.

2010-10-07 Thread akaariai
> Sorry, I should have been more clear.
>
> What I'm trying to do is solicit suggestions from django developers as
> to how I *can* move ticket #1946 forward. I can find a way to work
> around it in my own project, but it would be ideal to solve it on the
> Django side, for everybody.
>
> I mentioned there were three possible suggestions in the ticket
> discussion as to how to solve the problem.  If a Django developer can
> give me some guidance as to what approach seems to be the best long-term
> solution, I'm happy to try my hand at writing a patch that can hopefully
> be incorporated into the codebase.

Django doesn't expect the sequence name to be tablename_columname_seq,
at least not in trunk. The last_insert_id method in backends/
postgresql/operations.py uses select
currval(pg_get_serial_sequence(tablename, columname)).
pg_get_serial_sequence will return the correct sequence only if the
sequence is owned by the tablename, columname combination. If you
happen to have just one table per sequence, then issuing
ALTER SEQUENCE sequencename OWNED BY tablename.columname;
should fix the problem.

There is one more proposal in ticket #13295. The proposed solution
should allow using the same sequence for multiple tables, though
management of the manually defined sequence is a bit hard when trying
to build the schema and when resetting the sequence. Sorry if the
proposal is a bit hard to follow...

I don't know well enough how databases other than postgresql work, so
I don't know if the solution is valid for other databases.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Changes to LazySettings

2010-09-28 Thread akaariai
On Sep 28, 3:42 am, Luke Plant  wrote:
> But can anyone else think of any gotchas with this change before I
> commit it? It produces a 30% improvement for the benchmark that relates
> to the ticket [3].

Not directly related to this change, but there are at least one part
in Django that will cache a setting when first used [1]. The code in
django/utils/translations/__init__,py will cache the real translation
provider for performance reasons. If the USE_I18N setting is changed
when testing, it will not have any effect in which translations
provider is used. There might be other parts also which do the same
thing. I wonder if __setattr__ of the LazySettings should
automatically flush this cache when USE_I18N is changed?

[1] http://code.djangoproject.com/changeset/13899

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Speed testing Django

2010-09-16 Thread akaariai
On Sep 16, 10:43 am, Russell Keith-Magee 
wrote:

> Do we have a continuous performance benchmark at present? No.
>
> Would it be worth having one? Certainly.
>
> There's a long standing ticket proposing just such a change:
>
> http://code.djangoproject.com/ticket/8949
>
> And there was a sprint earlier this year that started developing a
> benchmark set that could be used; results so far can be found here:
>
> http://github.com/jacobian/djangobench
>
> However, there's still a lot of work to do to build up the benchmark
> set and deploy it in a continuous integration environment. I'm not
> aware of anyone specifically coordinating this benchmark work at the
> moment, so if you want to step up and make this your contribution,
> feel free to do so!
>
> Yours,
> Russ Magee %-)

That looks to be a good base to start the work from. I hope I will
have time to do this, but this might be bigger problem than I can
handle.

I will check if it is possible to integrate Codespeed with the
benchmarks in djangobench. I will post status updates to ticket #8949,
once I have better understanding of the problem.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Speed testing Django

2010-09-16 Thread akaariai
Is there any continuous speed testing done for Django? It would be
nice to see how performance of Django is evolving. For example while
working on ticket #14290, seeing how my changes to utils/translation/
__init__.py affect other parts of the framework would be useful. In
general, speed testing should expose performance problems early, so
that design of features can be changed before the feature goes into a
release.

If there is no continuous benchmarking done at the moment, maybe using
Codespeed (http://wiki.github.com/tobami/codespeed/) could be
considered. It is the framework used for http://speed.pypy.org, and it
is built using Django.

If there is need for this kind of work, I am volunteering to do some
work.  That is, set up Codespeed and do some simple tests. I haven't
tested integrating Codespeed with Django yet, so it is possible that
Codespeed is not useful in this set up. I would like to integrate it
with git, so that it would be easy to do speed testing of your own
development branches. In SVN this is hard to do, if I am not
mistaken...

I am thinking of implementing the following tests (as time permits):

Template rendering, simple case: build a table of, say 1000 x 10 cells
using for loops.

Template rendering, forms: build a large form (possibly with FormSets)
and use that in a template. Maybe both with {{ form.as_table }} and
iterating through the form. This would test form rendering, not
initialization.

Template rendering, real world case: Use {% extends %}, {% include %},
{% trans %}... blocks, that is, build a simple template that reflects
real world usage.

QuerySet building: Simple QuerySet building:
Foo.objects.filter(pk=id). Complex QuerySet building (couple of filter
operations, with some joins), maybe order by etc.

Object creation: Just create a bunch of Model objects. A few cases
here, simple object (only id) creation, complex object (many fields)
creation, objects with foreign keys, inheritance and deferred fields
object creation.

Realistic use case: Make a request to the framework and go through all
the parts of building the response, that is, make a request to
runserver. However, do not use database queries.

Regression tests: Go through trac and look for speed tests attached to
tickets.

This is probably a lot of work, but I would like to give it a try.
Comments?

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: I need a pony: Model translations (aka "my proposal")

2009-09-14 Thread akaariai

When using related table based approach for the translations, django
could use pivoting to fetch all the translations with just one normal
join.

Some pseudo sql, skipping constrains and indexes...
create table a (
  id integer pkey
);
create table b (
  id integer pkey,
  a_id integer references a.id,
  lang varchar(2),
  val1 integer,
  val2 integer
  ...
);

Now when fetching multiple languages instead of outer joining the b
table multiple times, one can fetch the results with just one normal
join.

select a.id,
max(case when b.lang = 'fi' then b.val else null end) as val_fi,
max(case when b.lang = 'en' then b.val else null end) as val_en,
max(case when b.lang = 'de' then b.val else null end) as val_de,
...
from a, b
where a.id = b.a_id group by a.id

This results in some serious performance gains.

I inserted 1 rows into a, and 7 translations for each in table b.
I am using postgresql version 8.3.5 on macbook, OS X 10.5.

Fetching one translation from table b, all rows: ~110ms
Fetching all translations from table b, using pivot: ~270ms,
using join: ~550ms

Ok, not too big of a performance gain.

Lets try filtering:

one translation from b, 1000 < id < 1300: ~60ms
all using join, 1000 < id < 1300: ~350ms
all using pivot, 1000 < id < 1300: ~70ms
just for comparison, fetching only from a, 1000 < id < 1300: ~30ms
Fetching only from a is similar to having the translations directly in
base table.

The point of this post is that when fetching small amounts of data
(less than 500 rows), fetching all of the translations at one time
using pivoting is almost the same cost as fetching just one
translation. Using columns in base table is of course the best
performing solution. And I believe there is something fishy in my
setup in the case of fetching all translations using pivot...

Implementing something like this in Django is surely non-trivial. But,
the topic of this thread encouraged me to post this :)


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---