On Jun 27, 4:22 am, Michal Petrucha <michal.petru...@ksp.sk> wrote:
> some visible progress on my project at long last. I spent most of the
> last week digging deep inside the ORM's entrails to make composite
> field lookups possible and finally it looks promising.
>
> While working on this I found out the extra_filters approach I
> intended to use was a dead end (which reminded me of what Russ wrote
> in response to my proposal: "I'm almost completely certain you'll
> find some gremlin lurking underneath some dark corner of the code").

I did a glance-over of your github branch. I was especially looking
for how you will handle LEFT OUTER JOINS involving composite primary
keys / foreign keys. If I am not missing something, I think this
haven't been done yet. I have myself been thinking about this issue,
and I thought it would be good to share what I have found out.

The problematic part for multi-column join conditions is in
django.db.models.sql.query:
def join(self, connector, ...):
    """
    Returns an alias for the join in 'connection', either reusing an
    existing alias for that join or creating a new one. 'connection'
is a
    tuple (lhs, table, lhs_col, col) where 'lhs' is either an existing
    table alias or a table name. The join correspods to the SQL
equivalent
    of::

    lhs.lhs_col = table.col
    """

Obviously this can not work for creating multi-column joins.

The connection information is stored in alias_map, join_map and
rev_join_map. In particular, in alias_map is stored (table, alias,
join_type, lhs, lhs_col, col, nullable). Currently the contents of the
alias_map is turned into SQL (sql/compliler.py, get_from_clause()) as:

result.append('%s %s%s ON (%s.%s = %s.%s)'
    % (join_type, qn(name), alias_str, qn(lhs),
         qn2(lhs_col), qn(alias), qn2(col)))

The most simple way to extend this to contain more columns would
probably be the following:
 - connection is defined as (lhs, table, lhs_col1, col1, lhs_col2,
col2, ...)
 - alias_map format needs to change a bit so that the extra columns
can be stored in there. One could store the extra column after the
nullable. Cleaner would be to have the columns in one tuple: (table,
alias, join_type, lhs, (cols), nullable)
 - Limited amount of places needs to be fixed, most notably the
get_from_clause() of compiler.py

The downside of the above is that it does not support any other join
conditions than ones involving 2 tables and a list of anded columns.
For composite fields this is enough.

For future usage it would be nice if one could pass in Where nodes as
the connection. This would allow for arbitrary join conditions. The
Where node knows how to turn itself into SQL, how to relabel aliases
and so on. This approach has some problems, however:
  - How to generate the Where node?
  - How to match existing joins to new joins? Currently this is done
by checking that the connection four-tuple is equivalent to the
existing join's four tuple. I don't think Where nodes know how to
check equivalence to another node. And even if where nodes knew how to
do that, also all the leaf nodes would need to know how to do that.
  - Performance issues, cloning a Where node is more expensive than
cloning a tuple. Also construction, equivalence checking and other
operations too are somewhat more expensive than using tuples.
  - Overkill for composite fields

Of course, the approaches could be combined, that is you pass in the
join condition as a tuple, and you can pass extra_filters (default
None) as a Where node. This would keep the normal case efficient but
allow for more complex join conditions if really needed. The join
having extra_filters could not be reused, except when explicitly
stated.

Having said all this, for this project "extend the connection tuple"
approach seems to be the only sane choice.

The work you have done looks very promising. I hope this post has been
at least somewhat useful to you.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to