Re: [sqlalchemy] Query and compiled_cache

Michael Bayer Thu, 30 May 2013 13:48:24 -0700

a very brief example of this, which if you keep digging in you can see how 
tricky it gets (fast), is like:


from sqlalchemy.sql import column, table

t1 = table('t1', column('x'), column('y'))

t2 = table('t1', column('x'), column('y'))

t3 = table('t2', column('p'), column('r'))

t4 = table('t2', column('r'), column('p'))

assert t1._hash == t2._hash
assert t3._hash != t4._hash


the patch to produce the above is below.  Note that Table/Column are easier to 
hash than table()/column(), since we treat the upper class versions as 
singletons.  There is a lot more state that needs to be taken into account 
though, like the _annotations dictionary on every ClauseElement.  In the case 
where an element doesn't define a fixed _hash, the usage of a new instance of 
that element in an ad-hoc Query means that whole Query can't be cached, because 
the element would have a different "id" each time (though dangerously, that 
id() can be reused when the original is garbage collected...that's an issue 
actually, we might instead need to use a counter for that case).


diff --git a/lib/sqlalchemy/sql/expression.py b/lib/sqlalchemy/sql/expression.py
index 5820cb1..d5de299 100644
--- a/lib/sqlalchemy/sql/expression.py
+++ b/lib/sqlalchemy/sql/expression.py
@@ -1669,6 +1669,7 @@ class ClauseElement(Visitable):
         """
         c = self.__class__.__new__(self.__class__)
         c.__dict__ = self.__dict__.copy()
+        c.__dict__.pop('_hash', None)
         ClauseElement._cloned_set._reset(c)
         ColumnElement.comparator._reset(c)
 
@@ -1681,6 +1682,10 @@ class ClauseElement(Visitable):
 
         return c
 
+    @util.memoized_property
+    def _hash(self):
+        return id(self)
+
     @property
     def _constructor(self):
         """return the 'constructor' for this ClauseElement.
@@ -2421,6 +2426,10 @@ class ColumnCollection(util.OrderedProperties):
         self._data.update((c.key, c) for c in cols)
         self.__dict__['_all_cols'] = util.column_set(self)
 
+    @util.memoized_property
+    def _hash(self):
+        return hash(tuple(c._hash for c in self))
+
     def __str__(self):
         return repr([str(c) for c in self])
 
@@ -4432,6 +4441,17 @@ class ColumnClause(Immutable, ColumnElement):
         self.type = sqltypes.to_instance(type_)
         self.is_literal = is_literal
 
+    @util.memoized_property
+    def _hash(self):
+        return hash(
+                    (
+                    hash(self.key),
+                    hash(self.table.name),  # note using "self.table" here 
causes an endless loop
+                    self.type._hash,
+                    hash(self.is_literal)
+                    )
+                )
+
     def _compare_name_for_result(self, other):
         if self.is_literal or \
             self.table is None or \
@@ -4586,6 +4606,15 @@ class TableClause(Immutable, FromClause):
         for c in columns:
             self.append_column(c)
 
+    @util.memoized_property
+    def _hash(self):
+        return hash(
+                    (
+                    hash(self.name),
+                    self._columns._hash,
+                    )
+                )
+
     def _init_collections(self):
         pass
 
diff --git a/lib/sqlalchemy/types.py b/lib/sqlalchemy/types.py
index bfff053..16834d1 100644
--- a/lib/sqlalchemy/types.py
+++ b/lib/sqlalchemy/types.py
@@ -59,6 +59,10 @@ class TypeEngine(AbstractType):
         def __reduce__(self):
             return _reconstitute_comparator, (self.expr, )
 
+    @property
+    def _hash(self):
+        return id(self)  # default to the same value as __hash__() if a 
specific hash is not defined
+
     hashable = True
     """Flag, if False, means values from this type aren't hashable.
 



On May 30, 2013, at 3:10 PM, Michael Bayer <mike...@zzzcomputing.com> wrote:

> my first 25 seconds of looking at this reveals that if you want to be able to 
> generate a hash, this has to go all the way down to everything.   
> query.filter(X == Y) means you need a hash for X == Y too.    These hashes 
> are definitely going to be determined using a traversal scheme for sure:
> 
> q = X == Y
> 
> q._magic_hash_value_()
> 
> will ask "X", operator.eq, "Y", for their hash values ("X" and "Y" assuming 
> they are Column objects are considered to be "immutable", even though they 
> can be copies of "X" and "Y" sometimes with different semantics), and combine 
> them together.
> 
> So some_select_statement._magic_hash_value_() would traverse all the way down 
> as well.
> 
> This is why object identity was a lot easier to work with.
> 
> 
> 
> 
> On May 30, 2013, at 3:05 PM, Michael Bayer <mike...@zzzcomputing.com> wrote:
> 
>> 
>> On May 30, 2013, at 2:28 PM, Claudio Freire <klaussfre...@gmail.com> wrote:
>> 
>>> On Thu, May 30, 2013 at 2:25 PM, Michael Bayer <mike...@zzzcomputing.com> 
>>> wrote:
>>> 
>>>> If you want to work on a feature that is actually going to change 
>>>> SQLAlchemy, (and would that be before or after you finish #2720? :) ), it 
>>>> would be:
>>> 
>>> After, I didn't forget, just real life real work priorities made me
>>> veer away from it. Since it was for 0.9, I judged I could safely delay
>>> 2720 a bit while I take care of work related priorities ;-)
>> 
>> also, I find an overhaul to Query such that it's self-hashing a lot more 
>> interesting than #2720.  It would be a much bigger performance savings and 
>> it would apply to other interpreters like pypy too.    Replacements of tiny 
>> sections of code with C, not that interesting :) (redoing all the C in pyrex 
>> is more interesting but not necessarily a priority).
>> 
>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to sqlalchemy+unsubscr...@googlegroups.com.
>> To post to this group, send email to sqlalchemy@googlegroups.com.
>> Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
>> For more options, visit https://groups.google.com/groups/opt_out.
>> 
>> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
> 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [sqlalchemy] Query and compiled_cache

Reply via email to