Re: [sqlalchemy] Query and compiled_cache

Michael Bayer Thu, 30 May 2013 15:05:37 -0700

my next thought is, if something isn't distinctly hashable, then it should 
cancel being hashable entirely.     this patch shows it using a symbol 
"unhashable": https://gist.github.com/zzzeek/5681612 .   If any construct has 
an unhashable inside of it, then that construct is unhashable too.


The hashing thing really has to start as a core concept first.   It's a big job 
but would be very helpful for caching scenarios and would allow us to build 
this feature on Query without too much difficulty.  The nice thing about 
"unhashable" is that simple queries will be hashable, but as soon as complexity 
increases you'd start seeing unhashables come in, preventing us from caching 
something that isn't actually easy to cache.

this could be really nice, could be a nice 0.9 focus, as I haven't found 0.9's 
big change yet (other than the 2to3 removal).



On May 30, 2013, at 4:48 PM, Michael Bayer <mike...@zzzcomputing.com> wrote:

> a very brief example of this, which if you keep digging in you can see how 
> tricky it gets (fast), is like:
> 
> from sqlalchemy.sql import column, table
> 
> t1 = table('t1', column('x'), column('y'))
> 
> t2 = table('t1', column('x'), column('y'))
> 
> t3 = table('t2', column('p'), column('r'))
> 
> t4 = table('t2', column('r'), column('p'))
> 
> assert t1._hash == t2._hash
> assert t3._hash != t4._hash
> 
> 
> the patch to produce the above is below.  Note that Table/Column are easier 
> to hash than table()/column(), since we treat the upper class versions as 
> singletons.  There is a lot more state that needs to be taken into account 
> though, like the _annotations dictionary on every ClauseElement.  In the case 
> where an element doesn't define a fixed _hash, the usage of a new instance of 
> that element in an ad-hoc Query means that whole Query can't be cached, 
> because the element would have a different "id" each time (though 
> dangerously, that id() can be reused when the original is garbage 
> collected...that's an issue actually, we might instead need to use a counter 
> for that case).
> 
> 
> diff --git a/lib/sqlalchemy/sql/expression.py 
> b/lib/sqlalchemy/sql/expression.py
> index 5820cb1..d5de299 100644
> --- a/lib/sqlalchemy/sql/expression.py
> +++ b/lib/sqlalchemy/sql/expression.py
> @@ -1669,6 +1669,7 @@ class ClauseElement(Visitable):
>         """
>         c = self.__class__.__new__(self.__class__)
>         c.__dict__ = self.__dict__.copy()
> +        c.__dict__.pop('_hash', None)
>         ClauseElement._cloned_set._reset(c)
>         ColumnElement.comparator._reset(c)
> 
> @@ -1681,6 +1682,10 @@ class ClauseElement(Visitable):
> 
>         return c
> 
> +    @util.memoized_property
> +    def _hash(self):
> +        return id(self)
> +
>     @property
>     def _constructor(self):
>         """return the 'constructor' for this ClauseElement.
> @@ -2421,6 +2426,10 @@ class ColumnCollection(util.OrderedProperties):
>         self._data.update((c.key, c) for c in cols)
>         self.__dict__['_all_cols'] = util.column_set(self)
> 
> +    @util.memoized_property
> +    def _hash(self):
> +        return hash(tuple(c._hash for c in self))
> +
>     def __str__(self):
>         return repr([str(c) for c in self])
> 
> @@ -4432,6 +4441,17 @@ class ColumnClause(Immutable, ColumnElement):
>         self.type = sqltypes.to_instance(type_)
>         self.is_literal = is_literal
> 
> +    @util.memoized_property
> +    def _hash(self):
> +        return hash(
> +                    (
> +                    hash(self.key),
> +                    hash(self.table.name),  # note using "self.table" here 
> causes an endless loop
> +                    self.type._hash,
> +                    hash(self.is_literal)
> +                    )
> +                )
> +
>     def _compare_name_for_result(self, other):
>         if self.is_literal or \
>             self.table is None or \
> @@ -4586,6 +4606,15 @@ class TableClause(Immutable, FromClause):
>         for c in columns:
>             self.append_column(c)
> 
> +    @util.memoized_property
> +    def _hash(self):
> +        return hash(
> +                    (
> +                    hash(self.name),
> +                    self._columns._hash,
> +                    )
> +                )
> +
>     def _init_collections(self):
>         pass
> 
> diff --git a/lib/sqlalchemy/types.py b/lib/sqlalchemy/types.py
> index bfff053..16834d1 100644
> --- a/lib/sqlalchemy/types.py
> +++ b/lib/sqlalchemy/types.py
> @@ -59,6 +59,10 @@ class TypeEngine(AbstractType):
>         def __reduce__(self):
>             return _reconstitute_comparator, (self.expr, )
> 
> +    @property
> +    def _hash(self):
> +        return id(self)  # default to the same value as __hash__() if a 
> specific hash is not defined
> +
>     hashable = True
>     """Flag, if False, means values from this type aren't hashable.
> 
> 
> 
> 
> On May 30, 2013, at 3:10 PM, Michael Bayer <mike...@zzzcomputing.com> wrote:
> 
>> my first 25 seconds of looking at this reveals that if you want to be able 
>> to generate a hash, this has to go all the way down to everything.   
>> query.filter(X == Y) means you need a hash for X == Y too.    These hashes 
>> are definitely going to be determined using a traversal scheme for sure:
>> 
>> q = X == Y
>> 
>> q._magic_hash_value_()
>> 
>> will ask "X", operator.eq, "Y", for their hash values ("X" and "Y" assuming 
>> they are Column objects are considered to be "immutable", even though they 
>> can be copies of "X" and "Y" sometimes with different semantics), and 
>> combine them together.
>> 
>> So some_select_statement._magic_hash_value_() would traverse all the way 
>> down as well.
>> 
>> This is why object identity was a lot easier to work with.
>> 
>> 
>> 
>> 
>> On May 30, 2013, at 3:05 PM, Michael Bayer <mike...@zzzcomputing.com> wrote:
>> 
>>> 
>>> On May 30, 2013, at 2:28 PM, Claudio Freire <klaussfre...@gmail.com> wrote:
>>> 
>>>> On Thu, May 30, 2013 at 2:25 PM, Michael Bayer <mike...@zzzcomputing.com> 
>>>> wrote:
>>>> 
>>>>> If you want to work on a feature that is actually going to change 
>>>>> SQLAlchemy, (and would that be before or after you finish #2720? :) ), it 
>>>>> would be:
>>>> 
>>>> After, I didn't forget, just real life real work priorities made me
>>>> veer away from it. Since it was for 0.9, I judged I could safely delay
>>>> 2720 a bit while I take care of work related priorities ;-)
>>> 
>>> also, I find an overhaul to Query such that it's self-hashing a lot more 
>>> interesting than #2720.  It would be a much bigger performance savings and 
>>> it would apply to other interpreters like pypy too.    Replacements of tiny 
>>> sections of code with C, not that interesting :) (redoing all the C in 
>>> pyrex is more interesting but not necessarily a priority).
>>> 
>>> 
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "sqlalchemy" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to sqlalchemy+unsubscr...@googlegroups.com.
>>> To post to this group, send email to sqlalchemy@googlegroups.com.
>>> Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>> 
>>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to sqlalchemy+unsubscr...@googlegroups.com.
>> To post to this group, send email to sqlalchemy@googlegroups.com.
>> Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
>> For more options, visit https://groups.google.com/groups/opt_out.
>> 
>> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
> 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [sqlalchemy] Query and compiled_cache

Reply via email to