[sqlalchemy] [Q][0.7.9] DetachedInstanceError

2013-05-30 Thread Ladislav Lenart
Hello.

I occasionally see DetachedInstanceError in apache error log of my web app.

According to

http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#expunging

an object is detached when:

1. It is expunged.
2. The session is closed.

I use expunge() only in some of my unit tests, never in the production code.
Thus the first option is out. The error occurs on various places but I think the
session *should* be still opened when it happens. Can an object become detached
any other way?

Do you have any idea how can I diagnose this problem given that:
* It happens sparingly and in random places?
* I am unable to reproduce it at will?

The web app is running cherrypy under apache.


Thank you in advance,

Ladislav Lenart

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




[sqlalchemy] Query and compiled_cache

2013-05-30 Thread Claudio Freire
I know this has been discussed a great deal already, but I've been
noticing this:

SomeClass.query()
.filter_by(id = blah)
.execution_options(compiled_cache = _orm_query_cache)
.options(
 orm.joinedload(blah),
 orm.joinedload(blah, bleh),
 orm.joinedload(blah, bleh, blih),
 orm.joinedload(blah, bleh, blih, bloh),
 ).first()

It adds one entry in _orm_query_cache each time I run it.

The cache is a dict-like LRUCache object I intend to have in the
running application to avoid recompiling queries all the time. I've
also tried it like this:

_some_query = SomeClass.query()
.filter_by(id = bindparam(bleh))
.execution_options(compiled_cache = _orm_query_cache)
.options(
 orm.joinedload(blah),
 orm.joinedload(blah, bleh),
 orm.joinedload(blah, bleh, blih),
 orm.joinedload(blah, bleh, blih, bloh),
 )

globally, and in my hot function:

_some_query.with_session(session).params(bleh = bleh).first()

Both ways behave the same with regard to the cache. It will be in a
very hot path, I know compilation CPU usage pales in comparison with
the roundtrip, but since this is such a hot path, I expect compiling
repeatedly to be troublesome:

I expect to have an unholy amount of concurrency (in the order of 10k
rps), I don't want a few of these compiling bogging even minutely
other concurrent transactions, the application has very stringent
latency requirements, and even a small hiccup would be unacceptable.

I'm on latest 0.7, also tried 0.8. I tried the baked query recipe, and
it raises exceptions everywhere. I tried to patch it up, and it didn't
work (it would cache results as well which was really unintended).

But never mind baked query's brokenness, from what I've seen, and read
in other threads in the archive, the problem is queries are appended
to the cache by object identity. I'm thinking it wouldn't be hard
making the Query object hasheable, or at least adding a bake()
method that freezes it for hashing (prohibits hash-changing
alterations?).

That way, one could use the second form up there and benefit from
query hashing, because session/param binding wouldn't change the hash,
and it would be a cache hit. Has it been explored already? Or maybe
there's something wrong on how I'm using the compiled_cache thing?
Should I start patching? ;-)

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] alembic questions/comments

2013-05-30 Thread Colleen Ross



 What would be great would be to have .sql files and .sqli (mako 
 templates with some context provided by the env.py) in addition to .py 
 files. How hard could that be? ;-)


UHHH Alembic *doesn't* support this?! Are you kidding me? Fuckit, I'm 
sticking to sqlalchemy-migrate. 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] [Q][0.7.9] How to issue apply_labels() on an ORM query?

2013-05-30 Thread Ladislav Lenart
Hello.

Sorry for the long delay. I finally had enough time to produce a minimal
self-contained regression. The attached file produces the following SQL:

WITH RECURSIVE
q_cte(partner_id, max_depth) AS (
SELECT
partner.id AS partner_id,
1 AS max_depth
FROM partner
WHERE partner.sponsor_id IS NULL
UNION ALL
SELECT
partner_alias.id AS partner_id,
max_depth + 1 AS max_depth
FROM
partner AS partner_alias,
q_cte AS q_cte_alias
WHERE partner_alias.sponsor_id = q_cte_alias.partner_id
)
SELECT
q_cte.partner_id AS q_cte_partner_id, -- suspicious
partner.sponsor_id AS partner_sponsor_id
FROM
q_cte
JOIN partner ON q_cte.partner_id = q_cte.partner_id -- BUG


I would expect this (q_cte part is correct):

WITH RECURSIVE
q_cte(partner_id, max_depth) AS (
...
)
SELECT
partner.id AS partner_id,
partner.sponsor_id AS partner_sponsor_id
FROM
q_cte
JOIN partner ON q_cte.partner_id = partner.id


The bug is somehow related to the use of select_from().


Hope this helps,

Ladislav Lenart


On 10.5.2013 22:04, Michael Bayer wrote:
 
 On May 10, 2013, at 3:03 PM, Ladislav Lenart lenart...@volny.cz wrote:
 
 Hello.

 My main concern was that the query creates a cartesian product and I thought 
 the
 warning might have something to do with it. It haven't. The problem is 
 related
 to the use of select_from():

q = session.query(cls, PersonalContact).select_from(q_cte_union)
q = q.join(cls, cls.id == q_cte_union.c.partner_id)

 This part renders the following SQL:

 SELECT *
 FROM
q_cte
JOIN partner ON q_cte.partner_id = q_cte.partner_id

 As you can see, the JOIN condition is wrong though I have no idea why. The
 following works, however:

q = session.query(cls)
q = q.add_entity(PersonalContact)
q = q.join(q_cte_union, cls.id == q_cte_union.c.partner_id)

 Is my usage of select_from() bad for some reason? Do you know what's going 
 on? I
 can provide you a fully runnable test case if you still need it (on Monday).
 Note also that I use SA 0.7.9. Is this fixed in later versions?
 
 I've no idea why you're getting it without more context - if everything is 
 generated from a query(), typically the labels are already applied, with 
 the exception of some operations such as subquery() (maybe cte? not sure, I'd 
 need to try it, hence an already existing example saves me lots of time 
 tinkering around trying to guess what you're doing).
 
 Things have been improved in 0.8 regarding same-named columns but that only 
 applies when the labels are already present.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


# coding=utf-8
from sqlalchemy import Column, ForeignKey, Integer
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.sql.expression import literal_column
from sqlalchemy.orm.util import aliased
from sqlalchemy.engine import create_engine
from sqlalchemy.orm.session import sessionmaker
from sqlalchemy.orm import relationship


Base = declarative_base()
session = None


class Partner(Base):
__tablename__ = 'partner'
id = Column(Integer(), primary_key=True)
sponsor_id = Column(Integer(), ForeignKey('partner.id', ondelete='SET NULL'))
sponsor = relationship('Partner', primaryjoin='Partner.sponsor_id == Partner.id', remote_side='Partner.id', uselist=False)

def find_subtree(self, max_depth=None):
cls = self.__class__
i0 = literal_column('1').label('max_depth')
q_base = session.query(cls.id.label('partner_id'), i0).filter(cls.sponsor_id == self.id)
q_cte = q_base.cte(name='q_cte', recursive=True)
q_cte_alias = aliased(q_cte, name='q_cte_alias')
partner_alias = aliased(cls, name='partner_alias')
i1 = literal_column('max_depth + 1').label('max_depth')
q_rec = session.query(partner_alias.id.label('partner_id'), i1)
q_rec = q_rec.filter(partner_alias.sponsor_id == q_cte_alias.c.partner_id)
if max_depth is not None:
q_rec = q_rec.filter(q_cte_alias.c.max_depth  max_depth)
q_cte_union = q_cte.union_all(q_rec)
q = session.query(cls).select_from(q_cte_union)
q = q.join(cls, cls.id == q_cte_union.c.partner_id)
# q = q.join(q_cte_union, cls.id == q_cte_union.c.partner_id)
return q


def main():
global session
conn_string = 'postgresql+psycopg2://lada:heslo@localhost:5433/zfp_xxx'
engine = create_engine(conn_string, echo=True)
session = sessionmaker(bind=engine, autoflush=False)()
Base.metadata.bind = engine
Base.metadata.create_all()
root = 

Re: [sqlalchemy] [Q][0.7.9] DetachedInstanceError

2013-05-30 Thread Michael Bayer

On May 30, 2013, at 7:16 AM, Ladislav Lenart lenart...@volny.cz wrote:

 Hello.
 
 I occasionally see DetachedInstanceError in apache error log of my web app.
 
 According to
 
http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#expunging
 
 an object is detached when:
 
1. It is expunged.
2. The session is closed.
 
 I use expunge() only in some of my unit tests, never in the production code.
 Thus the first option is out. The error occurs on various places but I think 
 the
 session *should* be still opened when it happens. Can an object become 
 detached
 any other way?

if you mark it as deleted and commit the session, or if it is pending in the 
session and the session is rolled back.

 
 Do you have any idea how can I diagnose this problem given that:
 * It happens sparingly and in random places?
 * I am unable to reproduce it at will?
 
 The web app is running cherrypy under apache.

get some more diagnostics around that exception, and take a look at other 
aspects of the object's state:

- does it have a primary key, or was it pending?

- was it deleted ?

- where is the code that is accessing this object, and is the owning session 
still in effect at that point, or if not why?


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Query and compiled_cache

2013-05-30 Thread Michael Bayer

On May 30, 2013, at 11:25 AM, Claudio Freire klaussfre...@gmail.com wrote:

 
 That way, one could use the second form up there and benefit from
 query hashing, because session/param binding wouldn't change the hash,
 and it would be a cache hit. Has it been explored already? Or maybe
 there's something wrong on how I'm using the compiled_cache thing?
 Should I start patching? ;-)

there is a very wide gap between working with the baked query recipe, for which 
I'm not aware of all the errors you refer to so more detail would help, and 
that of patching SQLAlchemy with new features.   In particular I can't imagine 
how it would cache results, there is nothing like that in the recipe or 
compiled cache system.

If you want to work on a feature that is actually going to change SQLAlchemy, 
(and would that be before or after you finish #2720? :) ), it would be:

1. there is no bake() method, the entire thing is transparent

2. as one calls session.query(X).filter(Y).join(Z), the hashing scheme is 
working the entire time in the background against a single LRU cache, that is 
perhaps per sessionmaker(), or otherwise, to produce cached versions of the 
Query.  Everyone would get the performance gains for free in all cases.   A 
flag or other system would exist to turn the feature off for those who are 
heavily subclassing Query and having issues or other edge backwards compatible 
issues.

3. it would just be on by default in a new major release like 0.9 or 1.0

4. it would have a super crapload of very complete and clean unit tests.

Otherwise, the bake() recipe as it is can be enhanced or augmented with 
__hash__() methods and all that but I'm not aware of anything regarding it that 
would require changes to SQLAlchemy itself, since it uses a Query subclass.



-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] alembic questions/comments

2013-05-30 Thread Mauricio de Abreu Antunes
I think I did not get it.

2013/5/30 Colleen Ross cr...@yapta.com:

 What would be great would be to have .sql files and .sqli (mako
 templates with some context provided by the env.py) in addition to .py
 files. How hard could that be? ;-)


 UHHH Alembic *doesn't* support this?! Are you kidding me? Fuckit, I'm
 sticking to sqlalchemy-migrate.

 --
 You received this message because you are subscribed to the Google Groups
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
 For more options, visit https://groups.google.com/groups/opt_out.





-- 
Mauricio de Abreu Antunes
Mobile: (51)930-74-525
Skype: mauricio.abreua

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Query and compiled_cache

2013-05-30 Thread Claudio Freire
On Thu, May 30, 2013 at 2:25 PM, Michael Bayer mike...@zzzcomputing.com wrote:

 On May 30, 2013, at 11:25 AM, Claudio Freire klaussfre...@gmail.com wrote:


 That way, one could use the second form up there and benefit from
 query hashing, because session/param binding wouldn't change the hash,
 and it would be a cache hit. Has it been explored already? Or maybe
 there's something wrong on how I'm using the compiled_cache thing?
 Should I start patching? ;-)

 there is a very wide gap between working with the baked query recipe, for 
 which I'm not aware of all the errors you refer to so more detail would help

Um... I don't remember the exact problems, will have to try it again.
I just assumed it was targeted at an older SA release and gave up on
it rather quickly.

 and that of patching SQLAlchemy with new features.   In particular I can't 
 imagine how it would cache results, there is nothing like that in the recipe 
 or compiled cache system.

Something with sharing the context among query instances, since query
instances cache results (or that I heard). Not sure about the
specifics of how that happened, but I did notice, after patching up
the recipe to not throw exceptions, that it would always return the
results for the first call. The queries themselves had no bind
parameters or anything, they would just return different results each
time due to concurrent updates to the database. Maybe that wasn't
expected for that recipe? In any case, I just gave up on it without
looking into it much.

 If you want to work on a feature that is actually going to change SQLAlchemy, 
 (and would that be before or after you finish #2720? :) ), it would be:

After, I didn't forget, just real life real work priorities made me
veer away from it. Since it was for 0.9, I judged I could safely delay
2720 a bit while I take care of work related priorities ;-)

 1. there is no bake() method, the entire thing is transparent

 2. as one calls session.query(X).filter(Y).join(Z), the hashing scheme is 
 working the entire time in the background against a single LRU cache, that is 
 perhaps per sessionmaker(), or otherwise, to produce cached versions of the 
 Query.  Everyone would get the performance gains for free in all cases.   A 
 flag or other system would exist to turn the feature off for those who are 
 heavily subclassing Query and having issues or other edge backwards 
 compatible issues.

That would mean Query objects would by default take the compiled_cache
from the session, and not only during flushes but always. If that's
alright, sure.

 3. it would just be on by default in a new major release like 0.9 or 1.0

I had assumed that much.

 4. it would have a super crapload of very complete and clean unit tests.

Ehm... I would imagine all the current tests involving Query would
cover most of it. A few more cache-specific tests could be added
surely, but only to check caching is indeed happening, correctness
should be checked by existing tests already.

 Otherwise, the bake() recipe as it is can be enhanced or augmented with 
 __hash__() methods and all that but I'm not aware of anything regarding it 
 that would require changes to SQLAlchemy itself, since it uses a Query 
 subclass.

Well, yeah, I guess so. But that subclass would have to step on all of
Query methods to be able to compute, cache and update the hash
(computing it always would be almost as costly as compiling, so it has
to be cached in an instance attribute). That'd be a chore, and it
would break every other release.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Query and compiled_cache

2013-05-30 Thread Michael Bayer

On May 30, 2013, at 2:28 PM, Claudio Freire klaussfre...@gmail.com wrote:

 
 4. it would have a super crapload of very complete and clean unit tests.
 
 Ehm... I would imagine all the current tests involving Query would
 cover most of it. A few more cache-specific tests could be added
 surely, but only to check caching is indeed happening, correctness
 should be checked by existing tests already.

well the current tests suffer very much from years of being integration tests 
and not unit tests.   These tests would actually do a terrible job of testing 
this cache, as almost all tests use a brand new mapping, a brand new session, 
and emit just one query.  none of the things that can go wrong with caching, 
such as the multiple result issue you're describing with the bake() recipe, 
would be exercised by current tests.Memory leaks, subtle changes in 
queries, all that stuff.It also sort of depends on how the feature comes 
out, how hard it will be to verify its correctness.

 
 Otherwise, the bake() recipe as it is can be enhanced or augmented with 
 __hash__() methods and all that but I'm not aware of anything regarding it 
 that would require changes to SQLAlchemy itself, since it uses a Query 
 subclass.
 
 Well, yeah, I guess so. But that subclass would have to step on all of
 Query methods to be able to compute, cache and update the hash
 (computing it always would be almost as costly as compiling, so it has
 to be cached in an instance attribute). That'd be a chore, and it
 would break every other release.

Can't a Query generate its hash from its current state, without generative 
methods being called ? Otherwise, the generative methods do run through a 
common system, which is the @_generative decorator.

I might guess that the way Query stores its state could be changed to make this 
whole job easier.That is, if it stored state using a whole system of 
objects with a known combined hashing scheme, rather than ad-hoc things like 
_whereclause, _order_by , etc.

I'm not sure how to do this but it would certainly be some kind of system that 
is largely transparent even to the internals - the internals might need to 
change how they represent state changes, but there wouldn't be any explicit 
mention of hashes inside of say filter_by(), for example.




-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Query and compiled_cache

2013-05-30 Thread Michael Bayer
my first 25 seconds of looking at this reveals that if you want to be able to 
generate a hash, this has to go all the way down to everything.   
query.filter(X == Y) means you need a hash for X == Y too.These hashes are 
definitely going to be determined using a traversal scheme for sure:

q = X == Y

q._magic_hash_value_()

will ask X, operator.eq, Y, for their hash values (X and Y assuming 
they are Column objects are considered to be immutable, even though they can 
be copies of X and Y sometimes with different semantics), and combine them 
together.

So some_select_statement._magic_hash_value_() would traverse all the way down 
as well.

This is why object identity was a lot easier to work with.




On May 30, 2013, at 3:05 PM, Michael Bayer mike...@zzzcomputing.com wrote:

 
 On May 30, 2013, at 2:28 PM, Claudio Freire klaussfre...@gmail.com wrote:
 
 On Thu, May 30, 2013 at 2:25 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 
 If you want to work on a feature that is actually going to change 
 SQLAlchemy, (and would that be before or after you finish #2720? :) ), it 
 would be:
 
 After, I didn't forget, just real life real work priorities made me
 veer away from it. Since it was for 0.9, I judged I could safely delay
 2720 a bit while I take care of work related priorities ;-)
 
 also, I find an overhaul to Query such that it's self-hashing a lot more 
 interesting than #2720.  It would be a much bigger performance savings and it 
 would apply to other interpreters like pypy too.Replacements of tiny 
 sections of code with C, not that interesting :) (redoing all the C in pyrex 
 is more interesting but not necessarily a priority).
 
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
 For more options, visit https://groups.google.com/groups/opt_out.
 
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Query and compiled_cache

2013-05-30 Thread Claudio Freire
On Thu, May 30, 2013 at 3:45 PM, Michael Bayer mike...@zzzcomputing.com wrote:

 On May 30, 2013, at 2:28 PM, Claudio Freire klaussfre...@gmail.com wrote:


 4. it would have a super crapload of very complete and clean unit tests.

 Ehm... I would imagine all the current tests involving Query would
 cover most of it. A few more cache-specific tests could be added
 surely, but only to check caching is indeed happening, correctness
 should be checked by existing tests already.

 well the current tests suffer very much from years of being integration tests 
 and not unit tests.   These tests would actually do a terrible job of testing 
 this cache, as almost all tests use a brand new mapping, a brand new session, 
 and emit just one query.  none of the things that can go wrong with caching, 
 such as the multiple result issue you're describing with the bake() recipe, 
 would be exercised by current tests.Memory leaks, subtle changes in 
 queries, all that stuff.It also sort of depends on how the feature comes 
 out, how hard it will be to verify its correctness.

Um... that might make the task a lot bigger than it should be. I'll
have to look into it.

 Otherwise, the bake() recipe as it is can be enhanced or augmented with 
 __hash__() methods and all that but I'm not aware of anything regarding it 
 that would require changes to SQLAlchemy itself, since it uses a Query 
 subclass.

 Well, yeah, I guess so. But that subclass would have to step on all of
 Query methods to be able to compute, cache and update the hash
 (computing it always would be almost as costly as compiling, so it has
 to be cached in an instance attribute). That'd be a chore, and it
 would break every other release.

 Can't a Query generate its hash from its current state, without generative 
 methods being called ? Otherwise, the generative methods do run through a 
 common system, which is the @_generative decorator.

I'm not so familiar with Query internals yet to answer this. But I'll
look into it. I've been thinking, that if caching is conditional on
nothing structural about the query changing, and if we just want to
support that pattern I mentioned above (where you have a global query
object from which you build session-bound ones with with_session),
it could be as cheap as taking the internals' identity as hash. That
wouldn't work for the usual query building patterns, but then again,
when you build a new object, you're already paying a cost similar to
compiling, so caching would only really benefit the case where you
cache the expression externally.

In case I'm not clear, this would not be cached if I were to take id(internals)

query(Blah).filter(blah).join(blah).first()

But I don't care, because that's expensive on its own. This would:

q = query(Blah).filter(blah).join(blah)

...
q2 = q.with_session(S).params(blah).first()


On Thu, May 30, 2013 at 4:05 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 On May 30, 2013, at 2:28 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Thu, May 30, 2013 at 2:25 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:

 If you want to work on a feature that is actually going to change 
 SQLAlchemy, (and would that be before or after you finish #2720? :) ), it 
 would be:

 After, I didn't forget, just real life real work priorities made me
 veer away from it. Since it was for 0.9, I judged I could safely delay
 2720 a bit while I take care of work related priorities ;-)

 also, I find an overhaul to Query such that it's self-hashing a lot more 
 interesting than #2720.  It would be a much bigger performance savings and it 
 would apply to other interpreters like pypy too.Replacements of tiny 
 sections of code with C, not that interesting :) (redoing all the C in pyrex 
 is more interesting but not necessarily a priority).

The C extension is already done, and I think I sent the latest
version, haven't I?

The only thing remaining of 2720 is turning it all into pyrex code.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Query and compiled_cache

2013-05-30 Thread Michael Bayer

On May 30, 2013, at 2:28 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Thu, May 30, 2013 at 2:25 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 
 If you want to work on a feature that is actually going to change 
 SQLAlchemy, (and would that be before or after you finish #2720? :) ), it 
 would be:
 
 After, I didn't forget, just real life real work priorities made me
 veer away from it. Since it was for 0.9, I judged I could safely delay
 2720 a bit while I take care of work related priorities ;-)

also, I find an overhaul to Query such that it's self-hashing a lot more 
interesting than #2720.  It would be a much bigger performance savings and it 
would apply to other interpreters like pypy too.Replacements of tiny 
sections of code with C, not that interesting :) (redoing all the C in pyrex is 
more interesting but not necessarily a priority).



-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Query and compiled_cache

2013-05-30 Thread Michael Bayer

On May 30, 2013, at 3:10 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Thu, May 30, 2013 at 3:45 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 
 On May 30, 2013, at 2:28 PM, Claudio Freire klaussfre...@gmail.com wrote:
 
 
 4. it would have a super crapload of very complete and clean unit tests.
 
 Ehm... I would imagine all the current tests involving Query would
 cover most of it. A few more cache-specific tests could be added
 surely, but only to check caching is indeed happening, correctness
 should be checked by existing tests already.
 
 well the current tests suffer very much from years of being integration 
 tests and not unit tests.   These tests would actually do a terrible job of 
 testing this cache, as almost all tests use a brand new mapping, a brand new 
 session, and emit just one query.  none of the things that can go wrong with 
 caching, such as the multiple result issue you're describing with the bake() 
 recipe, would be exercised by current tests.Memory leaks, subtle changes 
 in queries, all that stuff.It also sort of depends on how the feature 
 comes out, how hard it will be to verify its correctness.
 
 Um... that might make the task a lot bigger than it should be. I'll
 have to look into it.

if you look at the history of major feature adds, the actual cleverness to 
implement the feature is really like 10% of the effort.   its about the tests, 
the documentation, the upgrade paths, all of that.   If all I did was throw 
balls of clever at SQLAlchemy all day, and I have plenty, it would be a huge 
pile of colorful mud.All the support work is the price we pay for the joy 
of clever new features.

 
 
 Can't a Query generate its hash from its current state, without generative 
 methods being called ? Otherwise, the generative methods do run through 
 a common system, which is the @_generative decorator.
 
 I'm not so familiar with Query internals yet to answer this. But I'll
 look into it. I've been thinking, that if caching is conditional on
 nothing structural about the query changing, and if we just want to
 support that pattern I mentioned above (where you have a global query
 object from which you build session-bound ones with with_session),
 it could be as cheap as taking the internals' identity as hash. That
 wouldn't work for the usual query building patterns, but then again,
 when you build a new object, you're already paying a cost similar to
 compiling, so caching would only really benefit the case where you
 cache the expression externally.
 
 In case I'm not clear, this would not be cached if I were to take 
 id(internals)
 
 query(Blah).filter(blah).join(blah).first()
 
 But I don't care, because that's expensive on its own.

result caching (because we're calling first()) is a whole different thing.   
the dogpile.cache recipe addresses that - its a widely used recipe and sure, 
that would make a great feature too, I keep it as a recipe so that people who 
use it have spent time understanding it.  

Recipes are for features that I don't have time to make 300% bulletproof, 
basically.




-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Query and compiled_cache

2013-05-30 Thread Claudio Freire
On Thu, May 30, 2013 at 4:19 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 In case I'm not clear, this would not be cached if I were to take 
 id(internals)

 query(Blah).filter(blah).join(blah).first()

 But I don't care, because that's expensive on its own.

 result caching (because we're calling first()) is a whole different thing.   
 the dogpile.cache recipe addresses that - its a widely used recipe and sure, 
 that would make a great feature too, I keep it as a recipe so that people who 
 use it have spent time understanding it.


Oh no, I was referring to the query not the result. Results will never
be cached with what I propose.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Query and compiled_cache

2013-05-30 Thread Michael Bayer
a very brief example of this, which if you keep digging in you can see how 
tricky it gets (fast), is like:

from sqlalchemy.sql import column, table

t1 = table('t1', column('x'), column('y'))

t2 = table('t1', column('x'), column('y'))

t3 = table('t2', column('p'), column('r'))

t4 = table('t2', column('r'), column('p'))

assert t1._hash == t2._hash
assert t3._hash != t4._hash


the patch to produce the above is below.  Note that Table/Column are easier to 
hash than table()/column(), since we treat the upper class versions as 
singletons.  There is a lot more state that needs to be taken into account 
though, like the _annotations dictionary on every ClauseElement.  In the case 
where an element doesn't define a fixed _hash, the usage of a new instance of 
that element in an ad-hoc Query means that whole Query can't be cached, because 
the element would have a different id each time (though dangerously, that 
id() can be reused when the original is garbage collected...that's an issue 
actually, we might instead need to use a counter for that case).


diff --git a/lib/sqlalchemy/sql/expression.py b/lib/sqlalchemy/sql/expression.py
index 5820cb1..d5de299 100644
--- a/lib/sqlalchemy/sql/expression.py
+++ b/lib/sqlalchemy/sql/expression.py
@@ -1669,6 +1669,7 @@ class ClauseElement(Visitable):
 
 c = self.__class__.__new__(self.__class__)
 c.__dict__ = self.__dict__.copy()
+c.__dict__.pop('_hash', None)
 ClauseElement._cloned_set._reset(c)
 ColumnElement.comparator._reset(c)
 
@@ -1681,6 +1682,10 @@ class ClauseElement(Visitable):
 
 return c
 
+@util.memoized_property
+def _hash(self):
+return id(self)
+
 @property
 def _constructor(self):
 return the 'constructor' for this ClauseElement.
@@ -2421,6 +2426,10 @@ class ColumnCollection(util.OrderedProperties):
 self._data.update((c.key, c) for c in cols)
 self.__dict__['_all_cols'] = util.column_set(self)
 
+@util.memoized_property
+def _hash(self):
+return hash(tuple(c._hash for c in self))
+
 def __str__(self):
 return repr([str(c) for c in self])
 
@@ -4432,6 +4441,17 @@ class ColumnClause(Immutable, ColumnElement):
 self.type = sqltypes.to_instance(type_)
 self.is_literal = is_literal
 
+@util.memoized_property
+def _hash(self):
+return hash(
+(
+hash(self.key),
+hash(self.table.name),  # note using self.table here 
causes an endless loop
+self.type._hash,
+hash(self.is_literal)
+)
+)
+
 def _compare_name_for_result(self, other):
 if self.is_literal or \
 self.table is None or \
@@ -4586,6 +4606,15 @@ class TableClause(Immutable, FromClause):
 for c in columns:
 self.append_column(c)
 
+@util.memoized_property
+def _hash(self):
+return hash(
+(
+hash(self.name),
+self._columns._hash,
+)
+)
+
 def _init_collections(self):
 pass
 
diff --git a/lib/sqlalchemy/types.py b/lib/sqlalchemy/types.py
index bfff053..16834d1 100644
--- a/lib/sqlalchemy/types.py
+++ b/lib/sqlalchemy/types.py
@@ -59,6 +59,10 @@ class TypeEngine(AbstractType):
 def __reduce__(self):
 return _reconstitute_comparator, (self.expr, )
 
+@property
+def _hash(self):
+return id(self)  # default to the same value as __hash__() if a 
specific hash is not defined
+
 hashable = True
 Flag, if False, means values from this type aren't hashable.
 



On May 30, 2013, at 3:10 PM, Michael Bayer mike...@zzzcomputing.com wrote:

 my first 25 seconds of looking at this reveals that if you want to be able to 
 generate a hash, this has to go all the way down to everything.   
 query.filter(X == Y) means you need a hash for X == Y too.These hashes 
 are definitely going to be determined using a traversal scheme for sure:
 
 q = X == Y
 
 q._magic_hash_value_()
 
 will ask X, operator.eq, Y, for their hash values (X and Y assuming 
 they are Column objects are considered to be immutable, even though they 
 can be copies of X and Y sometimes with different semantics), and combine 
 them together.
 
 So some_select_statement._magic_hash_value_() would traverse all the way down 
 as well.
 
 This is why object identity was a lot easier to work with.
 
 
 
 
 On May 30, 2013, at 3:05 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 
 
 On May 30, 2013, at 2:28 PM, Claudio Freire klaussfre...@gmail.com wrote:
 
 On Thu, May 30, 2013 at 2:25 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 
 If you want to work on a feature that is actually going to change 
 SQLAlchemy, (and would that be before or after you finish #2720? :) ), it 
 would be:
 
 After, I didn't forget, just real life real work 

[sqlalchemy] DISTINCT with LIMIT problem

2013-05-30 Thread Kent
We use func.count().over() in order to help support result pagination.   
When attempting to limit the result set, I have found that if other tables 
are being joined (for the where clause, but not selected), then I need to 
add DISTINCT to the query or else the Cartesian result of my query messes 
up LIMIT.  (There are, say, 72 rows returned, where this only represents 17 
distinct records, for example.)

For example, a query may look like this:

select distinct 
  count(*) over () as recordcount, tablea.colx, tableb.coly
from tablea, tableb
where 
limit 100

This doesn't *quite* work because the analytical window function count(*) 
over() is applied *before* the distinct, so the count returns the wrong 
number (a Cartesian effect, returning 72 instead of 17, in this example).  

I have two potential solutions: 

Solution A:

Group by all columns (yielding the same effect as distinct), but which 
makes the window analytical function process *after* the group by and 
yields the correct count (17 instead of 72):

select count(*) over () as recordcount, tablea.colx, tableb.coly
from tablea, tableb
where ...
group by tablea.colx, tableb.coly *[all columns]*
limit 100

OR Solution B:

Put the count(*) over () in an outer select, like this:

select count(*) over () as recordcount, anon.* 
from (
  select distinct tablea.colx, tableb.coly
  from tablea, tableb
) as anon
limit 100

Either solution yields the correct answer I believe, but I'm having 
difficulty translating the SQL that I know will work into sqlalchemy land.

For Solution A, in the case of wanting to group by, I don't know how to get 
the full list of all selected columns to add to the group_by in such a way 
that even joinedload() will be included in the group by:

q = Session.query(class).join(joins).filter(...).option(joinedload(...))

q = q.group_by(* ??? How to tell sqlalchemy to group by all selected 
columns, even those which will be join loaded ???* )  

q = q.add_column(func.count().over().label('recordcount'))

For Solution B, I don't know how to wrap my query in an outer select 
(similar to the LIMIT implementation for Oracle) in a way that will still 
allow sqlalchemy to extract rows into instances:

This renders the correct SQL, I think:
qry = Session.query(qry.with_labels().statement, 
func.count().over().label('recordcount'))  

But I'm using SQL statement here so sqlalchemy won't translate result rows 
into object instances.

Can you point me in the right direction for one of these 2 solutions, 
please?

Many thanks,
Kent

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] DISTINCT with LIMIT problem

2013-05-30 Thread Michael Bayer

On May 30, 2013, at 5:19 PM, Kent jkentbo...@gmail.com wrote:

 
 Solution A:
 
 Group by all columns (yielding the same effect as distinct), but which 
 makes the window analytical function process after the group by and yields 
 the correct count (17 instead of 72):

are all those columns indexed?  even if they are, crappy query...

 
 OR Solution B:
 
 Put the count(*) over () in an outer select, like this:
 
 select count(*) over () as recordcount, anon.* 
 from (
   select distinct tablea.colx, tableb.coly
   from tablea, tableb
 ) as anon
 limit 100

this is very much how SQLAlchemy wants you to do it.

 
 Either solution yields the correct answer I believe, but I'm having 
 difficulty translating the SQL that I know will work into sqlalchemy land.
 
 
 For Solution B, I don't know how to wrap my query in an outer select (similar 
 to the LIMIT implementation for Oracle) in a way that will still allow 
 sqlalchemy to extract rows into instances:

from_self() can do this (probably use add_column() for the window function), or 
subquery() should work very well.   subq = q.subquery(); q = 
query(func.count('*').over().label(..), subq); .

send me a quick example if that's not working and I'll work out the query.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Query and compiled_cache

2013-05-30 Thread Michael Bayer
my next thought is, if something isn't distinctly hashable, then it should 
cancel being hashable entirely. this patch shows it using a symbol 
unhashable: https://gist.github.com/zzzeek/5681612 .   If any construct has 
an unhashable inside of it, then that construct is unhashable too.

The hashing thing really has to start as a core concept first.   It's a big job 
but would be very helpful for caching scenarios and would allow us to build 
this feature on Query without too much difficulty.  The nice thing about 
unhashable is that simple queries will be hashable, but as soon as complexity 
increases you'd start seeing unhashables come in, preventing us from caching 
something that isn't actually easy to cache.

this could be really nice, could be a nice 0.9 focus, as I haven't found 0.9's 
big change yet (other than the 2to3 removal).



On May 30, 2013, at 4:48 PM, Michael Bayer mike...@zzzcomputing.com wrote:

 a very brief example of this, which if you keep digging in you can see how 
 tricky it gets (fast), is like:
 
 from sqlalchemy.sql import column, table
 
 t1 = table('t1', column('x'), column('y'))
 
 t2 = table('t1', column('x'), column('y'))
 
 t3 = table('t2', column('p'), column('r'))
 
 t4 = table('t2', column('r'), column('p'))
 
 assert t1._hash == t2._hash
 assert t3._hash != t4._hash
 
 
 the patch to produce the above is below.  Note that Table/Column are easier 
 to hash than table()/column(), since we treat the upper class versions as 
 singletons.  There is a lot more state that needs to be taken into account 
 though, like the _annotations dictionary on every ClauseElement.  In the case 
 where an element doesn't define a fixed _hash, the usage of a new instance of 
 that element in an ad-hoc Query means that whole Query can't be cached, 
 because the element would have a different id each time (though 
 dangerously, that id() can be reused when the original is garbage 
 collected...that's an issue actually, we might instead need to use a counter 
 for that case).
 
 
 diff --git a/lib/sqlalchemy/sql/expression.py 
 b/lib/sqlalchemy/sql/expression.py
 index 5820cb1..d5de299 100644
 --- a/lib/sqlalchemy/sql/expression.py
 +++ b/lib/sqlalchemy/sql/expression.py
 @@ -1669,6 +1669,7 @@ class ClauseElement(Visitable):
 
 c = self.__class__.__new__(self.__class__)
 c.__dict__ = self.__dict__.copy()
 +c.__dict__.pop('_hash', None)
 ClauseElement._cloned_set._reset(c)
 ColumnElement.comparator._reset(c)
 
 @@ -1681,6 +1682,10 @@ class ClauseElement(Visitable):
 
 return c
 
 +@util.memoized_property
 +def _hash(self):
 +return id(self)
 +
 @property
 def _constructor(self):
 return the 'constructor' for this ClauseElement.
 @@ -2421,6 +2426,10 @@ class ColumnCollection(util.OrderedProperties):
 self._data.update((c.key, c) for c in cols)
 self.__dict__['_all_cols'] = util.column_set(self)
 
 +@util.memoized_property
 +def _hash(self):
 +return hash(tuple(c._hash for c in self))
 +
 def __str__(self):
 return repr([str(c) for c in self])
 
 @@ -4432,6 +4441,17 @@ class ColumnClause(Immutable, ColumnElement):
 self.type = sqltypes.to_instance(type_)
 self.is_literal = is_literal
 
 +@util.memoized_property
 +def _hash(self):
 +return hash(
 +(
 +hash(self.key),
 +hash(self.table.name),  # note using self.table here 
 causes an endless loop
 +self.type._hash,
 +hash(self.is_literal)
 +)
 +)
 +
 def _compare_name_for_result(self, other):
 if self.is_literal or \
 self.table is None or \
 @@ -4586,6 +4606,15 @@ class TableClause(Immutable, FromClause):
 for c in columns:
 self.append_column(c)
 
 +@util.memoized_property
 +def _hash(self):
 +return hash(
 +(
 +hash(self.name),
 +self._columns._hash,
 +)
 +)
 +
 def _init_collections(self):
 pass
 
 diff --git a/lib/sqlalchemy/types.py b/lib/sqlalchemy/types.py
 index bfff053..16834d1 100644
 --- a/lib/sqlalchemy/types.py
 +++ b/lib/sqlalchemy/types.py
 @@ -59,6 +59,10 @@ class TypeEngine(AbstractType):
 def __reduce__(self):
 return _reconstitute_comparator, (self.expr, )
 
 +@property
 +def _hash(self):
 +return id(self)  # default to the same value as __hash__() if a 
 specific hash is not defined
 +
 hashable = True
 Flag, if False, means values from this type aren't hashable.
 
 
 
 
 On May 30, 2013, at 3:10 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 
 my first 25 seconds of looking at this reveals that if you want to be able 
 to generate a hash, this has to go all the way down to everything.   
 query.filter(X == Y) 

Re: [sqlalchemy] DISTINCT with LIMIT problem

2013-05-30 Thread Kent
Thank you, I'll try that, but quick concern:  I specifically skipped 
trying to use .subquery() because the docs say Eager JOIN generation 
within the query is disabled.


Doesn't that mean I won't get my joinedload() results from the inner query?

Or does that refer to the outer query having eager join disabled?


On 5/30/2013 5:54 PM, Michael Bayer wrote:


On May 30, 2013, at 5:19 PM, Kent jkentbo...@gmail.com
mailto:jkentbo...@gmail.com wrote:



Solution A:

Group by all columns (yielding the same effect as distinct), but
which makes the window analytical function process *after* the group
by and yields the correct count (17 instead of 72):


are all those columns indexed?  even if they are, crappy query...



OR Solution B:

Put the count(*) over () in an outer select, like this:

select count(*) over () as recordcount, anon.*
from (
  select distinct tablea.colx, tableb.coly
  from tablea, tableb
) as anon
limit 100


this is very much how SQLAlchemy wants you to do it.



Either solution yields the correct answer I believe, but I'm having
difficulty translating the SQL that I know will work into sqlalchemy land.


For Solution B, I don't know how to wrap my query in an outer select
(similar to the LIMIT implementation for Oracle) in a way that will
still allow sqlalchemy to extract rows into instances:


from_self() can do this (probably use add_column() for the window
function), or subquery() should work very well.   subq = q.subquery(); q
= query(func.count('*').over().label(..), subq); .

send me a quick example if that's not working and I'll work out the query.

--
You received this message because you are subscribed to a topic in the
Google Groups sqlalchemy group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/sqlalchemy/_U28GXXR6sg/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to
sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] DISTINCT with LIMIT problem

2013-05-30 Thread Michael Bayer

On May 30, 2013, at 6:06 PM, Kent jkentbo...@gmail.com wrote:

 Thank you, I'll try that, but quick concern:  I specifically skipped trying 
 to use .subquery() because the docs say Eager JOIN generation within the 
 query is disabled.
 
 Doesn't that mean I won't get my joinedload() results from the inner query?
 
 Or does that refer to the outer query having eager join disabled?


if you want to eager load also from that subquery, you need to sitck it into an 
aliased:


MySubqClass = aliased(MyClass, subq)

query(x, MySubqClass).options(joinedload(MySubqClass.foobar))




 
 
 On 5/30/2013 5:54 PM, Michael Bayer wrote:
 
 On May 30, 2013, at 5:19 PM, Kent jkentbo...@gmail.com
 mailto:jkentbo...@gmail.com wrote:
 
 
 Solution A:
 
 Group by all columns (yielding the same effect as distinct), but
 which makes the window analytical function process *after* the group
 by and yields the correct count (17 instead of 72):
 
 are all those columns indexed?  even if they are, crappy query...
 
 
 OR Solution B:
 
 Put the count(*) over () in an outer select, like this:
 
 select count(*) over () as recordcount, anon.*
 from (
  select distinct tablea.colx, tableb.coly
  from tablea, tableb
 ) as anon
 limit 100
 
 this is very much how SQLAlchemy wants you to do it.
 
 
 Either solution yields the correct answer I believe, but I'm having
 difficulty translating the SQL that I know will work into sqlalchemy land.
 
 
 For Solution B, I don't know how to wrap my query in an outer select
 (similar to the LIMIT implementation for Oracle) in a way that will
 still allow sqlalchemy to extract rows into instances:
 
 from_self() can do this (probably use add_column() for the window
 function), or subquery() should work very well.   subq = q.subquery(); q
 = query(func.count('*').over().label(..), subq); .
 
 send me a quick example if that's not working and I'll work out the query.
 
 --
 You received this message because you are subscribed to a topic in the
 Google Groups sqlalchemy group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/sqlalchemy/_U28GXXR6sg/unsubscribe?hl=en.
 To unsubscribe from this group and all its topics, send an email to
 sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
 For more options, visit https://groups.google.com/groups/opt_out.
 
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
 For more options, visit https://groups.google.com/groups/opt_out.
 
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] [Q][0.7.9] How to issue apply_labels() on an ORM query?

2013-05-30 Thread Michael Bayer
this is very helpful because you are here running into an older feature that I 
think is not very applicable to modern usage, not to mention not terrifically 
documented, so I've added 
http://www.sqlalchemy.org/trac/attachment/ticket/2736/ to address changing the 
role of select_from() to be more what it says, and to offload a particular 
magic aspect of it into a new method called select_entity_from().   

if you look at the docstrings added to the patch there, you may get a sense for 
what's been going on.   I'll probably commit this soon and those docs will be 
up on the site.  There will be an upgrade path for users in this case.




On May 30, 2013, at 1:10 PM, Ladislav Lenart lenart...@volny.cz wrote:

 Hello.
 
 Sorry for the long delay. I finally had enough time to produce a minimal
 self-contained regression. The attached file produces the following SQL:
 
 WITH RECURSIVE
 q_cte(partner_id, max_depth) AS (
SELECT
partner.id AS partner_id,
1 AS max_depth
FROM partner
WHERE partner.sponsor_id IS NULL
UNION ALL
SELECT
partner_alias.id AS partner_id,
max_depth + 1 AS max_depth
FROM
partner AS partner_alias,
q_cte AS q_cte_alias
WHERE partner_alias.sponsor_id = q_cte_alias.partner_id
 )
 SELECT
q_cte.partner_id AS q_cte_partner_id, -- suspicious
partner.sponsor_id AS partner_sponsor_id
 FROM
q_cte
JOIN partner ON q_cte.partner_id = q_cte.partner_id -- BUG
 
 
 I would expect this (q_cte part is correct):
 
 WITH RECURSIVE
 q_cte(partner_id, max_depth) AS (
...
 )
 SELECT
partner.id AS partner_id,
partner.sponsor_id AS partner_sponsor_id
 FROM
q_cte
JOIN partner ON q_cte.partner_id = partner.id
 
 
 The bug is somehow related to the use of select_from().
 
 
 Hope this helps,
 
 Ladislav Lenart
 
 
 On 10.5.2013 22:04, Michael Bayer wrote:
 
 On May 10, 2013, at 3:03 PM, Ladislav Lenart lenart...@volny.cz wrote:
 
 Hello.
 
 My main concern was that the query creates a cartesian product and I 
 thought the
 warning might have something to do with it. It haven't. The problem is 
 related
 to the use of select_from():
 
   q = session.query(cls, PersonalContact).select_from(q_cte_union)
   q = q.join(cls, cls.id == q_cte_union.c.partner_id)
 
 This part renders the following SQL:
 
 SELECT *
 FROM
   q_cte
   JOIN partner ON q_cte.partner_id = q_cte.partner_id
 
 As you can see, the JOIN condition is wrong though I have no idea why. The
 following works, however:
 
   q = session.query(cls)
   q = q.add_entity(PersonalContact)
   q = q.join(q_cte_union, cls.id == q_cte_union.c.partner_id)
 
 Is my usage of select_from() bad for some reason? Do you know what's going 
 on? I
 can provide you a fully runnable test case if you still need it (on Monday).
 Note also that I use SA 0.7.9. Is this fixed in later versions?
 
 I've no idea why you're getting it without more context - if everything is 
 generated from a query(), typically the labels are already applied, with 
 the exception of some operations such as subquery() (maybe cte? not sure, 
 I'd need to try it, hence an already existing example saves me lots of time 
 tinkering around trying to guess what you're doing).
 
 Things have been improved in 0.8 regarding same-named columns but that only 
 applies when the labels are already present.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
 For more options, visit https://groups.google.com/groups/opt_out.
 
 
 select_from_bug.py

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.