Re: [sqlalchemy] [SA0.7] Possible feature request for subqueryload?

2013-09-26 Thread Ladislav Lenart
Hello.


On 25.9.2013 17:15, Michael Bayer wrote:
 
 On Sep 25, 2013, at 10:11 AM, Ladislav Lenart lenart...@volny.cz wrote:
 
 Hello.

 Would it be possible to make these two forms

session.query(cls).options(
subqueryload(cls.foos),
subqueryload(cls.foos, Foo.bar),
)

 and

session.query(cls).options(
subqueryload_all(cls.foos, Foo.bar)
)

 completely equivalent, i.e. subqueryload_all being just a shortcut of the 
 former
 if the paths have the above pattern (i.e. if one subqueryload is subsumed by
 another)?
 
 unless there's some bug I'm unaware of, those two forms are exactly 
 equivalent.   XYZload_all(a.b.c) is a shortcut to XYZload(a), 
 XYZLoad(a.b), XYZLoad(a.b.c).In 0.9 I'm working out a new system 
 backing the loader options that will allow more intuitive patterns, e.g. 
 load(cls).subqueryload(foos).subqueryload(bar).

Hmm, that's not what I see here. I captured SQL Alchemy debug log (echo=True) of
one such query and modified it to be easier to read, added corresponding python
code and also my remarks/questions. Please look at it, though it's rather 
long...


First, a query that returns ids of ALL Partner instances of the result (this is
how I do windowed queries).

[python]
session.query(distinct(Partner.id)).filter(
Partner.visible == True,
Partner.sponsor_id == self.id,
)

[sql]
SELECT DISTINCT partner.id AS anon_1
FROM partner
WHERE partner.visible = true AND partner.sponsor_id = %(sponsor_id_1)s

This result list of (partner) ids is then divided to windows of size 100. Each
window is queried individually. Following are the queries to completely populate
one such window.


Main query to populate data of max 100 partners (one window).

[python]
# win is a list of 100 ids.
session.query(Partner).filter(
Partner.id.in_(win)
).options(
joinedload(Partner.subject, innerjoin=True),
joinedload(Partner.subject, Subject.subject_personal, innerjoin=True),
joinedload(Partner.subject, Subject.subject_personal, 
SubjectPersonal.address),
subqueryload(Partner.subject, Subject.contacts),
subqueryload(Partner.subject, Subject.contacts, Contact.contact_personal),
subqueryload(Partner.subject, Subject.contacts, Contact.contact_personal,
ContactPersonal.contact_address),
subqueryload(Partner.subject, Subject.contacts, Contact.contact_personal,
ContactPersonal.permanent_address),
subqueryload_all(Partner.subject, Subject.tag_subjects, TagSubject.tag),
subqueryload_all(Partner.partner_regions),
)

[sql]
SELECT
partner.state AS partner_state,
partner.position AS partner_position,
partner.own_points AS partner_own_points,
partner.net_points AS partner_net_points,
partner.net_admin AS partner_net_admin,
partner.team_leader AS partner_team_leader,
partner.bazaar_owner AS partner_bazaar_owner,
partner.bazaar_extractor AS partner_bazaar_extractor,
partner.desired_team_leader AS partner_desired_team_leader,
partner.desired_bazaar_owner AS partner_desired_bazaar_owner,
partner.desired_bazaar_extractor AS partner_desired_bazaar_extractor,
partner.visible AS partner_visible,
partner.sms_credits AS partner_sms_credits,
partner.id AS partner_id,
partner.external_id AS partner_external_id,
partner.subject_id AS partner_subject_id,
partner.sponsor_id AS partner_sponsor_id,
partner.team_id AS partner_team_id,
partner.bazaar_id AS partner_bazaar_id,
partner.higher_bazaar_id AS partner_higher_bazaar_id
FROM partner
WHERE partner.id IN (...)


[python]
joinedload(Partner.subject, innerjoin=True),
joinedload(Partner.subject, Subject.subject_personal, innerjoin=True),
joinedload(Partner.subject, Subject.subject_personal, SubjectPersonal.address,
innerjoin=True),

corresponds to

[sql]
SELECT
subject.created_at AS subject_created_at,
subject.id AS subject_id,
subject.subject_personal_id AS subject_subject_personal_id,
subject.subject_corporate_id AS subject_subject_corporate_id,
subject.creator_id AS subject_creator_id,
anon_1.partner_subject_id AS anon_1_partner_subject_id,
address_1.street AS address_1_street,
address_1.postal_code AS address_1_postal_code,
address_1.city AS address_1_city,
address_1.country AS address_1_country,
address_1.id AS address_1_id,
subject_personal_1.title_prefix AS subject_personal_1_title_prefix,
subject_personal_1.title_suffix AS subject_personal_1_title_suffix,
subject_personal_1.first_name AS subject_personal_1_first_name,
subject_personal_1.last_name AS subject_personal_1_last_name,
subject_personal_1.identification_number AS
subject_personal_1_identification_number,
subject_personal_1.birth_date AS subject_personal_1_birth_date,
subject_personal_1.mobile_personal AS subject_personal_1_mobile_personal,
subject_personal_1.phone_personal AS subject_personal_1_phone_personal,
subject_personal_1.phone_home AS subject_personal_1_phone_home,

Re: [sqlalchemy] [SA0.7] Possible feature request for subqueryload?

2013-09-26 Thread Ladislav Lenart
Thank you for pointing this out!

I did not realize that it has such a consequencies, though it is perfectly
logical. My bad. All joinedloads are part of the main query now, as they should.

Also, ignore my remark about Tag info being not loaded. When there actually are
any tags, they get loaded as expected (though I stil don't understand why the
SQL is different in both cases).

The only thing remaining is that the following subqueryloads are loaded in
isolation:

 subqry - /subject/contacts/
 subqry - /subject/contacts/contact_personal/
 subqry - /subject/contacts/contact_personal/contact_address/
 subqry - /subject/contacts/contact_personal/permanent_address/

And I just realized / understood why! Because subqueryload uses JOIN and all
above relations can be NULL. Is this correct?

I have two types of contacts, personal and corporate. The above loads:
* ALL Subject.contacts (i.e. personal AND CORPORATE)
* The rest only work with the personal.


Thank you very much! It has been enlightening as always,

Ladislav Lenart


On 26.9.2013 16:13, Michael Bayer wrote:
 
 On Sep 26, 2013, at 9:30 AM, Ladislav Lenart lenart...@volny.cz wrote:
 

 Your thoughts on this? Do you see anything suspicious?

 BTW the new system for 0.9 looks very nice!
 
 OK I only started reading the first few queries, but immediately what I think 
 is causing confusion here is that your loading options are conflicting with 
 each other, so the joinedload() you're looking for at the top is being 
 partially cancelled out:
 
 session.query(Partner).filter(
Partner.id.in_(win)
 ).options(
joinedload(Partner.subject, innerjoin=True),
joinedload(Partner.subject, Subject.subject_personal, innerjoin=True),
joinedload(Partner.subject, Subject.subject_personal, 
 SubjectPersonal.address),
subqueryload(Partner.subject, Subject.contacts),
subqueryload(Partner.subject, Subject.contacts, Contact.contact_personal),
subqueryload(Partner.subject, Subject.contacts, Contact.contact_personal, 
 ContactPersonal.contact_address),
subqueryload(Partner.subject, Subject.contacts, Contact.contact_personal, 
 ContactPersonal.permanent_address),
subqueryload_all(Partner.subject, Subject.tag_subjects, TagSubject.tag),
subqueryload_all(Partner.partner_regions),
 )
 
 
 the directive subqueryload_all(Partner.subject, Subject.tag_subjects, 
 TagSubject.tag) is after the joinedloads you have above, so the 
 joinedload(Partner.subject) is cancelled out by that, hence all the 
 additional joinedload() that build on top of that is pushed out to be part of 
 the subqueryloads.   
 
 you can read these directives just like paths.  The path is a unique key in 
 a dictionary.  as the loading traverses along each relationship, it looks up 
 the path, and the kind of loader called for.  the above is the equivalent of:
 
 joined - /subject/
 joined - /subject/subject_personal/
 joined - /subject/subject_personal/address/
 subqry - /subject/contacts/
 subqry - /subject/contacts/contact_personal/
 subqry - /subject/contacts/contact_personal/contact_address/
 subqry - /subject/contacts/contact_personal/permanent_address/
 subqry - /subject/   (overrides the joined)
 subqry - /subject/tag_subjects/
 subqry - /subject/tag_subjects/tag/
 subqry - /partner_regions/
 
 


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] [SA0.7] Possible feature request for subqueryload?

2013-09-25 Thread Michael Bayer

On Sep 25, 2013, at 10:11 AM, Ladislav Lenart lenart...@volny.cz wrote:

 Hello.
 
 Would it be possible to make these two forms
 
session.query(cls).options(
subqueryload(cls.foos),
subqueryload(cls.foos, Foo.bar),
)
 
 and
 
session.query(cls).options(
subqueryload_all(cls.foos, Foo.bar)
)
 
 completely equivalent, i.e. subqueryload_all being just a shortcut of the 
 former
 if the paths have the above pattern (i.e. if one subqueryload is subsumed by
 another)?

unless there's some bug I'm unaware of, those two forms are exactly equivalent. 
  XYZload_all(a.b.c) is a shortcut to XYZload(a), XYZLoad(a.b), 
XYZLoad(a.b.c).In 0.9 I'm working out a new system backing the loader 
options that will allow more intuitive patterns, e.g. 
load(cls).subqueryload(foos).subqueryload(bar).



 
 My motivation: I have a rather complex function (see below) used in several
 contexts that returns Query.options for class Subject. The code is written to
 eliminate duplication, but the current 'suboptimal' behaviour of subqueryload
 (when compared to subqueryload_all) forces me to implement (and maintain)
 several independent versions of this function.
 
 Or am I missing something that could help me (perhaps rephrase the loads
 differently)?

im not really sure.   the API of this function seems a little complex to me, 
it's exposing details of the persistence mechanism while at the same time 
trying to hide part of it.




 
 The function (on class Subject) currently looks like this:
 
@classmethod
def fetch_options(
cls,
prefix=None, alias=None,
fetch_name=True,
fetch_personal=True, fetch_corporate=True,
use_joinedload=True, use_innerjoin=True,
fetch_address=False,
fetch_tags=False,
fetch_all=False
):
Fetch subject info. Basic usage (note the '*'!):
session.query(Subject).options(*Subject.fetch_options())
Arguments:
* prefix - Use if Subject is not queried directly, e.g.:
session.query(Partner).options(
*Subject.fetch_options(prefix=[Partner.subject])
)
* alias - Specify if necessary, e.g.:
subj_alias = aliased(Subject)
session.query(subj_alias).options(
*Subject.fetch_options(alias=subj_alias)
)
* fetch_name - Fetch everything for Subject.display_name(partner). This
is the default.
* fetch_address - Fetch addresses-related info. Implies fetch_name.
* fetch_tags - Fetch tag-related info. Can be used on its own.
* fetch_personal - Set to False to supress fetching of any info about
persons.
* fetch_corporate - Set to False to supress fetching of any info about
corporations.
* fetch_all - Shortcut that implies all above.
* use_joinedload - joinedload() is used by default to fetch all 1:1
relationships. If prefix contains 1:N relationship(s), set this to
False and subqueryload() will be used instead.
* use_innerjoin - One of fetch_personal or fetch_corporate MUST be
True. If only one is set, all joinedload() will be INNER by default.
Set this to False to force the use of OUTER.

from zfp.model.contact import Contact, ContactPersonal, 
 ContactCorporate
from zfp.model.tag import TagSubject
if fetch_all:
fetch_name = True
fetch_personal = True
fetch_corporate=True
fetch_address=True
fetch_tags=True
elif fetch_address:
fetch_name = True
assert fetch_personal or fetch_corporate
use_innerjoin = use_innerjoin and not(fetch_personal and 
 fetch_corporate)
if use_joinedload:
def load_op(*args):
return joinedload(*args, innerjoin=use_innerjoin)
else:
def load_op(*args):
return subqueryload(*args)
if prefix is None:
prefix = []
if alias is None:
alias = cls
options = []
if fetch_name:
options.extend([
subqueryload(*prefix + [alias.contacts]),
])
if fetch_personal:
options.extend([
load_op(*prefix + [alias.subject_personal]),
subqueryload(*prefix + [alias.contacts,
 Contact.contact_personal]),
])
if fetch_corporate:
options.extend([
load_op(*prefix + [alias.subject_corporate]),
subqueryload(*prefix + [alias.contacts,
 Contact.contact_corporate]),
])
if fetch_address:
if fetch_personal:
options.extend([
load_op(*prefix + [alias.subject_personal,
 SubjectPersonal.address]),