Re: [sqlalchemy] [Q][SA 0.7.9] yield_per() vs. joinedload() and subqueryload()

Michael Bayer Wed, 05 Jun 2013 07:28:43 -0700


the Query emitted by subqueryload does not use yield_per.  so if your total 
result set is 1000 rows, and the total rows represented by all the collections 
is 10000,  the first time that query is emitted, 10000 rows will be fully 
loaded and processed into memory at once.  This would occur typically somewhere 
in the first few rows of your 50 -row yield_per batch.   So the intent of 
yield_per, which is to conserve memory and upfront loading overhead, would be 
defeated entirely by this.


The subqueryload is emitting once if I recall correctly because I probably at 
some point have the query result being memoized in the query context to prevent 
it from being emitted many times in a yield_per scenario.


On Jun 5, 2013, at 10:20 AM, Ladislav Lenart <lenart...@volny.cz> wrote:

> Hello.
> 
> One more note. I've just tried the below experiment with joinedload() instead 
> of
> subqueryload() and that does NOT work just as you expected. One contact is
> returned several times and the first occurrences have incomplete phones.
> 
> However my experiments with subqueryload() suggest that it works just fine 
> with
> yield_per(). Can you elaborate on that one please?
> 
> 
> Thank you,
> 
> Ladislav Lenart
> 
> 
> On 5.6.2013 14:03, Ladislav Lenart wrote:
>> Hello.
>> 
>> I've tried an experiment to verify that yield_per() with subqueryload() 
>> behaves
>> as badly as you described, but according to my practical observation, it 
>> issues
>> ONE subqueryload() and everything behaves as I would expect. It emits two
>> SELECTs, one to fetch the objects and the second one to fetch the related 
>> data
>> and that's about it, see below. So back to square one, what is wrong with it?
>> 
>> 
>> Base = declarative_base()
>> 
>> class Contact(Base):
>>    # Just a sketch, I hope you get the picture.
>>    id,
>>    name,
>>    phones = relationship(Phone) # many
>> 
>> class Phone(Base):
>>    # Just a sketch, I hope you get the picture.
>>    id,
>>    number,
>>    contact = relationship(Contact) # one
>> 
>> # Setup engine with echo set to True.
>> 
>> phones = ['123456789', '987654321', '555777999']
>> for i in range(1, 11):
>>    c = Contact(name=u' '.join([u'Contact', unicode(i)]))
>>    session.add(c)
>>    session.add_all(Phone(contact=c, number=e) for e in phones)
>> session.flush()
>> session.expunge_all()
>> 
>> q = session.query(Contact).options(subqueryload(Contact.phones))
>> for each in q.yield_per(2):
>>    print each.last_name
>>    for e in each.phones:
>>        print e
>> 
>> The output is like this:
>> SA info about all the inserts after session.flush().
>> SA info about select for contacts.
>> SA info about select for their phones.
>> Contact 10
>> 123456789
>> 987654321
>> 555777999
>> Contact 9
>> 123456789
>> 987654321
>> 555777999
>> Contact 8
>> 123456789
>> 987654321
>> 555777999
>> Contact 7
>> 123456789
>> 987654321
>> 555777999
>> Contact 6
>> 123456789
>> 987654321
>> 555777999
>> Contact 5
>> 123456789
>> 987654321
>> 555777999
>> Contact 4
>> 123456789
>> 987654321
>> 555777999
>> Contact 3
>> 123456789
>> 987654321
>> 555777999
>> Contact 2
>> 123456789
>> 987654321
>> 555777999
>> Contact 1
>> 123456789
>> 987654321
>> 555777999
>> 
>> 
>> Thank you,
>> 
>> Ladislav Lenart
>> 
>> 
>> On 5.6.2013 11:26, Ladislav Lenart wrote:
>>> Hello.
>>> 
>>> I have one more question about my approach to WindowedRangeQuery:
>>> 
>>>> * Suppose a complex query q with join(...), filter(...) and options(...).
>>>> * I need to create q2 from q such that:
>>>> * It has no options.
>>>>   * Can I reset the options with q = q.options(None)?
>>>> * It has select with the over by magic taken from WindowedRangeQuery 
>>>> recipe.
>>>>   * I know I can use with_entities() for this.
>>>> * I will use q2 to obtain the window ranges.
>>>> * I will iterate over the window ranges and apply each to the original q 
>>>> and
>>>> execute it in a loop.
>>> 
>>> Suppose I have a complex query (without options() / order_by()) to obtain 
>>> the
>>> objects I want. I use this query to calculate window intervals. I don't 
>>> have to
>>> use it again to fetch the objects, because I already have their id 
>>> intervals. Am
>>> I right?
>>> 
>>> 
>>> Thank you,
>>> 
>>> Ladislav Lenart
>>> 
>>> 
>>> On 4.6.2013 19:15, Ladislav Lenart wrote:
>>>> On 4.6.2013 18:49, Michael Bayer wrote:
>>>>> 
>>>>> On Jun 4, 2013, at 12:28 PM, Ladislav Lenart <lenart...@volny.cz> wrote:
>>>>> 
>>>>>> 
>>>>>>> Absolutely, you should do whatever you have to in order to get the range
>>>>>>> you want, in fact the recipe even says this...
>>>>>> 
>>>>>> Ok. What I want to do is basically the following:
>>>>>> * Suppose a complex query q with join(...), filter(...) and options(...).
>>>>>> * I need to create q2 from q such that:
>>>>>> * It has no options.
>>>>>>   * Can I reset the options with q = q.options(None)?
>>>>>> * It has select with the over by magic taken from WindowedRangeQuery 
>>>>>> recipe.
>>>>>>   * I know I can use with_entities() for this.
>>>>>> * I will use q2 to obtain the window ranges.
>>>>>> * I will iterate over the window ranges and apply each to the original q 
>>>>>> and
>>>>>> execute it in a loop.
>>>>>> 
>>>>>> Can this strategy work?
>>>>> 
>>>>> there's not a publicly supported feature to "reset" the options right now 
>>>>> so you'd probably need to apply them after you get your window ranging 
>>>>> query.      You can probably remove the effect of eager loaders if you 
>>>>> were to just assign a blank _attributes dictionary to the new query 
>>>>> though (just no long term guarantees).
>>>>> 
>>>>> other than that, I use the windowing concept extensively and it works 
>>>>> very well.
>>>> 
>>>> Ok, I will try it.
>>>> 
>>>> Thank you very much for your invaluable insights,
>>>> 
>>>> Ladislav Lenart
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
> 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [sqlalchemy] [Q][SA 0.7.9] yield_per() vs. joinedload() and subqueryload()

Reply via email to