Re: [sqlalchemy] Problem with eagerload and lazy='joined'

Ben Chess Sat, 09 Jul 2011 14:07:34 -0700

On Jul 8, 2011, at 8:24 PM, Michael Bayer wrote:

> On Jul 8, 2011, at 6:17 PM, Ben Chess wrote:
> 
>> Hmm, bummer.
>> 
>> We're pretty dependent on detached mode.  We eagerload a bunch of
>> things for the purposes of storing them in memcache, and things don't
>> go well when they're not fully traversable.  
> 
> OK so you're caching.    Actually when I worked with caching a lot I found 
> that I was looking for the objects to exist without their related 
> collections, then I'd use the Beaker recipe so that the collections etc. were 
> cached separately as they were lazy loaded.    It allowed the cache to 
> contain individual collections of things more discretely, rather than several 
> large trees tailored towards some specific loading scenario.    The more 
> granular level collections allowed the retrieval over memcached to not spend 
> time loading things that weren't needed for a particular view.


Okay, but let's not get off topic.  How the results are cached has nothing to 
do with the fact that eagerload just isn't reliable.  If I eagerload a few 
objects in, I'd expect that that no further SQL queries will be necessary to 
use those objects.  That's currently not the case.

I experimented a bit with trying to to intelligently recurse down the paths 
defined by the eagers and see if there are any unloaded objects.  I didn't 
quite know what to do in the case of lists.  It felt like I was going to have 
to re-write a lot of things that the populators already do.

My proposal: If there are eagers with paths longer than 1 attribute, always 
re-populate that attribute.  I'd rather have correctness over performance.  Or 
at least have the option.  Here's my patch: http://pastebin.com/j8pakaLP

>> I've been delving into
>> the mapper.py code site you mention.  It's a lot of new code to me,
>> but I agree it doesn't seem trivial.  Also seems like a potential
>> performance hit, but could be something that'd be enabled with an
>> option.
>> 
>> One thing I still don't understand though is why the problem only
>> happens when we first eagerly load the completely unrelated "d_row"?
> 
> its the "a_obj.c_rows" that loads all the C's which then eagerload their A, 
> the A of course is already in the Session but it does the join and populates 
> each C.a_row.

I understand that a_obj.c_rows loads in the Cs.  That doesn't address my 
question.  My test passes if you don't eagerload d_row, and I'm not sure why.

>> 
>> On Thu, Jul 7, 2011 at 6:39 PM, Michael Bayer <mike...@zzzcomputing.com> 
>> wrote:
>>> 
>>> On Jul 7, 2011, at 7:04 PM, Ben Chess wrote:
>>> 
>>>> I've hit a problem where eagerload() fails to load in a relation of a
>>>> relation when lazy='joined' is involved.  It's easiest just to show
>>>> the test.  It fails in 0.7.1, and an equivalent test also fails in
>>>> 0.6.8.
>>>> 
>>>> http://pastebin.com/ruq6SM1z
>>>> 
>>>> Basically, A has relations to B, C, and D.  C's relationship to A is a
>>>> lazy='joined'.
>>>> 
>>>> First load A, eagerloading 'd_row'
>>>> Then reference A.c_row, causing it to load C.
>>>> 
>>>> Then, separately, load C, eagerloading 'a_row.b_row'.
>>>> At this point, I expunge_all() and demonstrate that b_row was not
>>>> attached to C.a_row.
>>>> 
>>>> This does not occur if C's relationship to A is lazy='select'.
>>>> Weirdly, this also does not occur if the initial load of A does not
>>>> eagerload 'd_row'.   I'm not sure why that should affect anything.
>>> 
>>> This will make it pass:
>>> 
>>> assert 'a_row' in a_obj.c_rows[0].__dict__
>>> session.expire_all()
>>> 
>>> c_obj = 
>>> session.query(C).options(eagerload_all('a_row.b_row')).filter_by(id=1).one()
>>> session.expunge_all()
>>> 
>>> assert c_obj.a_row.b_row
>>> 
>>> note after load #1, c_obj is already in the Session, and c_obj.a_row is 
>>> already populated (looking in __dict__ is always the way to see if 
>>> something is already loaded).    This is because of the lazy=False on 
>>> C.a_row.
>>> 
>>> Then what happens in the load, and it occurs on line 2587 of mapper.py in 
>>> the current tip, we get the C object already in the identity map during the 
>>> second load.   We say, OK C do you have any attributes that aren't 
>>> populated which we can pull from this row ?   C says, "nope".   C.a_row is 
>>> already there.   This process currently doesn't descend further into the 
>>> objects attached to C.a_row so the rest of the columns are thrown away.
>>> 
>>> It was actually somewhat of an innovation around 0.5 or so when I actually 
>>> got the thing to populate "unloaded" attributes on objects that were 
>>> otherwise loaded and might even have pending changes, which was a big step 
>>> forward at that time, I didn't take on trying to figure out if eagers could 
>>> keep on going into the graph and find deeper attributes that aren't loaded.
>>> 
>>> If you have an opinion on this, let me know, right now I feel like its in 
>>> an OK place considering the tradeoff of digging way down into a graph which 
>>> may be unnecessary for those rows that were already loaded, many-to-ones 
>>> are usually not an issue since they pull from the identity map.      If the 
>>> issue is you're going for "detached" behavior, I generally don't recommend 
>>> relying heavily on object graphs that are fully traversable in the detached 
>>> state unless you're doing some kind of offline caching.   Of course, if 
>>> there were a patch to that area of code that successfully kept the 
>>> traversal going deeper into already loaded nodes based on the current 
>>> eagers present, I'm open to evaluating it, though it doesn't seem like a 
>>> quick tweak at the moment.
>>> 
>>> Nice test though, if you're interested in helping with tests/patches we're 
>>> always looking for help.
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> 
>>>> --
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "sqlalchemy" group.
>>>> To post to this group, send email to sqlalchemy@googlegroups.com.
>>>> To unsubscribe from this group, send email to 
>>>> sqlalchemy+unsubscr...@googlegroups.com.
>>>> For more options, visit this group at 
>>>> http://groups.google.com/group/sqlalchemy?hl=en.
>>>> 
>>> 
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "sqlalchemy" group.
>>> To post to this group, send email to sqlalchemy@googlegroups.com.
>>> To unsubscribe from this group, send email to 
>>> sqlalchemy+unsubscr...@googlegroups.com.
>>> For more options, visit this group at 
>>> http://groups.google.com/group/sqlalchemy?hl=en.
>>> 
>>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "sqlalchemy" group.
>> To post to this group, send email to sqlalchemy@googlegroups.com.
>> To unsubscribe from this group, send email to 
>> sqlalchemy+unsubscr...@googlegroups.com.
>> For more options, visit this group at 
>> http://groups.google.com/group/sqlalchemy?hl=en.
>> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> To unsubscribe from this group, send email to 
> sqlalchemy+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/sqlalchemy?hl=en.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Re: [sqlalchemy] Problem with eagerload and lazy='joined'

Reply via email to