Re: [sqlalchemy] (Hopefully) simple problem with backrefs not being loaded when eagerloading.
On 13/09/10 18:21, Michael Bayer wrote: On Sep 13, 2010, at 12:26 PM, Jon Siddle wrote: This relationship is satisfied as you request it, and it works by looking in the current Session's identity map for the primary key stored by the many-to-one. The operation falls under the realm of lazyloading even though no SQL is emitted. If you consider that Child may have many related many-to-ones, all of which may already be in the Session, it would be quite wasteful for the ORM to assume that you're going to be working with the object in a detached state and that you need all of them. I'm not sure I see what you're saying here. I've explicitly asked for all children relating to parent and these are correctly queried and loaded. While they are being added to the parent.children list, why not also set each child.parent since this is known? because you didn't specify it, and it takes a palpable amount of additional overhead to do so I don't see why it's more overhead than an assignment child.parent = ... at the same time as the list append parent.children.append(...). There's obviously something more complex going on behind the scenes. as well as a palpable amount of complexity to decide if it should do so based on the logic you'd apply here, when in 99% of the cases it is not needed. I just don't see the complexity of the logic here. I've specified I want to join parent to each child, and it's already doing so in one direction. I realise this is only a problem for detached objects, but it leads to quite confusing behaviour, I think. I don't see how this is wasteful, but I may be missing something. Child may have parent, foo, bar, bat attached to it, all many-to-ones. Which ones should it assume the user wants to load ? parent. Because I have explicitly asked it to using joinedload or eagerload. If you are loading 1 rows, and each Child object has three many-to-ones on it, and suppose it takes 120 function calls to look at a relationship, determine the values to send to query._get(), look in the identity map, etc., that is 3 x 1 x 120 = 3.6 million function calls But you don't have to look in the identity map at all, since you've just set the parent-child association in the other direction and thus have both entities to hand, right? , by default, almost never needed since normally they are all just there in the current session, without the user asking to do so.There is nothing special about Child.parent just because Parent.children is present up the chain. While Hibernate may have decided that the complexity and overhead of adding this decisionmaking was worth it, they have many millions more function calls to burn with the Java VM in any case than we do in Python, and they also have a much higher bar to implement lazyloading since their native class instrumentation is itself a huge beast. In our case it is simply unnecessary. Any such automatic decisionmaking you can be sure quickly leads to many uncomfortable edge cases and thorny situations, causing needless surprise issues for users who don't really need such a behavior in the first place. I would agree with all of this if I understood why a) it takes an appreciable number of function calls or b) automatic decisionmaking is necessary. I don't think there's any ambiguity here, but again; perhaps I'm missing something fundamental. As I've mentioned, you will have an option to tell it which many-to-ones you'd like it to spend time pre-populating using the upcoming immedateload option. I still think this can be done with negligible overhead if it's done at the same time as the other side of the relation (parent-child). Perhaps I'll have to dig around in the code to see why this is such a problem. The Session's default assumption is that you're going to be leaving it around while you work with the objects contained, and in that way you interact with the database for as long as you deal with its objects, which represent proxies to the underlying transaction. When objects are detached, for reasons like caching and serialization to other places, normally you'd merge() them back when you want to use them again. So if it were me I'd normally be looking to not be closing the session. I'm closing the session before I forward the objects to the view template in a web application. The template has no business doing database operations, I disagree with this interpretation of abstraction. That's like saying that pushing the button in an elevator means you're now in charge of starting up the elevator motors and instructing them how many feet to travel. Huh? I didn't use the word abstraction. The template is not doing database operations, it is working with high level objects that you've sent it, and knows nothing of a database. That these objects may be doing database calls behind the scenes to lazily fetch additional data is known as the proxy pattern. It is one of the most
Re: [sqlalchemy] Re: (Hopefully) simple problem with backrefs not being loaded when eagerloading.
to ensure it works acceptably, do we get increased performance ? No - performance now decreases for all users, unconditionally. All collection loads now get additional overhead whether the user ever needed to load the other side or not. Do we gain ease of use ? For the vast majority of users who work with the Session opened as is recommended, no. They continue get Child.parent for free no matter how the Parent and Child happened to be loaded, if they are both in memory. The new feature adds absolutely no enhancement to usage for them. Why did we do it then ? Well, some people want to work with detached objects, because they have special needs like serialization, caching, or they have an issue with templates working with proxy objects, and they want the ORM to support their use case without any further intervention or specification. Even though detached objects are by design more tedious to work with - they are not associated with a database connection, other attributes that didn't come in the load aren't available, the attributes attached to their loaded child objects aren't available, and in the real world, a Child object that wants to go into a cache or something would likely need many of its attributes pre-loaded, not just the one that happens to be associated with its owning collection. The feature then makes it one less step, *if* the one attribute they care about having pre-loaded is that one. Otherwise, even their experience is not enhanced at all by the new feature. How could I possibly justify such a design decision ? -- Jon Siddle, CoreFiling Limited Software Tools Developer http://www.corefiling.com Phone: +44-1865-203192 -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] (Hopefully) simple problem with backrefs not being loaded when eagerloading.
I'm sure I'm missing something simple here, and any pointers in the right direction would be greatly appreciated. Take for instance the following code: session = Session() parents = session.query(Parent).options(joinedload(Parent.children)).all() session.close() print parents[0].children # This works print parents[0].children[0].parent # This gives a lazy loading error Adding the following loop before closing the session works (and doesn't hit the DB): for p in parents: for c in p.children: c.parent As far as I can tell, the mapping is correct since: * It all works fine if I leave the session open * If I don't use joinedload, and leave the session open it lazyloads correctly I'm surprised that: * It doesn't set both sides of the relation, considering it apparently knows about them * It complains that the session is closed despite not actually requiring an open session (no SQL is sent to the DB for c.parent) These apprent do-nothing loops are starting to clutter the code. There must be a better way. Thanks Jon -- Jon Siddle, CoreFiling Limited Software Tools Developer http://www.corefiling.com Phone: +44-1865-203192 -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] (Hopefully) simple problem with backrefs not being loaded when eagerloading.
On 13/09/10 16:45, Michael Bayer wrote: On Sep 13, 2010, at 8:48 AM, Jon Siddle wrote: I'm sure I'm missing something simple here, and any pointers in the right direction would be greatly appreciated. Take for instance the following code: session = Session() parents = session.query(Parent).options(joinedload(Parent.children)).all() session.close() print parents[0].children # This works print parents[0].children[0].parent # This gives a lazy loading error Adding the following loop before closing the session works (and doesn't hit the DB): for p in parents: for c in p.children: c.parent As far as I can tell, the mapping is correct since: * It all works fine if I leave the session open * If I don't use joinedload, and leave the session open it lazyloads correctly I'm surprised that: * It doesn't set both sides of the relation, considering it apparently knows about them This relationship is satisfied as you request it, and it works by looking in the current Session's identity map for the primary key stored by the many-to-one. The operation falls under the realm of lazyloading even though no SQL is emitted. If you consider that Child may have many related many-to-ones, all of which may already be in the Session, it would be quite wasteful for the ORM to assume that you're going to be working with the object in a detached state and that you need all of them. I'm not sure I see what you're saying here. I've explicitly asked for all children relating to parent and these are correctly queried and loaded. While they are being added to the parent.children list, why not also set each child.parent since this is known? I don't see how this is wasteful, but I may be missing something. I'm not suggesting it should touch relations that I haven't explicitly told it to eagerly load. The likes of Hibernate (yes, it's a very different beast) load both sides of the relation at once. The Session's default assumption is that you're going to be leaving it around while you work with the objects contained, and in that way you interact with the database for as long as you deal with its objects, which represent proxies to the underlying transaction. When objects are detached, for reasons like caching and serialization to other places, normally you'd merge() them back when you want to use them again. So if it were me I'd normally be looking to not be closing the session. I'm closing the session before I forward the objects to the view template in a web application. The template has no business doing database operations, and the controller *should* make sure all DB work has been done. In my case, I know I'll never need to write back to the DB. However, when working with detached objects is necessary, two approaches here you can use. One is a general approach that can load anything related, which is to load them in a @reconstructor. This is illustrated at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/ImmediateLoading .It won't issue any extra SQL for the many-to-ones that are present in the session already. In the specific case you have above, you can also use a trick which is to use contains_eager(): parents = session.query(Parent).options(joinedload(Parent.children), contains_eager(Parent.children, Child.parent)).all() This seems to address my problem directly. It's still a bit redundant, but from my initial tests it seems to solve my problem. the above approach requires that Parent is one of the entities that you're requesting explicitly - i.e. if you were saying joinedload(foo, bar, bat), it would be kind of impossible to target bat.hohos with contains_eager() due to the aliasing. I'm only interested in making sure both sides of the same relation are loaded; so this isn't a problem at all. so let me also back that up, that we've always planned on adding an immediateload option that would just fire off any lazyloader as the query fetches results.A really short patch that adds immediateload() is athttp://www.sqlalchemy.org/trac/ticket/1914 and hopefully will be in 0.6.5 pending further testing. We'll have to support 0.5 for some time, but it's good to know a shortcut is coming. Thanks a lot for your help. Jon -- Jon Siddle, CoreFiling Limited Software Tools Developer http://www.corefiling.com Phone: +44-1865-203192 -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.