Re: [sqlalchemy] (Hopefully) simple problem with backrefs not being loaded when eagerloading.
On 13/09/10 18:21, Michael Bayer wrote: On Sep 13, 2010, at 12:26 PM, Jon Siddle wrote: This relationship is satisfied as you request it, and it works by looking in the current Session's identity map for the primary key stored by the many-to-one. The operation falls under the realm of lazyloading even though no SQL is emitted. If you consider that Child may have many related many-to-ones, all of which may already be in the Session, it would be quite wasteful for the ORM to assume that you're going to be working with the object in a detached state and that you need all of them. I'm not sure I see what you're saying here. I've explicitly asked for all children relating to parent and these are correctly queried and loaded. While they are being added to the parent.children list, why not also set each child.parent since this is known? because you didn't specify it, and it takes a palpable amount of additional overhead to do so I don't see why it's more overhead than an assignment child.parent = ... at the same time as the list append parent.children.append(...). There's obviously something more complex going on behind the scenes. as well as a palpable amount of complexity to decide if it should do so based on the logic you'd apply here, when in 99% of the cases it is not needed. I just don't see the complexity of the logic here. I've specified I want to join parent to each child, and it's already doing so in one direction. I realise this is only a problem for detached objects, but it leads to quite confusing behaviour, I think. I don't see how this is wasteful, but I may be missing something. Child may have parent, foo, bar, bat attached to it, all many-to-ones. Which ones should it assume the user wants to load ? parent. Because I have explicitly asked it to using joinedload or eagerload. If you are loading 1 rows, and each Child object has three many-to-ones on it, and suppose it takes 120 function calls to look at a relationship, determine the values to send to query._get(), look in the identity map, etc., that is 3 x 1 x 120 = 3.6 million function calls But you don't have to look in the identity map at all, since you've just set the parent-child association in the other direction and thus have both entities to hand, right? , by default, almost never needed since normally they are all just there in the current session, without the user asking to do so.There is nothing special about Child.parent just because Parent.children is present up the chain. While Hibernate may have decided that the complexity and overhead of adding this decisionmaking was worth it, they have many millions more function calls to burn with the Java VM in any case than we do in Python, and they also have a much higher bar to implement lazyloading since their native class instrumentation is itself a huge beast. In our case it is simply unnecessary. Any such automatic decisionmaking you can be sure quickly leads to many uncomfortable edge cases and thorny situations, causing needless surprise issues for users who don't really need such a behavior in the first place. I would agree with all of this if I understood why a) it takes an appreciable number of function calls or b) automatic decisionmaking is necessary. I don't think there's any ambiguity here, but again; perhaps I'm missing something fundamental. As I've mentioned, you will have an option to tell it which many-to-ones you'd like it to spend time pre-populating using the upcoming immedateload option. I still think this can be done with negligible overhead if it's done at the same time as the other side of the relation (parent-child). Perhaps I'll have to dig around in the code to see why this is such a problem. The Session's default assumption is that you're going to be leaving it around while you work with the objects contained, and in that way you interact with the database for as long as you deal with its objects, which represent proxies to the underlying transaction. When objects are detached, for reasons like caching and serialization to other places, normally you'd merge() them back when you want to use them again. So if it were me I'd normally be looking to not be closing the session. I'm closing the session before I forward the objects to the view template in a web application. The template has no business doing database operations, I disagree with this interpretation of abstraction. That's like saying that pushing the button in an elevator means you're now in charge of starting up the elevator motors and instructing them how many feet to travel. Huh? I didn't use the word abstraction. The template is not doing database operations, it is working with high level objects that you've sent it, and knows nothing of a database. That these objects may be doing database calls behind the scenes to lazily fetch additional data is known as the proxy pattern. It is one of the most
[sqlalchemy] (Hopefully) simple problem with backrefs not being loaded when eagerloading.
I'm sure I'm missing something simple here, and any pointers in the right direction would be greatly appreciated. Take for instance the following code: session = Session() parents = session.query(Parent).options(joinedload(Parent.children)).all() session.close() print parents[0].children # This works print parents[0].children[0].parent # This gives a lazy loading error Adding the following loop before closing the session works (and doesn't hit the DB): for p in parents: for c in p.children: c.parent As far as I can tell, the mapping is correct since: * It all works fine if I leave the session open * If I don't use joinedload, and leave the session open it lazyloads correctly I'm surprised that: * It doesn't set both sides of the relation, considering it apparently knows about them * It complains that the session is closed despite not actually requiring an open session (no SQL is sent to the DB for c.parent) These apprent do-nothing loops are starting to clutter the code. There must be a better way. Thanks Jon -- Jon Siddle, CoreFiling Limited Software Tools Developer http://www.corefiling.com Phone: +44-1865-203192 -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] (Hopefully) simple problem with backrefs not being loaded when eagerloading.
On Sep 13, 2010, at 8:48 AM, Jon Siddle wrote: I'm sure I'm missing something simple here, and any pointers in the right direction would be greatly appreciated. Take for instance the following code: session = Session() parents = session.query(Parent).options(joinedload(Parent.children)).all() session.close() print parents[0].children # This works print parents[0].children[0].parent # This gives a lazy loading error Adding the following loop before closing the session works (and doesn't hit the DB): for p in parents: for c in p.children: c.parent As far as I can tell, the mapping is correct since: * It all works fine if I leave the session open * If I don't use joinedload, and leave the session open it lazyloads correctly I'm surprised that: * It doesn't set both sides of the relation, considering it apparently knows about them This relationship is satisfied as you request it, and it works by looking in the current Session's identity map for the primary key stored by the many-to-one. The operation falls under the realm of lazyloading even though no SQL is emitted. If you consider that Child may have many related many-to-ones, all of which may already be in the Session, it would be quite wasteful for the ORM to assume that you're going to be working with the object in a detached state and that you need all of them. The Session's default assumption is that you're going to be leaving it around while you work with the objects contained, and in that way you interact with the database for as long as you deal with its objects, which represent proxies to the underlying transaction. When objects are detached, for reasons like caching and serialization to other places, normally you'd merge() them back when you want to use them again. So if it were me I'd normally be looking to not be closing the session. However, when working with detached objects is necessary, two approaches here you can use. One is a general approach that can load anything related, which is to load them in a @reconstructor. This is illustrated at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/ImmediateLoading .It won't issue any extra SQL for the many-to-ones that are present in the session already. In the specific case you have above, you can also use a trick which is to use contains_eager(): parents = session.query(Parent).options(joinedload(Parent.children), contains_eager(Parent.children, Child.parent)).all() the above approach requires that Parent is one of the entities that you're requesting explicitly - i.e. if you were saying joinedload(foo, bar, bat), it would be kind of impossible to target bat.hohos with contains_eager() due to the aliasing. this will do the get() of the Parent as you run through. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] (Hopefully) simple problem with backrefs not being loaded when eagerloading.
On Sep 13, 2010, at 11:45 AM, Michael Bayer wrote: In the specific case you have above, you can also use a trick which is to use contains_eager(): parents = session.query(Parent).options(joinedload(Parent.children), contains_eager(Parent.children, Child.parent)).all() the above approach requires that Parent is one of the entities that you're requesting explicitly - i.e. if you were saying joinedload(foo, bar, bat), it would be kind of impossible to target bat.hohos with contains_eager() due to the aliasing. so let me also back that up, that we've always planned on adding an immediateload option that would just fire off any lazyloader as the query fetches results.A really short patch that adds immediateload() is at http://www.sqlalchemy.org/trac/ticket/1914 and hopefully will be in 0.6.5 pending further testing. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] (Hopefully) simple problem with backrefs not being loaded when eagerloading.
On 13/09/10 16:45, Michael Bayer wrote: On Sep 13, 2010, at 8:48 AM, Jon Siddle wrote: I'm sure I'm missing something simple here, and any pointers in the right direction would be greatly appreciated. Take for instance the following code: session = Session() parents = session.query(Parent).options(joinedload(Parent.children)).all() session.close() print parents[0].children # This works print parents[0].children[0].parent # This gives a lazy loading error Adding the following loop before closing the session works (and doesn't hit the DB): for p in parents: for c in p.children: c.parent As far as I can tell, the mapping is correct since: * It all works fine if I leave the session open * If I don't use joinedload, and leave the session open it lazyloads correctly I'm surprised that: * It doesn't set both sides of the relation, considering it apparently knows about them This relationship is satisfied as you request it, and it works by looking in the current Session's identity map for the primary key stored by the many-to-one. The operation falls under the realm of lazyloading even though no SQL is emitted. If you consider that Child may have many related many-to-ones, all of which may already be in the Session, it would be quite wasteful for the ORM to assume that you're going to be working with the object in a detached state and that you need all of them. I'm not sure I see what you're saying here. I've explicitly asked for all children relating to parent and these are correctly queried and loaded. While they are being added to the parent.children list, why not also set each child.parent since this is known? I don't see how this is wasteful, but I may be missing something. I'm not suggesting it should touch relations that I haven't explicitly told it to eagerly load. The likes of Hibernate (yes, it's a very different beast) load both sides of the relation at once. The Session's default assumption is that you're going to be leaving it around while you work with the objects contained, and in that way you interact with the database for as long as you deal with its objects, which represent proxies to the underlying transaction. When objects are detached, for reasons like caching and serialization to other places, normally you'd merge() them back when you want to use them again. So if it were me I'd normally be looking to not be closing the session. I'm closing the session before I forward the objects to the view template in a web application. The template has no business doing database operations, and the controller *should* make sure all DB work has been done. In my case, I know I'll never need to write back to the DB. However, when working with detached objects is necessary, two approaches here you can use. One is a general approach that can load anything related, which is to load them in a @reconstructor. This is illustrated at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/ImmediateLoading .It won't issue any extra SQL for the many-to-ones that are present in the session already. In the specific case you have above, you can also use a trick which is to use contains_eager(): parents = session.query(Parent).options(joinedload(Parent.children), contains_eager(Parent.children, Child.parent)).all() This seems to address my problem directly. It's still a bit redundant, but from my initial tests it seems to solve my problem. the above approach requires that Parent is one of the entities that you're requesting explicitly - i.e. if you were saying joinedload(foo, bar, bat), it would be kind of impossible to target bat.hohos with contains_eager() due to the aliasing. I'm only interested in making sure both sides of the same relation are loaded; so this isn't a problem at all. so let me also back that up, that we've always planned on adding an immediateload option that would just fire off any lazyloader as the query fetches results.A really short patch that adds immediateload() is athttp://www.sqlalchemy.org/trac/ticket/1914 and hopefully will be in 0.6.5 pending further testing. We'll have to support 0.5 for some time, but it's good to know a shortcut is coming. Thanks a lot for your help. Jon -- Jon Siddle, CoreFiling Limited Software Tools Developer http://www.corefiling.com Phone: +44-1865-203192 -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] (Hopefully) simple problem with backrefs not being loaded when eagerloading.
On Sep 13, 2010, at 12:26 PM, Jon Siddle wrote: This relationship is satisfied as you request it, and it works by looking in the current Session's identity map for the primary key stored by the many-to-one. The operation falls under the realm of lazyloading even though no SQL is emitted. If you consider that Child may have many related many-to-ones, all of which may already be in the Session, it would be quite wasteful for the ORM to assume that you're going to be working with the object in a detached state and that you need all of them. I'm not sure I see what you're saying here. I've explicitly asked for all children relating to parent and these are correctly queried and loaded. While they are being added to the parent.children list, why not also set each child.parent since this is known? because you didn't specify it, and it takes a palpable amount of additional overhead to do so as well as a palpable amount of complexity to decide if it should do so based on the logic you'd apply here, when in 99% of the cases it is not needed. I don't see how this is wasteful, but I may be missing something. Child may have parent, foo, bar, bat attached to it, all many-to-ones. Which ones should it assume the user wants to load ? If you are loading 1 rows, and each Child object has three many-to-ones on it, and suppose it takes 120 function calls to look at a relationship, determine the values to send to query._get(), look in the identity map, etc., that is 3 x 1 x 120 = 3.6 million function calls, by default, almost never needed since normally they are all just there in the current session, without the user asking to do so. There is nothing special about Child.parent just because Parent.children is present up the chain.While Hibernate may have decided that the complexity and overhead of adding this decisionmaking was worth it, they have many millions more function calls to burn with the Java VM in any case than we do in Python, and they also have a much higher bar to implement lazyloading since their native class instrumentation is itself a huge beast. In our case it is simply unnecessary. Any such automatic decisionmaking you can be sure quickly leads to many uncomfortable edge cases and thorny situations, causing needless surprise issues for users who don't really need such a behavior in the first place. As I've mentioned, you will have an option to tell it which many-to-ones you'd like it to spend time pre-populating using the upcoming immedateload option. The Session's default assumption is that you're going to be leaving it around while you work with the objects contained, and in that way you interact with the database for as long as you deal with its objects, which represent proxies to the underlying transaction. When objects are detached, for reasons like caching and serialization to other places, normally you'd merge() them back when you want to use them again. So if it were me I'd normally be looking to not be closing the session. I'm closing the session before I forward the objects to the view template in a web application. The template has no business doing database operations, I disagree with this interpretation of abstraction. That's like saying that pushing the button in an elevator means you're now in charge of starting up the elevator motors and instructing them how many feet to travel. The template is not doing database operations, it is working with high level objects that you've sent it, and knows nothing of a database. That these objects may be doing database calls behind the scenes to lazily fetch additional data is known as the proxy pattern. It is one of the most fundamental patterns in object oriented software design. Separation of concerns is about what kinds of source code and awareness of systems live in various places - it has nothing to do operational timing or initiation. The pre-load scenario is certainly valid if you're trying to render from an object graph that loads from a cache and doesn't want to do any additional database calls. But this is strictly an issue of optimization, not correct software design. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.