Re: [sqlalchemy] Re: Loading attributes for Transient objects

Michael Bayer Mon, 06 Sep 2010 10:04:10 -0700

On Sep 6, 2010, at 12:01 PM, Kent Bower wrote:

> Fantastic, I will like to look into this change. 
> 
> Since you asked, consider a use case similar to this: we have a RESTfulish 
> web service that accepts a serialized version of a "transfer object" which is 
> passed to the server when a database save should take place.  In this case, 
> an "order" with select relations are serialized and passed.
> 
> For a database save, this will be added to the session after it is "cast" 
> into a sqlalchemy object. Nothing special there. 
> 
> Now for the use case:  the webservice needs to also be invoked while the user 
> is still working on the order. For example, taxes and delivery charges are 
> calculated by the server.  Again, the serialized version of the transfer 
> object is sent to the server and cast into a sqlalchemy object. In this case, 
> however, we have no intention on ever saving the object during this service 
> request. Rather, the sqlalchemy object is transient. Still, to calculate 
> taxes as an example there are a handful of relations that need to be loaded, 
> such as zipcode objects, tax authorities, tax tables, etc.

So if it were me, I'd not be using HTTP in that way, i.e. the "big serialized 
bag of all kinds of stuff".    I'd have made it such that a new session key can 
be established with the web service which uses proper relational storage for 
the pending state.   In fact in my current project I am doing just that, where 
i have a base "OrderData" object that has two subclasses, "PendingOrder" and 
"Order", each of which are stored in distinct tables.  Its not concrete 
inheritance either - "OrderData" is a declarative mixin that uses 
@classproperty to establish the same relationship() objects on each subclass.   
However, I've certainly used HTTP sessions with disk state and such in the 
past, and while I don't prefer heavy reliance on serialization these days, I 
know that serialization patterns are very common.

> 
> The reason for not wanting to disable autoflush is that this same code is 
> (appropriately) invoked whether this object is persistent (from merge()) and 
> part of the save web service or transient and part of the calculate web 
> service (where the object is going to be thrown away).  In the case of being 
> the save, it is important for database consistency thar autoflush remain 
> enabled. 

So you have this transient object, and you want to use it to load various 
information about zipcodes and stuff, but its not database state.  *But*, you 
*have* populated individual foreign key and maybe primary key attributes on it, 
which most certainly represent persistent-centric concepts.    So there has 
been, at some point, some awareness of either this object's future primary key, 
or something has loaded up related many-to-ones and figured out their foreign 
keys and assigned them.   There's definitely no many-to-many collections at 
play since those aren't possible without persistence of the transient object's 
primary key information (unless you're working with totally unconstrained 
tables, in which case, good luck).

So the persistence information is already there.  If you are setting 
order.foreign_key_of_something, why would you not instead set order.something, 
so that order.something is already present in the transient state?  The rule 
here being, "how do i get foo.bar to be 'x'"? answer: "set foo.bar = 'x'" - 
simple right ?   The ORM would be left to do its normal job of worrying about 
foreign keys.

But instead, you're working in reverse.  The ORM has an opinion that if you 
work in reverse like that, it won't block you, but it also isn't going to make 
the guesses and assumptions that would be required for it to act "the same" 
(see the FAQ entry about "foo_id 7" for some rantage on this, you've probably 
already seen it).

I know the answer already to why you're populating 
order.foreign_key_of_something rather than order.something.  You're trying to 
reduce the serialization size, and/or the overhead of merging all that 
serialized data back into a session.   So you're trying to rig the ORM into a 
custom, optimized serialization scheme, a use case that is outside the scope of 
the very simple, single purpose that relationship() is designed for out of the 
box, which is to represent a linkage between classes and persist/restore a 
corresponding linkage between related database rows (since if one side is 
transient, there is only one database row in play).    

But there is good news, if we look at this for what it seems to be, which is an 
optimized serialization scheme.  You should build it that way.   Write custom 
serialization (__getstate__/__setstate__) for your class - if it detects as 
transient, expire all relationship()-based attributes upon __getstate__ - upon 
__setstate__, iterate through all relationship based-attrbutes (using 
mapper.iterate_properties()) and plug them all into query.with_parent() to 
again produce the "pending" attributes.    Basically, take advantage of the 
foreign key/primary key attributes already present to reduce the size of the 
serialization, and load the data back on the other end, transparently.   It's 
up to __setstate__ to figure out transactional context.   You could use 
scoped_session which is the easy way out, or write a custom pickler that 
accepts "session" (I'd go with the latter, more explicit).   Such a recipe 
could even look at the "local_remote_pairs" of each relationship to decide 
which related attributes should be expired, and which should not, based on 
whether or not the necessary FK attributes are present.   It would make a great 
recipe for the wiki.

> 
> Since sqla won't load that for me in the case of transient, I need to load 
> the relation manually (unless you feel like enhancing that as well). 

its not an enhancement - it was a broken behavior that was specifically 
removed.   The transient object has no session, so therefore no SQL can be 
emitted - there's no context established.  

> 
> Now I can manually emulate the obj being persistent with your changes for 
> 
> On Sep 6, 2010, at 10:58 AM, Michael Bayer <mike...@zzzcomputing.com> wrote:
> 
>> 
>> On Sep 6, 2010, at 9:06 AM, Kent wrote:
>> 
>>> with_parent seems to add a join condition.  
>> 
>> OK, so I guess you read the docs which is why you thought it joined and why 
>> you didn't realize it doesn't work for transient.  r20b6ce05f194 changes all 
>> that so that with_parent() accepts transient objects and will do the "look 
>> at the attributes" thing.   The docs are updated as this method does use the 
>> lazy loader SQL mechanism, not a join.
>> 
>> 
>> 
>>> Is there a way to get at
>>> the query object that would be rendered from a lazy load (or what
>>> "subqueryload" would render on the subsequent load), but on a
>>> transient object, if i supply the session?
>>> 
>>> even though not "recommended", can it make sqla believe my transient
>>> object is detached by setting its state key?
>>> 
>>> There are reasons i do not want to add this to the session and
>>> disabling autoflush would also cause problems.
>>> 
>>> 
>>> 
>>> On Sep 3, 9:58 am, Michael Bayer <mike...@zzzcomputing.com> wrote:
>>>> On Sep 3, 2010, at 9:36 AM, Kent wrote:
>>>> 
>>>>> For the case of customerid = '7', that is a simple problem, but when
>>>>> it is a more complex join condition, we only wanted to define this
>>>>> condition in one single place in our application (namely, the orm).
>>>>> That way, if or when that changes, developers don't need to search for
>>>>> other places in the app that needed to manually duplicate the logic of
>>>>> the orm join condition.
>>>> 
>>>>> If I supplied the DBSession to sqla, it would know how to create the
>>>>> proper Query object for this lazyload.  Can you point me in the right
>>>>> direction (even if where you point me is not currently part of the
>>>>> public API)?
>>>> 
>>>> Query has the with_parent() method for this use case.  
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> Thanks again,
>>>>> Kent
>>>> 
>>>>> --
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "sqlalchemy" group.
>>>>> To post to this group, send email to sqlalch...@googlegroups.com.
>>>>> To unsubscribe from this group, send email to 
>>>>> sqlalchemy+unsubscr...@googlegroups.com.
>>>>> For more options, visit this group 
>>>>> athttp://groups.google.com/group/sqlalchemy?hl=en.
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "sqlalchemy" group.
>>> To post to this group, send email to sqlalch...@googlegroups.com.
>>> To unsubscribe from this group, send email to 
>>> sqlalchemy+unsubscr...@googlegroups.com.
>>> For more options, visit this group at 
>>> http://groups.google.com/group/sqlalchemy?hl=en.
>>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "sqlalchemy" group.
>> To post to this group, send email to sqlalch...@googlegroups.com.
>> To unsubscribe from this group, send email to 
>> sqlalchemy+unsubscr...@googlegroups.com.
>> For more options, visit this group at 
>> http://groups.google.com/group/sqlalchemy?hl=en.
>> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To post to this group, send email to sqlalch...@googlegroups.com.
> To unsubscribe from this group, send email to 
> sqlalchemy+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/sqlalchemy?hl=en.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Re: [sqlalchemy] Re: Loading attributes for Transient objects

Reply via email to