Re: [sqlalchemy] Properly handling sessions in rollback

Michael Bayer Sun, 13 Nov 2011 08:15:36 -0800

On Nov 13, 2011, at 7:59 AM, Vlad K. wrote:

> 
> Hi.
> 
> Imagine the following scenario:
> 
> 
> 
> session = DBSession()
> readonly_model = session.query(ReadOnlyModel).get(id)
> 
> # Readonly means the model will NOT have its data changed in the life of the 
> transaction(s).
> 
> method_one(readonly_model.readonly_data, param_1, param_2, ...)
> method_two(readonly_model.readonly_data, param_3, param_4, ...)
> 
> session.commit()
> 
> 
> 
> Now, the code here is the "caller" and the methods one and two are logically 
> separate, meaning they do not know anything about the caller, nor the caller 
> knows anything about the internal states and processing involved in the 
> methods.
> 
> Suppose a transaction fails in one or both methods, say an IntegrityError. 
> The methods know how to handle this, catch the exceptions and adapt 
> accordingly. They manually do session.rollback() and proceed within new 
> transaction.
> 
> The problem is, if this happens in method_one, then readonly_model is expired 
> and has no readonly_model.readonly_data when method two is called.


> From what I've read in the docs, I am supposed to do session.refresh(),

No that's not required at all.  All the objects that are still referenced 
outside the session, stay present in the session and will reload their data 
when accessed.  

What you will lose is whatever pending changes were present in method one, 
whatever data was flushed but not committed.

But when you say method one "proceeds within a new transaction" - doesn't that 
imply that it re-tried its full series of steps, so that the required state is 
apparent at the end of method one ?    if method one is where the data is being 
set up for method two, it of course would have to re-establish that state for 
which method two is dependent.

Alternatively, if the data upon which method two is dependent on is set up by 
the caller, then the caller should be catching the failure of method one, and 
retrying.  A transaction by definition is an atomic unit - all steps must 
proceed, or none.  if the caller were calling methods 1, 2, 3, 4 , and then 3 
failed, the rollback would require that all of 1, 2, 3 be called again.

> but that means the caller must know what happened within the methods one and 
> two (rollback happened).

I would think that's unavoidable.  If the caller is controlling the scope of 
the transaction, it's best not to mix that concept within the callees.    If 
this were method 1, 2, 3, 4, then 3 rolled back, the caller would have to be 
responsible for calling those methods again...not clear here if method one 
calls method two, or the caller is calling method one which then calls method 
two ? in any case you need to start from the beginning no matter what.

> 
> One way would be to extract data from the readonly_model:
> 
> readonly_data = readonly_model.readonly_data
> 
> method_one(readonly_data, param_1, param_2, ...)

is the issue here attempting to avoid redundant SELECT of the data ?   
readonly_model.readonly_data will otherwise refresh itself when you access it.

> ...
> 
> Now, I am certain that this readonly data will NOT change between two 
> transactions, 100% sure it will never happen in the life of entire request / 
> all transactions involved in the process, so basically extracting it from the 
> session/model is safe, within the logic of the code.

if you're just looking for "high performance when the rollback happens", then 
yes, perhaps you can read this readonly data into a separate session, then 
merge() it into the possibly-failing session using merge(..., load=False), like 
a cache, then the caller can re-load that information in when the transaction 
fails.    This is not to say the caller needs to "know" about method one, or 
method two, but you're building an elaborate system here where transactions can 
be "tried again" (already tricky) and then on top of that can re-install some 
data that was loaded in the failed transaction, basically an ad-hoc caching 
system - so there really is an operational framework occurring here and its the 
caller who would know that, or some system you would inject between caller and 
those methods which can mediate things like this.

> 
> Any suggestions/advices about this? Am I doing something completely wrong?

if it were me, I'd never be even trying to "redo" an operation that rolled 
back.    99% of the time if my app has to rollback a transaction in the middle 
of an operation, that's a bug in the application no different from a crash.

If I absolutely needed to use the pattern of "try to insert a record with a 
key, it might fail because it's unique", I'd use Session.begin_nested() so that 
the method in question in effect uses SAVEPOINT, gets its own transactional 
state that doesn't affect things outside of it.    


> What if the readonly_data was not 100% sure not to change? Is my only option 
> to have the caller somehow know there was rollback involved and refresh the 
> session?

it depends on who is retrying what and if your goal is just so that the correct 
state is still present (if it wasn't mutated in the transaction, no action 
should be needed) or if its also so that SELECTs don't get emitted twice (in 
which case, ad-hoc caching solution, why do you have the need for transactions 
to fail yet not be slightly slower on recovery).




-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Re: [sqlalchemy] Properly handling sessions in rollback

Reply via email to