[sqlalchemy] Re: Lock table, do things to table, unlock table: Best way?
Unique constraints have worked well. Thanks! On May 29, 1:44 pm, Michael Bayer mike...@zzzcomputing.com wrote: yup On May 29, 2012, at 1:01 PM, Jeff wrote: Thanks Michael, Just to make clear what exactly begin_nested() is contributing: Normal case: session.rollback() goes back to the last session.commit() session.begin_nested() case: session.rollback() goes back to the last session.begin_nested() or session.commit(), whichever occurred last. Correct? On May 28, 11:54 am, Michael Bayer mike...@zzzcomputing.com wrote: An option to add along to the unique constraint, if you expect to get collisions often, is to use a SAVEPOINT so that a process can roll back partially if this particular INSERT fails, then use the row. The Session offers SAVEPOINT via begin_nested(): session.begin_nested() try: session.add(thing_that_may_exist_already) session.commit() # flushes, and commits only the savepoint except exc.IntegrityError: session.rollback() thing_that_may_exist_already = session.query(Thing).filter_by(criteiron).one() the difference between using locks to prevent concurrent dupes versus using constraints and expecting dupes to fail is known as pessimistic versus optimistic locking. On May 28, 2012, at 10:38 AM, Jeff wrote: The unique constraint sounds like a workable solution! I'll implement that with a try/except and report back if that was effective. Thanks! On May 28, 5:43 am, Simon King si...@simonking.org.uk wrote: On Sun, May 27, 2012 at 6:18 PM, Jeff jeffalst...@gmail.com wrote: Thanks, I have indeed spent a lot of time looking at SELECT FOR UPDATE, but as far as I can tell that locks rows that have been selected. That is not helpful in this use case, in which the issue is rows not existing, and then later existing. Am I misunderstanding? On May 27, 11:48 am, A.M. age...@themactionfaction.com wrote: On May 27, 2012, at 1:07 AM, Jeff wrote: I have multiple processes accessing a table. All of these processes want to read a set of rows from the table, and if the rows are not present they will make a calculation and insert the rows themselves. The issue comes where process A does a query to see if the target set of rows is present in the table, and they're not, and then another starts calculating. While it's calculating, process B inserts the rows. Then process A inserts the rows, and now we have two copies of these sets of rows. Bad. You should look at SELECT FOR UPDATE. http://docs.sqlalchemy.org/en/rel_0_7/orm/query.html?highlight=lockmo... Cheers, M Could you put unique constraints on the table so that the second process will get an error when it tries to insert the duplicate rows? It won't prevent you from performing the calculations twice, but at least you won't get the duplicates. Another option would be to write some sort of pending marker into the table, so that subsequent processes know that the result is already being calculated. Simon -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group athttp://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group athttp://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] Re: Lock table, do things to table, unlock table: Best way?
Thanks Michael, Just to make clear what exactly begin_nested() is contributing: Normal case: session.rollback() goes back to the last session.commit() session.begin_nested() case: session.rollback() goes back to the last session.begin_nested() or session.commit(), whichever occurred last. Correct? On May 28, 11:54 am, Michael Bayer mike...@zzzcomputing.com wrote: An option to add along to the unique constraint, if you expect to get collisions often, is to use a SAVEPOINT so that a process can roll back partially if this particular INSERT fails, then use the row. The Session offers SAVEPOINT via begin_nested(): session.begin_nested() try: session.add(thing_that_may_exist_already) session.commit() # flushes, and commits only the savepoint except exc.IntegrityError: session.rollback() thing_that_may_exist_already = session.query(Thing).filter_by(criteiron).one() the difference between using locks to prevent concurrent dupes versus using constraints and expecting dupes to fail is known as pessimistic versus optimistic locking. On May 28, 2012, at 10:38 AM, Jeff wrote: The unique constraint sounds like a workable solution! I'll implement that with a try/except and report back if that was effective. Thanks! On May 28, 5:43 am, Simon King si...@simonking.org.uk wrote: On Sun, May 27, 2012 at 6:18 PM, Jeff jeffalst...@gmail.com wrote: Thanks, I have indeed spent a lot of time looking at SELECT FOR UPDATE, but as far as I can tell that locks rows that have been selected. That is not helpful in this use case, in which the issue is rows not existing, and then later existing. Am I misunderstanding? On May 27, 11:48 am, A.M. age...@themactionfaction.com wrote: On May 27, 2012, at 1:07 AM, Jeff wrote: I have multiple processes accessing a table. All of these processes want to read a set of rows from the table, and if the rows are not present they will make a calculation and insert the rows themselves. The issue comes where process A does a query to see if the target set of rows is present in the table, and they're not, and then another starts calculating. While it's calculating, process B inserts the rows. Then process A inserts the rows, and now we have two copies of these sets of rows. Bad. You should look at SELECT FOR UPDATE. http://docs.sqlalchemy.org/en/rel_0_7/orm/query.html?highlight=lockmo... Cheers, M Could you put unique constraints on the table so that the second process will get an error when it tries to insert the duplicate rows? It won't prevent you from performing the calculations twice, but at least you won't get the duplicates. Another option would be to write some sort of pending marker into the table, so that subsequent processes know that the result is already being calculated. Simon -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group athttp://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] Re: Lock table, do things to table, unlock table: Best way?
The unique constraint sounds like a workable solution! I'll implement that with a try/except and report back if that was effective. Thanks! On May 28, 5:43 am, Simon King si...@simonking.org.uk wrote: On Sun, May 27, 2012 at 6:18 PM, Jeff jeffalst...@gmail.com wrote: Thanks, I have indeed spent a lot of time looking at SELECT FOR UPDATE, but as far as I can tell that locks rows that have been selected. That is not helpful in this use case, in which the issue is rows not existing, and then later existing. Am I misunderstanding? On May 27, 11:48 am, A.M. age...@themactionfaction.com wrote: On May 27, 2012, at 1:07 AM, Jeff wrote: I have multiple processes accessing a table. All of these processes want to read a set of rows from the table, and if the rows are not present they will make a calculation and insert the rows themselves. The issue comes where process A does a query to see if the target set of rows is present in the table, and they're not, and then another starts calculating. While it's calculating, process B inserts the rows. Then process A inserts the rows, and now we have two copies of these sets of rows. Bad. You should look at SELECT FOR UPDATE. http://docs.sqlalchemy.org/en/rel_0_7/orm/query.html?highlight=lockmo... Cheers, M Could you put unique constraints on the table so that the second process will get an error when it tries to insert the duplicate rows? It won't prevent you from performing the calculations twice, but at least you won't get the duplicates. Another option would be to write some sort of pending marker into the table, so that subsequent processes know that the result is already being calculated. Simon -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] Re: Lock table, do things to table, unlock table: Best way?
An option to add along to the unique constraint, if you expect to get collisions often, is to use a SAVEPOINT so that a process can roll back partially if this particular INSERT fails, then use the row. The Session offers SAVEPOINT via begin_nested(): session.begin_nested() try: session.add(thing_that_may_exist_already) session.commit() # flushes, and commits only the savepoint except exc.IntegrityError: session.rollback() thing_that_may_exist_already = session.query(Thing).filter_by(criteiron).one() the difference between using locks to prevent concurrent dupes versus using constraints and expecting dupes to fail is known as pessimistic versus optimistic locking. On May 28, 2012, at 10:38 AM, Jeff wrote: The unique constraint sounds like a workable solution! I'll implement that with a try/except and report back if that was effective. Thanks! On May 28, 5:43 am, Simon King si...@simonking.org.uk wrote: On Sun, May 27, 2012 at 6:18 PM, Jeff jeffalst...@gmail.com wrote: Thanks, I have indeed spent a lot of time looking at SELECT FOR UPDATE, but as far as I can tell that locks rows that have been selected. That is not helpful in this use case, in which the issue is rows not existing, and then later existing. Am I misunderstanding? On May 27, 11:48 am, A.M. age...@themactionfaction.com wrote: On May 27, 2012, at 1:07 AM, Jeff wrote: I have multiple processes accessing a table. All of these processes want to read a set of rows from the table, and if the rows are not present they will make a calculation and insert the rows themselves. The issue comes where process A does a query to see if the target set of rows is present in the table, and they're not, and then another starts calculating. While it's calculating, process B inserts the rows. Then process A inserts the rows, and now we have two copies of these sets of rows. Bad. You should look at SELECT FOR UPDATE. http://docs.sqlalchemy.org/en/rel_0_7/orm/query.html?highlight=lockmo... Cheers, M Could you put unique constraints on the table so that the second process will get an error when it tries to insert the duplicate rows? It won't prevent you from performing the calculations twice, but at least you won't get the duplicates. Another option would be to write some sort of pending marker into the table, so that subsequent processes know that the result is already being calculated. Simon -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] Re: Lock table, do things to table, unlock table: Best way?
Thanks, I have indeed spent a lot of time looking at SELECT FOR UPDATE, but as far as I can tell that locks rows that have been selected. That is not helpful in this use case, in which the issue is rows not existing, and then later existing. Am I misunderstanding? On May 27, 11:48 am, A.M. age...@themactionfaction.com wrote: On May 27, 2012, at 1:07 AM, Jeff wrote: I have multiple processes accessing a table. All of these processes want to read a set of rows from the table, and if the rows are not present they will make a calculation and insert the rows themselves. The issue comes where process A does a query to see if the target set of rows is present in the table, and they're not, and then another starts calculating. While it's calculating, process B inserts the rows. Then process A inserts the rows, and now we have two copies of these sets of rows. Bad. You should look at SELECT FOR UPDATE. http://docs.sqlalchemy.org/en/rel_0_7/orm/query.html?highlight=lockmo... Cheers, M -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] Re: Lock table, do things to table, unlock table: Best way?
Hmmm. It also appears that the 4 step solution I gave above doesn't work consistently. Any ideas as to why not? On May 27, 1:07 am, Jeff jeffalst...@gmail.com wrote: I have multiple processes accessing a table. All of these processes want to read a set of rows from the table, and if the rows are not present they will make a calculation and insert the rows themselves. The issue comes where process A does a query to see if the target set of rows is present in the table, and they're not, and then another starts calculating. While it's calculating, process B inserts the rows. Then process A inserts the rows, and now we have two copies of these sets of rows. Bad. The current solution is to have a process: 1. Lock the table with session.execute(LOCK TABLES table WRITE) 2. Query 3. If need be, calculate and commit 4. Unlock with session.execute(UNLOCK TABLES) Then if another process wants to query while the calculation is happening, it will have to wait until the table unlocks, and it will see the result of the commit. The wait time of the locked table is acceptable. That should work, but is ugly and, worse, requires explicit MySQL code. I understand this could lead to dependency on the backend implementation (ie. If this code was ever used on an sqlite database?). Regardless, I'd like to do this completely within sqlalchemy. Is there a way to do this? It seems like there might be a way to do this with transactions, but I can't parse it. Thoughts? Thanks! -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.