Re: [sqlalchemy] sqlalchemy failing to commit somewhere (but fails silently) when trying to update data
one thing to note is that deepcopy() is not going to work. It will copy SQLAlchemy's own accounting information on the object as well and generally cause confusion. The easiest way to insert a lot of data while detecting dupes efficiently is to sort the data, then chunk through it, and for each chunk, pre-load from the database all those records which reside within the range of that chunk of pending data. You put those pre-loaded records in a dictionary and check it for each record. A simple system I use very often is (this isn't chunked, but could be): recs = dict(session.query(MyThing.id, MyThing)) for i, newrec in enumerate(my_incoming_stuff): if newrec['id'] in recs: rec = recs[newrec['id']] rec.data = newrec['data'] else: rec = MyThing(id=newrec['id'], data=newrec['data']) session.add(rec) if i % 1000 == 0: session.flush() session.commit() On Mar 18, 2013, at 1:54 AM, alonn alonis...@gmail.com wrote: Like all CRUD goes, I need to write some data to a table. when I write new data to the table, everything works like charm. the problem starts when I need to write data already existing in the table (actually updating some data with the same primary key). the data just doesn't seem to be written to the table! I started with trying to update the data with session.merge(), but later tried a more brute force approach, of querying for the same primary_key in the table, deleting it and the adding and flushing the changed objects. some where, if the basic add and flush failes the rest doesn't work. I'll be glad for a clue here. The code: def flush(obj_Instance, id): taking care of the sqlalchemy flushing params: Instance: an object Instance to flush into id: the unique object instance id DBSession2.add(obj_Instance) try: try: DBSession2.flush() print (flushed:, str(obj_Instance)) except (FlushError, IntegrityError) as err: DBSession2.rollback() if ('conflicts with persistent instance' in str(err)) or ('Duplicate key was ignored' in str(err)): transaction.begin() #my original slick take: DBSession2.merge(obj_instance) # but after it failed to update correctly I changed to a more brute force approach #DBSession2.flush() #to save the merge #from here on trying to brute force it #saving for further reference - another try newInstance = deepcopy(obj_Instance) print (deleting: %s % id) DBSession2.query(type(obj_Instance)).filter_by(ids = id).delete() DBSession2.flush() #at this point, I was so desperate for this to work I literated the code with flush commands. DBSession2.add(newInstance) DBSession2.flush() return else: raise #handling the case of the same key problem isn't the source of conflicts except Exception as err: # supposed to find out the error type and message # the code doesn't get here, only in real exceptions it was planned to catch, 3 rows in 10,000 uploaded to the db #TODO: make this less general and more specific print str(err) write_log(num=id, msg=some sql or sqlalchemy error use num %s as id identifier with object: %s % (id, obj_Instance.name), timestamp= datetime.now(), errtype=sql error, log=str(err)) DBSession2.rollback() transaction.begin() maybe this try/fail/rollback/merge or delete/insert new pattern is wrong (also I think pythonic - try and ask forgiveness,but that would be for mike to judge) using sqlalchemy 0.7.3 vs mssql 2005 with pyodbc 2.1.11 and tg 2.1 (the transaction manager comes with tg and I think is based transaction) -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] sqlalchemy failing to commit somewhere (but fails silently) when trying to update data
Thanks Michael for the good advice. since I don't this chunking solution won't work for this specific use case (The keys would be hard to sort) would't it be an easier solution just to move transaction.commit() after each flush, so the DBSession.rollback() wouldn't lose existing data in the session? and another thing - is there a simple update contruct? or would session.merge() do? On Monday, March 18, 2013 5:37:34 PM UTC+2, Michael Bayer wrote: one thing to note is that deepcopy() is not going to work. It will copy SQLAlchemy's own accounting information on the object as well and generally cause confusion. The easiest way to insert a lot of data while detecting dupes efficiently is to sort the data, then chunk through it, and for each chunk, pre-load from the database all those records which reside within the range of that chunk of pending data. You put those pre-loaded records in a dictionary and check it for each record. A simple system I use very often is (this isn't chunked, but could be): recs = dict(session.query(MyThing.id, MyThing)) for i, newrec in enumerate(my_incoming_stuff): if newrec['id'] in recs: rec = recs[newrec['id']] rec.data = newrec['data'] else: rec = MyThing(id=newrec['id'], data=newrec['data']) session.add(rec) if i % 1000 == 0: session.flush() session.commit() On Mar 18, 2013, at 1:54 AM, alonn alon...@gmail.com javascript: wrote: Like all CRUD goes, I need to write some data to a table. when I write new data to the table, everything works like charm. the problem starts when I need to write data already existing in the table (actually updating some data with the same primary key). the data just doesn't seem to be written to the table! I started with trying to update the data with session.merge(), but later tried a more brute force approach, of querying for the same primary_key in the table, deleting it and the adding and flushing the changed objects. some where, if the basic add and flush failes the rest doesn't work. I'll be glad for a clue here. The code: def flush(obj_Instance, id): taking care of the sqlalchemy flushing params: Instance: an object Instance to flush into id: the unique object instance id DBSession2.add(obj_Instance) try: try: DBSession2.flush() print (flushed:, str(obj_Instance)) except (FlushError, IntegrityError) as err: DBSession2.rollback() if ('conflicts with persistent instance' in str(err)) or ('Duplicate key was ignored' in str(err)): transaction.begin() #my original slick take: DBSession2.merge(obj_instance) # but after it failed to update correctly I changed to a more brute force approach #DBSession2.flush() #to save the merge #from here on trying to brute force it #saving for further reference - another try newInstance = deepcopy(obj_Instance) print (deleting: %s % id) DBSession2.query(type(obj_Instance)).filter_by(ids = id).delete() DBSession2.flush() #at this point, I was so desperate for this to work I literated the code with flush commands. DBSession2.add(newInstance) DBSession2.flush() return else: raise #handling the case of the same key problem isn't the source of conflicts except Exception as err: # supposed to find out the error type and message # the code doesn't get here, only in real exceptions it was planned to catch, 3 rows in 10,000 uploaded to the db #TODO: make this less general and more specific print str(err) write_log(num=id, msg=some sql or sqlalchemy error use num %s as id identifier with object: %s % (id, obj_Instance.name), timestamp= datetime.now(), errtype=sql error, log=str(err)) DBSession2.rollback() transaction.begin() maybe this try/fail/rollback/merge or delete/insert new pattern is wrong (also I think pythonic - try and ask forgiveness,but that would be for mike to judge) using sqlalchemy 0.7.3 vs mssql 2005 with pyodbc 2.1.11 and tg 2.1 (the transaction manager comes with tg and I think is based transaction) -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com javascript:. To post to this group, send email to sqlal...@googlegroups.comjavascript: . Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the
Re: [sqlalchemy] sqlalchemy failing to commit somewhere (but fails silently) when trying to update data
if you want to go that approach I suggest you use begin_nested() which will produce a SAVEPOINT, local to a certain scope within the transaction. you'll have better results with 0.8 using this approach. On Mar 18, 2013, at 1:54 PM, alonn alonis...@gmail.com wrote: Thanks Michael for the good advice. since I don't this chunking solution won't work for this specific use case (The keys would be hard to sort) would't it be an easier solution just to move transaction.commit() after each flush, so the DBSession.rollback() wouldn't lose existing data in the session? and another thing - is there a simple update contruct? or would session.merge() do? On Monday, March 18, 2013 5:37:34 PM UTC+2, Michael Bayer wrote: one thing to note is that deepcopy() is not going to work. It will copy SQLAlchemy's own accounting information on the object as well and generally cause confusion. The easiest way to insert a lot of data while detecting dupes efficiently is to sort the data, then chunk through it, and for each chunk, pre-load from the database all those records which reside within the range of that chunk of pending data. You put those pre-loaded records in a dictionary and check it for each record. A simple system I use very often is (this isn't chunked, but could be): recs = dict(session.query(MyThing.id, MyThing)) for i, newrec in enumerate(my_incoming_stuff): if newrec['id'] in recs: rec = recs[newrec['id']] rec.data = newrec['data'] else: rec = MyThing(id=newrec['id'], data=newrec['data']) session.add(rec) if i % 1000 == 0: session.flush() session.commit() On Mar 18, 2013, at 1:54 AM, alonn alon...@gmail.com wrote: Like all CRUD goes, I need to write some data to a table. when I write new data to the table, everything works like charm. the problem starts when I need to write data already existing in the table (actually updating some data with the same primary key). the data just doesn't seem to be written to the table! I started with trying to update the data with session.merge(), but later tried a more brute force approach, of querying for the same primary_key in the table, deleting it and the adding and flushing the changed objects. some where, if the basic add and flush failes the rest doesn't work. I'll be glad for a clue here. The code: def flush(obj_Instance, id): taking care of the sqlalchemy flushing params: Instance: an object Instance to flush into id: the unique object instance id DBSession2.add(obj_Instance) try: try: DBSession2.flush() print (flushed:, str(obj_Instance)) except (FlushError, IntegrityError) as err: DBSession2.rollback() if ('conflicts with persistent instance' in str(err)) or ('Duplicate key was ignored' in str(err)): transaction.begin() #my original slick take: DBSession2.merge(obj_instance) # but after it failed to update correctly I changed to a more brute force approach #DBSession2.flush() #to save the merge #from here on trying to brute force it #saving for further reference - another try newInstance = deepcopy(obj_Instance) print (deleting: %s % id) DBSession2.query(type(obj_Instance)).filter_by(ids = id).delete() DBSession2.flush() #at this point, I was so desperate for this to work I literated the code with flush commands. DBSession2.add(newInstance) DBSession2.flush() return else: raise #handling the case of the same key problem isn't the source of conflicts except Exception as err: # supposed to find out the error type and message # the code doesn't get here, only in real exceptions it was planned to catch, 3 rows in 10,000 uploaded to the db #TODO: make this less general and more specific print str(err) write_log(num=id, msg=some sql or sqlalchemy error use num %s as id identifier with object: %s % (id, obj_Instance.name), timestamp= datetime.now(), errtype=sql error, log=str(err)) DBSession2.rollback() transaction.begin() maybe this try/fail/rollback/merge or delete/insert new pattern is wrong (also I think pythonic - try and ask forgiveness,but that would be for mike to judge) using sqlalchemy 0.7.3 vs mssql 2005 with pyodbc 2.1.11 and tg 2.1 (the transaction manager comes with tg and I think is based transaction) -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com. To post to this