I do a lot of processing on large amount of data. The common pattern we follow is:
1. Iterate through a large data set 2. Do some sort of processing (i.e. NLP processing like tokenization, capitalization, regex parsing, ... ) 3. Insert the new result in another table. Right now we are doing something like this: for x in session.query(Foo).yield_per(10000): bar = Bar() bar.hello = x.world.lower() session.add(bar) session.flush() session.commit() This works, not great though. Typically, we will have to wait 30mins - 1hr to see `bar`s being committed. My question is: Is there a way that we can commit as we are iterating without breaking yield_per? If not, what is the recommended way of doing this? **NOTE: There's a lot of answers on Stackoverflow that involves writing custom pagination functions for session.query. Their efficiency and effectiveness has not been benchmarked. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.