Hi all, I just discovered something weird when doing concurrency testing with my program. Before writing a simplified test case for it and really figuring out whether its a bug with sqlalchemy (I am using 0.5.5), I wanted to write the scenario here. Basically I was getting the infamous "Set size changed during iteration" error during commit. Multiple threads are not accessing the same session (as was in some of the posts I have read online). Here's what I am doing:
There is a task dispatcher that queries the database, gets a list of tasks to be done, dispatches them to separate threads. Each thread creates its own session using the session_maker and updates the status of a task. There are three kinds of objects in question: Task, TaskSchedule and TaskResult. There is a one-to-many relation between Task and TaskSchedule. There is also a one-to-many relation between Task and TaskResult. Pseudo code: task_dispatcher: for task_schedule in sess.query(TaskSchedule).filter (next_execution_time <= current_time): do_task(task_schedule.task) <-- using relation task_schedule.next_execution_time = some_value sess.commit() (in a new thread) do_task(task): sess := get_session() sess.add(TaskResult(task.id, task_started)) sess.commit() task.perform() sess.merge(TaskResult(task.id, task_finished)) sess.commit() sess.close() Basically, I get the mysterious error for the commit in task_dispatcher intermittently (probably 10 times out of 1000 executions). This brings me to my first question: Is it bad practice to commit a session from within the sess.query() loop? I wanted each task execution time update to be independent of other tasks. If this is bad practice, then should I use sub-transactions for this purpose? After realizing this possible pitfall, I changed the code to commit only after the loop has finished. This lead to another interesting result: None of my TaskResult objects got committed. Instead I saw a lot of ROLLBACK's in the log with no errors associated with them. I tried yet another variation. This is the one that puzzled me the most. I tried eager loading the task relation in the TaskScheduleItem and it worked like a charm: task_dispatcher: for task_schedule in sess.query(TaskScheduleItem).options(eagerload ('task')).filter(next_execution_time <= current_time): do_task(task_schedule.task) <-- using relation task_schedule.next_execution_time = some_value sess.commit() This time, no errors, the TaskResults appeared in database with each commit in the do_task function. Does anyone have an insight into this? Thanks, --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~----------~----~----~----~------~----~------~--~---