[sqlalchemy] Strange behaviour in multi-process environment
Hi, Strange behaviour with sqa in multi-process environment... already posted on StackOverflow for a web app but still missing some understanding so posting here. I've created an application where my sqa calls are encapsulated: My API's methods always do the same kind of stuff: 1- request a session 2- do my stuff - mainly requesting the db through session.query(...) 3- eventually: - modify some mapped object attributes - and write to db (session.commit()). Several consecutive calls to my API's methods return the same session object as long as I'm in the same thread which is the required behaviour. Some commits happen but the session is always the same (expire_on_commit = False) # this factory is thread safe: a session object is returned (always the same) to the # caller. If called from another thread, the returned session object will be different session_factory = sessionmaker(bind=engine, expire_on_commit=False) session_maker = scoped_session(session_factory) # to get a session: session = session_maker() If I fork another thread, using the API is still safe since the first step (getting a session) will return a new session instance (due to scoped_session) Here's my problem: I have two processes which such an architecture (based on this API). While P1 is writing to DB (session.commit()), P2 is reading from it (session.query(...)). The problem is that P2 does not see the DB change and has to poll several times before getting it. I'm actually requesting the DB through session.query(...) calls from P2 but nothing happens. *Sometimes it works though! :-(* Don't know why (did not change the code) - Over 10 times, it will work 3 times... If I change my API to close the session after every commit, P2 sees the change as if commiting was not actually writing to disk and closing the session did it. I do not understand this behavour as the sqa documentation encourages to create a global session in a thread a work with it (which is done with the scoped_session mechanism) My configuration is : Linux OS, SQA 0.8.4 and a ram based db (/dev/shm/my_db.sqlite but the problem is still there with regular disk based db on my ~/my_db.sqlite) Thanks a lot for your time, Pierre -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
[sqlalchemy] Strange behaviour in multi-process environment
Hi, Strange behaviour with sqa in multi-process environment... already posted on StackOverflowhttp://stackoverflow.com/questions/21109794/delayed-change-using-sqlalchemyfor a web app but still missing some understanding so posting here. I've created an application where my sqa calls are encapsulated: My API's methods always do the same kind of stuff: 1- request a session 2- do my stuff - mainly requesting the db through session.query(...) 3- eventually: - modify some mapped object attributes - and write to db (session.commit()). Several consecutive calls to my API's methods return the same session object as long as I'm in the same thread which is the required behaviour. Some commits happen but the session is always the same (expire_on_commit = False) # this factory is thread safe: a session object is returned (always the same) to the # caller. If called from another thread, the returned session object will be different session_factory = sessionmaker(bind=engine, expire_on_commit=False) session_maker = scoped_session(session_factory) # to get a session: session = session_maker() If I fork another thread, using the API is still safe since the first step (getting a session) will return a new session instance (due to scoped_session) Here's my problem: I have two processes which such an architecture (based on this API). While P1 is writing to DB (session.commit()), P2 is reading from it (session.query(...)). The problem is that P2 does not see the DB change and has to poll several times before getting it. I'm actually requesting the DB through session.query(...) calls from P2 but nothing happens. *Sometimes it works though! :-(* Don't know why (did not change the code) - Over 10 times, it will work 3 times... If I change my API to close the session after every commit, P2 sees the change as if commiting was not actually writing to disk and closing the session did it. I do not understand this behavour as the sqa documentation encourages to create a global session in a thread a work with it (which is done with the scoped_session mechanism) My configuration is : Linux OS, SQA 0.8.4 and a ram based db (/dev/shm/my_db.sqlite but the problem is still there with regular disk based db on my ~/my_db.sqlite) Thanks a lot for your time, Pierre -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Strange behaviour in multi-process environment
i was having this kind of problem while using a multi-threaded app, but with a postgres backend. in postgres, with high-concurrency, i was expecting this kind of behaviour, so i had to implement some simple semaphores to make it work properly -- sqlalchemy is not to blame if something changes in your database while one end of a software creates a record that the another doesn't knows about when trying to operate on. strangely, though, was this to happen with sqlite. are you using it in synchronous mode? journaling is enabled? are you using the default pool for connection? best regards, richard. On 01/20/2014 02:12 PM, pr64 wrote: Hi, Strange behaviour with sqa in multi-process environment... already posted on StackOverflow http://stackoverflow.com/questions/21109794/delayed-change-using-sqlalchemy for a web app but still missing some understanding so posting here. I've created an application where my sqa calls are encapsulated: My API's methods always do the same kind of stuff: 1- request a session 2- do my stuff - mainly requesting the db through session.query(...) 3- eventually: - modify some mapped object attributes - and write to db (session.commit()). Several consecutive calls to my API's methods return the same session object as long as I'm in the same thread which is the required behaviour. Some commits happen but the session is always the same (expire_on_commit = False) # this factory is thread safe: a session object is returned (always the same) to the # caller. If called from another thread, the returned session object will be different session_factory = sessionmaker(bind=engine, expire_on_commit=False) session_maker = scoped_session(session_factory) # to get a session: session = session_maker() If I fork another thread, using the API is still safe since the first step (getting a session) will return a new session instance (due to scoped_session) Here's my problem: I have two processes which such an architecture (based on this API). While P1 is writing to DB (session.commit()), P2 is reading from it (session.query(...)). The problem is that P2 does not see the DB change and has to poll several times before getting it. I'm actually requesting the DB through session.query(...) calls from P2 but nothing happens. *Sometimes it works though! :-(* Don't know why (did not change the code) - Over 10 times, it will work 3 times... If I change my API to close the session after every commit, P2 sees the change as if commiting was not actually writing to disk and closing the session did it. I do not understand this behavour as the sqa documentation encourages to create a global session in a thread a work with it (which is done with the scoped_session mechanism) My configuration is : Linux OS, SQA 0.8.4 and a ram based db (/dev/shm/my_db.sqlite but the problem is still there with regular disk based db on my ~/my_db.sqlite) Thanks a lot for your time, Pierre -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Strange behaviour in multi-process environment
On Mon, Jan 20, 2014 at 1:51 PM, Richard Gerd Kuesters rich...@humantech.com.br wrote: i was having this kind of problem while using a multi-threaded app, but with a postgres backend. in postgres, with high-concurrency, i was expecting this kind of behaviour, so i had to implement some simple semaphores to make it work properly -- sqlalchemy is not to blame if something changes in your database while one end of a software creates a record that the another doesn't knows about when trying to operate on. I think it may have something to do with the implicit transaction the session opens when manipulated. P2 may be holding on to an older snapshot. I'd suggest committing right before checking to see whether the records are there in P2, just as an experiment. The right behavior would be to commit after any kind of session manipulation, though that's trickier (lots of things count as implicit session manipulations). -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Strange behaviour in multi-process environment
Thanks for your comments. I'm using sqlite in default mode (which is synchronous) and here's my engine configuration. engine = create_engine('sqlite:///my_db.sqlite', connect_args={'detect_types': sqlite3.PARSE_DECLTYPES| sqlite3.PARSE_COLNAMES}, native_datetime=True, poolclass=NullPool, convert_unicode=True ) metadata.create_all(engine) # this factory is thread safe: a session object is returned (always the same) to the # caller. If called from another thread, the returned session object will be different session_factory = sessionmaker(bind=engine, expire_on_commit=False) self.session_maker = scoped_session(session_factory) What I cannot understand is that session.commit() in P1 does not seem to actually write on disk as my reader process (P2) queries the db, ie: P2 does not rely on a value stored in its own session: it queries the db to get something up to date (via session.query(...)) When explicitely closing the session just after the commit (session.close() just after the session.commit()), everything works. If a session.close() is needed to actually write to db, I cannont understand the neet to commit... I may be missing something... On Monday, January 20, 2014 5:56:31 PM UTC+1, Klauss wrote: On Mon, Jan 20, 2014 at 1:51 PM, Richard Gerd Kuesters ric...@humantech.com.br javascript: wrote: i was having this kind of problem while using a multi-threaded app, but with a postgres backend. in postgres, with high-concurrency, i was expecting this kind of behaviour, so i had to implement some simple semaphores to make it work properly -- sqlalchemy is not to blame if something changes in your database while one end of a software creates a record that the another doesn't knows about when trying to operate on. I think it may have something to do with the implicit transaction the session opens when manipulated. P2 may be holding on to an older snapshot. I'd suggest committing right before checking to see whether the records are there in P2, just as an experiment. The right behavior would be to commit after any kind of session manipulation, though that's trickier (lots of things count as implicit session manipulations). -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Strange behaviour in multi-process environment
heavely awkward question: you said that the error occurs in memory and in disk. is your disk a ssd? trim enabled, noatime in fstab, deadline as scheduler. any of these? On 01/20/2014 03:05 PM, pr64 wrote: Thanks for your comments. I'm using sqlite in default mode (which is synchronous) and here's my engine configuration. engine = create_engine('sqlite:///my_db.sqlite', connect_args={'detect_types': sqlite3.PARSE_DECLTYPES| sqlite3.PARSE_COLNAMES}, native_datetime=True, poolclass=NullPool, convert_unicode=True ) metadata.create_all(engine) # this factory is thread safe: a session object is returned (always the same) to the # caller. If called from another thread, the returned session object will be different session_factory = sessionmaker(bind=engine, expire_on_commit=False) self.session_maker = scoped_session(session_factory) What I cannot understand is that session.commit() in P1 does not seem to actually write on disk as my reader process (P2) queries the db, ie: P2 does not rely on a value stored in its own session: it queries the db to get something up to date (via session.query(...)) When explicitely closing the session just after the commit (session.close() just after the session.commit()), everything works. If a session.close() is needed to actually write to db, I cannont understand the neet to commit... I may be missing something... On Monday, January 20, 2014 5:56:31 PM UTC+1, Klauss wrote: On Mon, Jan 20, 2014 at 1:51 PM, Richard Gerd Kuesters ric...@humantech.com.br javascript: wrote: i was having this kind of problem while using a multi-threaded app, but with a postgres backend. in postgres, with high-concurrency, i was expecting this kind of behaviour, so i had to implement some simple semaphores to make it work properly -- sqlalchemy is not to blame if something changes in your database while one end of a software creates a record that the another doesn't knows about when trying to operate on. I think it may have something to do with the implicit transaction the session opens when manipulated. P2 may be holding on to an older snapshot. I'd suggest committing right before checking to see whether the records are there in P2, just as an experiment. The right behavior would be to commit after any kind of session manipulation, though that's trickier (lots of things count as implicit session manipulations). -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Strange behaviour in multi-process environment
On Jan 20, 2014, at 11:12 AM, pr64 pierrerot...@gmail.com wrote: Hi, Strange behaviour with sqa in multi-process environment... already posted on StackOverflow for a web app but still missing some understanding so posting here. I've created an application where my sqa calls are encapsulated: My API's methods always do the same kind of stuff: 1- request a session 2- do my stuff - mainly requesting the db through session.query(...) 3- eventually: - modify some mapped object attributes - and write to db (session.commit()). Several consecutive calls to my API's methods return the same session object as long as I'm in the same thread which is the required behaviour. Some commits happen but the session is always the same (expire_on_commit = False) # this factory is thread safe: a session object is returned (always the same) to the # caller. If called from another thread, the returned session object will be different session_factory = sessionmaker(bind=engine, expire_on_commit=False) session_maker = scoped_session(session_factory) # to get a session: session = session_maker() If I fork another thread, using the API is still safe since the first step (getting a session) will return a new session instance (due to scoped_session) First of all, “fork another thread”, I’m assuming you mean, “fork a child process”. So in that case, you should be aware that no DBAPI I’m familiar with is consistently safe in which to move a connection object over the process boundary. If a new process is created, the engine used in that process must have an empty connection pool, meaning you’re either using NullPool or you’re creating a new engine in the new process. Any Session object that has begun a transaction in the parent process is absolutely not a candidate for continued use within the child process as it is already holding onto a connection. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] Strange behaviour in multi-process environment
On Jan 20, 2014, at 3:59 PM, pr64 pierrerot...@gmail.com wrote: OK, sorry for my explanation which is not right. I launch two separate processes from the command line. Each is importing my API and therefore creates its own connection to the sqlite database. Commiting in process 1 should be visible from process 2. The problem I have is that the change is seen seconds later in process 2. No session is shared, each process having its own. What I can observe is that if I close session just after commiting in process 1, then process 2 sees the change as expected. If the session is not closed in process 1 (just commiting) then, the change is not seen in process 2... :-( this sounds like basic isolation behavior. If process 2 is still running in a transaction that was started before process 1 committed, then the database isolation behavior would determine what can be seen in process 2 within that transaction. SQLite’s concurrency behavior is not something I have a deep understanding of, as it is file-based and more simplistic than a modern MVCC-style database, it has some complicated locking and journaling behavior described at http://sqlite.org/lang_transaction.html. The answer to your question is likely described within this documentation. I've got a an OrmManager class: class OrmManager: def __init__(self, database, metadata, echo=False): self.database = database engine = create_engine('sqlite:///' + database, echo=echo, connect_args={'detect_types': sqlite3.PARSE_DECLTYPES| sqlite3.PARSE_COLNAMES}, native_datetime=True, poolclass=NullPool, convert_unicode=True ) metadata.create_all(engine) # this factory is thread safe: a session object is returned (always the same) to the # caller. If called from another thread, the returned session object will be different session_factory = sessionmaker(bind=engine, expire_on_commit=False) self.session = scoped_session(session_factory) def get_session(self): session = self.session() return session and in P1 and P2, I instantiate it: orm_mgr = OrmManager(database=path/to/my/.sqlite/file, metadata=METADATA) session = orm_mgr.get_session() # do some stuff here session.commit() On Monday, January 20, 2014 5:12:58 PM UTC+1, pr64 wrote: Hi, Strange behaviour with sqa in multi-process environment... already posted on StackOverflow for a web app but still missing some understanding so posting here. I've created an application where my sqa calls are encapsulated: My API's methods always do the same kind of stuff: 1- request a session 2- do my stuff - mainly requesting the db through session.query(...) 3- eventually: - modify some mapped object attributes - and write to db (session.commit()). Several consecutive calls to my API's methods return the same session object as long as I'm in the same thread which is the required behaviour. Some commits happen but the session is always the same (expire_on_commit = False) # this factory is thread safe: a session object is returned (always the same) to the # caller. If called from another thread, the returned session object will be different session_factory = sessionmaker(bind=engine, expire_on_commit=False) session_maker = scoped_session(session_factory) # to get a session: session = session_maker() If I fork another thread, using the API is still safe since the first step (getting a session) will return a new session instance (due to scoped_session) Here's my problem: I have two processes which such an architecture (based on this API). While P1 is writing to DB (session.commit()), P2 is reading from it (session.query(...)). The problem is that P2 does not see the DB change and has to poll several times before getting it. I'm actually requesting the DB through session.query(...) calls from P2 but nothing happens. Sometimes it works though! :-( Don't know why (did not change the code) - Over 10 times, it will work 3 times... If I change my API to close the session after every commit, P2 sees the change as if commiting was not actually writing to disk and closing the session did it. I do not understand this behavour as the sqa documentation encourages to create a global session in a thread a work with it (which is done with the scoped_session mechanism) My configuration is : Linux OS, SQA 0.8.4 and a ram based db (/dev/shm/my_db.sqlite but the problem is still there with regular disk based db on my ~/my_db.sqlite) Thanks a lot for your time, Pierre -- You received this message because you are subscribed to the Google
Re: [sqlalchemy] Strange behaviour in multi-process environment
well, that's why I asked about his FS config earlier today. I got some strange behaviors from sqlite using ssd drives with some specific set of options for the filesystem; but I'm also not using the latest version of sqlite ... 3.7.9 here, as of ubuntu 12.04 lts. I really didn't narrowed it down because I was using a dev machine, not production -- which is read only and doesn't have concurrency. best regards, richard. Em 2014-01-20 19:33, Michael Bayer escreveu: On Jan 20, 2014, at 3:59 PM, pr64 pierrerot...@gmail.com [6] wrote: OK, sorry for my explanation which is not right. I launch two separate processes from the command line. Each is importing my API and therefore creates its own connection to the sqlite database. Commiting in process 1 should be visible from process 2. The problem I have is that the change is seen seconds later in process 2. No session is shared, each process having its own. What I can observe is that if I close session just after commiting in process 1, then process 2 sees the change as expected. If the session is not closed in process 1 (just commiting) then, the change is not seen in process 2... :-( this sounds like basic isolation behavior. If process 2 is still running in a transaction that was started before process 1 committed, then the database isolation behavior would determine what can be seen in process 2 within that transaction. SQLite's concurrency behavior is not something I have a deep understanding of, as it is file-based and more simplistic than a modern MVCC-style database, it has some complicated locking and journaling behavior described at http://sqlite.org/lang_transaction.html [7]. The answer to your question is likely described within this documentation. I've got a an OrmManager class: class OrmManager: def __init__(self, database, metadata, echo=False): self.database = database engine = create_engine('sqlite:///' + database, echo=echo, connect_args={'detect_types': sqlite3.PARSE_DECLTYPES| sqlite3.PARSE_COLNAMES}, native_datetime=True, poolclass=NullPool, convert_unicode=True ) metadata.create_all(engine) # this factory is thread safe: a session object is returned (always the same) to the # caller. If called from another thread, the returned session object will be different session_factory = sessionmaker(bind=engine, expire_on_commit=False) self.session = scoped_session(session_factory) def get_session(self): session = self.session() return session and in P1 and P2, I instantiate it: orm_mgr = OrmManager(database=path/to/my/.sqlite/file, metadata=METADATA) session = orm_mgr.get_session() # do some stuff here session.commit() On Monday, January 20, 2014 5:12:58 PM UTC+1, pr64 wrote: Hi, Strange behaviour with sqa in multi-process environment... already posted on StackOverflow [1] for a web app but still missing some understanding so posting here. I've created an application where my sqa calls are encapsulated: My API's methods always do the same kind of stuff: 1- request a session 2- do my stuff - mainly requesting the db through session.query(...) 3- eventually: - modify some mapped object attributes - and write to db (session.commit()). Several consecutive calls to my API's methods return the same session object as long as I'm in the same thread which is the required behaviour. Some commits happen but the session is always the same (expire_on_commit = False) # this factory is thread safe: a session object is returned (always the same) to the # caller. If called from another thread, the returned session object will be different session_factory = sessionmaker(bind=engine, expire_on_commit=False) session_maker = scoped_session(session_factory) # to get a session: session = session_maker() If I fork another thread, using the API is still safe since the first step (getting a session) will return a new session instance (due to scoped_session) Here's my problem: I have two processes which such an architecture (based on this API). While P1 is writing to DB (session.commit()), P2 is reading from it (session.query(...)). The problem is that P2 does not see the DB change and has to poll several times before getting it. I'm actually requesting the DB through session.query(...) calls from P2 but nothing happens. SOMETIMES IT WORKS THOUGH! :-( Don't know why (did not change the code) - Over 10 times, it will work 3 times... If I change my API to close the session after every commit, P2 sees the change as if commiting was not actually writing to disk and closing the session did it. I do not understand this behavour as the sqa documentation encourages to create a global session in a thread a work with it (which is done with the scoped_session mechanism) My configuration is : Linux OS, SQA 0.8.4 and a ram based db (/dev/shm/my_db.sqlite but the problem is still there with regular disk based db on my ~/my_db.sqlite) Thanks a lot for your time, Pierre -- You received this message because