I'll probably ask @pgbouncer mailing list if i can use it with advisory_locks per session. If not, even raw sessions will be enough. Some comments inline. Thanks
On Sun, Apr 27, 2014 at 10:07 PM, David G Johnston < david.g.johns...@gmail.com> wrote: > Dorian Hoxha wrote > > Hi list, > > > > I am trying to use postgresql as a queue for long-jobs (max ~4 hours) > > using > > advisory_locks. I can't separate the long-job into sub-jobs. > > > > > > 1. At ultimate-best-case scenario there will be ~100 workers, so no > > web-scale performance required. > > Is there a problem with 100 open sessions (behind a connection > pooler?) > > for hours like in my case? > > The session will commit the progress of the job, but will stay opened > > and hold the advisory_lock till the job is done or it expires (look > 2). > > 2. Is it possible to set a time limit to auto-close an opened session > > that hasn't made a query in a while ? > > So if a worker fails,and the session time-outs, postgresql/pgbouncer > > will close the session and release the lock ? > > Do you understand the difference between a session and a transaction? Your > subject and body indicate either you don't or have a thinko/typo somewhere? > Long-running transactions (idle or otherwise) are a major problem while > long-running idle sessions are simply marginally wasteful (in moderation). > > I'm talking about session advisory locks. That's why i choose advisory instead of "SELECT FOR SHARE",a shared row lock because that looks like it must be inside a transaction, from reading the docs. I'll keep the session open for the duration of the hole job, and create/commit transactions to update the progress. And in the end close the connection and the lock will be released. > Why not just update the job as dispatched when it is being worked on and > then completed when finished? You still would need to handle partial > completion somehow but this way you don't waste live resources during the > long waiting period simply monitoring a lock. Though probably at your > volume this is not that big an issue. > > PostgreSQL proper does not have session timeouts that I am aware of. If a > worker fails it should release its connection and all advisory locks would > be released... > I thought that if a worker failed the session was left open. So maybe i don't need the session-timeout thing. > You reference to a connection pooler in the above doesn't make sense to me; > you'd need persistent connections for this to work (they can be from a pool > but the pool size would have to be 100+). > > The main decision is whether job process state is part of your data model > or > simply an artifact. I'd suggest that it should be part of the model so > state should be managed directly thorough row-level locking and status > fields instead of indirectly through temporary system advisory locks. > I maintain job-status and job-progress on the same row. But i also want that users can cancel the job, by updating 'canceled' column. So everytime i report the progress, i also "RETURNING canceled" to see if the job has been canceled by the user and abort it. > If you want to keep track of active workers you should setup some kind of > heartbeat update query; and maybe depending on how important this is attach > advisory lock info to that heartbeat record so a monitoring process can > check both the pulse table and the system advisory lock for a two data > point > confirmation of activity. > > Will probably do it like this. > David J. > > > > > > -- > View this message in context: > http://postgresql.1045698.n5.nabble.com/Postgresql-the-right-tool-queue-using-advisory-locks-long-transactions-tp5801667p5801670.html > Sent from the PostgreSQL - general mailing list archive at Nabble.com. > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >