[sqlalchemy] Re: INSERT IGNORE
ive added ticket #804 for this. On Oct 1, 2007, at 5:35 PM, Jim Musil wrote: I'm not seeing the ability to use the INSERT IGNORE ... syntax. Is this available anywhere? Jim Musil --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: limit=bindparam(result_limit)
ive added ticket # 805 for the LIMIT/OFFSET bind parameter feature. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: INSERT IGNORE
we could potentially support a mysql_ignore keyword argument to the Insert() construct in order to achieve this (we've been adding a few postgres flags as of late). In the interim you'd have to use textual clauses for this. On Oct 1, 2007, at 5:35 PM, Jim Musil wrote: I'm not seeing the ability to use the INSERT IGNORE ... syntax. Is this available anywhere? Jim Musil --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] non-blocking row locks?
Hi, I am attempting to implement a job queue in a postgres database. There is a simple job table in which each row represents a job to be run. There are multiple dispatcher threads that pull jobs off the queue and run them. I need concurrency control to prevent multiple threads from dispatching the same job. I think the most elegant place to do this is in the database itself using row locking (rather than with OS or application concurrency control mechanisms, which would restrict the code to a single process or host). I can get and lock the next job using job = session.query(Job).with_lockmode('update').first() However, another thread running the same query would presumably block until the first releases the lock. Is there a non-blocking update mode, such that the second thread returns immediately (and I can look for a different job), or some way for the query to exclude locked jobs? Apologies if this is a sqlalchemy 101 (or SQL 101) question, I'm new to it all and I've not been able to find answers via FAQs/google. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: non-blocking row locks?
On Oct 3, 2007, at 10:40 AM, sacha wrote: Hi, I am attempting to implement a job queue in a postgres database. There is a simple job table in which each row represents a job to be run. There are multiple dispatcher threads that pull jobs off the queue and run them. I need concurrency control to prevent multiple threads from dispatching the same job. I think the most elegant place to do this is in the database itself using row locking (rather than with OS or application concurrency control mechanisms, which would restrict the code to a single process or host). I can get and lock the next job using job = session.query(Job).with_lockmode('update').first() However, another thread running the same query would presumably block until the first releases the lock. Is there a non-blocking update mode, such that the second thread returns immediately (and I can look for a different job), or some way for the query to exclude locked jobs? Apologies if this is a sqlalchemy 101 (or SQL 101) question, I'm new to it all and I've not been able to find answers via FAQs/google. I usually take a low-tech approach to this problem and update the rows which I want to process with a status flag, such as IN PROGRESS. Subsequent queries for job rows by other threads query for rows which have QUEUED as their status flag, thereby ignoring the IN PROGRESS rows. that way nothing is locked outside of the span of single short-running transaction. i.e. BEGIN SELECT * FROM jobs WHERE status='QUEUED' FOR UPDATE UPDATE jobs SET status='IN PROGRESS' WHERE status='QUEUED' COMMIT if you want just one job, then just update the WHERE criterion of the UPDATE statement accordingly to match the job(s) you are actually going to process (or just use an ORM flush if youre using the ORM). with postgres 8.2 (and the latest trunk of 0.4) you can even do a RETURNING and get the whole thing in one query: UPDATE jobs SET status='IN PROGRESS' WHERE status='QUEUED' RETURNING * when jobs are finished I usually mark them as COMPLETE, that way you get a log output of job history as a bonus. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] 04ormtutorial: IntegrityError: (IntegrityError) 1062, Duplicate entry
Hi! I'm using tutorial http://www.sqlalchemy.org/docs/04/ormtutorial.html#datamapping_manytomany When I create and save 1st object, all works fine. But when I save 2nd post-object with the SAME KEYWORD: cut lang=python wendy = session.query(User).filter_by(name='wendy').one() post = BlogPost(Wendy's Blog Post #2, This is a test #2, wendy) post.keywords.append(Keyword('wendy')) ** wendy already exists post.keywords.append(Keyword('2ndpost')) /cut I got Exception: ... sqlalchemy.exceptions.IntegrityError: (IntegrityError) (1062, Duplicate entry 'wendy' for key 2) u'INSERT INTO keywords (keyword) VALUES (%s)' ['wendy'] How can I avoid that? How to use old keyword as object?? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] 04ormtutorial: IntegrityError: (IntegrityError) 1062, Duplicate entry
Hi! I'm using tutorial http://www.sqlalchemy.org/docs/04/ormtutorial.html#datamapping_manytomany When I create and save 1st object, all works fine. But when I save 2nd post-object with the SAME KEYWORD: cut lang=python wendy = session.query(User).filter_by(name='wendy').one() post = BlogPost(Wendy's Blog Post #2, This is a test #2, wendy) post.keywords.append(Keyword('wendy')) ** wendy already exists post.keywords.append(Keyword('2ndpost')) /cut I got Exception: ... sqlalchemy.exceptions.IntegrityError: (IntegrityError) (1062, Duplicate entry 'wendy' for key 2) u'INSERT INTO keywords (keyword) VALUES (%s)' ['wendy'] How can I avoid that? How to use old keyword as object?? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: non-blocking row locks?
Coincidentally I had to implement almost exactly the same thing today. I used separate job checkout/checkin transactions, serializable isolation level to find out when there is a collision and job checkout time to see which jobs are currently running. By default the checkout time is a special date way in the past instead of NULL to make it work better with indexing. This implements a kind of optimistic locking that throws serialization errors when two workers checkout concurrently. I'm hoping that this won't hit performance problems, because the checkout process is quite fast compared to the processing, although there will be around 40 workers running concurrently in the cluster. The code I used is basically this: (slightly edited to omit confidential/superfluous stuff) def check_out_dataset_for_processing(): for retries in xrange(MAX_RETRIES): try: session = Session() # This is necessary to avoid duplicate checkouts session.connection(Dataset).execute(SET TRANSACTION ISOLATION LEVEL SERIALIZABLE) # Database side CURRENT_TIMESTAMP would be a lot better now = datetime.now() timeout_horizon = now - timedelta(seconds=TIMEOUT) dataset = session.query(Dataset).filter_by(is_processed=False)\ .filter(Dataset.last_checkout timeout_horizon).first() if dataset: # If found something mark it checked out dataset.last_checkout = now result = dataset.id, dataset.data_for_processing else: result = None session.commit() return result except sqlalchemy.exceptions.ProgrammingError, e: if e.orig.pgcode != '40001': # Ignore serialization conflicts raise logger.error('Failed to checkout a dataset') def store_processing_result(dataset_id, processing_result): session = Session() dataset = session.query(Dataset).filter_by(id=dataset_id).first() dataset.result = processing_result dataset.is_processed = True session.commit() In my case duplicate execution is only a performance issue so when a worker times out due to crashing or just being slow that dataset is handed to another worker. Though this code can easily be modified to do something different in case of timeout. I don't have a separate job table and the datasets table is millions of rows so to get good performance I used a partial index (currently only in trunk): Index('unprocessed_datasets_idx', datasets.c.last_checkout, postgres_where=datasets.c.is_processed == False) This reduces the job lookup to a simple index lookup. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] SELECT executemany
Is there some other way I can execute a query using a list as a bound parameter? res = db.db_con.text('SELECT module_extra_key_name FROM module_extra_keys WHERE module_id IN :module_ids').execute({'module_ids': [1,2]}) The above is giving me an You cannot execute SELECT statements in executemany() error. This is while I'm running 0.3.8. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: mySQL and timedelta
Noufal wrote: Hello everyone, I'm having some trouble with a query involving a timedelta object with a mySQL backend (MyISAM). I have a table called Run that has two fields like so. sa.Column('starttime', sa.TIMESTAMP), sa.Column('endtime', sa.TIMESTAMP) I'm trying to find all rows that have endtime - starttime equal to some timedelta obtained from a user. The query is constructed like this l = datetime.timedelta(seconds=907) cc = ((Run.c.endtime - Run.c.starttime) == l) t=q.select(cc) 2007-10-01 14:10:12,572 INFO sqlalchemy.engine.base.Engine.0x..4c SELECT runs.status AS runs_status, runs.workdir AS runs_workdir, runs.machinetype AS runs_machinetype, runs.endtime AS runs_endtime, runs.hostname AS runs_hostname, runs.cmdline AS runs_cmdline, runs.incremental AS runs_incremental, runs.`user` AS runs_user, runs.starttime AS runs_starttime, runs.rid AS runs_rid FROM runs WHERE (runs.endtime - runs.starttime) = %s ORDER BY runs.rid 2007-10-01 14:10:12,572 INFO sqlalchemy.engine.base.Engine.0x..4c [datetime.timedelta(0, 907)] The %s looks funny there. This returns the wrong row and when I do this kind of query via my script (which is web based), I get messages in the server error log that look like this /usr/lib/python2.5/site-packages/sqlalchemy/databases/mysql.py:31 3: Warning: Truncated incorrect DOUBLE value: '00:08:40' cursor.execute(statement, parameters) I expect the users to input values like HH:MM:SS which I split and use to create the timedelta object. I'd appreciate any help. MySQL doesn't support first-class intervals or arithmetic operators for date types that I'm aware of. The date math functions can be used for this sort of calculation, such as: cc = (func.timestampdiff(text('second'), Run.c.starttime, Run.c.endtime) == 907) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: INSERT IGNORE
Thanks! On Oct 3, 9:13 am, Michael Bayer [EMAIL PROTECTED] wrote: ive added ticket #804 for this. On Oct 1, 2007, at 5:35 PM, Jim Musil wrote: I'm not seeing the ability to use the INSERT IGNORE ... syntax. Is this available anywhere? Jim Musil --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: SELECT executemany
On Oct 3, 2007, at 2:46 PM, Paul Kippes wrote: Is there some other way I can execute a query using a list as a bound parameter? res = db.db_con.text('SELECT module_extra_key_name FROM module_extra_keys WHERE module_id IN :module_ids').execute({'module_ids': [1,2]}) The above is giving me an You cannot execute SELECT statements in executemany() error. This is while I'm running 0.3.8. its not really possible in the way you describe since the IN clause requires explicit bind parameters spelled out in a list, i.e. IN (?, ?, ?, ?). using SQL expression constructs will generate the bind params for you, i.e. select([sometable.c.extra_key_name], sometable.module_id.in_(1, 2)).execute() --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---