[sqlalchemy] mapping postgres sequences?

2012-05-29 Thread Chris Withers

Hi All,

How do I create a postgres sequence independent of a table using sqlalchemy?

How do I select the next value from a sequence that forms part of a 
postgres table, starting with the SQLAlchemy Table object?


cheers,

Chris

--
Simplistix - Content Management, Batch Processing  Python Consulting
- http://www.simplistix.co.uk

--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] mapping postgres sequences?

2012-05-29 Thread Robert Forkel
ad 1)
from http://docs.sqlalchemy.org/en/rel_0_7/core/schema.html#defining-sequences :

The Sequence object also has the ability to be executed standalone
like a SQL expression, which has the effect of calling its “next
value” function:

seq = Sequence('some_sequence')
nextid = connection.execute(seq)


ad 2)
i determine sequences from models by inspection:

   from sqlalchemy import Sequence
   ...

if len(table.primary_key.columns) == 1:
pk = list(table.primary_key.columns)[0]
if isinstance(pk.default, Sequence):

  ...
On Tue, May 29, 2012 at 8:58 AM, Chris Withers ch...@simplistix.co.uk wrote:
 Hi All,

 How do I create a postgres sequence independent of a table using sqlalchemy?

 How do I select the next value from a sequence that forms part of a postgres
 table, starting with the SQLAlchemy Table object?

 cheers,

 Chris

 --
 Simplistix - Content Management, Batch Processing  Python Consulting
            - http://www.simplistix.co.uk

 --
 You received this message because you are subscribed to the Google Groups
 sqlalchemy group.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 To unsubscribe from this group, send email to
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/sqlalchemy?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] mapping postgres sequences?

2012-05-29 Thread Robert Forkel
working example for the first question:

from sqlalchemy import Sequence, create_engine, MetaData

if __name__ == __main__:
md = MetaData()
e = create_engine('postgresql://rforkel@/test1', echo=True)
md.bind = e
s = Sequence('name', metadata=md)
md.create_all()
print list(e.execute(s.next_value()))[0][0]


On Tue, May 29, 2012 at 8:58 AM, Chris Withers ch...@simplistix.co.uk wrote:
 Hi All,

 How do I create a postgres sequence independent of a table using sqlalchemy?

 How do I select the next value from a sequence that forms part of a postgres
 table, starting with the SQLAlchemy Table object?

 cheers,

 Chris

 --
 Simplistix - Content Management, Batch Processing  Python Consulting
            - http://www.simplistix.co.uk

 --
 You received this message because you are subscribed to the Google Groups
 sqlalchemy group.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 To unsubscribe from this group, send email to
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/sqlalchemy?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] mapping postgres sequences?

2012-05-29 Thread Robert Forkel
example now includes your second question:


from sqlalchemy import Sequence, create_engine, MetaData, Column, Integer
from sqlalchemy.ext.declarative import declarative_base


Base = declarative_base()

class Model(Base):
__tablename__ = 'model'
col = Column(Integer, Sequence('seq'), primary_key=True)


if __name__ == __main__:
e = create_engine('postgresql://rforkel@/test1', echo=True)
Base.metadata.bind = e
s = Sequence('name', metadata=Base.metadata)
Base.metadata.create_all()
print list(e.execute(s.next_value()))[0][0]

for col in Model.__table__.columns:
if col.name == 'col':
print list(e.execute(col.default.next_value()))[0][0]


On Tue, May 29, 2012 at 8:58 AM, Chris Withers ch...@simplistix.co.uk wrote:
 Hi All,

 How do I create a postgres sequence independent of a table using sqlalchemy?

 How do I select the next value from a sequence that forms part of a postgres
 table, starting with the SQLAlchemy Table object?

 cheers,

 Chris

 --
 Simplistix - Content Management, Batch Processing  Python Consulting
            - http://www.simplistix.co.uk

 --
 You received this message because you are subscribed to the Google Groups
 sqlalchemy group.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 To unsubscribe from this group, send email to
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/sqlalchemy?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: Lock table, do things to table, unlock table: Best way?

2012-05-29 Thread Jeff
Thanks  Michael,

Just to make clear what exactly begin_nested() is contributing:

Normal case:
session.rollback() goes back to the last session.commit()

session.begin_nested() case:
session.rollback() goes back to the last session.begin_nested() or
session.commit(), whichever occurred last.

Correct?


On May 28, 11:54 am, Michael Bayer mike...@zzzcomputing.com wrote:
 An option to add along to the unique constraint, if you expect to get 
 collisions often, is to use a SAVEPOINT so that a process can roll back 
 partially if this particular INSERT fails, then use the row.  The Session 
 offers SAVEPOINT via begin_nested():

 session.begin_nested()
 try:
      session.add(thing_that_may_exist_already)
      session.commit()  # flushes, and commits only the savepoint
 except exc.IntegrityError:
     session.rollback()
     thing_that_may_exist_already = 
 session.query(Thing).filter_by(criteiron).one()

 the difference between using locks to prevent concurrent dupes versus using 
 constraints and expecting dupes to fail is known as pessimistic versus 
 optimistic locking.

 On May 28, 2012, at 10:38 AM, Jeff wrote:







  The unique constraint sounds like a workable solution! I'll implement
  that with a try/except and report back if that was effective. Thanks!

  On May 28, 5:43 am, Simon King si...@simonking.org.uk wrote:
  On Sun, May 27, 2012 at 6:18 PM, Jeff jeffalst...@gmail.com wrote:
  Thanks,

  I have indeed spent a lot of time looking at SELECT FOR UPDATE, but as
  far as I can tell that locks rows that have been selected. That is not
  helpful in this use case, in which the issue is rows not existing, and
  then later existing. Am I misunderstanding?

  On May 27, 11:48 am, A.M. age...@themactionfaction.com wrote:
  On May 27, 2012, at 1:07 AM, Jeff wrote:

  I have multiple processes accessing  a table. All of these processes
  want to read a set of rows from the table, and if the rows are not
  present they will make a calculation and insert the rows themselves.
  The issue comes where process  A does a query to see if the target set
  of rows is present in the table, and they're not, and then another
  starts calculating. While it's calculating, process B inserts the
  rows. Then process A inserts the rows, and now we have two copies of
  these sets of rows. Bad.

  You should look at SELECT FOR UPDATE.

 http://docs.sqlalchemy.org/en/rel_0_7/orm/query.html?highlight=lockmo...

  Cheers,
  M

  Could you put unique constraints on the table so that the second
  process will get an error when it tries to insert the duplicate rows?
  It won't prevent you from performing the calculations twice, but at
  least you won't get the duplicates.

  Another option would be to write some sort of pending marker into
  the table, so that subsequent processes know that the result is
  already being calculated.

  Simon

  --
  You received this message because you are subscribed to the Google Groups 
  sqlalchemy group.
  To post to this group, send email to sqlalchemy@googlegroups.com.
  To unsubscribe from this group, send email to 
  sqlalchemy+unsubscr...@googlegroups.com.
  For more options, visit this group 
  athttp://groups.google.com/group/sqlalchemy?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: Multiple inserts with .append()'d associations

2012-05-29 Thread Jeff
engine.dispose() is an inefficient operation, as the Engine is a
factory for connections, not a connection itself.   If you'd like
there to be zero actual database connections open when the application
is idle, you can disable pooling using NullPool.

Very good point. Will do!

To be more clear, I was actually doing this:

all_events =
list(session.query(Event).filter().order_by(...).all())
session.close()
session.bind.dispose()

for a in analyses_parameters:
analysis = Analysis()
events_in_this_analysis = all_events[...]
analysis.events = events_in_this_analysis
session.append(analysis)
session.commit()
session.close()
session.bind.dispose()

Thus, there is no connection to the database during the for loop. And
that loop takes a long time, so the lack of a connection is much
appreciated. The issue comes during the session.commit(), which has
thousands of Analyses with hundreds of thousands of associated Events.
I can watch the database and the open connections, and I see that:

1. All these Analyses are inserted (which takes awhile)
2. THEN the Analysis-Event associations are inserted, which also takes
awhile. It's during this time that I see the connection for this
session goes idle for a few cycles, then is inserting, then goes idle,
and then is inserting, etc. This is the behavior that seemed could be
improved. It's unclear to me from your comments whether or not it
actually can be.

SQLAlchemy 0.7 will also perform the INSERT of all the above records
as a single prepared statement if you assign a primary key value to
the objects ahead of time
Good idea, but unfortunately not doable in this case; lots of
independent processes are doing this at the same time, so I don't see
a coherent way to effectively determine primary keys ahead of time.

Or, you can get the same effect by inserting the records using a non-
ORM executemany, where in this case you wouldn't need to pre-assign
primary key values:

Session.execute(Event.__table__.insert(), params=[{a:evt.a,
b:evt.b, ...} for evt in events}]
... 

I'd actually be inserting into Analysis_Event_Association. That
association table needs the primary keys of the Analyses and the
Events. Doing that manually might look like:

(using NullPool)
all_events =
session.query(Event).filter().order_by(...).values(id)
session.close()

for a in analyses_parameters:
analysis = Analysis()
session.append(analysis)
session.commit()
session.close()

all_analyses =
session.query(Analysis).filter().order_by(...).values(id)
session.close()

for a in all_analyses =
events_in_this_analysis_ids = all_events[...]
for e in events_in_this_analysis_ids:
session.add(Analysis_Event_Association(analysis_id=a,
event_id=e)
session.commit()
session.close()

That seems like it would be the most efficient in terms of connections
only being open during the session.commit(), and each of those having
the most efficient INSERTing. The Python end of things, however, looks
a lot less elegant/simple, making me think I'm missing something.

Thanks!

On May 28, 12:11 pm, Michael Bayer mike...@zzzcomputing.com wrote:
 On May 28, 2012, at 10:52 AM, Jeff wrote:









  There are Analyses, which have Events. There's an
  Analysis_Event_Association table. Events are written before the
  Analyses. Later, I have a for loop calculating multiple Analyses. For
  each Analysis, we identify the Events it includes and append them,
  using sqlalchemy's relationship plumbing. This just writes the
  appropriate row to the association table. It works great:

  all_events =
  list(session.query(Event).filter().order_by(...).all())

  for a in analyses_parameters:
     analysis = Analysis()
     events_in_this_analysis = all_events[...]
     analysis.events = events_in_this_analysis
     session.append(analysis)

  session.commit()
  session.close()
  session.bind.dispose()

  The issue is that I'm in a situation where there are many, many
  processes writing to the same database, and the database only allows
  us a limited number of connections. So I'm trying to have connections
  open only when they're actually being used, hence the
  session.bind.dispose().

 engine.dispose() is an inefficient operation, as the Engine is a factory for 
 connections, not a connection itself.   If you'd like there to be zero actual 
 database connections open when the application is idle, you can disable 
 pooling using NullPool.



  Writing all the analyses is a single insert, and thus is efficient
  with regards to having a connection open. HOWEVER, it appears that
  each of the Events associations we appended are being inserted
  individually, which is bad.  The connection sits there, then is used,
  then sits there, then is used, etc. This contributes to the maximum
  connections open, which is a limited resource.

 Just to make sure we're on the same page, the Session procures a single DBAPI 
 connection, and uses it for the lifespan of that transaction.