[sqlalchemy] Another query on many to many relation
Hello, I have two classes: Host, Reservation (in many-to-many relation), and three tables: hosts, host_reservation (association table), reservation. I need to get table of pairs (host, reservation) *where host is unique and (reservation for this host that has the earliest date OR None in place of reservation)*. If I do: s=session.query(Host, Reservation).outerjoin(Host.reservations).order_by(Host.id, Reservation.start_date) ..that works, but produces more than 1 (earliest) reservation for a given host: 2009-04-08 14:46:47,375 INFO sqlalchemy.engine.base.Engine.0x...3c94 SELECT hosts.id AS hosts_id, hosts.ip AS hosts_ip, hosts.hostname AS hosts_hostname, hosts.location AS hosts_location, hosts.architecture_id AS hosts_architecture_id, hosts.os_kind_id AS hosts_os_kind_id, hosts.os_version_id AS hosts_os_version_id, hosts.additional_info AS hosts_additional_info, hosts.column_12 AS hosts_column_12, hosts.column_13 AS hosts_column_13, hosts.username AS hosts_username, hosts.password AS hosts_password, hosts.alias AS hosts_alias, hosts.virtualization_id AS hosts_virtualization_id, hosts.shareable AS hosts_shareable, hosts.shareable_between_projects AS hosts_shareable_between_projects, hosts.notes AS hosts_notes, hosts.cpu AS hosts_cpu, hosts.ram AS hosts_ram, hosts.column_24 AS hosts_column_24, hosts.batch AS hosts_batch, hosts.asset AS hosts_asset, hosts.owner AS hosts_owner, hosts.ssh_key_present AS hosts_ssh_key_present, hosts.machine_type_model AS hosts_machine_type_model, hosts.mac_address_eth_0 AS hosts_mac_address_eth_0, hosts.physical_box AS hosts_physical_box, hosts.up_n_running AS hosts_up_n_running, hosts.available AS hosts_available, hosts.earliest_reservation_id AS hosts_earliest_reservation_id, hosts.project_id AS hosts_project_id, reservation.id AS reservation_id, reservation.start_date AS reservation_start_date, reservation.end_date AS reservation_end_date, reservation.status AS reservation_status, reservation.businessneed AS reservation_businessneed, reservation.notetohwrep AS reservation_notetohwrep, reservation.email_id AS reservation_email_id, reservation.project_id AS reservation_project_id FROM hosts LEFT OUTER JOIN reservation_hosts AS reservation_hosts_1 ON hosts.id = reservation_hosts_1.host_id LEFT OUTER JOIN reservation ON reservation.id = reservation_hosts_1.reservation_id ORDER BY hosts.id, reservation.start_date (host.id, reservation.start_date) 344 2009-04-13 344 2009-04-20 354 2009-04-13 354 2009-04-20 355 2009-04-13 355 2009-04-20 ... I need to get the table like following: (host.id, reservation.start_date) 344 2009-04-13 354 2009-04-13 355 2009-04-13 I managed to produce plain SQL (Postgres) query producing the result I need, but I have no idea how to translate that into SQLA: SELECT hosts.id AS hosts_id, hosts.ip AS hosts_ip, hosts.hostname AS hosts_hostname, hosts.location AS hosts_location, hosts.architecture_id AS hosts_architecture_id, hosts.os_kind_id AS hosts_os_kind_id, hosts.os_version_id AS hosts_os_version_id, hosts.additional_info AS hosts_additional_info, hosts.column_12 AS hosts_column_12, hosts.column_13 AS hosts_column_13, hosts.username AS hosts_username, hosts.password AS hosts_password, hosts.alias AS hosts_alias, hosts.virtualization_id AS hosts_virtualization_id, hosts.shareable AS hosts_shareable, hosts.shareable_between_projects AS hosts_shareable_between_projects, hosts.notes AS hosts_notes, hosts.cpu AS hosts_cpu, hosts.ram AS hosts_ram, hosts.column_24 AS hosts_column_24, hosts.batch AS hosts_batch, hosts.asset AS hosts_asset, hosts.owner AS hosts_owner, hosts.ssh_key_present AS hosts_ssh_key_present, hosts.machine_type_model AS hosts_machine_type_model, hosts.mac_address_eth_0 AS hosts_mac_address_eth_0, hosts.physical_box AS hosts_physical_box, hosts.up_n_running AS hosts_up_n_running, hosts.available AS hosts_available, hosts.earliest_reservation_id AS hosts_earliest_reservation_id, hosts.project_id AS hosts_project_id, reservation.id AS reservation_id, reservation.start_date AS reservation_start_date, reservation.end_date AS reservation_end_date, reservation.status AS reservation_status, reservation.businessneed AS reservation_businessneed, reservation.notetohwrep AS reservation_notetohwrep, reservation.email_id AS reservation_email_id, reservation.project_id AS reservation_project_id FROM hosts LEFT OUTER JOIN reservation_hosts AS reservation_hosts_1 ON hosts.id = reservation_hosts_1.host_id LEFT OUTER JOIN reservation ON reservation.id = reservation_hosts_1.reservation_id INNER JOIN (SELECT rh.host_id, MIN(r.start_date) FROM reservation AS r INNER JOIN reservation_hosts AS rh ON r.id=rh.reservation_id GROUP BY rh.host_id) AS min_date(host_id, start_date) ON hosts.id = min_date.host_id AND reservation.start_date = min_date.start_date WHERE reservation.id IS NOT NULL ORDER BY hosts.id, reservation.start_date Anybody has an idea? Regards, mk
[sqlalchemy] Many to Many using self-reference in Declarative
Hi, for implementation of a many to many relationship i create a second plain table which contains two columns from the same user table: user_target_table = Table(user_target, metadata, Column('user_id', Integer, ForeignKey('tg_user.user_id')), Column('target_id', Integer, ForeignKey('tg_user.user_id')), Column('relation_created', DateTime, onupdate = datetime.datetime.now, default = datetime.datetime.now) ) Yet i don't know how to declare the attribute in the User class: class User(Base): .. target = relation(User, secondary = user_target_table, backref = user) .. Is that correct? or needed being declared twice within it? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Mapper with declarative
get it! Problem solved! On 7 Apr., 19:16, J. Cliff Dyer j...@sdf.lonestar.org wrote: On Tue, 2009-04-07 at 06:28 -0700, Scripper wrote: Hello everyone, I have a concrete question about using declarative mapper in my project. When i there appeared some errors. I wonder whether declarative mapper supports two relation in a single class. Thanks! Well, so far, that's an abstract question. The answer (as Michael told you) is yes. But what errors are you getting, specifically (full traceback would be helpful)? Cheers, Cliff --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: When to create new sessions?
On Wednesday 08 April 2009 05:53:12 Michael Bayer wrote: On Apr 7, 2009, at 6:07 PM, rintin...@googlemail.com wrote: Hey Everyone, I'm new to SQLAlchemy (used to using Django's ORM but need to switch) and there's just one thing I'm struggling with, which is when am I supposed to create Sessions? I am of course creating scoped sessions. I feel like a real dunce for not being able to get my head around it. Do I create one per-request and pass it around? That just doesn't feel quite right to me. Or can I create them at module-level when I need them? per-request is the most natural approach. The point of the scopedsession is that you can use it as a global object, there's no need to pass it around. It automatically routes operations to a thread-local session. I'm sure django does something similar. the chapter on sessions includes a discussion on integrating scopedsession within a web application, you should check it out. Also, is it okay to call commit() more than once on the same session? absolutely. On a per-function basis even (seems like an awful lot of boilerplate code in each function though… surely not?!) depending on what you're doing , this may or may not be appropriate. boilerplate can be cut down using a decorator, such as: @commits def do_some_stuff(...): the decorator: def commits(fn): def go(*args, **kw): try: return fn(*args, **kw) Session.commit() Not to be to nitpicky... but this commit is never reached. And dangling transactions can be very irritating. I'd go for this (untetsted) def transaction(f) def _wrapper(*args, **kwargs): commit = True try: return f(*args, **kwargs) except: commit = False finally: (session.commit if commit else session.rollback)() Diez --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: 0.5.3 and mssql
On Tue, Apr 7, 2009 at 2:58 PM, Lukasz Szybalski szybal...@gmail.com wrote: Hello, Is this still proper connection string for mssql where I specify the tds version and a connection driver name? e = sqlalchemy.create_engine(mssql://user:p...@server:1433/db_name?driver=TDSodbc_options='TDS_Version=8.0') 1. Just to let you know... the dsn connection works as long as TDS_Version = 8.0 is provided in the dsn settings of freetds for mssql 2000. 2. I went over your sampeles at : http://www.sqlalchemy.org/trac/browser/sqlalchemy/trunk/lib/sqlalchemy/databases/mssql.py and using the mssql:///?odbc_connect=dsn%3Dmydsn%3BDatabase%3Ddb I was able to get proper dns string. The syntax is weird because the code needs to look like this and needs to use %s to substitute the string. I don't know if you run into unicode problems or escaping issues. Is there a better way to do this? import urllib url= urllib.quote_plus('DRIVER={TDS};Server=servername;Database=dbname;UID=user;PWD=pass;port=123;TDS_Version=8.0;') e = sqlalchemy.create_engine(mssql:///?odbc_connect=%s % url) I wish you guys did this in a code so I could only provide something like this: e = sqlalchemy.create_engine(mssql:///?odbc_connect='DRIVER={TDS};Server=servername;Database=dbname;UID=user;PWD=pass;port=123;TDS_Version=8.0;'') Notice the single quotes after odbc_connect=' If not could you copy above 3 lines of code I've put as an example and add it to your docs for mssql. I would never know how to get here without being subscribed to this list and know about the changes and know to look at the source code file. 3. With dsn-less connection there seems to be some kind of issue with pyodbc, but I can't figure out what the problem is. import pyodbc pyodbc.connect('DRIVER={TDS};Server=servername;Database=dbname;UID=user;PWD=pass;port=123;') Traceback (most recent call last): File stdin, line 1, in module pyodbc.Error: ('08S01', '[08S01] [unixODBC][FreeTDS][SQL Server]Unable to connect: Adaptive Server is unavailable or does not exist (20009) (SQLDriverConnectW)') Anybody might know how to solve this? Thanks, Lucas What is the syntax for driver, and how can I pass TDS_Version=8.0 to the end of connection string? Thanks, Lucas File /home/lucas/tmp/saENV/lib/python2.5/site-packages/SQLAlchemy-0.5.3-py2.5.egg/sqlalchemy/pool.py, line 198, in __init__ self.connection = self.__connect() File /home/lucas/tmp/saENV/lib/python2.5/site-packages/SQLAlchemy-0.5.3-py2.5.egg/sqlalchemy/pool.py, line 261, in __connect connection = self.__pool._creator() File /home/lucas/tmp/saENV/lib/python2.5/site-packages/SQLAlchemy-0.5.3-py2.5.egg/sqlalchemy/engine/strategies.py, line 80, in connect raise exc.DBAPIError.instance(None, None, e) sqlalchemy.exc.DBAPIError: (Error) ('08001', '[08001] [unixODBC][FreeTDS][SQL Server]Unable to connect to data source (0) (SQLDriverConnectW)') None None -- How to create python package? http://lucasmanual.com/mywiki/PythonPaste DataHub - create a package that gets, parses, loads, visualizes data http://lucasmanual.com/mywiki/DataHub -- How to create python package? http://lucasmanual.com/mywiki/PythonPaste DataHub - create a package that gets, parses, loads, visualizes data http://lucasmanual.com/mywiki/DataHub --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Relative speed of ORM vs. direct insertions (on simple tables)
Good morning! I hope I'm not tipping any sacred cows here, but for simple SqlA objects, it seems to be a heckuva lot faster to just insert into the table directly (using SqlA expression language) than to insert the objects via session.flush(). In the attached tests, I'm observing a 10x -20x speedup. I'm still new to SqlA, so I was hoping the list here would be kind enough to verify my observation. My questions: 1. If so, why? I assume it's because session_flush() does seperate insert statments (as verified when echo = True is on). 2. In test 3, is this a reasonable away to convert from session to direct table insert? Is there a simpler way than the Thing.to_dict method I hacked together. 3. Are these valid tests? I don't want to have all the embarrassment of some others who have 'slammed' SqlA without a proper grounding. I'm no expert, and I want to make sure what I have is something approximating idiomatic SqlA. I tried to be generous about what to include in the timed section of each test. I do have autoflush off, and I'm using Sqlite (in memory), which might affect things. 4. If there is a faster way to flush out a session, I'm all ears! I understand the Big Win (tm) of the ORM is programmer simplicity and power, but if I can get that without major hits to performance, I'd like to be able to Be Greedy (tm) and have it all. - #!/usr/bin/env python2 import sys import itertools import time from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import Table, Column, Integer, String, MetaData, ForeignKey, Boolean from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker Base = declarative_base() class Thing(Base): __tablename__ = 'asn_change' id = Column(Integer, primary_key=True, nullable=False) datum = Column(String, primary_key=True, nullable=False) start = Column(Integer, primary_key=True, nullable=False) stop = Column(Integer, primary_key=True, nullable=False) def to_dict(self): return dict(id=self.id, datum=self.datum, start=self.start, stop=self.start) def dummy_setup(connstring='sqlite:///:memory:'): engine = create_engine(connstring, echo=False) Session = sessionmaker(bind=engine, autoflush=False, autocommit=False) session = Session() Base.metadata.bind = engine Base.metadata.create_all() return session, engine def clean_up(session): session.expunge_all() session.query(Thing).delete() session.commit() ## Tests class Tests(object): def setUp(self): self.session, self.engine = dummy_setup() self.R = 1 def test_01_orm(self): session = self.session clean_up(session) for ii in xrange(self.R): session.add(Thing(id=ii,datum=some data %i %ii, start=0,stop=9)) now = time.time() session.flush() session.commit() t = time.time() - now print timing: %f2.2 % t assert True def test_02_direct_insert(self): session = self.session clean_up(session) stm = Thing.__table__.insert().compile() ipfx = [dict(id=ii,datum=some data %i %ii,start=0,stop=9) for ii in xrange(self.R)] now = time.time() stm.execute(ipfx) t = time.time() - now print timing: %f2.2 % t assert True def test_03_convert_from_session(self): session = self.session clean_up(session) stm = Thing.__table__.insert().compile() for ii in xrange(self.R): session.add(Thing(id=ii,datum=some data %i %ii, start=0,stop=9)) # count the conversion time as part of the test now = time.time() ipfx = [x.to_dict() for x in session.new] stm.execute( ipfx) session.new.clear() t = time.time() - now print timing: %f2.2 % t assert True - $ nosetests -v -s insertion_test.py insertion_test.Tests.test_01_orm ... timing: 8.2513552.2 ok insertion_test.Tests.test_02_direct_insert ... timing: 0.5210562.2 ok insertion_test.Tests.test_03_convert_from_session ... timing: 0.8730292.2 ok -- Ran 3 tests in 14.247s OK --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Another query on many to many relation
Hello, P.S. Definitions: class Reservation(object): def __init__(self, startdate, enddate, status, bneed, nhwrep): self.start_date = startdate self.end_date = enddate self.status = status self.businessneed = bneed self.notetohwrep = nhwrep class Host(object): def __init__(self,IP,HostName,Location,Additional_info,Column_11,Column_12,Username,Password,Alias,Shareable,Shareable_between_projects,Notes,CPU,RAM,Column_23,Batch,ASSET,Owner,SSH_KEY_PRESENT,Machine_Type_Model,MAC_ADDRESS_ETH_0,Physical_Box,Up_n_running,Available): self.ip = IP self.hostname = HostName self.location = Location self.additional_info = Additional_info #self.End_Date = End_Date self.column_11 = Column_11 self.column_12 = Column_12 self.username = Username self.password = Password self.alias = Alias self.shareable = Shareable self.shareable_between_projects = Shareable_between_projects self.notes = Notes self.cpu = CPU self.ram = RAM self.batch = Batch self.asset = ASSET self.owner = Owner self.ssh_key_present = SSH_KEY_PRESENT self.machine_type_model = Machine_Type_Model self.mac_address_eth_0 = MAC_ADDRESS_ETH_0 self.physical_box = Physical_Box self.up_n_running = Up_n_running self.available = Available hosts_table = Table('hosts',md, Column('id',Integer,primary_key=True), Column('ip',String), Column('hostname',String), Column('location',String), Column('architecture_id',Integer,ForeignKey('architecture.id')), Column('os_kind_id',Integer,ForeignKey('os_kind.id')), Column('os_version_id',Integer,ForeignKey('os_version.id')), Column('additional_info',String), #Column('End_Date',SLDate), Column('column_12',String), Column('column_13',String), #Column('email_id',Integer,ForeignKey('email.id')), Column('username',String), Column('password',String), Column('alias',String), Column('virtualization_id',Integer,ForeignKey('virtualization.id')), Column('shareable',SLBoolean), Column('shareable_between_projects',SLBoolean), Column('notes',String), Column('cpu',String), Column('ram',String), Column('column_24',String), Column('batch',String), Column('asset',String), Column('owner',String), Column('ssh_key_present',String), Column('machine_type_model',String), Column('mac_address_eth_0',String), Column('physical_box',SLBoolean), Column('up_n_running',SLBoolean), Column('available',SLBoolean), Column('project_id',Integer,ForeignKey('project.id'))) reservation_table = Table('reservation', md, Column('id',Integer,primary_key=True), Column('start_date',SLDate), Column('end_date',SLDate), Column('status', String), Column('businessneed', String), Column('notetohwrep',String), Column('email_id',Integer,ForeignKey('email.id')), Column('project_id',Integer,ForeignKey('project.id')) ) reservation_hosts_assoc_table = Table('reservation_hosts', md, Column('reservation_id',Integer,ForeignKey('reservation.id')), Column('host_id',Integer,ForeignKey('hosts.id')) ) mapper(Reservation, reservation_table, properties={'email':relation(Email,order_by=Email.id), 'project':relation(Project, order_by=Project.id), 'hosts':relation(Host, secondary=reservation_hosts_assoc_table,backref='reservation') } ) mapper(Host, hosts_table, properties={'architecture':relation(Architecture, order_by=Architecture.id, backref='hosts'), 'os_kind':relation(OS_Kind, order_by=OS_Kind.id, backref='hosts'), 'os_version':relation(OS_version, order_by=OS_version.id,backref='hosts'), 'virtualization':relation(Virtualization,order_by=Virtualization.id, backref='hosts'), 'project':relation(Project, order_by=Project.id, backref='hosts'), 'reservations':relation(Reservation,secondary=reservation_hosts_assoc_table, backref='host_reservations') }) Regards, mk --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Calling Oracle Stored Procedures Having Out Cursor Parameter
On Apr 7, 10:49 pm, Michael Bayer mike...@zzzcomputing.com wrote: we have an outparam construct which makes use of cx_oracle's built in API for this: result = testing.db.execute(text(begin foo(:x_in, :x_out, :y_out, :z_out); end;, bindparams=[bindparam('x_in', Numeric), outparam('x_out', Numeric), outparam('y_out', Numeric), outparam('z_out', String)]), x_in=5) assert result.out_parameters == {'x_out':10, 'y_out':75, 'z_out':None} Thanks Michael. I tried something like this, but I could not find any generic or vendor specific type to specify for the cursor parameter (e.g. something like .. outparam(':p_out', OracleCursor)). Does this make sense? It's not likely I'll be able to change these stored procedures unfortunately, so I may be stuck with a cursor out parameter... :( --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Relative speed of ORM vs. direct insertions (on simple tables)
Some followups: Python 2.4.3 on 64-bit linux. Timings are near identical in SA 0.5.2 and 0.5.3. On Apr 8, 9:57 am, Gregg Lind gregg.l...@gmail.com wrote: Good morning! I hope I'm not tipping any sacred cows here, but for simple SqlA objects, it seems to be a heckuva lot faster to just insert into the table directly (using SqlA expression language) than to insert the objects via session.flush(). In the attached tests, I'm observing a 10x -20x speedup. I'm still new to SqlA, so I was hoping the list here would be kind enough to verify my observation. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Association Object Pattern
Hi guys, I've got a question regarding the association object pattern. Though perhaps it's not what I should be using... Let's say I've got a Parent class that I'd like to add tags with data to. class Parent(Base) id = Str # The association object class Tag(Base) parent_id = Str child_id = Int value = Float # The child object, comes with some data too class Child(Base) id = Int name = Str date = Date The relation setup is just like in the example in the docs. So now I'd like to create a set of tags and pass the info the tag needs to create its child to its constructor. tags = [Tag(name='name1', date=dt.date.today()), ...etc. ] Then do something like: parent = Parent() for tag in tags: parent.tags.append(tag) # can I use a set here instead of a list? The bit that I'm struggeling with is duplicate tag objects... In the tag's constructor I have: child = sess.query(Child).filter(Child.name == name).filter(Child.date == date).first() if child is None: child = Child(name=name, date=date) self.child = child So if there's a child object in the db with the same name and date it uses that as it's child field. Though I'm getting : raise exc.FlushError(New instance %s with identity key %s conflicts with persistent instance %s % (state_str(state), str(instance_key), state_str(existing))) Which means the tag I'm adding is allready in the db. Which is true. Normally I'd use merge to get around that, but don't know how to do that since I'm 'adding' the new instance through parent.tags.append() Any ideas how to solve this? Am I simply using the wrong approach? Regards, Christian --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Relative speed of ORM vs. direct insertions (on simple tables)
Gregg Lind wrote: Good morning! I hope I'm not tipping any sacred cows here, but for simple SqlA objects, it seems to be a heckuva lot faster to just insert into the table directly (using SqlA expression language) than to insert the objects via session.flush(). In the attached tests, I'm observing a 10x -20x speedup. I'm still new to SqlA, so I was hoping the list here would be kind enough to verify my observation. verified My questions: 1. If so, why? I assume it's because session_flush() does seperate insert statments (as verified when echo = True is on). session.flush() performs a topological sort of all dirty/pending/deleted objects based on foreign key dependencies between tables as well as between rows, checks all modified attributes and collections (the collections part sometimes requires a load of the collection, unless certain options are set) for a net change in value, issues INSERT/UPDATE/DELETE statements in an exact order based on dependencies, tailors individual INSERT and UPDATE statements based on the values which are present in memory vs. server side defaults (for inserts) or the values which have a net change (for updates). It then issues all of these statements individually (by necessity, since they all have different argument lists and sometimes inter-row dependencies, you also cannot fetch the last inserted id from an executemany()) which from a DBAPI point of view is slower in any case, since you are calling execute() many times. Newly inserted rows often require extra statements to fetch newly generated primary keys, which are then distributed to all the foreign-key-holding attributes which require it (which are then potentially inserted or updated in subsequent statements). After all SQL is emitted, it then refreshes the bookkeeping status on all entities which were changed, and expires attributes whose values were generated within the DB but don't need to be fetched until needed. OTOH an executemany() call receives a pre-made list of parameters for any number of bind parameter sets, the DBAPI then prepares a single statement and runs it N times, usually within C code, and you can modify or insert tens of thousands of rows in a few seconds (the trick is that you've generated this huge dict of data beforehand, and that your parameters are all of identical structure). 2. In test 3, is this a reasonable away to convert from session to direct table insert? Is there a simpler way than the Thing.to_dict method I hacked together. for simple table mappings, its easy enough to deal with your rows as dicts and use execute() to change things. For more complexity with relations to other tables in various ways, it becomes less trivial. There are always tradeoffs to be navigated according to your specific needs. 3. Are these valid tests? I don't want to have all the embarrassment of some others who have 'slammed' SqlA without a proper grounding. I'm no expert, and I want to make sure what I have is something approximating idiomatic SqlA. I tried to be generous about what to include in the timed section of each test. I do have autoflush off, and I'm using Sqlite (in memory), which might affect things. I didnt look closely but the general observation of expressions are faster than ORM is valid. Your orders of magnitude might be off. 4. If there is a faster way to flush out a session, I'm all ears! I understand the Big Win (tm) of the ORM is programmer simplicity and power, but if I can get that without major hits to performance, I'd like to be able to Be Greedy (tm) and have it all. try keeping the size of the session small, and look into options like passive_deletes and passive_updates, which prevent rows from being loaded in order to accomodate cascades that can be established in the database directly. In any case flushing tens of thousands of objects is unlikely to be performant. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Calling Oracle Stored Procedures Having Out Cursor Parameter
cbaron wrote: On Apr 7, 10:49 pm, Michael Bayer mike...@zzzcomputing.com wrote: we have an outparam construct which makes use of cx_oracle's built in API for this: result = testing.db.execute(text(begin foo(:x_in, :x_out, :y_out, :z_out); end;, bindparams=[bindparam('x_in', Numeric), outparam('x_out', Numeric), outparam('y_out', Numeric), outparam('z_out', String)]), x_in=5) assert result.out_parameters == {'x_out':10, 'y_out':75, 'z_out':None} Thanks Michael. I tried something like this, but I could not find any generic or vendor specific type to specify for the cursor parameter (e.g. something like .. outparam(':p_out', OracleCursor)). Does this make sense? im missing that part. the out parameter receives a cursor ? strange. you should at least modify your code to use cx_oracle's API, check out their site for information on that. I didn't realize a cursor was a datatype (seems very strange); --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] availability of related obj
Hi, I have a definition similar to this:: class Ticket(Base): __tablename__ = 'ticket' id = Column(Integer, primary_key=True) assigned_to_id = Column(ForeignKey(User.id)) assigned_to = relation(User, primaryjoin = assigned_to_id == User.id, lazy=True) If I instantiate ticket = Ticket() ticket.assigned_to_id = user.id I can commit and after that I can 'print ticket.assigned_to' Is there a way to have ticket_assigned_to available *before* committing? I'd like to have it available in after-flush phase of sessionExtension. SQLA knows how to retrieve it so I wandererd if it can be instructed to make it available on demand. thanks sandro -- Sandro Dentella *:-) http://sqlkit.argolinux.orgSQLkit home page - PyGTK/python/sqlalchemy --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: availability of related obj
Alessandro Dentella wrote: Hi, I have a definition similar to this:: class Ticket(Base): __tablename__ = 'ticket' id = Column(Integer, primary_key=True) assigned_to_id = Column(ForeignKey(User.id)) assigned_to = relation(User, primaryjoin = assigned_to_id == User.id, lazy=True) If I instantiate ticket = Ticket() ticket.assigned_to_id = user.id I can commit and after that I can 'print ticket.assigned_to' Is there a way to have ticket_assigned_to available *before* committing? I'd like to have it available in after-flush phase of sessionExtension. as soon as Ticket is persistent within the flush, the ticket.assigned_to relation will be live and will lazy load when accessed. no commit is needed. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: availability of related obj
Alessandro Dentella wrote: Hi, I have a definition similar to this:: class Ticket(Base): __tablename__ = 'ticket' id = Column(Integer, primary_key=True) assigned_to_id = Column(ForeignKey(User.id)) assigned_to = relation(User, primaryjoin = assigned_to_id == User.id, lazy=True) If I instantiate ticket = Ticket() ticket.assigned_to_id = user.id I can commit and after that I can 'print ticket.assigned_to' specifically the lazy loader will work in the after_flush_postexec() phase of the sessionextension. during after_flush(), the post-flush bookkeeping has not been establishsed yet on assigned_to_id, and the lazy loader always looks for the persisted version of the attribute. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: availability of related obj
as soon as Ticket is persistent within the flush, the ticket.assigned_to relation will be live and will lazy load when accessed. no commit is needed. mmh, in the following example, I can't use assigned_to within after_flush. Am I doing something wrong? from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import Table, Column, ForeignKey, text, func from sqlalchemy.types import * from sqlalchemy.orm import relation, sessionmaker from sqlalchemy.orm.interfaces import SessionExtension Base = declarative_base() Base.metadata.bind = 'sqlite://' class User(Base): __tablename__ = 'user' id = Column(Integer, primary_key=True) username = Column(String(20)) class Ticket(Base): __tablename__ = 'ticket' id = Column(Integer, primary_key=True) assigned_to_id = Column(ForeignKey(User.id)) assigned_to = relation(User, primaryjoin = assigned_to_id == User.id, lazy=True) class SKSessionExtension(SessionExtension): def after_flush(self, session, flush_context): implement the after-flush signal for new in session.new: if isinstance(new, Ticket): print NEW: %s - assigned_to: %s % ( new, new.assigned_to) Session = sessionmaker(bind=Base.metadata.bind) session = Session(extension=SKSessionExtension()) Base.metadata.create_all() #Base.metadata.bind.echo = True user = User() user.username = 'aaa' session.add(user) session.commit() ticket = Ticket() ticket.assigned_to_id = user.id session.add(ticket) session.flush() print AFTER FLUSH, ticket.assigned_to session.commit() print AFTER COMMIT, ticket.assigned_to -- that leads to this output: NEW: __main__.Ticket object at 0x84f51cc - assigned_to: None AFTER FLUSH None AFTER COMMIT __main__.User object at 0x84f060c sandro --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: availability of related obj
specifically the lazy loader will work in the after_flush_postexec() phase of the sessionextension. during after_flush(), the post-flush bookkeeping has not been establishsed yet on assigned_to_id, and the lazy loader always looks for the persisted version of the attribute. ok, this explains my example in fact, that really means I cannot use that relation in after_flush. after_lush_postexec is too late as I need session.dirty/session.new to be able to understand what happened and take actions consequently. thanks sandro --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Relative speed of ORM vs. direct insertions (on simple tables)
Thank you for the excellent and comprehensive answer! I didn't realize exactly how much work the session object does As per your advice, I have taken steps to reduce the size of my session object, and things run much more quickly now. On Wed, Apr 8, 2009 at 12:50 PM, Michael Bayer mike...@zzzcomputing.com wrote: Gregg Lind wrote: Good morning! I hope I'm not tipping any sacred cows here, but for simple SqlA objects, it seems to be a heckuva lot faster to just insert into the table directly (using SqlA expression language) than to insert the objects via session.flush(). In the attached tests, I'm observing a 10x -20x speedup. I'm still new to SqlA, so I was hoping the list here would be kind enough to verify my observation. verified My questions: 1. If so, why? I assume it's because session_flush() does seperate insert statments (as verified when echo = True is on). session.flush() performs a topological sort of all dirty/pending/deleted objects based on foreign key dependencies between tables as well as between rows, checks all modified attributes and collections (the collections part sometimes requires a load of the collection, unless certain options are set) for a net change in value, issues INSERT/UPDATE/DELETE statements in an exact order based on dependencies, tailors individual INSERT and UPDATE statements based on the values which are present in memory vs. server side defaults (for inserts) or the values which have a net change (for updates). It then issues all of these statements individually (by necessity, since they all have different argument lists and sometimes inter-row dependencies, you also cannot fetch the last inserted id from an executemany()) which from a DBAPI point of view is slower in any case, since you are calling execute() many times. Newly inserted rows often require extra statements to fetch newly generated primary keys, which are then distributed to all the foreign-key-holding attributes which require it (which are then potentially inserted or updated in subsequent statements). After all SQL is emitted, it then refreshes the bookkeeping status on all entities which were changed, and expires attributes whose values were generated within the DB but don't need to be fetched until needed. OTOH an executemany() call receives a pre-made list of parameters for any number of bind parameter sets, the DBAPI then prepares a single statement and runs it N times, usually within C code, and you can modify or insert tens of thousands of rows in a few seconds (the trick is that you've generated this huge dict of data beforehand, and that your parameters are all of identical structure). 2. In test 3, is this a reasonable away to convert from session to direct table insert? Is there a simpler way than the Thing.to_dict method I hacked together. for simple table mappings, its easy enough to deal with your rows as dicts and use execute() to change things. For more complexity with relations to other tables in various ways, it becomes less trivial. There are always tradeoffs to be navigated according to your specific needs. 3. Are these valid tests? I don't want to have all the embarrassment of some others who have 'slammed' SqlA without a proper grounding. I'm no expert, and I want to make sure what I have is something approximating idiomatic SqlA. I tried to be generous about what to include in the timed section of each test. I do have autoflush off, and I'm using Sqlite (in memory), which might affect things. I didnt look closely but the general observation of expressions are faster than ORM is valid. Your orders of magnitude might be off. 4. If there is a faster way to flush out a session, I'm all ears! I understand the Big Win (tm) of the ORM is programmer simplicity and power, but if I can get that without major hits to performance, I'd like to be able to Be Greedy (tm) and have it all. try keeping the size of the session small, and look into options like passive_deletes and passive_updates, which prevent rows from being loaded in order to accomodate cascades that can be established in the database directly. In any case flushing tens of thousands of objects is unlikely to be performant. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] declarative and __table_args__ I must be missing something simple.
I assume I am over looking some simple thing, but I just can't seem to find it. Thanks for the assist, I have palms open ready for face planting. Using a class and table with orm.mapper() class Child(object): pass child_table = Table('child', meta.metadata, Column('parent_id', Integer, primary_key=True), Column('parent_ref', Integer, nullable=False), Column('content', String(10)), ForeignKeyConstraint(['parent_id', 'parent_ref'], ['parent.id', 'parent.ref']) ) orm.mapper(Child, child_table) class Parent(object): pass parent_table = Table('parent', meta.metadata, Column('id', Integer, primary_key=True), Column('ref', Integer, primary_key=True) ) orm.mapper(Parent, parent_table, properties={ 'children':relation(Child, lazy=False) }) Produces the following create 2009-04-08 16:36:54,319 INFO sqlalchemy.engine.base.Engine.0x...a6b0 CREATE TABLE parent ( id INTEGER NOT NULL, ref INTEGER NOT NULL, PRIMARY KEY (id, ref) ) 2009-04-08 16:36:54,319 INFO sqlalchemy.engine.base.Engine.0x...a6b0 {} 2009-04-08 16:36:54,454 INFO sqlalchemy.engine.base.Engine.0x...a6b0 COMMIT 2009-04-08 16:36:54,456 INFO sqlalchemy.engine.base.Engine.0x...a6b0 CREATE TABLE child ( parent_id INTEGER NOT NULL, parent_ref INTEGER NOT NULL, content VARCHAR(10), PRIMARY KEY (parent_id), FOREIGN KEY(parent_id, parent_ref) REFERENCES parent (id, ref) ) Using what I believe is the exact same thing with declarative produces the creates minus the composite foreign key and then of course is unable to establish the relation. class Parent(Base): __tablename__ = 'parent' id = Column(Integer, primary_key=True) ref = Column(Integer, primary_key=True) children = relation(Child, lazy=False) class Child(Base): __tablename__ = 'child' __table_args__ = ForeignKeyConstraint(['parent_id','parent_ref'], ['parent.id', 'parent.ref']) parent_id = Column(Integer, primary_key=True) parent_ref = Column(Integer, nullable=False) content = Column(String(10)) The create output is 2009-04-08 16:47:08,331 INFO sqlalchemy.engine.base.Engine.0x...a710 CREATE TABLE child ( parent_id INTEGER NOT NULL, parent_ref INTEGER NOT NULL, content VARCHAR(10), PRIMARY KEY (parent_id) ) 2009-04-08 16:47:08,331 INFO sqlalchemy.engine.base.Engine.0x...a710 {} 2009-04-08 16:47:08,464 INFO sqlalchemy.engine.base.Engine.0x...a710 COMMIT 2009-04-08 16:47:08,466 INFO sqlalchemy.engine.base.Engine.0x...a710 CREATE TABLE parent ( id INTEGER NOT NULL, ref INTEGER NOT NULL, PRIMARY KEY (id, ref) ) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] table reflection: include_columns does not include a component of the primary key?
I am trudging through the unit tests for the Sybase backend and found an interesting one in test_nonreflected_fk_raises in engine/ reflection.py. A couple of the drivers look like they skip over the column if it is not contained within table.c but go ahead and create the primary_key in any case! This seems a bit odd to me. Is this the intended behavior? pjjH # I think we have to raise some kind of exception here if # we try and reflect on an index when the column is # omitted from include_columns? if include_columns and column_name not in include_columns: raise exc.NoReferencedColumnError( Could not create PrimaryKey/Index '%s' on table '%s': table '%s' has column named '%s' but it is not present in include_columns:%s % ( index_name, table.name, table.name, column_name,','.join(include_columns))) if r.status 0x800 == 0x800: table.primary_key.add(table.c[row[0]]) if not index_name in PK.keys(): PK[index_name] = PrimaryKeyConstraint(name = index_name) PK[index_name].add(table.c [column_name]) else: if not index_name in INDEXES.keys(): INDEXES[index_name] = Index(index_name, unique= (r.status 0x2 == 0x2)) INDEXES[index_name].append_column(table.c[column_name]) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: declarative and __table_args__ I must be missing something simple.
Yep, there it is. Stupidly simple. Dug up some old know working source that did and diff and grep later I found the cause. __table_args__ needs to be give a tuple with an empty dictionary, like so. __table_args__ = (ForeignKeyConstraint(['parent_id', 'parent_ref'], ['parent.id', 'parent.ref']), {}) Now all is well, sorry for the ML clutter. I am face palming in 3, 2, 1 On Apr 8, 4:49 pm, Wayne Witzel wwitz...@gmail.com wrote: I assume I am over looking some simple thing, but I just can't seem to find it. Thanks for the assist, I have palms open ready for face planting. Using a class and table with orm.mapper() class Child(object): pass child_table = Table('child', meta.metadata, Column('parent_id', Integer, primary_key=True), Column('parent_ref', Integer, nullable=False), Column('content', String(10)), ForeignKeyConstraint(['parent_id', 'parent_ref'], ['parent.id', 'parent.ref']) ) orm.mapper(Child, child_table) class Parent(object): pass parent_table = Table('parent', meta.metadata, Column('id', Integer, primary_key=True), Column('ref', Integer, primary_key=True) ) orm.mapper(Parent, parent_table, properties={ 'children':relation(Child, lazy=False) }) Produces the following create 2009-04-08 16:36:54,319 INFO sqlalchemy.engine.base.Engine.0x...a6b0 CREATE TABLE parent ( id INTEGER NOT NULL, ref INTEGER NOT NULL, PRIMARY KEY (id, ref) ) 2009-04-08 16:36:54,319 INFO sqlalchemy.engine.base.Engine.0x...a6b0 {} 2009-04-08 16:36:54,454 INFO sqlalchemy.engine.base.Engine.0x...a6b0 COMMIT 2009-04-08 16:36:54,456 INFO sqlalchemy.engine.base.Engine.0x...a6b0 CREATE TABLE child ( parent_id INTEGER NOT NULL, parent_ref INTEGER NOT NULL, content VARCHAR(10), PRIMARY KEY (parent_id), FOREIGN KEY(parent_id, parent_ref) REFERENCES parent (id, ref) ) Using what I believe is the exact same thing with declarative produces the creates minus the composite foreign key and then of course is unable to establish the relation. class Parent(Base): __tablename__ = 'parent' id = Column(Integer, primary_key=True) ref = Column(Integer, primary_key=True) children = relation(Child, lazy=False) class Child(Base): __tablename__ = 'child' __table_args__ = ForeignKeyConstraint(['parent_id','parent_ref'], ['parent.id', 'parent.ref']) parent_id = Column(Integer, primary_key=True) parent_ref = Column(Integer, nullable=False) content = Column(String(10)) The create output is 2009-04-08 16:47:08,331 INFO sqlalchemy.engine.base.Engine.0x...a710 CREATE TABLE child ( parent_id INTEGER NOT NULL, parent_ref INTEGER NOT NULL, content VARCHAR(10), PRIMARY KEY (parent_id) ) 2009-04-08 16:47:08,331 INFO sqlalchemy.engine.base.Engine.0x...a710 {} 2009-04-08 16:47:08,464 INFO sqlalchemy.engine.base.Engine.0x...a710 COMMIT 2009-04-08 16:47:08,466 INFO sqlalchemy.engine.base.Engine.0x...a710 CREATE TABLE parent ( id INTEGER NOT NULL, ref INTEGER NOT NULL, PRIMARY KEY (id, ref) ) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: table reflection: include_columns does not include a component of the primary key?
phrrn...@googlemail.com wrote: I am trudging through the unit tests for the Sybase backend and found an interesting one in test_nonreflected_fk_raises in engine/ reflection.py. A couple of the drivers look like they skip over the column if it is not contained within table.c but go ahead and create the primary_key in any case! This seems a bit odd to me. Is this the intended behavior? if by create you mean reflect, probably not. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] My sqlalchemy.orm.identity.IdentityManagedState is leaking
Hi everyone! I have a threaded application which deals with a lot of records (it generates hundreds of thousands in an hour). I have a Database module that my program imports, and use scoped_session to make the session thread-safe. When a record is to be inserted, my program passes the object to be saved to the Database insert() function. Here's the code: == engine = sa.create_engine(engine_url, echo=echo) Session = sa.orm.scoped_session(sa.orm.sessionmaker(bind=engine, autocommit=False)) def insert(obj): try: session().merge(obj) except Exception, e: log.warning(Database problem: + str(e)) session().rollback() raise else: log.debug(Saved to database) session().commit() Session.remove() def session(): return Session() == Even though I call Session.remove(), it seems that I can't stop sqlalchemy.orm.identity.IdentityManagedState growing. Unit testing which inserts a couple thousand records shows the growth with Heapy. The dict of sqlalchemy.orm.identity.IdentityManagedState starts at 334 objects, ending with 11210 objects. I thought Session.remove() would cause SQLAlchemy to release those resources, but this doesn't seem to be the case. As the process is going to need to long-running (weeks hopefully), I'm far happier with performing expensive CPU operations than exhausting my memory. I am certain this is my own error, but I am not sure what it is. Any help would be appreciated! Thanks in advance, Chris Lewis --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: My sqlalchemy.orm.identity.IdentityManagedState is leaking
I believe you want session.expunge() or session.expunge_all(). GL On Wed, Apr 8, 2009 at 6:05 PM, Chris Lewis cfle...@gmail.com wrote: Hi everyone! I have a threaded application which deals with a lot of records (it generates hundreds of thousands in an hour). I have a Database module that my program imports, and use scoped_session to make the session thread-safe. When a record is to be inserted, my program passes the object to be saved to the Database insert() function. Here's the code: == engine = sa.create_engine(engine_url, echo=echo) Session = sa.orm.scoped_session(sa.orm.sessionmaker(bind=engine, autocommit=False)) def insert(obj): try: session().merge(obj) except Exception, e: log.warning(Database problem: + str(e)) session().rollback() raise else: log.debug(Saved to database) session().commit() Session.remove() def session(): return Session() == Even though I call Session.remove(), it seems that I can't stop sqlalchemy.orm.identity.IdentityManagedState growing. Unit testing which inserts a couple thousand records shows the growth with Heapy. The dict of sqlalchemy.orm.identity.IdentityManagedState starts at 334 objects, ending with 11210 objects. I thought Session.remove() would cause SQLAlchemy to release those resources, but this doesn't seem to be the case. As the process is going to need to long-running (weeks hopefully), I'm far happier with performing expensive CPU operations than exhausting my memory. I am certain this is my own error, but I am not sure what it is. Any help would be appreciated! Thanks in advance, Chris Lewis --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: My sqlalchemy.orm.identity.IdentityManagedState is leaking
Something I've realized is that all objects are cascading from a large one. There's a large staff object, which then has a collection of people objects which then have a collection of people statistics objects. This might well be the problem. If so, how can I tell SQLA to commit to the database and wipe clean it's memory? Is Session.remove() supposed to do this? Chris --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: My sqlalchemy.orm.identity.IdentityManagedState is leaking
Hi Gregg, Changing the Session.remove() to an expunge_all doesn't have any effect, the growth continues. Chris On Apr 8, 4:10 pm, Gregg Lind gregg.l...@gmail.com wrote: I believe you want session.expunge() or session.expunge_all(). GL On Wed, Apr 8, 2009 at 6:05 PM, Chris Lewis cfle...@gmail.com wrote: Hi everyone! I have a threaded application which deals with a lot of records (it generates hundreds of thousands in an hour). I have a Database module that my program imports, and use scoped_session to make the session thread-safe. When a record is to be inserted, my program passes the object to be saved to the Database insert() function. Here's the code: == engine = sa.create_engine(engine_url, echo=echo) Session = sa.orm.scoped_session(sa.orm.sessionmaker(bind=engine, autocommit=False)) def insert(obj): try: session().merge(obj) except Exception, e: log.warning(Database problem: + str(e)) session().rollback() raise else: log.debug(Saved to database) session().commit() Session.remove() def session(): return Session() == Even though I call Session.remove(), it seems that I can't stop sqlalchemy.orm.identity.IdentityManagedState growing. Unit testing which inserts a couple thousand records shows the growth with Heapy. The dict of sqlalchemy.orm.identity.IdentityManagedState starts at 334 objects, ending with 11210 objects. I thought Session.remove() would cause SQLAlchemy to release those resources, but this doesn't seem to be the case. As the process is going to need to long-running (weeks hopefully), I'm far happier with performing expensive CPU operations than exhausting my memory. I am certain this is my own error, but I am not sure what it is. Any help would be appreciated! Thanks in advance, Chris Lewis --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: My sqlalchemy.orm.identity.IdentityManagedState is leaking
Breakthrough: Using Gregg's expunge_all, then commenting out the addition to collections, SQLA is not leaking anymore. Inside my team object, I go in then create the people and their statistics, but I don't add the people to the team collection anymore. The links are never used again, but it's a many-to-many relationship (a person can be on several teams), so I thought I needed to do this in order to get the many-to-many table working. Not putting these people in the team collections allows SQLA to free the resources. My question then becomes: how do I free up collections and have SQLA requery for those objects lazily? Chris On Apr 8, 4:18 pm, Chris Lewis cfle...@gmail.com wrote: Hi Gregg, Changing the Session.remove() to an expunge_all doesn't have any effect, the growth continues. Chris On Apr 8, 4:10 pm, Gregg Lind gregg.l...@gmail.com wrote: I believe you want session.expunge() or session.expunge_all(). GL On Wed, Apr 8, 2009 at 6:05 PM, Chris Lewis cfle...@gmail.com wrote: Hi everyone! I have a threaded application which deals with a lot of records (it generates hundreds of thousands in an hour). I have a Database module that my program imports, and use scoped_session to make the session thread-safe. When a record is to be inserted, my program passes the object to be saved to the Database insert() function. Here's the code: == engine = sa.create_engine(engine_url, echo=echo) Session = sa.orm.scoped_session(sa.orm.sessionmaker(bind=engine, autocommit=False)) def insert(obj): try: session().merge(obj) except Exception, e: log.warning(Database problem: + str(e)) session().rollback() raise else: log.debug(Saved to database) session().commit() Session.remove() def session(): return Session() == Even though I call Session.remove(), it seems that I can't stop sqlalchemy.orm.identity.IdentityManagedState growing. Unit testing which inserts a couple thousand records shows the growth with Heapy. The dict of sqlalchemy.orm.identity.IdentityManagedState starts at 334 objects, ending with 11210 objects. I thought Session.remove() would cause SQLAlchemy to release those resources, but this doesn't seem to be the case. As the process is going to need to long-running (weeks hopefully), I'm far happier with performing expensive CPU operations than exhausting my memory. I am certain this is my own error, but I am not sure what it is. Any help would be appreciated! Thanks in advance, Chris Lewis --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] relation that works in .48 but fails in .53
I am porting some code from SQLA .48 to .53 . I have a relation that works in .48, but not in .53. At the end of this post is a test program demonstrating the problem. The program works in .48 but fails in .53 trying to understand the child1_object relation . The error message says to add foreign_keys to the relation, but that doesn't seem to actually help. It does however work if I change the relation to child1_object = relation('Child1', backref='others', primaryjoin = child1_id == Child1.__table__.c.id) Is making this change the recommended solution? Is the behavior I am seeing here expected? Is the message telling me to use foreign_keys bogus? Here is the code: #!/usr/bin/python -u from sqlalchemy import Column, Integer, create_engine, String, ForeignKey from sqlalchemy.orm import sessionmaker, relation from sqlalchemy.ext.declarative import declarative_base Base = declarative_base() class Parent(Base): __tablename__ = 'parent' id = Column(Integer, primary_key=True) _cls = Column('cls', String(50)) __mapper_args__ = dict(polymorphic_on = _cls ) class Child1(Parent): __tablename__ = 'child1' __mapper_args__ = dict(polymorphic_identity = 'child1') id = Column(Integer, ForeignKey('parent.id'), primary_key=True) class Child2(Parent): __tablename__ = 'child2' __mapper_args__ = dict(polymorphic_identity = 'child2') id = Column(Integer, ForeignKey('parent.id'), primary_key=True) class Other(Base): __tablename__ = 'other' id = Column(Integer, primary_key=True) child1_id = Column(Integer, ForeignKey('child1.id')) child1_object = relation('Child1', backref='others', primaryjoin = child1_id == Child1.id) engine = create_engine('sqlite://') Base.metadata.create_all(engine) session = sessionmaker(engine)() def main(): child1 = Child1() child2 = Child2() other = Other() child1.others = [other] session.add(child1) session.add(child2) session.add(other) session.flush() assert 2 == session.query(Parent).count() assert child1 == session.query(Other).one().child1_object if __name__ == '__main__': main() --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: relation that works in .48 but fails in .53
this is a side effect of declarative which I covered in my pycon tutorial. Child1.id is a map of both child1.id and parent.id, since they are mapped under the same name. you can see this if you say Child1.id.property.columns. the non-foreign key parent.id takes precedence. the easy solution is to split them up: class Child1(Parent): __tablename__ = 'child1' __mapper_args__ = dict(polymorphic_identity = 'child1') child_id = Column(Integer, ForeignKey('parent.id'), primary_key=True) class Other(Base): __tablename__ = 'other' id = Column(Integer, primary_key=True) child1_id = Column(Integer, ForeignKey('child1.id')) child1_object = relation('Child1', backref='others', primaryjoin = child1_id == Child1.child_id) Bobby Impollonia wrote: I am porting some code from SQLA .48 to .53 . I have a relation that works in .48, but not in .53. At the end of this post is a test program demonstrating the problem. The program works in .48 but fails in .53 trying to understand the child1_object relation . The error message says to add foreign_keys to the relation, but that doesn't seem to actually help. It does however work if I change the relation to child1_object = relation('Child1', backref='others', primaryjoin = child1_id == Child1.__table__.c.id) Is making this change the recommended solution? Is the behavior I am seeing here expected? Is the message telling me to use foreign_keys bogus? Here is the code: #!/usr/bin/python -u from sqlalchemy import Column, Integer, create_engine, String, ForeignKey from sqlalchemy.orm import sessionmaker, relation from sqlalchemy.ext.declarative import declarative_base Base = declarative_base() class Parent(Base): __tablename__ = 'parent' id = Column(Integer, primary_key=True) _cls = Column('cls', String(50)) __mapper_args__ = dict(polymorphic_on = _cls ) class Child1(Parent): __tablename__ = 'child1' __mapper_args__ = dict(polymorphic_identity = 'child1') id = Column(Integer, ForeignKey('parent.id'), primary_key=True) class Child2(Parent): __tablename__ = 'child2' __mapper_args__ = dict(polymorphic_identity = 'child2') id = Column(Integer, ForeignKey('parent.id'), primary_key=True) class Other(Base): __tablename__ = 'other' id = Column(Integer, primary_key=True) child1_id = Column(Integer, ForeignKey('child1.id')) child1_object = relation('Child1', backref='others', primaryjoin = child1_id == Child1.id) engine = create_engine('sqlite://') Base.metadata.create_all(engine) session = sessionmaker(engine)() def main(): child1 = Child1() child2 = Child2() other = Other() child1.others = [other] session.add(child1) session.add(child2) session.add(other) session.flush() assert 2 == session.query(Parent).count() assert child1 == session.query(Other).one().child1_object if __name__ == '__main__': main() --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: relation that works in .48 but fails in .53
Is there a video or slides from that pycon talk available online? I see the overview on the pycon site (http://us.pycon.org/2009/tutorials/schedule/2PM4/) and it looks very interesting. I looked more at what you said about the parent id column taking precedence and that does seem to be what happens. In particular, I see that the sql generated for session.query(Child1).filter( Child1.id.in_( range(5))) .count() uses the IN condition on parent.id in .53, whereas it was on the child1.id in .48. The .48 behavior seems better here. I can't think of a situation where I would say Child1.id and want it to use the column from the parent table instead of the child. Regardless, now that I understand what is happening, I will be able to get my application working on .53. Thanks for your help. On Wed, Apr 8, 2009 at 6:04 PM, Michael Bayer mike...@zzzcomputing.com wrote: this is a side effect of declarative which I covered in my pycon tutorial. Child1.id is a map of both child1.id and parent.id, since they are mapped under the same name. you can see this if you say Child1.id.property.columns. the non-foreign key parent.id takes precedence. the easy solution is to split them up: class Child1(Parent): __tablename__ = 'child1' __mapper_args__ = dict(polymorphic_identity = 'child1') child_id = Column(Integer, ForeignKey('parent.id'), primary_key=True) class Other(Base): __tablename__ = 'other' id = Column(Integer, primary_key=True) child1_id = Column(Integer, ForeignKey('child1.id')) child1_object = relation('Child1', backref='others', primaryjoin = child1_id == Child1.child_id) Bobby Impollonia wrote: I am porting some code from SQLA .48 to .53 . I have a relation that works in .48, but not in .53. At the end of this post is a test program demonstrating the problem. The program works in .48 but fails in .53 trying to understand the child1_object relation . The error message says to add foreign_keys to the relation, but that doesn't seem to actually help. It does however work if I change the relation to child1_object = relation('Child1', backref='others', primaryjoin = child1_id == Child1.__table__.c.id) Is making this change the recommended solution? Is the behavior I am seeing here expected? Is the message telling me to use foreign_keys bogus? Here is the code: #!/usr/bin/python -u from sqlalchemy import Column, Integer, create_engine, String, ForeignKey from sqlalchemy.orm import sessionmaker, relation from sqlalchemy.ext.declarative import declarative_base Base = declarative_base() class Parent(Base): __tablename__ = 'parent' id = Column(Integer, primary_key=True) _cls = Column('cls', String(50)) __mapper_args__ = dict(polymorphic_on = _cls ) class Child1(Parent): __tablename__ = 'child1' __mapper_args__ = dict(polymorphic_identity = 'child1') id = Column(Integer, ForeignKey('parent.id'), primary_key=True) class Child2(Parent): __tablename__ = 'child2' __mapper_args__ = dict(polymorphic_identity = 'child2') id = Column(Integer, ForeignKey('parent.id'), primary_key=True) class Other(Base): __tablename__ = 'other' id = Column(Integer, primary_key=True) child1_id = Column(Integer, ForeignKey('child1.id')) child1_object = relation('Child1', backref='others', primaryjoin = child1_id == Child1.id) engine = create_engine('sqlite://') Base.metadata.create_all(engine) session = sessionmaker(engine)() def main(): child1 = Child1() child2 = Child2() other = Other() child1.others = [other] session.add(child1) session.add(child2) session.add(other) session.flush() assert 2 == session.query(Parent).count() assert child1 == session.query(Other).one().child1_object if __name__ == '__main__': main() --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: My sqlalchemy.orm.identity.IdentityManagedState is leaking
Inside my team object, I go in then create the people and their statistics, but I don't add the people to the team collection anymore. The links are never used again, but it's a many-to-many relationship (a person can be on several teams), so I thought I needed to do this in order to get the many-to-many table working. Not putting these people in the team collections allows SQLA to free the resources. My question then becomes: how do I free up collections and have SQLA requery for those objects lazily? While waiting to see if this question has an answer, I tried circumventing the collection by directly inserting to the already defined many-to-many table: # This has been simplified, so don't be concerned with the odd use # of ForeignKeyConstraint :) team_staff = Table(TEAM_STAFF, Base.metadata, Column(team_name, Unicode(100)), Column(staff_name, Unicode(100)), ForeignKeyConstraint(['team_name'], ['TEAM.name']), ForeignKeyConstraint(['staff_name','realm', 'site'], ['STAFF.name']), ) Database.engine.execute(team_staff.insert(), \ team_name=unicode(team_name), \ staff_name=unicode(staff_name)) Unfortunately, for Unicode names, this isn't working, and I can't seem to coerce it to do so *sigh*. It just ends up as NULL in the DB. I'm willing to accept any and all workarounds for this... inputting to TEAM_CHARACTERS without having to append to a collection so it gets freed in memory when expunge_all() is called is all I need to do! I'm then able to sleep soundly! Thanks all :) Chris --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: When to create new sessions?
Now the decorator swallows exceptions silently. You have to reraise the exception after rolling back like Michael did. I believe the correct form is: def transaction(f): def wrapper(*args, **kwargs): try: value = f(*args, **kwargs) except: session.rollback() raise else: session.commit() return value return wrapper On Wed, Apr 8, 2009 at 7:12 AM, Diez B. Roggisch de...@web.de wrote: On Wednesday 08 April 2009 05:53:12 Michael Bayer wrote: On Apr 7, 2009, at 6:07 PM, rintin...@googlemail.com wrote: Hey Everyone, I'm new to SQLAlchemy (used to using Django's ORM but need to switch) and there's just one thing I'm struggling with, which is when am I supposed to create Sessions? I am of course creating scoped sessions. I feel like a real dunce for not being able to get my head around it. Do I create one per-request and pass it around? That just doesn't feel quite right to me. Or can I create them at module-level when I need them? per-request is the most natural approach. The point of the scopedsession is that you can use it as a global object, there's no need to pass it around. It automatically routes operations to a thread-local session. I'm sure django does something similar. the chapter on sessions includes a discussion on integrating scopedsession within a web application, you should check it out. Also, is it okay to call commit() more than once on the same session? absolutely. On a per-function basis even (seems like an awful lot of boilerplate code in each function though… surely not?!) depending on what you're doing , this may or may not be appropriate. boilerplate can be cut down using a decorator, such as: @commits def do_some_stuff(...): the decorator: def commits(fn): def go(*args, **kw): try: return fn(*args, **kw) Session.commit() Not to be to nitpicky... but this commit is never reached. And dangling transactions can be very irritating. I'd go for this (untetsted) def transaction(f) def _wrapper(*args, **kwargs): commit = True try: return f(*args, **kwargs) except: commit = False finally: (session.commit if commit else session.rollback)() Diez --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---