[sqlalchemy] filtered or computed version of an existing relation?
OK, next question. Well... two related questions. :) 1) In general, inside an object's method def, where I am doing arbitrary calculations, how can I get access to the session the object is bound to in order to run other queries? 2) More specifically, I want to make a filtered and computed version of an existing relation, how should I do this? See the code below. Thank you, AF Code: = users_table = Table('users', metadata, Column('id', Integer, primary_key=True), Column('name', String(50)), Column('room', ForeignKey('rooms.id')), Column('height', Numeric) Column('gender', String(1)) ) rooms_table = Table('rooms', metadata, Column('id', Integer, primary_key=True), Column('name', String(50)), Column('size', Integer) ) metadata.drop_all(engine) metadata.create_all(engine) class user(object): pass class room(object): def percent_men(self): # How to code these? # Must return the % of men vs. women # in this room. def room.percent_of_all_users(self): # % of users here vs. count of all users def men(self): # just the male users in this room mapper(user, users_table, properties={'room' : relation('rooms.id', backref = 'users')}) mapper(room, rooms_table) # and here are the methods I wish to have: percent_men = room.percent_men() percent_of_population = room.percent_of_all_users() --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Two enhancement proposals for orm.Query
On Wednesday 17 June 2009 19.08:10 klaus wrote: ... whether a query will yield any result ... The best approximation seems to be query.first() is not None which can select a lot of columns. I often see query.count() 0 which can become quite expensive on a DBMS like PostgreSQL. Just a side note that pg tends to compute first() efficiently if (and I assume sa does this) the implementation uses LIMIT to tell the db that really only the first row is interesting. I don't know about other databases. cheers -- vbi -- this email is protected by a digital signature: http://fortytwo.ch/gpg signature.asc Description: This is a digitally signed message part.
[sqlalchemy] safe (or paranoid) deletion
Hi, I'm using sqlalchemy 0.4.8 and I'm looking for a safe deletion extension (if exists one). With safe deletion I mean to mark a record as deleted, not to physically delete it, for example there is an extension for RoR called acts as paranoid[1]. This extension add a deleted_at field to the model and: 1. when you delete a record deleted_at is set to the current timestamp; 2. it adds a and (deleted_at is NULL) to every query related to the model. Is there something similar for SA or alternatives methods to achieve a safe deletion? Thanks! [1] http://rubyforge.org/projects/ar-paranoid --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] SA 0.4 with SQLServer and PyODBC Unicode data into nvarchar column
Hello, I have a question regarding SQLAlchemy(0.4) with PyODBC, SQLServer 2008. I want to insert Unicode data and query values should be prefixed with 'N' like INSERT INTO test VALUES(N'Сърцето и черният й дроб'). Is this possible with SA 0.4? The main problem with SQL Server is that needs explicit 'N' prefix to string values containing Unicode and I cannot find a way to insert Unicode data other than modifying SA queries. Before digging into this I want to make sure there is no other way. Best regards, Cristian. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: relation error?
On Jun 18, 2009, at 1:58 AM, allen.fowler wrote: you can, you can use a validator that rejects all changes, or if you're brave you can create a custom __setattribute__ method that brokers all attribute setter access. the former is in the SQLA mapping docs the latter is part of Python. Thank you. Interesting seems a bit of a round about, though. :) its an object relational mapper, which deals with persistence. the behavior of your business objects beyond that is your own affair. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: safe (or paranoid) deletion
On Jun 18, 2009, at 6:03 AM, Alberto Granzotto wrote: Hi, I'm using sqlalchemy 0.4.8 and I'm looking for a safe deletion extension (if exists one). With safe deletion I mean to mark a record as deleted, not to physically delete it, for example there is an extension for RoR called acts as paranoid[1]. This extension add a deleted_at field to the model and: 1. when you delete a record deleted_at is set to the current timestamp; you can do this via myobject.deleted_at=func.now(), then flush 2. it adds a and (deleted_at is NULL) to every query related to the model. we have had users who subclassed Query to provide this logic. there's a couple of kinks to that which will be resolved by ticket #1424. Is there something similar for SA or alternatives methods to achieve a safe deletion? there's a recipe on the wiki which instead uses history tables to log an audit trail - that way your primary tables aren't cluttered up with dead rows. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: session flush
On Jun 18, 2009, at 1:02 AM, Michael Mileusnich wrote: I know I have asked this before but I would like some clarification. If I do something like : i = query(table).all for instance in i: thread_class(i.ID) thread_class.start() del thread_class and do an additional query in the init of the thread_class to pull up the entire record for i using the ID passed into variable fetch_i, can I do sess.flush() and fetch_i be updated? I am using a scoped session and sometimes I get the following error: Traceback (most recent call last): File D:\Python26\lib\threading.py, line 522, in __bootstrap_inner self.run() File C:\Dev\jlaunch.py, line 253, in run self.sess.flush() File d:\python26\lib\site-packages\sqlalchemy-0.5.3-py2.6.egg \sqlalchemy\orm\ session.py, line 1347, in flush raise sa_exc.InvalidRequestError(Session is already flushing) InvalidRequestError: Session is already flushing Am I going about this wrong? its not at all clear what you're doing here but Sessions are not threadsafe in any case so you cannot access them via multiple threads without applying locking to all operations. Using the ScopedSession, each thread should call Session() itself (or use the class-level accessors) so that individual sessions are not shared among threads. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: filtered or computed version of an existing relation?
On Jun 18, 2009, at 2:27 AM, AF wrote: OK, next question. Well... two related questions. :) 1) In general, inside an object's method def, where I am doing arbitrary calculations, how can I get access to the session the object is bound to in order to run other queries? object_session(self) 2) More specifically, I want to make a filtered and computed version of an existing relation, how should I do this? I would use object_session(self).query(Class)... to create the appropriate query. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: SA 0.4 with SQLServer and PyODBC Unicode data into nvarchar column
PyODBC accepts unicode strings (and by that I mean Python unicode, u'some string') directly on most platforms. No N character is needed (unless PyODBC or the odbc driver does this behind the scenes). Our MSSQL dialect does detect certain builds where this is not possible and instead encodes to utf-8 or whatever encoding is configured on the engine before passing in unicode strings. On Jun 18, 2009, at 8:30 AM, cristiroma wrote: Hello, I have a question regarding SQLAlchemy(0.4) with PyODBC, SQLServer 2008. I want to insert Unicode data and query values should be prefixed with 'N' like INSERT INTO test VALUES(N'Сърцето и черният й дроб'). Is this possible with SA 0.4? The main problem with SQL Server is that needs explicit 'N' prefix to string values containing Unicode and I cannot find a way to insert Unicode data other than modifying SA queries. Before digging into this I want to make sure there is no other way. Best regards, Cristian. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Optimizing joined entity loads
Hi list! It's been a long time since I've posted here. SA is still the best Python ORM; thanks for all your hard work Mike! Now, the question. I've got a set of related entities: Order (has items) Item (has attributes) Attribute I am working on optimizing the loading and have run into a situation that I can't figure out how to make SA optimize. Here's the scenario: First I load the order (1 query) Next I load the related items (1 query) Next I load the related attributes for each item (1 query for each item) I have tried eager loading (i.e. setting lazy=False on the item.attributes relationship), but that generates a HUGE result set that takes MUCH longer to load than loading the attributes individually for each item as listed above. What I'd like to have SA do is do a single query to load all attributes for all items of the order when the first item's attributes are requested. Here's the revised scenario from above: First I load the order (1 query) Next I load the related items (1 query) Next I load the related attributes for each item (1 query loads all attributes for the entire order) Is this possible or do I have to roll my own extension to orm.relation (actually orm.properties.PropertyLoader) ? -- that doesn't look very fun. ~ Daniel --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: SA 0.4 with SQLServer and PyODBC Unicode data into nvarchar column
You should put this into your .freetds.conf file to ensure that FreeTDS will tell iconv to do the right thing (my understanding is that all unicode data is encoded to UCS-2 by FreeTDS) tds version = 8.0 client charset = UTF-8 SQL Alchemy create_engine has an encoding kwarg: encoding=’utf-8’ – the encoding to use for all Unicode translations, both by engine-wide unicode conversion as well as the Unicode type object This should be set to the same value as you have for 'client charset' in the .freetds.conf file. You can run into problems when bogus data has been stuff into the nvarchar field at the dataserver as it will cause the python codec to blow up when retrieving the data so Don't Do That (I discovered this the hard way by having a server-side job that was populating the data and not ensuring that the encoding was well-formed) pjjH --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Optimizing joined entity loads
On Jun 18, 2009, at 11:09 AM, Daniel wrote: Hi list! It's been a long time since I've posted here. SA is still the best Python ORM; thanks for all your hard work Mike! Now, the question. I've got a set of related entities: Order (has items) Item (has attributes) Attribute I am working on optimizing the loading and have run into a situation that I can't figure out how to make SA optimize. Here's the scenario: First I load the order (1 query) Next I load the related items (1 query) Next I load the related attributes for each item (1 query for each item) I have tried eager loading (i.e. setting lazy=False on the item.attributes relationship), but that generates a HUGE result set that takes MUCH longer to load than loading the attributes individually for each item as listed above. What I'd like to have SA do is do a single query to load all attributes for all items of the order when the first item's attributes are requested. Here's the revised scenario from above: First I load the order (1 query) Next I load the related items (1 query) Next I load the related attributes for each item (1 query loads all attributes for the entire order) Is this possible or do I have to roll my own extension to orm.relation (actually orm.properties.PropertyLoader) ? -- that doesn't look very fun. I dont really understand the case here. a query that loads all attributes for the entire order would necesssarily return just as many rows as the original query using an eager load. if the order had 5 related items, and each item had 10 attributes, its 50 rows. the phrase 1 query loads all attributes for the entire order would be loading the same 50 rows. So i dont see how the result set is HUGE in one case and not the other (assuming HUGE means, number of rows. if number of columns, SQLA ignores columns for entities which it already has during a load). Normally, if you wanted the attributes to eagerload off the related items, but not from the order, you would specify eagerloading on only those attributes which you want eagerloaded. query(Order).options(eagerload(items, attributes)) the above will set up order.items.attributes as an eagerload, but nothing else. it sounds like you already did this. So i dont really understand the problem. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: SA 0.4 with SQLServer and PyODBC Unicode data into nvarchar column
Well, out of the box it didn't work, I pasted a sample code here: http://pastebin.com/m104b32e0 (note that strings are bulgarian characters not #1040 as put by pastebin. My FreeTDS/ODBC are: --- freetds.conf [Observations_TDS] host = 10.0.0.50 port = 1433 tds version = 8.0 client charset = UTF-8 odbc.ini [Observations] Driver = MSSQLunixODBC Description = SQL Server Servername = Observations_TDS User= sa Password= vmuser Language= Database= 20090610OBS Logging = 1 LogFile = /tmp/log_observations QuotedId= Yes AnsiNPW = Yes Mars_Connection = No Sample code is: #!/var/local/eoe/python245/bin/python # -*- coding: UTF-8 -*- # vim: set fileencoding=UTF-8 : from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from sqlalchemy import Table, Column, String, MetaData, Unicode, Integer from sqlalchemy.orm import mapper # Table definition in SQL #CREATE TABLE [dbo].[test2] ( # [ID] int IDENTITY(1, 1) NOT NULL, # [comment] nvarchar(100) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL, # CONSTRAINT [PK_test2] PRIMARY KEY CLUSTERED ([ID]) #) #ON [PRIMARY] #GO # sqlalchemy 0.4.8 # python 2.4 # freetds-0.8.2 officialy patched # pyodbc 2.1.6 metadata = MetaData() test2_table = Table('test2', metadata, Column('comment', Unicode, primary_key=True) ) class Test2(object): def __init__(self, comment): self.comment = comment mapper(Test2, test2_table) engine = create_engine('mssql://sa:vmuser@/?dsn=Observations', echo=True) Session = sessionmaker(bind=engine) session = Session() ob = Test2(u'товарни автомобили') session.save(ob) session.commit() ob_list = session.query(Test2).all() print ob_list Output is: [cor...@localhost bin]$ ./python test2.py 2009-06-18 16:55:22,754 INFO sqlalchemy.engine.base.Engine.0x..94 BEGIN 2009-06-18 16:55:22,755 INFO sqlalchemy.engine.base.Engine.0x..94 INSERT INTO test2 (comment) VALUES (?) 2009-06-18 16:55:22,755 INFO sqlalchemy.engine.base.Engine.0x..94 ['\xd1\x82\xd0\xbe\xd0\xb2\xd0\xb0\xd1\x80\xd0\xbd\xd0\xb8 \xd0\xb0\xd0\xb2\xd1\x82\xd0\xbe\xd0\xbc\xd0\xbe\xd0\xb1\xd0\xb8\xd0\xbb\xd0\xb8'] 2009-06-18 16:55:22,787 INFO sqlalchemy.engine.base.Engine.0x..94 COMMIT 2009-06-18 16:55:22,791 INFO sqlalchemy.engine.base.Engine.0x..94 BEGIN 2009-06-18 16:55:22,791 INFO sqlalchemy.engine.base.Engine.0x..94 SELECT test2.comment AS test2_comment FROM test2 ORDER BY test2.comment 2009-06-18 16:55:22,791 INFO sqlalchemy.engine.base.Engine.0x..94 [] [__main__.Test2 object at 0xb7ab5d0c] 2009/6/18 Michael Bayer mike...@zzzcomputing.com PyODBC accepts unicode strings (and by that I mean Python unicode, u'some string') directly on most platforms. No N character is needed (unless PyODBC or the odbc driver does this behind the scenes). Our MSSQL dialect does detect certain builds where this is not possible and instead encodes to utf-8 or whatever encoding is configured on the engine before passing in unicode strings. On Jun 18, 2009, at 8:30 AM, cristiroma wrote: Hello, I have a question regarding SQLAlchemy(0.4) with PyODBC, SQLServer 2008. I want to insert Unicode data and query values should be prefixed with 'N' like INSERT INTO test VALUES(N'Сърцето и черният й дроб'). Is this possible with SA 0.4? The main problem with SQL Server is that needs explicit 'N' prefix to string values containing Unicode and I cannot find a way to insert Unicode data other than modifying SA queries. Before digging into this I want to make sure there is no other way. Best regards, Cristian. -- The brain is a wonderful organ. It starts working the moment you get up in the morning, and does not stop until you get into the office. Robert Frost (1874-1963) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: safe (or paranoid) deletion
On 18 Giu, 15:38, Michael Bayer mike...@zzzcomputing.com wrote: On Jun 18, 2009, at 6:03 AM, Alberto Granzotto wrote: Hi, I'm using sqlalchemy 0.4.8 and I'm looking for a safe deletion extension (if exists one). With safe deletion I mean to mark a record as deleted, not to physically delete it, for example there is an extension for RoR called acts as paranoid[1]. This extension add a deleted_at field to the model and: 1. when you delete a record deleted_at is set to the current timestamp; you can do this via myobject.deleted_at=func.now(), then flush but this doesn't propagate on all the children objects, I'd like to mark them as deleted too. I need to manually iterate through 2. it adds a and (deleted_at is NULL) to every query related to the model. we have had users who subclassed Query to provide this logic. there's a couple of kinks to that which will be resolved by ticket #1424. great thank you. Is there something similar for SA or alternatives methods to achieve a safe deletion? there's a recipe on the wiki which instead uses history tables to log an audit trail - that way your primary tables aren't cluttered up with dead rows. OK, I've got it :) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: SA 0.4 with SQLServer and PyODBC Unicode data into nvarchar column
The wirdest thing is that if I write: ob = Test2('just a test') session.save(ob) session.commit() 2009-06-18 17:08:15,251 INFO sqlalchemy.engine.base.Engine.0x..74 INSERT INTO test2 (comment) VALUES (?) 2009-06-18 17:08:15,251 INFO sqlalchemy.engine.base.Engine.0x..74 ['just a test'] 2009-06-18 17:08:15,262 INFO sqlalchemy.engine.base.Engine.0x..74 COMMIT Everything works fine, but if I write: ob = Test2(u'товарни автомобили') session.save(ob) session.commit() Output is: [cor...@localhost bin]$ ./python test2.py 2009-06-18 16:55:22,754 INFO sqlalchemy.engine.base.Engine.0x..94 BEGIN 2009-06-18 16:55:22,755 INFO sqlalchemy.engine.base.Engine.0x..94 INSERT INTO test2 (comment) VALUES (?) 2009-06-18 16:55:22,755 INFO sqlalchemy.engine.base.Engine.0x..94 ['\xd1\x82\xd0\xbe\xd0\xb2\xd0\xb0\xd1\x80\xd0\xbd\xd0\xb8 \xd0\xb0\xd0\xb2\xd1\x82\xd0\xbe\xd0\xbc\xd0\xbe\xd0\xb1\xd0\xb8\xd0\xbb\xd0\xb8'] 2009-06-18 16:55:22,787 INFO sqlalchemy.engine.base.Engine.0x..94 COMMIT And in the database, column is empty! no data, just blank. On Thu, Jun 18, 2009 at 6:43 PM, phrrn...@googlemail.com phrrn...@googlemail.com wrote: You should put this into your .freetds.conf file to ensure that FreeTDS will tell iconv to do the right thing (my understanding is that all unicode data is encoded to UCS-2 by FreeTDS) tds version = 8.0 client charset = UTF-8 SQL Alchemy create_engine has an encoding kwarg: encoding='utf-8' - the encoding to use for all Unicode translations, both by engine-wide unicode conversion as well as the Unicode type object This should be set to the same value as you have for 'client charset' in the .freetds.conf file. You can run into problems when bogus data has been stuff into the nvarchar field at the dataserver as it will cause the python codec to blow up when retrieving the data so Don't Do That (I discovered this the hard way by having a server-side job that was populating the data and not ensuring that the encoding was well-formed) pjjH -- The brain is a wonderful organ. It starts working the moment you get up in the morning, and does not stop until you get into the office. Robert Frost (1874-1963) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: safe (or paranoid) deletion
On Jun 18, 2009, at 1:47 PM, Alberto Granzotto wrote: On 18 Giu, 15:38, Michael Bayer mike...@zzzcomputing.com wrote: On Jun 18, 2009, at 6:03 AM, Alberto Granzotto wrote: Hi, I'm using sqlalchemy 0.4.8 and I'm looking for a safe deletion extension (if exists one). With safe deletion I mean to mark a record as deleted, not to physically delete it, for example there is an extension for RoR called acts as paranoid[1]. This extension add a deleted_at field to the model and: 1. when you delete a record deleted_at is set to the current timestamp; you can do this via myobject.deleted_at=func.now(), then flush but this doesn't propagate on all the children objects, I'd like to mark them as deleted too. I need to manually iterate through that is true, feel free to use the cascade_iterator mapper function (requires that delete cascade is configured on the relation()s you want to have this sort of cascade) : from sqlalchemy.orm.attributes import instance_state from sqlalchemy.orm import class_mapper def delete_obj(myobject): myobject.deleted_at = func.now() mapper = class_mapper(myobject) for obj, m in mapper.cascade_iterator(delete, instance_state(myobject)): obj.deleted_at = func.now() --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Optimizing joined entity loads
I dont really understand the case here. My first example wasn't very good. In an attempt to keep it simple I actually made it too simple. Here's another example: Order (has items) Item (has attributes, has tags) Attribute Tag If I set both Item.attributes and Item.tags to eager-load, then my result set size is the product of len(attributes) * len(tags), which is where the result set becomes HUGE. This is a description of the queries before the optimization: select orders (1 query) select items (1 query) select attributes (1 query per item) select tags (1 query per item) I'd like to combine all attribute queries into a single query. Likewise for tags. So instead of having 2 + len(items) * 2 queries (assuming 10 items, that's 22 queries), I'd have exactly 4 queries. Like this: select orders ... where order_id = ? (1 query) select items ... where order_id = ? (1 query) select attributes ... join items where order_id = ? (1 query) select tags ... join items where order_id = ? (1 query) This would be done by the loader strategy (probably a variant of LazyLoader), which would issue a single query. The result of that query would be used to populate the attributes collection of each item on the order. ... So i dont see how the result set is HUGE in one case and not the other (assuming HUGE means, number of rows. if number of columns, SQLA ignores columns for entities which it already has during a load). I think my new example above should clear up the confusion. However, the old example (using eager loading) would return duplicate copies of the item data for each attribute. If there are a lot of columns in the items table, the size of the result set can get quite large using this type of eager load, and it's pretty inefficient since it's returning a duplicate copy of the item with each attribute. The strategy I'm looking for eliminates all that duplicate data at the expense of a single extra query. In the case of having multiple relations (e.g. attributes and tags) the eager-load result set grows exponentially, while the strategy I'm looking for only requires a single query per relation but loads no duplicate data. Theoretically this is the most efficient solution possible assuming that all data must be loaded (i.e. every item, attribute and tag). Normally, if you wanted the attributes to eagerload off the related items, but not from the order, you would specify eagerloading on only those attributes which you want eagerloaded. Yes, I understand that. It's not what I'm asking for though. Thanks. ~ Daniel --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Optimizing joined entity loads
Clarification: If I set both Item.attributes and Item.tags to eager-load, then my result set size is the product of len(attributes) * len(tags), which is where the result set becomes HUGE. I jumped right from the eager-load to the completely non-optimized (no eager loading) scenario: This is a description of the queries before the optimization: select orders (1 query) select items (1 query) select attributes (1 query per item) select tags (1 query per item) Here's the query list for the eager-load-attributes-and-tags scenario: select orders (1 query) select items eager-loading attributes and tags (1 query) So 2 queries in that scenario, but the second query takes FOREVER to execute and returns a HUGE result set. ~ Daniel --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Optimizing joined entity loads
On Jun 18, 2009, at 2:19 PM, millerdev wrote: This would be done by the loader strategy (probably a variant of LazyLoader), which would issue a single query. The result of that query would be used to populate the attributes collection of each item on the order. hey Daniel - Good to have you back on the list. So this is some variant of, i have a bunch of objects and I'd like to issue a single SQL statement representing a set of collections across all of them. SQLA doesn't have a built in loader function like this, although its been discussed. IMO it's just too much to be going on automatically with too little understanding required up front, meaning end users might flip on super-magic-loading and then wonder why some massive query is being triggered by a single attribute load - if they had already discarded half their objects that would be the recipients then the query would in fact be quite wasteful.The controversial part is that you ask for a single attribute on a single object, but then data is loaded into objects that are elsewhere as a result. I think Hibernate might have this kind of loading available, I've seen it in their docs but I doubt its used very much. If it were a highly requested feature that would see a lot of use, that would help its case...otherwise it's a complicated feature that's hard to implement, maintain and support to the degree that a core SQLA feature requires, mostly in terms of constructing the right SQL for a huge variety of cases, writing/maintaining all the unit tests for those cases, and supporting on the mailing list and IRC those users who think they want to use this feature but don't really understand it (or are hitting bugs with it). OTOH, rolling recipes for things like this yourself has no such burden of working perfectly in all cases at all times for everyone everywhere, it only needs to work for your immediate needs. I think the infrastructure exists to construct this feature seamlessly without the need to hack into LoaderStrategy (though that would be where a core version of this feature would be created). You'd start using plain session.query() to get the rows you want, hash together the collections as desired, and then apply them to the already loaded objects using the attributes.set_committed_value() function which is used for this purpose (pseudocode follows)(also use trunk rev 6066 since I had to fix set_committed_value()): from sqlalchemy.orm import attributes loaded_collections = collections.defaultdict(list) for parent_pk, obj in session.query(Parent.id, DesiredClass).figure out the correct criteria: collection = loaded_collections[parent_pk] collection.append(obj) for parent_pk, collection in loaded_collections.iteritems(): object = lookup_the_parent_object(parent_pk) # probably via session.identity_map attributes.set_committed_value(object, collectionname, collection) note that set_committed_value() blows away any pending changes on the collection. If that's an issue, you can perform a conditional which checks for an already existing collection in object.__dict__, or use attributes.get_history(), etc. Since you're loading for every object in the result, it seems reasonable that you'd just issue the above query immediately following the primary result (when we've considered this feature, that's how we considered it). But if you truly wanted to trigger the above by a lazy attribute, you can roll the above into a callable, and apply to any attribute via: def load_data(): the above loading logic return attributes.ATTR_WAS_SET state = attrbutes.instance_state(some_object) state.set_callable(collectionname, load_data) which means, when you hit some_object.collectionname, load_data is called, and the return value of ATTR_WAS_SET indicates the callable populated what it needed, no need for attributes to use the return value as the collection. So you could build a subclass of Query which applied all of the above using a function like .special_deferred_load(collectionname), which would then install the loader callable to each element in the result when that option is selected.The hard part is what I've left out, i.e. constructing the query to load the child items and finding all the parent objects that are involved in the load. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Optimizing joined entity loads
So this is some variant of, i have a bunch of objects and I'd like to ... snip lots of reasons why this should not be a standard feature Yeah, I understand what I'm asking for here, and I would never expect this kind of optimization to kick in by default. Instead, it would only be used in those cases where profiling has shown that there is a bottleneck. I have profiled my code to find out where the bottlenecks are, so I think I fall into the category of individuals who actually know how to use a feature like this. I think Hibernate might have this kind of loading available... I wouldn't be surprised. From my use of Hibernate (which was quite a while ago now) my inclination was that it had lots of options for complex and highly customized optimizations, like the one I'm trying to do. You'd start using plain session.query() to get the rows you want, hash together the collections as desired, and then apply them to the already loaded objects using the attributes.set_committed_value() function which is used for this purpose (pseudocode follows)(also use trunk rev 6066 since I had to fix set_committed_value()): I suppose r6066 is a 0.5 revision number? How well does set_committed_value() work in SA 0.4.7 ? I haven't upgraded to SA 0.5 yet, and I'd rather not do that right now if I can avoid it. However, if this can't be done with 0.4... Thanks a lot for the tips on how to approach this problem. That's exactly what I needed. ~ Daniel --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] preview an update?
I'm building a complex ETL tool with SQLAlchemy that will sometimes need to let the user preview a changed record - not actually carry out the update, just find out which fields would be changed, and to what values. I'm having trouble figuring out a good way to do it. Is there a way to get a sort of preview of an update - get information about what updates are pending for the next flush, or get a flush to report back its changes without actually performing it? One approach would be to set echo to True, let the user see the echo, then roll back the transaction, but there are a couple reasons I don't think that will work for me. I don't want to make the user mentally parse SQL statements; I'd like to get a dict of the pending changes and write my own code to display them attractively, instead. Anyway, I'm having trouble getting echo to work on this app. (The whole thing is part of a TurboGears project, and my echoes aren't showing up, probably because of something I don't understand about the SQLAlchemy logging configuration.) Just getting to the bind variables that will ultimately be sent along with the UPDATE statement would be a great solution, and I'm trying to figure out where I could get them. Right now, it looks like the `params` dict is assembled in Mapper._save_obj in orm/mapper.py, used at line 1376 to issue connection.execute, then discarded. I'm considering overriding Mapper with my own version whose __save_obj can exit at this point, returning `params`, if it is invoked with a preview=True parameter... but that seems a little scary. __save_obj is a long method, and I'd have to keep my app's version of it carefully synched with the canonical sqlalchemy version indefinitely. Thanks in advance for reading through, and for any any suggestions! -- - Catherine http://catherinedevlin.blogspot.com/ *** PyOhio * July 25-26, 2009 * pyohio.org *** --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: preview an update?
On Jun 18, 2009, at 4:27 PM, Catherine Devlin wrote: I'm building a complex ETL tool with SQLAlchemy that will sometimes need to let the user preview a changed record - not actually carry out the update, just find out which fields would be changed, and to what values. I'm having trouble figuring out a good way to do it. Is there a way to get a sort of preview of an update - get information about what updates are pending for the next flush, or get a flush to report back its changes without actually performing it? you can get most of this stuff from the session without any flush occurring. at the object level are the new, dirty, and deleted collections on Session. At the attribute level the attributes.get_history() method will illustrate the full changes made since the last flush on an individual object attribute. get_history() should be in the API docs. what you can't get without flushing are newly generated primary key identifiers, server side defaults that havent fired off, and the results of cascades like delete or on update cascades. however if this does what you need this is definitely the best way to go as it doesnt require issuing a flush() which then has to be rolled back internally. One approach would be to set echo to True, let the user see the echo, then roll back the transaction, but there are a couple reasons I don't think that will work for me. I don't want to make the user mentally parse SQL statements; I'd like to get a dict of the pending changes and write my own code to display them attractively, instead. Anyway, I'm having trouble getting echo to work on this app. (The whole thing is part of a TurboGears project, and my echoes aren't showing up, probably because of something I don't understand about the SQLAlchemy logging configuration.) If you went down this road, use a ConnectionProxy to capture all the SQL expressions as they are emitted.this is also in the API docs.The unit tests use a custom proxy to capturing SQL and expressions various operations, using a decorator to enable/disable the capturing. you can capture the SQL expression constructs directly where you'll have access to all the Table objects and bind params and you can display it in any way suitable. Depending on your needs I'd use one of the above approaches and I would not recommend instrumenting any of the ORM internals. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---