[sqlalchemy] Low performance when reflecting tables via pyodbc+mssql
For the following code: from sqlalchemy import create_engine, MetaData, Table dbengine = create_engine('mssql+pyodbc://MYDSN') dbmeta = MetaData() dbmeta.bind = dbengine def get_table(name): table = DBTable(name, dbmeta, autoload=True, autoload_with=dbengine) It takes 50 seconds or so per call to `get_table`. Did I miss something? Where should I look at? Thanks in advance. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Low performance when reflecting tables via pyodbc+mssql
On Tue, Feb 12, 2013 at 10:12 AM, shaung shaun.g...@gmail.com wrote: For the following code: from sqlalchemy import create_engine, MetaData, Table dbengine = create_engine('mssql+pyodbc://MYDSN') dbmeta = MetaData() dbmeta.bind = dbengine def get_table(name): table = DBTable(name, dbmeta, autoload=True, autoload_with=dbengine) If you add echo='debug' to your create_engine call, SA will log all calls to the database and rows returned, which might give you an idea of where all the time is being spent. Hope that helps, Simon -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Low performance when reflecting tables via pyodbc+mssql
On Tuesday, February 12, 2013 7:18:37 PM UTC+9, Simon King wrote: If you add echo='debug' to your create_engine call, SA will log all calls to the database and rows returned, which might give you an idea of where all the time is being spent. Thanks, Simon. I've looked through the debug log and found the reason. It turns out that the table has several foreign key constraints, and SA is inspecting all of the related tables and all the related tables to the related tables... There were 23 tables involved, which explained the long execution time. So is there anything I can do about this? I'm considering two possibilities: 1. Ignore the constraints to speed up 2. Or cache all the meta data to a disk file so no need to wait when restarting the program Either would be fine for me. Is it possible? -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Low performance when reflecting tables via pyodbc+mssql
On 12-02-2013 13:29, shaung wrote: On Tuesday, February 12, 2013 7:18:37 PM UTC+9, Simon King wrote: If you add echo='debug' to your create_engine call, SA will log all calls to the database and rows returned, which might give you an idea of where all the time is being spent. Thanks, Simon. I've looked through the debug log and found the reason. It turns out that the table has several foreign key constraints, and SA is inspecting all of the related tables and all the related tables to the related tables... There were 23 tables involved, which explained the long execution time. So is there anything I can do about this? I'm considering two possibilities: 1. Ignore the constraints to speed up 2. Or cache all the meta data to a disk file so no need to wait when restarting the program Either would be fine for me. Is it possible? I had a smilar problem. I had a ms sql database that another application created and I need to select data from it. There was lots of tables so I tried reflection but it was slow so I decided to use sa declarative method. But declaring all the tables again in python was too much work. I use sqlautocode to generate declerative table classes and use them in my models with some minor modifications. if the db structure does not change too often this will speed up things. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Low performance when reflecting tables via pyodbc+mssql
On Tue, Feb 12, 2013 at 11:29 AM, shaung shaun.g...@gmail.com wrote: On Tuesday, February 12, 2013 7:18:37 PM UTC+9, Simon King wrote: If you add echo='debug' to your create_engine call, SA will log all calls to the database and rows returned, which might give you an idea of where all the time is being spent. Thanks, Simon. I've looked through the debug log and found the reason. It turns out that the table has several foreign key constraints, and SA is inspecting all of the related tables and all the related tables to the related tables... There were 23 tables involved, which explained the long execution time. So is there anything I can do about this? I'm considering two possibilities: 1. Ignore the constraints to speed up 2. Or cache all the meta data to a disk file so no need to wait when restarting the program Either would be fine for me. Is it possible? Caching the metadata should be fairly easy if you are happy with that approach. I think MetaData instances are picklable: http://stackoverflow.com/questions/11785457/sqlalchemy-autoloaded-orm-persistence Simon -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
[sqlalchemy] Mixing declarative style with expression language
Greetings, I trust everyone is doing well. Our code base uses SQLAlchemy and some of the old code uses expression language style code e.g. appts = Table(Appointment, META, autoload=True, autoload_with=DB) statement = select([appts.c.appointmentId], and_( appts.c.appointmentId == 212 )) results = select_all(statement) where as some of our code uses declarative style classes e.g. class Appointment(Alchemy_Base, QueryMixin): __tablename__ = Appointment appointmentId = Column(Integer, primary_key=True) @classmethod def get_by_id(cls, appointment_id): query = cls.query.filter_by(appointmentId=appointment_id) return query.one() Some of our scripts are going to use both of these files (both expression style and declarative style) so my question is, is it (i.e. mixing two styles of code) going to cause any sort of problems or we're going to be okay? I am asking because some people in our company are suggesting that we convert all code into one format (devs want to convert expression style old code into declarative style code). Kindly let me know your suggestions. BTW, we're using MySQL as database and Python 2.6.4. Thanks in advance. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Low performance when reflecting tables via pyodbc+mssql
On Feb 12, 2013, at 6:29 AM, shaung shaun.g...@gmail.com wrote: On Tuesday, February 12, 2013 7:18:37 PM UTC+9, Simon King wrote: If you add echo='debug' to your create_engine call, SA will log all calls to the database and rows returned, which might give you an idea of where all the time is being spent. Thanks, Simon. I've looked through the debug log and found the reason. It turns out that the table has several foreign key constraints, and SA is inspecting all of the related tables and all the related tables to the related tables... There were 23 tables involved, which explained the long execution time. So is there anything I can do about this? I'm considering two possibilities: 1. Ignore the constraints to speed up 2. Or cache all the meta data to a disk file so no need to wait when restarting the program you can pickle the metadata for this purpose.Though I wonder if it's time to revisit that behavior of reflection, it would be easy enough to have it stop reflecting after one level deep.I'm actually not even sure why it's so critical that it even traverse the first level of foreign keys, since those ForeignKey objects could just remain unresolved until one ensured that the other tables were also pulled in explicitly. The Table object would still work with those FK objects unresolved. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Mixing declarative style with expression language
On Feb 12, 2013, at 9:21 AM, Oltmans rolf.oltm...@gmail.com wrote: Greetings, I trust everyone is doing well. Our code base uses SQLAlchemy and some of the old code uses expression language style code e.g. appts = Table(Appointment, META, autoload=True, autoload_with=DB) statement = select([appts.c.appointmentId], and_( appts.c.appointmentId == 212 )) results = select_all(statement) where as some of our code uses declarative style classes e.g. class Appointment(Alchemy_Base, QueryMixin): __tablename__ = Appointment appointmentId = Column(Integer, primary_key=True) @classmethod def get_by_id(cls, appointment_id): query = cls.query.filter_by(appointmentId=appointment_id) return query.one() Some of our scripts are going to use both of these files (both expression style and declarative style) so my question is, is it (i.e. mixing two styles of code) going to cause any sort of problems or we're going to be okay? I am asking because some people in our company are suggesting that we convert all code into one format (devs want to convert expression style old code into declarative style code). Kindly let me know your suggestions. BTW, we're using MySQL as database and Python 2.6.4. Thanks in advance. OK well this answer got kind of long. This is the TLDR: make sure your select_all() function is making use of the same Session that your Appointment.get_by_id() method is using.. If that make sense, then you're good to go.Ultimately your code would probably be more succinct and cleaner if you moved to an all ORM style, but as far as actual execution problems just sharing the Session as a point of execution is enough. With the TLDR given, here is the detailed explanation: There's one aspect to this style that is worth taking a look at, which is the scope of connection checkouts and transactions.It's not so critical that you're producing SQL queries using two different systems, what can cause problems in some situations is that those two sets of SQL queries are invoked using different connection contexts, and once the application wants to use both styles in an interactive way, but due to the use of many different connections simultaneously, the work of the two systems carries on in separate transactions which are isolated from each other and can produce conflicts. So in the abstract, what we are talking about is if you wrote a program like this (this is not SQLAlchemy code): connection = database.connect() connection.execute(statement 1) connection.execute(statement 2) connection.execute(statement 3) connection.commit() connection.close() Above, that's pretty simple, three statements on a connection, within a transaction. When we invoke statement 2, the work of statement 1 will be available to us, and invoking statement 3, we'll be able to see the work of statement 2 and 1 as well. Other parts of the program or other applications that are using the same database, will in all likelihood not see much of the effects of these three statements until we commit the transaction, which is due to a behavior known as transaction isolation. The one case you actually won't have any isolation is if your MySQL database is using MyISAM tables. If you're on MyISAM, most of what I'm writing about here probably doesn't matter because there's no real transactions present. But assuming you're on InnoDB, the above style is the best way to go. This is in contrast to this: conn1 = database.connect() conn1.execute(statement 1) conn2 = database.connect() conn2.execute(statement 2) conn2.commit() conn3 = database.connect() conn3.execute(statement 3) conn1.commit() conn3.commit() So above, we illustrate using three separate connections to emit the three statements, using three separate transactions. Just to make it even more disastrous I've interleaved the first and third transactions. If you're using InnoDB, the above system will be very prone to issues, assuming there's some degree of relationship between those three statements. The effects of statement1 won't be visible to that of statement2, nor will statement3 have any context. The risk of conflicts and locks is greatly increased as well. We're mixing up transactions and additionally putting a lot more burden on the database with three separate connections in use as well as three individual commits. So when using SQLAlchemy we want to make sure the way a series of steps proceeds is closer to the first version.When mixing the Core and ORM as you're doing, this is doable, but you need to invoke the Core and ORM queries using the same context. Normally, if you invoke Core statement like this: stmt = table.select() result = engine.execute(stmt) or even like this: result = stmt.execute() what we're using above is a feature called implicit execution, where
[sqlalchemy] storing a large file into a LargeBinary question
Greetings everyone, I have a piece of code in a web app where I need to store a large binary file (uploaded file stored on disk by Apache server), into an object's LargeBinary attribute. That's pretty easy to do with a syntax like: myobject.uploaded_file = xyz.file.read() The problem is that I don't want to load the entire file into memory when I set the LargeBinary attribute. If my understanding is correct, the above call will first cause the entire content of the uploaded file to be loaded into memory and then that is assigned to the myobject.uploaded_file LargeBinary attribute. Correct? (Then when sqlalchemy eventually issues the INSERT statement to store the data in the database... But then I don't really know how the data transfer is done...) I have tried to find another way of passing the data to the LargeBinary object that would not have to load the entire file into memory at once, but stream it in chunks during the INSERT statement, but I was not able to find anything. :-( Anyone managed to implement something like this before, or know where I can read some more info about possible ways of doing this with sqlalchemy? Thanks a lot, Andre -- André Charbonneau Research Computing Support Analyst Shared Services Canada | National Research Council Canada Services partagés Canada | Conseil national de recherches Canada 100 Sussex Drive | 100, promenade Sussex Ottawa, Ontario K1A 0R6 Canada andre.charbonn...@ssc-spc.gc.ca Telephone | Téléphone: 613-993-3129 -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] storing a large file into a LargeBinary question
One piece of extra information I forgot to mention in my original email is that the myobject.uploaded_file LargeBinary column attribute maps to a Postgresql bytea column. Best regards, Andre On 13-02-12 03:22 PM, Andre Charbonneau wrote: Greetings everyone, I have a piece of code in a web app where I need to store a large binary file (uploaded file stored on disk by Apache server), into an object's LargeBinary attribute. That's pretty easy to do with a syntax like: myobject.uploaded_file = xyz.file.read() The problem is that I don't want to load the entire file into memory when I set the LargeBinary attribute. If my understanding is correct, the above call will first cause the entire content of the uploaded file to be loaded into memory and then that is assigned to the myobject.uploaded_file LargeBinary attribute. Correct? (Then when sqlalchemy eventually issues the INSERT statement to store the data in the database... But then I don't really know how the data transfer is done...) I have tried to find another way of passing the data to the LargeBinary object that would not have to load the entire file into memory at once, but stream it in chunks during the INSERT statement, but I was not able to find anything. :-( Anyone managed to implement something like this before, or know where I can read some more info about possible ways of doing this with sqlalchemy? Thanks a lot, Andre -- André Charbonneau Research Computing Support Analyst Shared Services Canada | National Research Council Canada Services partagés Canada | Conseil national de recherches Canada 100 Sussex Drive | 100, promenade Sussex Ottawa, Ontario K1A 0R6 Canada andre.charbonn...@ssc-spc.gc.ca Telephone | Téléphone: 613-993-3129 -- André Charbonneau Research Computing Support Analyst Shared Services Canada | National Research Council Canada Services partagés Canada | Conseil national de recherches Canada 100 Sussex Drive | 100, promenade Sussex Ottawa, Ontario K1A 0R6 Canada andre.charbonn...@ssc-spc.gc.ca Telephone | Téléphone: 613-993-3129 -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] storing a large file into a LargeBinary question
On Feb 12, 2013, at 3:22 PM, Andre Charbonneau andre.charbonn...@nrc-cnrc.gc.ca wrote: Greetings everyone, I have a piece of code in a web app where I need to store a large binary file (uploaded file stored on disk by Apache server), into an object's LargeBinary attribute. That's pretty easy to do with a syntax like: myobject.uploaded_file = xyz.file.read() The problem is that I don't want to load the entire file into memory when I set the LargeBinary attribute. If my understanding is correct, the above call will first cause the entire content of the uploaded file to be loaded into memory and then that is assigned to the myobject.uploaded_file LargeBinary attribute. Correct? (Then when sqlalchemy eventually issues the INSERT statement to store the data in the database... But then I don't really know how the data transfer is done...) I have tried to find another way of passing the data to the LargeBinary object that would not have to load the entire file into memory at once, but stream it in chunks during the INSERT statement, but I was not able to find anything. :-( In the old days these streaming binary interfaces were common, but as memory has become plentiful you don't see them used anymore. Even systems like Oracle, you see client libraries reliant upon having to set the allowed size of memory to be bigger than the largest value you need to store. psycopg2 does work with buffer() and memoryview() objects as the input to a bytea column, and you could send these in as arguments where SQLAlchemy should pass them through (or if not, its easy to make a custom type that passes it through). Though these objects don't appear to work around having to load the data into memory, they just make memory usage more efficient by removing the need for it to be copied internally. I'm not intimately familiar with them enough to know if they support some way to stream from a file handle or not. There's also a facility I've not previously heard of in Postgresql and psycopg2 called the large object system, which appears to be an entirely separate table pg_largeobject that stores them. Dropping into psycopg2, you can store and retrieve these objects using the object interface: http://initd.org/psycopg/docs/connection.html#connection.lobject as far as how to get that data into your table, it seems like you'd need to link to the OID of your large object, rather than using bytea: http://www.postgresql.org/docs/current/static/lo-funcs.html . So you'd need to forego the usage of bytea. Again SQLAlchemy types could be created which transparently perform these tasks against the OID. I'd ask on the psycopg2 list as to which feature they recommend, and I'm betting they will likely say that memory is very cheap and plentiful these days and you should just assume the data will be fit into memory. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] custom __init__ methods not being invoked
It doesn't appear that the method decorated by @orm.reconstructor is called on objects retrieved/loaded as relationships. Not my desired behavior, but I guess it is consistent with the docs: When instances are loaded during a Query operation as in query(MyMappedClass).one(), init_on_load is called. So if I need it to be executed in a relationship-loading situation, what's the best way to go about it? Thanks. — RM On Mon, Jan 7, 2013 at 3:36 AM, Ryan McKillen ryan.mckil...@gmail.comwrote: Worked like a charm. Thanks. — RM On Mon, Jan 7, 2013 at 6:26 PM, Michael van Tellingen michaelvantellin...@gmail.com wrote: See http://docs.sqlalchemy.org/en/latest/orm/mapper_config.html#constructors-and-object-initialization On Mon, Jan 7, 2013 at 4:47 AM, RM ryan.mckil...@gmail.com wrote: I have a class which inherits from Base. My class has a metaclass which inherits from DeclarativeMeta. Among other things, the metaclass adds an __init__ method to the class dictionary. When I instantiate an instance of my class directly, my __init__ method is invoked, but if I use the ORM to retrieve an instance, my __init__ method is not invoked. A metaclass serves better than a mixin for what I am trying to accomplish. However, I did experiment with a mixin and saw the same behavior as described above. Any ideas? Many thanks. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/oDj_bHNvP7EJ. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] custom __init__ methods not being invoked
its called in all SQL loading scenarios including that of relationships. A relationship load might not actually result in the object being loaded from the DB in these scenarios: 1. the relationship is a simple many-to-one, and the object could be located by primary key from the identity map without emitting a SQL load. 2. the relationship emitted the SQL, but as it loaded the rows, the objects matching those rows were already in the identity map, so they weren't reconstructed. In both scenarios above, the objects were still guaranteed to be present in the identity map in only three possible ways: 1. they were loaded at some point earlier, in which case your reconstructor was called 2. they moved from pending to persistent , meaning you added them with add(), then they got inserted, so you'd want to make sure whatever regular __init__ does is appropriate here 3. the objects were detached, and were add()ed back into the session, but this still implies that #1 or #2 were true for a previous Session. On Feb 12, 2013, at 5:29 PM, Ryan McKillen ryan.mckil...@gmail.com wrote: It doesn't appear that the method decorated by @orm.reconstructor is called on objects retrieved/loaded as relationships. Not my desired behavior, but I guess it is consistent with the docs: When instances are loaded during a Query operation as in query(MyMappedClass).one(), init_on_load is called. So if I need it to be executed in a relationship-loading situation, what's the best way to go about it? Thanks. — RM On Mon, Jan 7, 2013 at 3:36 AM, Ryan McKillen ryan.mckil...@gmail.com wrote: Worked like a charm. Thanks. — RM On Mon, Jan 7, 2013 at 6:26 PM, Michael van Tellingen michaelvantellin...@gmail.com wrote: See http://docs.sqlalchemy.org/en/latest/orm/mapper_config.html#constructors-and-object-initialization On Mon, Jan 7, 2013 at 4:47 AM, RM ryan.mckil...@gmail.com wrote: I have a class which inherits from Base. My class has a metaclass which inherits from DeclarativeMeta. Among other things, the metaclass adds an __init__ method to the class dictionary. When I instantiate an instance of my class directly, my __init__ method is invoked, but if I use the ORM to retrieve an instance, my __init__ method is not invoked. A metaclass serves better than a mixin for what I am trying to accomplish. However, I did experiment with a mixin and saw the same behavior as described above. Any ideas? Many thanks. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/oDj_bHNvP7EJ. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.