Re: [sqlalchemy] prepared statements
On Feb 6, 2014, at 12:34 PM, Rick Otten rottenwindf...@gmail.com wrote: Hello, I'm using SQLAlchemy 0.8.4 with PostgreSQL 9.3. I have a situation where I need to run the same set of queries millions of times in one session. I think it would be very helpful from a performance perspective if I could prepare my query ahead of time. (The data under the queries is not changing significantly during a run.) Is this possible? Is SQLAlchemy already doing it behind the scenes for me magically? the Python DBAPI (see http://www.python.org/dev/peps/pep-0249/) doesn’t have an explicit “prepared statements” construct. what it does have is the “executemany()” command, which passes a statement and a list of parameter sets to the DBAPI, which can then make the decision to use prepared statements or whatever other means it prefers to optimize the approach. SQLAlchemy documents the use of executemany() most fully in the Core tutorial: http://docs.sqlalchemy.org/en/rel_0_9/core/tutorial.html#executing-multiple-statements I was imagining/hoping I'd find something like this: # prepare the query: myPreparedQuery = mySession.query(stuff).filter(parameter definitions).prepare() # run the query whenever I need it during my session: myPreparedQuery.parameters(stuff).fetchall() prepared statements don’t really apply much to SELECT statements, the performance gains from such are marginal as you typically invoke a particular SELECT just once within a transaction, and prepared statements don’t necessarily carry across transaction or connection boundaries on backends. There are various performance concerns related to SELECT queries and solving them depends much on identifying where a particular performance issue resides. There’s overhead on the side of Python/SQLAlchemy formulating the string statement, theres overhead on the side of passing it to the database to be parsed (which is by far the most infinitesimal part) there’s overhead on retrieving raw rows and columns and there’s python/SQLAlchemy overhead on marshaling those rows into Python objects, or particularly ORM objects which are very expensive relatively. For these areas of overhead there are different strategies, some more exotic than others, of minimizing overhead, and you’d want to dig into each one individually after doing profiling. For background on profiling, see http://stackoverflow.com/questions/1171166/how-can-i-profile-a-sqlalchemy-powered-application/1175677#1175677 . I also have a writeup on the performance differences as one moves between core and ORM as well as between executemany() and non, which is here: http://stackoverflow.com/questions/11769366/why-is-sqlalchemy-insert-with-sqlite-25-times-slower-than-using-sqlite3-directly/11769768#11769768 . signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] prepared statements
On Feb 6, 2014, at 12:34 PM, Rick Otten rottenwindf...@gmail.com wrote: Hello, I'm using SQLAlchemy 0.8.4 with PostgreSQL 9.3. I have a situation where I need to run the same set of queries millions of times in one session. After sending that, I just read the words “same set of queries millions of times in one session”. That raises a red flag for me. If it were me and I had to pull millions of rows from a database I’d be looking for ways to SELECT lots of rows at once, in batches.Millions of individual queries with no option for batching suggests that each subsequent query somehow relies upon the results of the previous one, which would be a pretty exotic case in itself but I still might see if a more advanced feature like window functions can be used to help with that (a window function lets you write criteria against the rows that have already been returned thus far). That said, if you truly need to run millions of queries, you might want to consider using psycopg2 directly, or at best you’d use only SQLAlchemy Core (no ORM) and probably make use of the compiled_cache feature (see http://docs.sqlalchemy.org/en/rel_0_9/core/connections.html?highlight=execution_options#sqlalchemy.engine.Connection.execution_options.params.compiled_cache). It’s going to be a time consuming operation in any case. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] prepared statements
On Thu, Feb 6, 2014 at 3:08 PM, Michael Bayer mike...@zzzcomputing.com wrote: I was imagining/hoping I'd find something like this: # prepare the query: myPreparedQuery = mySession.query(stuff).filter(parameter definitions).prepare() # run the query whenever I need it during my session: myPreparedQuery.parameters(stuff).fetchall() prepared statements don't really apply much to SELECT statements, the performance gains from such are marginal as you typically invoke a particular SELECT just once within a transaction, and prepared statements don't necessarily carry across transaction or connection boundaries on backends. That's simply false. Because of the following: There are various performance concerns related to SELECT queries and solving them depends much on identifying where a particular performance issue resides. There's overhead on the side of Python/SQLAlchemy formulating the string statement, theres overhead on the side of passing it to the database to be parsed (which is by far the most infinitesimal part) That part is most definitely not always so infinitesimally small. I recall one case (granted, one among so many) of a complex query that was very fast to execute. The query had dozens of joins, so it took considerable planning time on the database side. Planning took around 300ms, where execution took only 25ms. So, having spotted the bottleneck, I switched to using prepared statements (it was a query that was run several times in the session, not millions of times, but several) with different arguments. The benefit was considerable. To do that (I was using SQLA 0.5) I had to compile the select object and generate the query string, I used Text in the bindparams to replace them with $1, $2, etc... (as postgres likes it), and the built a PREPARE statement and an EXECUTE one, it was rather simple, having SQLA generate the query string. The result was a 10-fold increase in performance. DBAPI doesn't have direct support for prepared statements, but hacking them isn't that difficult. You just have to make sure you send the PREPARE, EXECUTE and the DEALLOCATE (never forget the DEALLOCATE) on the same connection. Which is easy, just ask SQLAlchemy's session for its connection and use it. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Automap and naming of relationship attributes
Well, using the mapper event would be nicer, but in any case I was already iterating over Base.classes and adding them to my own module's namespace like so: globals()[cls.__name__] = cls It works for the rest of my application being able to see the classes by importing the module, but apparently not for this. I'm not really expert at Python class and namespace innards, but from the error message as well as the default str() output it seems the automap-generated classes considers themselves to be in the sqlalchemy.ext.automap module but are not registered in that namespace. Is there a way to tell the classes to use a different namespace from an instrument_class handler? (And incidentally I'm already using my own base class through automap_base(declarative_base(cls=...)) but that doesn't make any difference.) On 2014.2.6, at 15:59, Michael Bayer mike...@zzzcomputing.com wrote: Python pickle can't pickle class instances where the class isn't locatable as module-level imports. As automap necessarily creates classes on the fly, these classes aren't part of any module. to have them part of a module you'd want to use an event to place them in the namespace of one of your own modules, or you can implement a custom `__reduce__()` method on them (see the Python docs for __reduce__()). a good event to use here might be instrument_class: http://docs.sqlalchemy.org/en/rel_0_9/orm/events.html#sqlalchemy.orm.events.MapperEvents.instrument_class On Feb 6, 2014, at 4:32 AM, Adrian Robert adrian.b.rob...@gmail.com wrote: One other point, I was trying out the dogpile cache example and ran into (after I stuck a .encode('utf-8') into the key mangler since I'm using Python-3 and pylibmc): _pickle.PicklingError: Can't pickle class 'sqlalchemy.ext.automap.Person': attribute lookup sqlalchemy.ext.automap.Person failed This was fixed by a hack sqlalchemy.ext.automap.__dict__[cls.__name__] = cls run over all the automap-created classes. It might be I'm only having to do this because I'm doing something wrong elsewhere, but I just thought I'd mention it in case it comes up for someone. On 2014.2.2, at 14:22, Adrian Robert adrian.b.rob...@gmail.com wrote: Thanks, that works beautifully. I had noticed name_for_scalar_relationship parameter but I guess wasn't confident enough that I understood what was going on to try it. :-[ -- You received this message because you are subscribed to a topic in the Google Groups sqlalchemy group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/sqlalchemy/p6YkPuCs_Ks/unsubscribe. To unsubscribe from this group and all its topics, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] prepared statements
just wondering -- would it be possible to mimic this behavior using a custom view for this select ( in postgresql ) and then querying that ? i think the query planner might only run on the creation. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] prepared statements
On Thu, Feb 6, 2014 at 4:00 PM, Jonathan Vanasco jonat...@findmeon.com wrote: just wondering -- would it be possible to mimic this behavior using a custom view for this select ( in postgresql ) and then querying that ? i think the query planner might only run on the creation. I think views don't cache the plan, they're handled as rules. What you'd need is a function (pg's version of stored procedures). -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] prepared statements
On Feb 6, 2014, at 1:31 PM, Claudio Freire klaussfre...@gmail.com wrote: There are various performance concerns related to SELECT queries and solving them depends much on identifying where a particular performance issue resides. There's overhead on the side of Python/SQLAlchemy formulating the string statement, theres overhead on the side of passing it to the database to be parsed (which is by far the most infinitesimal part) That part is most definitely not always so infinitesimally small. I recall one case (granted, one among so many) of a complex query that was very fast to execute. The query had dozens of joins, so it took considerable planning time on the database side. Planning took around 300ms, where execution took only 25ms. So, having spotted the bottleneck, I switched to using prepared statements (it was a query that was run several times in the session, not millions of times, but several) with different arguments. The benefit was considerable. ideally the database would know how to cache plans based on the string form of the statement. Oracle does this. It appears that Postrgresql does not unless you specifically use a prepared statement. DBAPI doesn't have direct support for prepared statements, but hacking them isn't that difficult. You just have to make sure you send the PREPARE, EXECUTE and the DEALLOCATE (never forget the DEALLOCATE) on the same connection. Which is easy, just ask SQLAlchemy's session for its connection and use it. psycopg2 is an extremely mature library and I find it curious that such a feature has not been added. They’ve posted their rationale here: http://initd.org/psycopg/articles/2012/10/01/prepared-statements-psycopg/ in any case, prepared statements aren’t part of DBAPI nor built in to psycopg2 yet so this is out of SQLAlchemy’s realm, thankfully. If you want to use the recipe at that link it would have to be hand-rolled. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] prepared statements
On Thu, Feb 6, 2014 at 4:29 PM, Rick Otten rottenwindf...@gmail.com wrote: Thanks Claudio - I'll mull over the pros and cons of explicitly managing the connections to prepare the statements vs just being patient while the job runs vs gains I might achieve elsewhere. Remember, prepared statements will only help if planning time is a considerable fraction of execution time. The queries in need of this would be complex. Try to explain analyze the queries you repeatedly execute to see whether preparing them is worth the hassle. I don't remember which version of postgres introduced planning time in the explain, but if you don't have it, just take the difference between the reported execution time and what you time from your end, save network latency that should be equal to what you'd save by preparing. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] prepared statements
On Thu, Feb 6, 2014 at 4:38 PM, Michael Bayer mike...@zzzcomputing.com wrote: On Feb 6, 2014, at 1:31 PM, Claudio Freire klaussfre...@gmail.com wrote: There are various performance concerns related to SELECT queries and solving them depends much on identifying where a particular performance issue resides. There's overhead on the side of Python/SQLAlchemy formulating the string statement, theres overhead on the side of passing it to the database to be parsed (which is by far the most infinitesimal part) That part is most definitely not always so infinitesimally small. I recall one case (granted, one among so many) of a complex query that was very fast to execute. The query had dozens of joins, so it took considerable planning time on the database side. Planning took around 300ms, where execution took only 25ms. So, having spotted the bottleneck, I switched to using prepared statements (it was a query that was run several times in the session, not millions of times, but several) with different arguments. The benefit was considerable. ideally the database would know how to cache plans based on the string form of the statement. Oracle does this. It appears that Postrgresql does not unless you specifically use a prepared statement. Well, it cannot. At least not simplistically, because postgres uses the literals in the statement (the ones likely to change) to specialize plan cost, and different values can produce different plans (depending on stats). For instance, a query with a where clause deleted=false that selects 99% of a table would do a sequential scan, whereas deleted=true might use an index (if there was such an index). They thus don't cache plans unless specifically asked for (with prepare), because those plans may very well be inferior (they don't use value-specific knowledge). -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] prepared statements
On Feb 6, 2014, at 2:43 PM, Claudio Freire klaussfre...@gmail.com wrote: On Thu, Feb 6, 2014 at 4:38 PM, Michael Bayer mike...@zzzcomputing.com wrote: On Feb 6, 2014, at 1:31 PM, Claudio Freire klaussfre...@gmail.com wrote: There are various performance concerns related to SELECT queries and solving them depends much on identifying where a particular performance issue resides. There's overhead on the side of Python/SQLAlchemy formulating the string statement, theres overhead on the side of passing it to the database to be parsed (which is by far the most infinitesimal part) That part is most definitely not always so infinitesimally small. I recall one case (granted, one among so many) of a complex query that was very fast to execute. The query had dozens of joins, so it took considerable planning time on the database side. Planning took around 300ms, where execution took only 25ms. So, having spotted the bottleneck, I switched to using prepared statements (it was a query that was run several times in the session, not millions of times, but several) with different arguments. The benefit was considerable. ideally the database would know how to cache plans based on the string form of the statement. Oracle does this. It appears that Postrgresql does not unless you specifically use a prepared statement. Well, it cannot. At least not simplistically, because postgres uses the literals in the statement (the ones likely to change) to specialize plan cost, and different values can produce different plans (depending on stats). For instance, a query with a where clause deleted=false that selects 99% of a table would do a sequential scan, whereas deleted=true might use an index (if there was such an index). They thus don't cache plans unless specifically asked for (with prepare), because those plans may very well be inferior (they don't use value-specific knowledge). right, but from an API POV, its more tedious since we have to track the handle on the outside. This could be implemented as a cursor event even in conjunction with a custom execution option and possibly using compiled_cache as well. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] Automap and naming of relationship attributes
On Feb 6, 2014, at 1:56 PM, Adrian Robert adrian.b.rob...@gmail.com wrote: Well, using the mapper event would be nicer, but in any case I was already iterating over Base.classes and adding them to my own module's namespace like so: globals()[cls.__name__] = cls It works for the rest of my application being able to see the classes by importing the module, but apparently not for this. I'm not really expert at Python class and namespace innards, but from the error message as well as the default str() output it seems the automap-generated classes considers themselves to be in the sqlalchemy.ext.automap module but are not registered in that namespace. Is there a way to tell the classes to use a different namespace from an instrument_class handler? (And incidentally I'm already using my own base class through automap_base(declarative_base(cls=...)) but that doesn't make any difference.) I’m seeing that and working with a rudimental example I’m not seeing a solution to it.You can as automap suggests create the classes explicitly. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] prepared statements
On Thu, Feb 6, 2014 at 4:47 PM, Michael Bayer mike...@zzzcomputing.com wrote: On Feb 6, 2014, at 2:43 PM, Claudio Freire klaussfre...@gmail.com wrote: On Thu, Feb 6, 2014 at 4:38 PM, Michael Bayer mike...@zzzcomputing.com wrote: On Feb 6, 2014, at 1:31 PM, Claudio Freire klaussfre...@gmail.com wrote: There are various performance concerns related to SELECT queries and solving them depends much on identifying where a particular performance issue resides. There's overhead on the side of Python/SQLAlchemy formulating the string statement, theres overhead on the side of passing it to the database to be parsed (which is by far the most infinitesimal part) That part is most definitely not always so infinitesimally small. I recall one case (granted, one among so many) of a complex query that was very fast to execute. The query had dozens of joins, so it took considerable planning time on the database side. Planning took around 300ms, where execution took only 25ms. So, having spotted the bottleneck, I switched to using prepared statements (it was a query that was run several times in the session, not millions of times, but several) with different arguments. The benefit was considerable. ideally the database would know how to cache plans based on the string form of the statement. Oracle does this. It appears that Postrgresql does not unless you specifically use a prepared statement. Well, it cannot. At least not simplistically, because postgres uses the literals in the statement (the ones likely to change) to specialize plan cost, and different values can produce different plans (depending on stats). For instance, a query with a where clause deleted=false that selects 99% of a table would do a sequential scan, whereas deleted=true might use an index (if there was such an index). They thus don't cache plans unless specifically asked for (with prepare), because those plans may very well be inferior (they don't use value-specific knowledge). right, but from an API POV, its more tedious since we have to track the handle on the outside. This could be implemented as a cursor event even in conjunction with a custom execution option and possibly using compiled_cache as well. That would be tricky. You'd have to maintain a tally of which connection prepared which query (think the complications of multi-engine setups), and make sure you deallocate the handle before returning the connection to the pool. Not to mention coming up with unique handles. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] prepared statements
On Feb 6, 2014, at 2:58 PM, Claudio Freire klaussfre...@gmail.com wrote: On Thu, Feb 6, 2014 at 4:47 PM, Michael Bayer mike...@zzzcomputing.com wrote: On Feb 6, 2014, at 2:43 PM, Claudio Freire klaussfre...@gmail.com wrote: On Thu, Feb 6, 2014 at 4:38 PM, Michael Bayer mike...@zzzcomputing.com wrote: On Feb 6, 2014, at 1:31 PM, Claudio Freire klaussfre...@gmail.com wrote: There are various performance concerns related to SELECT queries and solving them depends much on identifying where a particular performance issue resides. There's overhead on the side of Python/SQLAlchemy formulating the string statement, theres overhead on the side of passing it to the database to be parsed (which is by far the most infinitesimal part) That part is most definitely not always so infinitesimally small. I recall one case (granted, one among so many) of a complex query that was very fast to execute. The query had dozens of joins, so it took considerable planning time on the database side. Planning took around 300ms, where execution took only 25ms. So, having spotted the bottleneck, I switched to using prepared statements (it was a query that was run several times in the session, not millions of times, but several) with different arguments. The benefit was considerable. ideally the database would know how to cache plans based on the string form of the statement. Oracle does this. It appears that Postrgresql does not unless you specifically use a prepared statement. Well, it cannot. At least not simplistically, because postgres uses the literals in the statement (the ones likely to change) to specialize plan cost, and different values can produce different plans (depending on stats). For instance, a query with a where clause deleted=false that selects 99% of a table would do a sequential scan, whereas deleted=true might use an index (if there was such an index). They thus don't cache plans unless specifically asked for (with prepare), because those plans may very well be inferior (they don't use value-specific knowledge). right, but from an API POV, its more tedious since we have to track the handle on the outside. This could be implemented as a cursor event even in conjunction with a custom execution option and possibly using compiled_cache as well. That would be tricky. You'd have to maintain a tally of which connection prepared which query (think the complications of multi-engine setups), and make sure you deallocate the handle before returning the connection to the pool. Not to mention coming up with unique handles. you can use a per-connection cache using connection.info, and you can also manage the lifecycle of the connection and the cache of items with connection and/or pool events as well (e.g. if connection.rollback() clears out prepared statements). lots of functions already do things like this, like keeping track of XA handles and such. it wouldn’t really be hard at all actually, as long as the DBAPI itself behaves nicely when lots of handles are allocated. the execution_options functionality would then be usable as a way to enable the feature on a per-connection or per-statement basis, like: stmt = select([some_table]).execution_options(pg_prepared_statement=True) conn.execute(stmt) the string form of stmt could be linked to the generated handle so that any number of stmt object’s whose string form matches the target would work. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] SQLite: OperationalError when decoding 0x92 in TEXT column
Hmm, this one has me stumped. As best I can tell after poking at it using the column_reflect event, a custom dialect, etc. - the issue here is that in pysqlite.py we (in my Python 3.3 install) are selecting `sqlite3.dbapi2` as the dbapi interface, but we aren't telling sqlite3 anything about how to treat unicode errors. From what I am reading (but it seems inconsistent, maybe?) sqlite3 automatically decodes all database retrieved values from their bytes for text fields, returning unicode strings. Except... that doesn't always seem to be true. I hex-edited a db file to change the utf-8 string hello to hell + 0x92 and sqlite3 switched from returning hello to bhell\x92, or something like that - I've been poking at this for so long I've lost track of that transcript. One can override sqlite3's text factory, apparently, with (for instance) `sqlite3.text_factory = lambda x: x.decode('utf-8', errors='ignore')`. Maybe the key is to try and find a way to trigger that from sqlalchemy? I tried and failed, maybe someone else can point me back to the path? Just to re-summarize the problem: In python 3, I'm getting errors trying to read a row from a sqlite database that has a TEXT column with an invalid utf-8 sequence (specifically, the singleton bye '0x92'). I'd love to just have sqlalchemy move along and ignore the byte, but I'm not clear how to do that. On Tuesday, February 4, 2014 4:33:46 AM UTC-8, Simon King wrote: I've not done much with reflection, but perhaps you could use the column_reflect event: snip -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] SQLite: OperationalError when decoding 0x92 in TEXT column
On Feb 6, 2014, at 6:59 PM, Erich Blume blume.er...@gmail.com wrote: Hmm, this one has me stumped. As best I can tell after poking at it using the column_reflect event, a custom dialect, etc. - the issue here is that in pysqlite.py we (in my Python 3.3 install) are selecting `sqlite3.dbapi2` as the dbapi interface, but we aren't telling sqlite3 anything about how to treat unicode errors. From what I am reading (but it seems inconsistent, maybe?) sqlite3 automatically decodes all database retrieved values from their bytes for text fields, returning unicode strings. Except... that doesn't always seem to be true. I hex-edited a db file to change the utf-8 string hello to hell + 0x92 and sqlite3 switched from returning hello to bhell\x92, or something like that - I've been poking at this for so long I've lost track of that transcript. One can override sqlite3's text factory, apparently, with (for instance) `sqlite3.text_factory = lambda x: x.decode('utf-8', errors='ignore')`. Maybe the key is to try and find a way to trigger that from sqlalchemy? I tried and failed, maybe someone else can point me back to the path? Just to re-summarize the problem: In python 3, I'm getting errors trying to read a row from a sqlite database that has a TEXT column with an invalid utf-8 sequence (specifically, the singleton bye '0x92'). I'd love to just have sqlalchemy move along and ignore the byte, but I'm not clear how to do that. Pysqlite (e.g. sqlite3 module) returns TEXT as Python unicode out of the gate. That exception message is being raised by sqlite3 itself, SQLAlchemy is just a pass-through, as the string type knows on sqlite that the value is already unicode. you might need to CAST() the value as BINARY perhaps, not sure. You’d first want to get a plain sqlite3 script to do what you want. Setting a “text_factory” at the module level of sqlite3 is certainly easy enough but that seems way too broad. Ideally you’d want to be able to get the value on a per-column basis. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [sqlalchemy] prepared statements
On Thursday, February 6, 2014 2:18:51 PM UTC-5, Klauss wrote: I think views don't cache the plan, they're handled as rules. What you'd need is a function (pg's version of stored procedures). I had time to look it up; this generally seems correct. Looking at some explain syntax, it seems like the plan for join conditions under the view are kept -- but you still get hit with a plan onto the view. i could be interpreting this wrong, and this likely has minimal use. something that might be applicable to the original poster's usage is a parallel search. we have a lightweight read-through cache working with sqlalchemy for 'simple' queries ( 1-2 kv pairs ). cache fails are aggregated and then tossed into a single select. dumb example: select * from table where id = 1; select * from table where id = 2; select * from table where id = 3; select * from table where id = 4; select * from table where id = 5; becomes select * from table where id in (1,2,3,4,5) The performance improvement is shockingly significant; i think it's a mix of the planning, fetching and fewer roundtrips/connectivity. if that's an option, i'd suggest trying it. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.