Le jeudi 25 octobre 2012 16:21:49 UTC+2, David Moore a écrit : > > Hello, >
Hi, > > In line with what is required for Python 3, would it not make sense to > insist across the board that bind values to sqlalchemy.String should be > bytestrings and that bind values to sqlalchemy.Unicode should be unicode > strings, converting if necessary? > I am using the Unicode class and not String directly. But the code I was referring to is in the String class. > I don't think I understand why you would not want that ever. > I am using unicode strings, but I'd like sqlalchemy to convert them to the dialect encoding in order to avoid a problem with oracle indexes not being used when querying with unicode strings. > > regards, > Dave Moore > > ------------------------------ > > Hi, > > Having had the same problem as above, I would like to use the > "convert_unicode='force'" flag at engine configuration level. > It seems that String tests the value of "convert_unicode" at its own level > as well as engine level, but it isn't the case for the 'force' value check : > > def bind_processor(self, dialect): > if self.convert_unicode or dialect.convert_unicode: > if dialect.supports_unicode_binds and \ > *self.convert_unicode != 'force'*: > ... > > def result_processor(self, dialect, coltype): > wants_unicode = self.convert_unicode or dialect.convert_unicode > needs_convert = wants_unicode and \ > (dialect.returns_unicode_strings is not True or > *self.convert_unicode == 'force'*) > > Thus I have to create a type decorator for all my Strings even if I have > set the right flag at engine configuration level... > > Yann > > > Le jeudi 2 août 2012 23:22:22 UTC+2, Michael Bayer a écrit : >> >> so the "convert_unicode='force'" flag is not ideal here as that will >> spend most of it's time checking for decodes necessary on the result set >> side, which is a lot of wasted effort. You can customize how strings are >> handled on the bind side, including per-dialect behavior, using a custom >> type: >> >> class MyStringType(TypeDecorator): >> impl = String >> >> def process_bind_param(self, value, dialect): >> if value is not None and dialect.name == "oracle": >> value = value.encode('utf-8') >> >> then replace usage of the String(length=XYZ) type with >> MyStringType(length=XYZ). >> >> docs: http://docs.sqlalchemy.org/en/rel_0_7/core/types.html#types-custom >> >> >> On Aug 2, 2012, at 5:41 AM, David Moore wrote: >> >> > Hi, >> > >> > Further searching seems to indicate this is an Oracle issue, not a >> cx_Oracle issue. >> > http://www.digipedia.pl/usenet/thread/15912/2814/ - same problem with >> cx_Oracle >> > http://support.unify.com/supportforum/viewtopic.php?f=40&t=3823 - >> happening with perl drivers. >> > http://tao.qshine.com/note/note_ora.htm - happening with ADO.NETdrivers. >> > >> > There's also evidence this problem occurs with MS SQL Server as well: >> > >> http://www.microsoftfaqs.com/Articles/1148439/Index_not_used_when_UNICODE_%3D_true- >> SQL Server 2005 >> > >> http://blog.brianhartsock.com/2008/12/14/nvarchar-vs-varchar-in-sql-server-beware/ >> >> > >> > Although, it seems in SQL Server at least, it still uses the index, >> just in a slower manner, which is probably why we haven't noticed this >> issue yet. >> > >> > It seems to me that the place to fix this is when the type of the >> column is known, which is when sqlalchemy binds the parameter. There's >> some discussion in one of those links that, at the sql statement >> preparation level, you don't know what the column datatype is, so you can't >> fix it there. >> > >> > Would it be possible for String datatypes to detect unicode values and >> encode them with the database charset? >> > >> > As a lower-level temporary solution, you can get cx_Oracle to encode >> all unicode parameters to strings, but that's obviously the wrong thing to >> do if you have any Unicode columns. This snippet: >> > >> > def InputTypeHandler(cursor, value, numElements): >> > if isinstance(value, unicode): >> > return cursor.var(str, arraysize = numElements, >> > inconverter = lambda x: >> x.encode(cursor.connection.nencoding)) >> > >> > And then when creating the connection: >> > >> > connection.inputtypehandler = InputTypeHandler >> > >> > thanks for the help, >> > >> > -- >> > David Moore >> > Senior Software Engineer >> > St. James Software >> > Email: dav...@sjsoft.com >> > >> > >> > ----- Original Message ----- >> >> And for anyone else experiencing this issue, there was a subtle >> >> difference in the execution plans that's now apparent. The >> >> statement which only selects colid runs a FAST FULL SCAN and >> >> successfully converts the unicode parameter using SYS_OP_C2C, and >> >> uses the index. When you select both colid and message, it runs a >> >> FULL SCAN, and even though it seems to detect the conversion can be >> >> done, it does not use the index at all. Perhaps this is actually an >> >> Oracle issue? >> >> >> >> -- >> >> David Moore >> >> Senior Software Engineer >> >> St. James Software >> >> Email: dav...@sjsoft.com >> >> >> >> >> >> >> >> ----- Original Message ----- >> >>> Hi Michael, >> >>> >> >>> Indeed, you are correct - adding the message column to the >> >>> cx_oracle >> >>> query shows the same behaviour as the sqlalchemy query. Sorry I >> >>> missed that. >> >>> >> >>> I will take these results to the cx_oracle list on this basis. On >> >>> my >> >>> second question, though, is there a way to enforce >> >>> convert_unicode='force' only on Oracle databases across an >> >>> application? I know the hooking syntax changed from sqlalchemy 0.6 >> >>> to 0.7, so I'd imagine if there is a way, it would be different for >> >>> those two versions. >> >>> >> >>> thanks, >> >>> >> >>> -- >> >>> David Moore >> >>> Senior Software Engineer >> >>> St. James Software >> >>> Email: dav...@sjsoft.com >> >>> >> >>> >> >>> ----- Original Message ----- >> >>>> Hi David - >> >>>> >> >>>> I've done some experiments with the script you gave me. Correct >> >>>> me >> >>>> if I'm wrong (which is very possible), but the attached revised >> >>>> version appears to demonstrate the difference between the >> >>>> cx_oracle >> >>>> and SQLAlchemy versions is really just that the SQLAlchemy >> >>>> version >> >>>> is running a different SQL string: >> >>>> >> >>>> SELECT test_table.colid, test_table.message >> >>>> FROM test_table >> >>>> WHERE test_table.colid = :colid_1 >> >>>> >> >>>> whereas the cx_oracle version, critically, does not ask for the >> >>>> "message" column: >> >>>> >> >>>> SELECT test_table.colid FROM test_table WHERE >> test_table.colid = >> >>>> :colid >> >>>> >> >>>> it's not clear to me if the placement of the "message" column >> >>>> impacts >> >>>> the Oracle planner, or otherwise if the latency is on the >> >>>> cx_oracle >> >>>> side regarding buffering of columns or something similar. But >> >>>> i >> >>>> am able to get the SQLAlchemy version to be as fast as the >> >>>> cx_oracle >> >>>> version, and the cx_oracle version to be as slow as the >> >>>> sqlalchemy >> >>>> version, by just adding/removing the "message" column from the >> >>>> query. >> >>>> >> >>>> The attached script also tries to eliminate the overhead of >> >>>> SQLAlchemy compiling the SQL construct within the timed portion, >> >>>> as >> >>>> I was testing this against only 40K rows and wanted to try to get >> >>>> the results as close as possible. It's only dealing with the >> >>>> unicode parameter, and it's true if you encode the parameter >> >>>> first, >> >>>> both the SQLA and cx_oracle versions get faster, even to the >> >>>> point >> >>>> that the two-column version minus unicode is faster than the one >> >>>> column version with unicode. My guess is that cx_oracle is >> >>>> detecting a unicode value in the input and doing something >> >>>> different >> >>>> with the statement overall as a result of it being present. >> >>>> >> >>>> Some typical set of results are: >> >>>> >> >>>> Setting up table >> >>>> sa_both_cols 1.20209717751 >> >>>> sa_one_col 0.0182020664215 >> >>>> sa_literal_sql 0.019690990448 >> >>>> cx_oracle_one_col 0.0839619636536 >> >>>> cx_oracle_both_cols 1.18438816071 >> >>>> >> >>>> Setting up table >> >>>> sa_both_cols 1.17753505707 >> >>>> sa_one_col 0.0785720348358 >> >>>> sa_literal_sql 0.016618013382 >> >>>> cx_oracle_one_col 0.0194280147552 >> >>>> cx_oracle_both_cols 1.15302705765 >> >>>> >> >>>> while the cx_oracle/SQLA versions seem to compete for the "fast" >> >>>> version, which is because we're only running the statement once >> >>>> and >> >>>> I'm running on a low performing amazon small instance, it's clear >> >>>> the "two column" version is where the vast amount of latency >> >>>> occurs >> >>>> for both systems. >> >>>> >> >>>> >> >>>> let me know if you can confirm similar results on your end. I >> >>>> tested >> >>>> only with SQLAlchemy 0.8 in trunk but I was able to reproduce >> >>>> your >> >>>> initial results so I assume the experience with 0.7, 0.6 would be >> >>>> similar. >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> On Aug 1, 2012, at 6:40 AM, David Moore wrote: >> >>>> >> >>>>> Hi all, >> >>>>> >> >>>>> This is a bug in a product with multiple versions in support. >> >>>>> The >> >>>>> issue can be reproduced with >> >>>>> sqlalchemy: 0.6.1, 0.6.6, 0.7.8 >> >>>>> cx_Oracle: 5.0.4, 5.1, 5.1.2 >> >>>>> Oracle: 10g XE and 10g Enterprise >> >>>>> >> >>>>> We have tables created with sqlalchemy.String primary keys and >> >>>>> other indexes. When querying through sqlalchemy, when we pass >> >>>>> in >> >>>>> a unicode value for the key, Oracle does not use the index, and >> >>>>> instead uses a full table scan. When executing the exact same >> >>>>> query through cx_Oracle, we don't see this issue. The script >> >>>>> attached is a small test case which demonstrates the issue. >> >>>>> Sample output: >> >>>>> >> >>>>> Running select >> >>>>> Unicode params took 3.455 seconds >> >>>>> Encoded params took 0 seconds >> >>>>> Running cx_Oracle select >> >>>>> Unicode params took 0.0619998 seconds >> >>>>> Encoded params took 0 seconds >> >>>>> >> >>>>> And running the cx_Oracle select first to prevent pollution by >> >>>>> caching: >> >>>>> >> >>>>> Running cx_Oracle select >> >>>>> Unicode params took 0.0469999 seconds >> >>>>> Encoded params took 0 seconds >> >>>>> Running select >> >>>>> Unicode params took 3.267 seconds >> >>>>> Encoded params took 0.0159998 seconds >> >>>>> >> >>>>> As can be seen, there's a slight slowdown with the unicode >> >>>>> parameter on cx_Oracle, but nothing compared to the slowdown on >> >>>>> sqlalchemy. The sqlalchemy unicode parameter select is the >> >>>>> only >> >>>>> one which performs a full table scan, so that makes sense. >> >>>>> >> >>>>> This issue has been mentioned on this list before - 15 Feb with >> >>>>> title "Full Table scan with Oracle due to Charset conversion" >> >>>>> and >> >>>>> 4 April with title "Problem with Oracle requests" - but in >> >>>>> those >> >>>>> cases, the issue was reproducible with the cx_Oracle driver as >> >>>>> well, which it is not here. >> >>>>> >> >>>>> Using the convert_unicode='force' argument to the String types >> >>>>> does >> >>>>> solve the issue, but there are a few reasons I don't want to do >> >>>>> that. This product also supports PostgreSQL, SQL Server and >> >>>>> MySQL, and I do not want to make changes which will affect >> >>>>> running >> >>>>> under those databases. This is also happening in an otherwise >> >>>>> very stable product under long-term support in a number of >> >>>>> places, >> >>>>> so whatever change is made needs to be minimal. >> >>>>> >> >>>>> Adding an additional function-based index on SYS_OP_C2C(column) >> >>>>> for >> >>>>> each of the affected columns also solves this issue with a >> >>>>> slight >> >>>>> performance change for the extra indexes which need to be kept >> >>>>> up >> >>>>> to date. This is the stopgap solution we've gone for in the >> >>>>> meantime for those systems on production. It's not ideal, >> >>>>> though, >> >>>>> and a bit difficult to automate and then verify the results. >> >>>>> We'd >> >>>>> prefer to know why this was happening, and maybe fix a >> >>>>> sqlalchemy >> >>>>> issue in the process. >> >>>>> >> >>>>> So, my questions are: >> >>>>> * What is sqlalchemy doing differently to straight cx_Oracle >> >>>>> that >> >>>>> triggers this behaviour? How can I debug this further to find >> >>>>> out? >> >>>>> * Is there a way, on sqlalchemy 0.6.x, to set up a hook which >> >>>>> will >> >>>>> use convert_unicode='force' only on an Oracle database and >> >>>>> only >> >>>>> for String columns, automatically across an application? >> >>>>> There's >> >>>>> no create_engine argument for convert_unicode='force', which >> >>>>> is >> >>>>> a >> >>>>> pity. >> >>>>> * And a hook for sqlalchemy 0.7.x to achieve the same result? >> >>>>> >> >>>>> Thanks in advance for any help, >> >>>>> regards, >> >>>>> -- >> >>>>> David Moore >> >>>>> Senior Software Engineer >> >>>>> St. James Software >> >>>>> Email: dav...@sjsoft.com >> >>>>> >> >>>>> -- >> >>>>> You received this message because you are subscribed to the >> >>>>> Google >> >>>>> Groups "sqlalchemy" group. >> >>>>> To post to this group, send email to >> >>>>> sqlal...@googlegroups.com. >> >>>>> To unsubscribe from this group, send email to >> >>>>> sqlalchemy+...@googlegroups.com. >> >>>>> For more options, visit this group at >> >>>>> http://groups.google.com/group/sqlalchemy?hl=en. >> >>>>> >> >>>>> <oracle_sqlalchemy_encoding_error.py> >> >>>> >> >>>> >> >>>> -- >> >>>> You received this message because you are subscribed to the >> >>>> Google >> >>>> Groups "sqlalchemy" group. >> >>>> To post to this group, send email to sqlal...@googlegroups.com. >> >>>> To unsubscribe from this group, send email to >> >>>> sqlalchemy+...@googlegroups.com. >> >>>> For more options, visit this group at >> >>>> http://groups.google.com/group/sqlalchemy?hl=en. >> >>>> >> >>>> >> >>> >> >>> -- >> >>> You received this message because you are subscribed to the Google >> >>> Groups "sqlalchemy" group. >> >>> To post to this group, send email to sqlal...@googlegroups.com. >> >>> To unsubscribe from this group, send email to >> >>> sqlalchemy+...@googlegroups.com. >> >>> For more options, visit this group at >> >>> http://groups.google.com/group/sqlalchemy?hl=en. >> >>> >> >>> >> >> >> >> -- >> >> You received this message because you are subscribed to the Google >> >> Groups "sqlalchemy" group. >> >> To post to this group, send email to sqlal...@googlegroups.com. >> >> To unsubscribe from this group, send email to >> >> sqlalchemy+...@googlegroups.com. >> >> For more options, visit this group at >> >> http://groups.google.com/group/sqlalchemy?hl=en. >> >> >> >> >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups "sqlalchemy" group. >> > To post to this group, send email to sqlal...@googlegroups.com. >> > To unsubscribe from this group, send email to >> sqlalchemy+...@googlegroups.com. >> > For more options, visit this group at >> http://groups.google.com/group/sqlalchemy?hl=en. >> > >> >> -- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To view this discussion on the web visit > https://groups.google.com/d/msg/sqlalchemy/-/xQHoRrrfOugJ. > To post to this group, send email to sqlal...@googlegroups.com<javascript:> > . > To unsubscribe from this group, send email to > sqlalchemy+...@googlegroups.com <javascript:>. > For more options, visit this group at > http://groups.google.com/group/sqlalchemy?hl=en. > > > > > -- > David Moore > Senior Software Engineer > St. James Software > Email: dav...@sjsoft.com <javascript:> > > -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/ZeC7Vq9TvwQJ. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.