Re: [sqlalchemy] MSSQL String columns often incorrectly encoded as NVARCHAR

Mike Bayer Fri, 29 Mar 2019 06:57:06 -0700

On Fri, Mar 29, 2019 at 6:20 AM Ian Wagner <ianthetec...@gmail.com> wrote:
>
> Hello all,
>
> I'm trying to get to the bottom of an issue in which Python 3 (unicode by 
> definition) strings are always encoded as NVARCHAR for at least two backends 
> (pymssql and pyodbc). Using bytstrings as comparison arguments (for example 
> Table.column == value.encode('utf-8')) sends a regular string literal as 
> expected, but regular strings are encoded as NVARCHAR literals.
>
> This behavior is fairly logical at the underlying driver (pymssql or pyodbc) 
> level, which is why I'm posting here. I believe the the use of a String 
> column (as opposed to a Unicode column) type should not pass an NVARCHAR 
> literal. Doing so has disastrous performance implications, as SQL Server ends 
> up casting the whole column up. This will wreak havoc when regularly dealing 
> with large-ish tables (1.7 million rows or so in our case).
>
> I have previously posted with a LOT more details on StackOverflow 
> (https://stackoverflow.com/questions/55098426/strings-used-in-query-always-sent-with-nvarchar-syntax-even-if-the-underlying-c).
>  I also have an MCVE over on GitHub 
> (https://github.com/ianthetechie/pymssql_sqlalchemy_55098426).
>
> In my MCVE, I outline a possible approach for fixing this, but it appears to 
> have some problems. I'm posting here asking for feedback on what's wrong with 
> my approach, and what would be the best way to go about getting this fixed.


seems like we will need some documentation for this as it is confusing
a lot of people.   The issue that introduced this behavior is
https://github.com/sqlalchemy/sqlalchemy/issues/4442 and then that
same user felt it was happening too often in
https://github.com/sqlalchemy/sqlalchemy/issues/4561, however I
clarified that the N prefix only generates if there is no other
context to determine that this is not a non-unicode context.

the NVARCHAR thing should not happen if you are comparing to a
non-NVARCHAR column.   it only occurs when there is no other context
that SQLAlchemy can determine the correct datatype for the Unicode
object being passed.

However, the example case you have on github there seems to be using a
unicode in a VARCHAR comparison so should not see the N.  If it does,
it's a bug.  I will try your test case now.

In the future, please go straight to the SQLAlchemy github issue
tracker with a succinct test case, as this N thing is obviously still
ongoing.



>
> Thanks!
> Ian
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> ---
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] MSSQL String columns often incorrectly encoded as NVARCHAR

Reply via email to