Re: [sqlalchemy] Cannot insert strings with a length greater than 2000 into columns with a datatype of Text, VARCHAR, or NVARCHAR using MS SQL Server 2017 and 2019

Nicolas Lykke Iversen Thu, 15 Oct 2020 04:25:17 -0700

I already tested the _UTF8 encoding using SQL Server 2019 in 
Docker: mcr.microsoft.com/mssql/server:2019-latest


The POC fails here too with the error, when the database is created with 
the LATIN1_GENERAL_100_CI_AS_SC_UTF8 collation:  

Traceback (most recent call last):
  File "test.py", line 56, in <module>
    cursor.execute(sql, (content,))
pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Driver 17 for 
SQL Server][SQL Server]Cannot convert to text/ntext or collate to 
'Latin1_General_100_CI_AS_SC_UTF8' because these legacy LOB types do not 
support UTF-8 or UTF-16 encodings. Use types varchar(max), nvarchar(max) or 
a collation which does not have the _SC or _UTF8 flags. (4189) 
(SQLParamData)")

However, it works when using 
cursor.setinputsizes([(pyodbc.SQL_WVARCHAR,0,0),]).

You are not right about the collation being legacy - 
LATIN1_GENERAL_100_CI_AS_SC is used extensively, due to its support for 
supplementary characters. It's text datatype that's legacy:

*If you store character data that reflects multiple languages in SQL Server 
(SQL Server 2005 (9.x) and later), use Unicode data types (nchar, nvarchar, 
and ntext) instead of non-Unicode data types (char, varchar, and text).*

I'm not saying it's SQLAlchemy fault or that SQLAlchemy should fix this 
issue. But if you read my pyodbc issue on Github, you will see that they 
argue that's SQLAlchemy that's using pyodbc incorrectly :(






On Thursday, October 15, 2020 at 1:06:29 PM UTC+2 Simon King wrote:

> Do you know if there is a downside to calling setinputsizes like that?
> To put it another way, why doesn't pyodbc configure itself like that
> automatically? Why do you think this belongs in SQLAlchemy rather than
> pyodbc?
>
> I suspect the answer is that most applications don't need it and there
> is a downside (perhaps a performance implication?).
>
> I've never used SQL Server, but the fact that the error message refers
> to these collations as "legacy" suggests that an alternative collation
> might be better.
>
> https://docs.microsoft.com/en-us/sql/relational-databases/collations/collation-and-unicode-support?view=sql-server-ver15#Supplementary_Characters
> says:
>
> SQL Server 2019 (15.x) extends supplementary character support to
> the char and varchar data types with the new UTF-8 enabled collations
> (_UTF8). These data types are also capable of representing the full
> Unicode character range.
>
> If you can restrict yourself to SQL Server 2019, that might be a better 
> option.
>
> Simon
>
> On Thu, Oct 15, 2020 at 10:08 AM Nicolas Lykke Iversen
> <nly...@gmail.com> wrote:
> >
> > Thank you, Simon.
> >
> > I'm curious whether this is the way to do it in the future, or whether 
> SQLAlchemy should implement varchar(max)properly?
> >
> > What would the argument be for not implementing varchar(max)in the 
> pyodbc dialect?
> >
> > On Thursday, October 15, 2020 at 11:05:32 AM UTC+2 Simon King wrote:
> >>
> >> You could call 'setinputsizes' in a handler for the
> >> 'before_cursor_execute' event, something like this:
> >>
> >>
> >> from sqlalchemy import event
> >>
> >> @event.listens_for(SomeEngine, 'before_cursor_execute')
> >> def receive_before_cursor_execute(conn, cursor, statement,
> >> parameters, context, executemany):
> >> cursor.setinputsizes([(pyodbc.SQL_WVARCHAR,0,0),])
> >>
> >>
> >> 
> https://docs.sqlalchemy.org/en/13/core/events.html#sqlalchemy.events.ConnectionEvents
> >> 
> https://docs.sqlalchemy.org/en/13/core/events.html#sqlalchemy.events.ConnectionEvents.before_cursor_execute
> >>
> >> Hope that helps,
> >>
> >> Simon
> >>
> >>
> >> On Thu, Oct 15, 2020 at 8:27 AM Nicolas Lykke Iversen <nly...@gmail.com> 
> wrote:
> >> >
> >> > Hi Mike,
> >> >
> >> > I have created an issue for pyodbc: 
> https://github.com/mkleehammer/pyodbc/issues/835#issuecomment-708930870
> >> >
> >> > I've gotten really good feedback there from Microsoft, and a fix has 
> been proposed that works:
> >> >
> >> > "You can try to use setinputsizes on your parameter to tell it to 
> send as varchar(max): cursor.setinputsizes([(pyodbc.SQL_WVARCHAR,0,0),])"
> >> >
> >> > I'm by no means an SQLAlchemy expert, but shouldn't the pyodbc 
> dialect be updated to support varchar(max)using the proposed method? If 
> not, how can I execute cursor.setinputsizes([(pyodbc.SQL_WVARCHAR,0,0),]) 
> using SQLAlchemy, so that I can make use of varchar(max)in my application?
> >> >
> >> > Can you recommend a hotfix for using varchar(max)in current 
> SQLAlchemy applications that need to handle Unicode supplementary 
> characters (_SC)?
> >> >
> >> > I appreciate really appreciate your help.
> >> >
> >> > Best regards
> >> > Nicolas
> >> > On Wednesday, October 14, 2020 at 3:36:11 PM UTC+2 Mike Bayer wrote:
> >> >>
> >> >>
> >> >>
> >> >> On Wed, Oct 14, 2020, at 5:35 AM, Nicolas Lykke Iversen wrote:
> >> >>
> >> >> Hi Mike,
> >> >>
> >> >> I've now tested inserting strings with more than 2000 characters 
> using Azure Data Studio (SQL Server GUI) and everything works.
> >> >>
> >> >> Furthermore, I've tested pyodbc (DBAPI) directly, and it only fails 
> when inserting such strings using parameterised SQL queries (it succeeds 
> without using parametrised queries).
> >> >>
> >> >>
> >> >> that would be expected because all the datatype-related issues occur 
> when bound parameters are passed.
> >> >>
> >> >>
> >> >>
> >> >> You can see my POC below, if you have any interest.
> >> >>
> >> >> I guess it should be submitted as a bug to pyodbc... Do you know if 
> I can disable parametrisation for certain SQL queries in SQLAlchemy?
> >> >>
> >> >>
> >> >> there is not and this is definitely an issue that has to be solved 
> at the pyodbc level, either a bug on their end or something in your 
> configuration that has to be changed.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Best regards (and thanks for your help and support!!!)
> >> >> Nicolas
> >> >>
> >> >> System info:
> >> >> python v. 3.8.5
> >> >> pyodbc v. 4.0.30
> >> >> msodbcsql17 v. 17.6.1.1
> >> >>
> >> >> POC:
> >> >> import sys
> >> >> import pyodbc
> >> >>
> >> >> host = 'tcp:127.0.0.1,1433'
> >> >> db = 'pyodbc_test'
> >> >> user = 'sa'
> >> >> pwd = 'P@ssw0rd'
> >> >>
> >> >> print('started')
> >> >>
> >> >> cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL 
> Server};SERVER='+host+';DATABASE='+'master'+';UID='+user+';PWD='+ pwd, 
> autocommit=True)
> >> >> cursor = cnxn.cursor()
> >> >>
> >> >> try:
> >> >> cursor.execute(f'CREATE DATABASE {db} COLLATE 
> Latin1_General_100_CI_AS_SC')
> >> >> except pyodbc.ProgrammingError as e:
> >> >> pass # database exists
> >> >>
> >> >> cursor.execute(f'USE {db}')
> >> >>
> >> >> try:
> >> >> cursor.execute("""
> >> >> CREATE TABLE msg (
> >> >> id int identity(1,1) not null,
> >> >> content varchar(max) not null
> >> >> );""")
> >> >> except pyodbc.ProgrammingError as exc:
> >> >> pass # table exists
> >> >>
> >> >> content = 2000 * 'A'
> >> >>
> >> >> cursor.execute(f"""
> >> >> INSERT INTO msg (content)
> >> >> VALUES ('{content}')""")
> >> >> print(f'non-param: {len(content)=}: success')
> >> >>
> >> >> sql = f"""
> >> >> INSERT INTO msg (content)
> >> >> VALUES (?)"""
> >> >> cursor.execute(sql, (content))
> >> >> print(f'param: {len(content)=}: success')
> >> >>
> >> >> content = 2001 * 'A'
> >> >>
> >> >> cursor.execute(f"""
> >> >> INSERT INTO msg (content)
> >> >> VALUES ('{content}')""")
> >> >> print(f'non-param: {len(content)=}: success')
> >> >>
> >> >> # this fails!
> >> >> sql = f"""
> >> >> INSERT INTO msg (content)
> >> >> VALUES (?)"""
> >> >> cursor.execute(sql, (content))
> >> >> print(f'param: {len(content)=}: success')
> >> >>
> >> >>
> >> >> #cursor.execute('SELECT * FROM msg')
> >> >> #rows = cursor.fetchall()
> >> >> #for r in rows:
> >> >> # print(r)
> >> >>
> >> >> print('finished')
> >> >>
> >> >>
> >> >>
> >> >> On Wednesday, October 14, 2020 at 12:43:25 AM UTC+2 Mike Bayer wrote:
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Oct 13, 2020, at 4:57 PM, Nicolas Lykke Iversen wrote:
> >> >>
> >> >> Thank you, Mike - very much appreciated!
> >> >>
> >> >> Just to be clear, pyodbc is not a driver, it’s a ODBC-compliant 
> DBAPI, right? I separately downloaded a driver for SQL Server from 
> Microsoft, which pyodbc makes use of.
> >> >>
> >> >>
> >> >> right the pyodbc is the DBAPI in this case, which we usually refer 
> to as a "driver" but in the case of ODBC the "driver" is more specifically 
> the separate ODBC driver component.
> >> >>
> >> >>
> >> >> Do you suggest that changing pyodbc to another SQL Server DPAPI 
> would solve the problem?
> >> >>
> >> >>
> >> >> I suggest that if there is no issue with the query you're running 
> outside of the context of pyodbc that you submit an issue to pyodbc at 
> https://github.com/mkleehammer/pyodbc/issues . However I was able to find 
> a discussion thread about your error message that seemed to be independent 
> of ODBC.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> If so, can you recommend another DBAPI for SQL Server? Or do you 
> think that the problem is caused by Microsoft’s driver?
> >> >>
> >> >>
> >> >> pyodbc is the only supported driver for SQL Server that exists now 
> for Python. You also definitely want to use Microsoft's ODBC drivers so 
> you're already there.
> >> >>
> >> >>
> >> >>
> >> >> I’m pretty sure SQL Server works fine when accessed using .NET, 
> otherwise the Internet would be full of complaints regarding not being able 
> to insert +2000 characters in a varchar(max).
> >> >>
> >> >>
> >> >> you'd want to see if the same ODBC driver and options are in use in 
> that scenario.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Best regards
> >> >> Nicolas
> >> >>
> >> >>
> >> >> On Tue, 13 Oct 2020 at 22.22, Mike Bayer <mik...@zzzcomputing.com> 
> wrote:
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Oct 13, 2020, at 10:50 AM, Nicolas Lykke Iversen wrote:
> >> >>
> >> >> Hi SQLAlchemy,
> >> >>
> >> >> System information:
> >> >>
> >> >> Mac OS X v. 10.15.7
> >> >> Python v. 3.8.5
> >> >> SQLAlchemy v. 1.3.19
> >> >> MS SQL Server 2017 and 2019 (both Enterprise and Docker images e.g. 
> mcr.microsoft.com/mssql/server:2019-latest)
> >> >>
> >> >> Problem:
> >> >> I have an issue with inserting strings with a length greater than 
> 2000 into columns with a datatype of Text, VARCHAR, or NVARCHAR using MS 
> SQL Server 2017 and 2019.
> >> >>
> >> >> I've checked the MS SQL Server and it creates it column properly 
> with e.g. a datatype of varchar(max)for Text and VARCHAR, which should be 
> able to store strings with a size up to 2 GB according to Microsoft 
> documentation.
> >> >>
> >> >> Furthermore, I've tried using other collations, but I need _SC 
> (supplementary character) support for my applications, so I cannot drop it, 
> and adding _UTF8 (UTF-8) doesn't solve the problem either.
> >> >>
> >> >> Why am I not allowed to store strings with a size greater than 2000?
> >> >>
> >> >>
> >> >> I'm not really sure, this has to do with a behavior of SQL Server 
> and/or your ODBC driver. a google search finds fairly scant results but 
> there is a long discussion regarding this error here: 
> https://www.sqlservercentral.com/forums/topic/collation-error-when-adding-distributer
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> And why is SQLAlchemy displaying that error message, when trying to 
> insert plain ASCII text ("AAAA...")?
> >> >>
> >> >>
> >> >> SQLAlchemy runs SQL statements using a method called 
> "cursor.execute()", which is a feature of the DBAPI (database driver) in 
> use. As your error message indicates you're using the pyodbc driver, this 
> method is documented here: 
> https://github.com/mkleehammer/pyodbc/wiki/Cursor#executesql-parameters
> >> >>
> >> >> The DBAPI execute() method, and most of the other DBAPI methods, can 
> throw exceptions if something goes wrong. SQLAlchemy has the policy that if 
> it encounters such an exception when it calls cursor.execute(), it wraps it 
> in a SQLAlchemy-specific version of that exception (named the same) and 
> then throws it.
> >> >>
> >> >> In other words you're using a driver called pyodbc that's creating 
> this error. SQLAlchemy just propagates it for you but otherwise has nothing 
> to do with how it's produced.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> I would really appreciate some guidance on how to solve this 
> problem. What could be causing it?
> >> >>
> >> >> Error:
> >> >> sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', 
> "[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Cannot 
> convert to text/ntext or collate to 'Latin1_General_100_CI_AS_SC' because 
> these legacy LOB types do not support UTF-8 or UTF-16 encodings. Use types 
> varchar(max), nvarchar(max) or a collation which does not have the _SC or 
> _UTF8 flags. (4189) (SQLParamData)")
> >> >> [SQL: INSERT INTO msg (content) OUTPUT inserted.id VALUES (?)]
> >> >> [parameters: 
> ('AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>  
> ... (1703 characters truncated) ... 
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA',)]
> >> >> (Background on this error at: http://sqlalche.me/e/13/f405)
> >> >>
> >> >>
> >> >> Program (POC):
> >> >> import logging
> >> >> import sqlalchemy
> >> >> from sqlalchemy import Column, Text, Integer, NVARCHAR, VARCHAR
> >> >> from sqlalchemy.ext.declarative import declarative_base
> >> >> from sqlalchemy.orm import sessionmaker
> >> >>
> >> >> logging.root.setLevel(logging.DEBUG)
> >> >>
> >> >> Base = declarative_base()
> >> >>
> >> >>
> >> >> class Msg(Base):
> >> >> __tablename__ = 'msg'
> >> >>
> >> >> id = Column(Integer, primary_key=True, autoincrement=True)
> >> >> content = Column(VARCHAR, nullable=False)
> >> >>
> >> >>
> >> >> user = 'sa'
> >> >> pwd = 'P@ssw0rd'
> >> >> host = 'localhost'
> >> >> port = 1433
> >> >> db = 'test'
> >> >>
> >> >> logging.info('started.')
> >> >>
> >> >> engine = sqlalchemy.create_engine(
> >> >> 
> f'mssql+pyodbc://{user}:{pwd}@{host}:{port}/{db}?driver=ODBC+Driver+17+for+SQL+Server',
> >> >> pool_size=5,
> >> >> max_overflow=10,
> >> >> pool_pre_ping=True,
> >> >> isolation_level='READ_UNCOMMITTED',
> >> >> pool_recycle=900,
> >> >> echo=False,
> >> >> connect_args={'connect_timeout': 10})
> >> >>
> >> >> Base.metadata.create_all(engine)
> >> >>
> >> >> Session = sessionmaker(bind=engine, autoflush=False, 
> autocommit=False)
> >> >> session = Session()
> >> >>
> >> >> for i in range(10000):
> >> >> try:
> >> >> msg = Msg(content='A' * i)
> >> >> session.add(msg)
> >> >> session.commit()
> >> >> except Exception as exc:
> >> >> logging.exception(f'fail: {i=}: {exc}')
> >> >> break
> >> >> else:
> >> >> logging.info(f'success: {i=}')
> >> >>
> >> >> logging.info('done.')
> >> >>
> >> >>
> >> >> --
> >> >> SQLAlchemy -
> >> >> The Python SQL Toolkit and Object Relational Mapper
> >> >>
> >> >> http://www.sqlalchemy.org/
> >> >>
> >> >> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> >> >> ---
> >> >>
> >> >> You received this message because you are subscribed to the Google 
> Groups "sqlalchemy" group.
> >> >> To unsubscribe from this group and stop receiving emails from it, 
> send an email to sqlalchemy+...@googlegroups.com.
> >> >> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/ca549391-4360-480e-8c58-06577f6d92dan%40googlegroups.com
> .
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> SQLAlchemy -
> >> >> The Python SQL Toolkit and Object Relational Mapper
> >> >>
> >> >> http://www.sqlalchemy.org/
> >> >>
> >> >> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> >> >> ---
> >> >> You received this message because you are subscribed to a topic in 
> the Google Groups "sqlalchemy" group.
> >> >> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/sqlalchemy/Kk6DkPNWlR4/unsubscribe.
> >> >> To unsubscribe from this group and all its topics, send an email to 
> sqlalchemy+...@googlegroups.com.
> >> >> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/b4a135ef-1c1e-476c-a3f0-60326dca76ea%40www.fastmail.com
> .
> >> >>
> >> >>
> >> >> --
> >> >> SQLAlchemy -
> >> >> The Python SQL Toolkit and Object Relational Mapper
> >> >>
> >> >> http://www.sqlalchemy.org/
> >> >>
> >> >> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> >> >> ---
> >> >> You received this message because you are subscribed to the Google 
> Groups "sqlalchemy" group.
> >> >> To unsubscribe from this group and stop receiving emails from it, 
> send an email to sqlalchemy+...@googlegroups.com.
> >> >>
> >> >> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/CAE%2B5CJABrO_459MHtDePMXah8mb67TFvQB8rUwgVgk6%2By4v-ow%40mail.gmail.com
> .
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> SQLAlchemy -
> >> >> The Python SQL Toolkit and Object Relational Mapper
> >> >>
> >> >> http://www.sqlalchemy.org/
> >> >>
> >> >> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> >> >> ---
> >> >> You received this message because you are subscribed to the Google 
> Groups "sqlalchemy" group.
> >> >> To unsubscribe from this group and stop receiving emails from it, 
> send an email to sqlalchemy+...@googlegroups.com.
> >> >>
> >> >> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/60f82633-887a-43a3-ac52-c5541058e0bcn%40googlegroups.com
> .
> >> >>
> >> >>
> >> > --
> >> > SQLAlchemy -
> >> > The Python SQL Toolkit and Object Relational Mapper
> >> >
> >> > http://www.sqlalchemy.org/
> >> >
> >> > To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> >> > ---
> >> > You received this message because you are subscribed to the Google 
> Groups "sqlalchemy" group.
> >> > To unsubscribe from this group and stop receiving emails from it, 
> send an email to sqlalchemy+...@googlegroups.com.
> >> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/f3e2a277-8529-4fd5-83be-26445616f6c3n%40googlegroups.com
> .
> >
> > --
> > SQLAlchemy -
> > The Python SQL Toolkit and Object Relational Mapper
> >
> > http://www.sqlalchemy.org/
> >
> > To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> > ---
> > You received this message because you are subscribed to the Google 
> Groups "sqlalchemy" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to sqlalchemy+...@googlegroups.com.
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/38a5e081-6e34-42fa-bf99-3d27f445f727n%40googlegroups.com
> .
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/2124ac80-a81d-4dfd-9c66-7034c08a18efn%40googlegroups.com.

Re: [sqlalchemy] Cannot insert strings with a length greater than 2000 into columns with a datatype of Text, VARCHAR, or NVARCHAR using MS SQL Server 2017 and 2019

Reply via email to