[ https://issues.apache.org/jira/browse/ARROW-11390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272271#comment-17272271 ]
Lance Dacey commented on ARROW-11390: ------------------------------------- Actually, turbodbc would have been installed before pyarrow since version 3.0 was not on conda-forge so I moved it down to the pip section. Do I need to reverse this installation process? {code:java} && /opt/conda/bin/conda install -c conda-forge -yq \ pandas \ numpy \ pyodbc \ pybind11 \ turbodbc \ azure-storage-blob \ azure-storage-common \ xlrd \ openpyxl \ mysql-connector-python \ zeep \ xmltodict \ dask \ dask-labextension \ pymssql=2.1 \ sqlalchemy-redshift \ python-snappy \ seaborn \ python-gitlab \ pyxlsb \ humanfriendly \ jupyterlab \ notebook=6.1.4 \ pip \ && /opt/conda/bin/pip install --no-cache-dir --upgrade pip \ smartsheet-python-sdk \ duo-client \ adlfs \ pyarrow \ "apache-airflow[postgres,redis,celery,crypto,ssh,password]==$AIRFLOW_VERSION" \ {code} I have not been able to get turbodbc to work with pip which is why I am using conda right now. Actually I was just trying to get it to work again using a CFLAGS argument "-D_GLIBCXX_USE_CXX11_ABI=0", but had no luck. I will attempt some more and perhaps raise an issue on the turbodbc project though. Let me know if there is a proper way to install these libraries! (ideally with just plain pip, since my base image is from Airflow which does not use conda by default) > [Python] pyarrow 3.0 issues with turbodbc > ----------------------------------------- > > Key: ARROW-11390 > URL: https://issues.apache.org/jira/browse/ARROW-11390 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 3.0.0 > Environment: pyarrow 3.0.0 > fsspec 0.8.4 > adlfs v0.5.9 > pandas 1.2.1 > numpy 1.19.5 > turbodbc 4.1.1 > Reporter: Lance Dacey > Priority: Major > Labels: python, turbodbc > > This is more of a turbodbc issue I think, but perhaps someone here would have > some idea of what changed to cause potential issues. > {code:java} > cursor = connection.cursor() > cursor.execute("select top 10 * from dbo.tickets") > table = cursor.fetchallarrow(){code} > I am able to run table.num_rows and it will print out 10. > If I run table.to_pandas() or table.schema or try to write the table to a > dataset, my kernel dies with no explanation. I reverted back to pyarrow 2.0 > and the same code works again. > [https://github.com/blue-yonder/turbodbc/issues/289] > -- This message was sent by Atlassian Jira (v8.3.4#803005)