Zan-L opened a new issue, #1997: URL: https://github.com/apache/arrow-adbc/issues/1997
### What happened? Jobs calling adbc_ingestion() failed due to memory error. Upon checking, the data were split into {number of processor} parquet files, instead of those of ~10MB like 1.0.0. ### Stack Trace adbc_driver_manager.InternalError: INTERNAL: unknown error type: cannot allocate memory cursor.adbc_ingest(table, data, mode) File "/usr/local/lib/python3.12/site-packages/adbc_driver_manager/dbapi.py", line 937, in adbc_ingest return _blocking_call(self._stmt.execute_update, (), {}, self._stmt.cancel) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "adbc_driver_manager/_lib.pyx", line 1569, in adbc_driver_manager._lib._blocking_call_impl File "adbc_driver_manager/_lib.pyx", line 1562, in adbc_driver_manager._lib._blocking_call_impl File "adbc_driver_manager/_lib.pyx", line 1295, in adbc_driver_manager._lib.AdbcStatement.execute_update File "adbc_driver_manager/_lib.pyx", line 260, in adbc_driver_manager._lib.check_error ### How can we reproduce the bug? Unfortunately, I cannot share the data. However, it should be observed that in a four core VM, a dataset of moderate size (like 500 MB in parquet file size) will be split into four ~125MB files when adbc_ingest() is called to upload to Snowflake instead of fifty ~10MB files. ### Environment/Setup Packages: adbc-driver-manager==1.1.0 adbc-driver-snowflake==1.1.0 Operating system: Windows/Linux Package manager: pip -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org