zeroshade commented on issue #2508:
URL: https://github.com/apache/arrow-adbc/issues/2508#issuecomment-2654744391
I created a table in Snowflake consisting of 75 columns and 1M rows, where
25 columns were `NUMBER(38,0)`, 25 were `FLOAT` and 25 were `VARCHAR`. So that
it should be guaranteed to go through the integerToDecimal128 function (I
checked the debugger to verify).
With a pure go mainprog, I was able to download the entire million rows in
under 5 to 7 seconds (roughly 7x faster than ODBC in your original screenshot
for the 1M rows).
Just to confirm things for myself, I also tested using python (which will go
through a C stream of data just like R would) and rather than timing just
streaming the results, I also included creating a pyarrow table from the
streamed records (i.e. materializing the entire result set in memory at once
rather than just grabbing one batch at a time), the corresponding code looks
like:
```python
import adbc_driver_snowflake.dbapi
with adbc_driver_snowflake.dbapi.connect("<snowflake URI>") as conn,
conn.cursor() as cur:
cur.execute('SELECT * FROM "my_table"')
tbl = cur.fetch_arrow_table()
print(tbl.num_rows)
```
And even with the added cost of materializing the entire result set, it
still only takes around 20s for the python to run. So whatever is causing it to
take so long seems to be specific to R, and not on the Go side. @paleolimbot
any ideas?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]