hi there, responses inline

On Wed, Aug 28, 2019, at 11:39 PM, Varun Madiath wrote:
> Hi.
> 
> I'm working to extend the Turbodbc 
> <https://turbodbc.readthedocs.io/en/latest/> dialect of the sqlalchemy_exasol 
> <https://github.com/blue-yonder/sqlalchemy_exasol/> project to support 
> executing against a pandas dataframe.
> 
> I'm aware that SqlAlchemy can use the executemany method of the DBAPI, and 
> I've seen that the MSSql dialect can even use PyODBC's fast_executemany 
> feature to speed up execution. However these all rely on the conversion of 
> the data into a list of dictionaries that is then passed to the execute 
> method of sqlalchmey.
> 
> Turbodbc has the ability to execute against a PyArrow table directly 
> <https://turbodbc.readthedocs.io/en/latest/pages/advanced_usage.html#using-apache-arrow-tables-as-query-parameters>.
>  I was hoping to get some advice on the best way to be able to expose this 
> optimisation to users. Here are two approaches I'm considering

what I dont understand about this API is what the contents of the SQL string 
need to be. Is this method parsing the SQL string "INSERT INTO my_table VALUES 
(?, ?)" ? that's not exactly how a SQLAlchemy INSERT statement looks, for 
example, we include the column names. 



>  1. Expose an execute_pandas method on a custom Connection object.

SQLAlchemy's Connection class is not a subclassing target. The internal Dialect 
on the other hand is, and that's where you can extend all kinds of various 
execution hooks. 

One reason is makes no sense to subclass Connection is that custom DBAPI 
features that require special execution styles are much more simply exposed 
using the DBAPI directly, based on the patterns documented at 
https://docs.sqlalchemy.org/en/13/core/connections.html#working-with-raw-dbapi-connections
 . Another is that Connection accepts an execution_options() dictionary which 
is where dialect-specific directives should be placed, where they can be 
interpreted by the Dialect.



>  1. Extend the execute function in a custom Connection class to check if the 
> parameter is a pandas DataFrame and if it is, use executemanycolumns method 
> of turbodbc, and otherwise defer to Connection.execute.

The most straightforward approach is to support an execution_option, so that 
you only need to change your dialect's do_executemany() method to look in 
context._execution_options for the key/value pairs that you've documented as 
being meaningful to your dialect.

conn.execution_options(exasol_pyarrow_table=my_table).execute("insert into 
table foo bar")






> Do these approaches seem sane? Are there any other approaches that you'd 
> recommend?
> 
> I have a version of method one implemented in this commit 
> <https://github.com/vamega/sqlalchemy_exasol/commit/02f5b6ea93af09d128261f5e1c9f674ac433783e>.
>  It somewhat works, but relies on having a custom EngineStrategy to have 
> create_engine return a Connection subclass. It also doesn't have knowledge of 
> being in a transaction, and auto commit, and probably a lot of other things. 
> The branch fetch-pandas will be tracking that approach.
> 
> I like the second approach more, although from some initial investigation 
> it's going to be a lot more work to get that working seamlessly across the 
> execution and core levels of SqlAlchemy, and I'm not even sure if it's 
> possible.
> 
> I appreciate any feedback you might have.
> 
> Finally, I'd also like to thank everyone here for a wonderful product. I 
> don't use the ORM much, but I really love using SQLAlchemy!
> 
> Varun Madiath
> 
> 

> --
>  SQLAlchemy - 
>  The Python SQL Toolkit and Object Relational Mapper
> 
> http://www.sqlalchemy.org/
> 
>  To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
>  --- 
>  You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
>  To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
>  To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/CADm-oGk6NO%3DciTMrRzQ-ruez5_bcC-qBxuPWL2k5F%3DTUbNDMWw%40mail.gmail.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/CADm-oGk6NO%3DciTMrRzQ-ruez5_bcC-qBxuPWL2k5F%3DTUbNDMWw%40mail.gmail.com?utm_medium=email&utm_source=footer>.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/de3bd0c2-d96b-42d3-955b-4ea4a493089f%40www.fastmail.com.

Reply via email to