Re: [sqlalchemy] [Question] Why not passing Connection URL query parameters to the dialect?

[email protected] Fri, 07 Jan 2022 02:26:29 -0800

Hi !

Ok. So if I understand you correctly, you want to keep query parameters 
solely for DBAPI drivers connection parameters and would hence not accept a 
PR that would implement something that changes that.


There are other reasons though for which I was looking into this. In 
particular, what I am mentioning is already sort of done by PyAthena. They 
use at least two query parameters that help tell where the data is stored.
One (`s3_staging_prefix`) tells where query results are stored and fits 
nicely amongst the connection parameters.
The second (`s3_prefix`) is used to tall where data should be stored when a 
table is created and does not fit so well.

It does not fit because you end-up relying on SchemaItem to be bound to a 
connection to get back those parameters 
<https://github.com/laughingman7743/PyAthena/blob/addbfe79e0dabbcf191bca8803e5e5f0e9e6cee4/pyathena/sqlalchemy_athena.py#L218-L225>,
 
but in many case this binding is not done.
In particular DDL statements compilation just blows in your face. A 
statement like:

      Table('name', MetaData(), Column('c', Integer)).create(bind=engine)

Fails with:

      File "~/pyathena/sqlalchemy_athena.py", line 313, in post_create_table
        raw_connection = table.bind.raw_connection()
     AttributeError: 'NoneType' object has no attribute 'raw_connection'
         Table('name', MetaData(), Column('c', Integer)).create(bind=engine)

I guess the storage location of a table does fit in the table dialect 
kwargs:

    Table('<name>', MetaData(), ..., awsathena_location='s3://...')

Initially I thought it could be useful, e.g. when building ETL pipelines 
that moves data around, to be able to bind a table with the actual storage 
location as late as possible (to reuse a Table object).

But generally other bits in the table definition needs to change too, like 
the name of the schema. So there is no real benefit and one has to create 
several Table objects anyway.
And the use of the connection is just an unfortunate hack... And this is an 
issue that should be addressed in PyAthena.

Thanks for your input, helps choosing the better fix for this.

Regards,
Nicolas

Le jeudi 6 janvier 2022 à 18:18:49 UTC+1, Mike Bayer a écrit :

> hey there -
>
> database URLs do support query string parameters, however they have a 
> specific meaning which is that they are consumed by the DBAPI in use, not 
> the dialect directly.  Please review the docs at 
> https://docs.sqlalchemy.org/en/14/core/engines.html#custom-dbapi-connect-arguments-on-connect-routines
>   
> for background on how these arguments are used.
>
>
>
> On Thu, Jan 6, 2022, at 8:48 AM, [email protected] wrote:
>
> Hi !
>
> While working on some improvements to PyAthena, I was looking into means 
> to pass some parameters to the dialect. Going through the code of the `
> create_engine()` function code, I saw that dialects `__init__()` where 
> given dialect kwargs passed as kwargs 
> <https://github.com/sqlalchemy/sqlalchemy/blob/db85d28a857945ce021e27a187a14999eeb5c89e/lib/sqlalchemy/engine/create.py#L480-L482>
>  
> to the create_engine() function. But the dialect does not have access to 
> the connection URL.
>
> E.g. you can do:
>
> e = create_engine('<url>', dialect_kwarg1=<value>, 
> dialect_kwarg2='<value>', ...)
>
> But not:
>
> e = create_engine('<url>?dialect_kwarg1=<value>&dialect_kwarg2=<value>', 
> ...)
> # or
> e = create_engine('<url>?kwarg1=<value>&kwarg2=<value>', ...)
> # though I guess cause you can pass other kind of args, like pool args, 
> you'd like to keep # the `<dialect>_` prefix
>
> I was wondering why? Particularly given that since the connection URL is 
> what determines the dialect, keeping dialect specific stuff in the URL does 
> not seem that far fetch. Or am I overlooking something?
>
> Why does it matters? I find that passing arguments through the URL very 
> handy. Allows to easily override certain configuration parameters, with 
> touching any code. It also makes it easy to exchange settings with other 
> people.
>
> If there are no particular reasons to not do this, would you accept a PR 
> to deal with this?
>
> Thanks,
> Nicolas.
>
>
> -- 
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/f185f29c-9ec2-48b8-a5de-fe6112ab25c5n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/f185f29c-9ec2-48b8-a5de-fe6112ab25c5n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/eedd1454-e953-4a6b-9f51-b270984c65cfn%40googlegroups.com.

Re: [sqlalchemy] [Question] Why not passing Connection URL query parameters to the dialect?

Reply via email to