Re: [sqlalchemy] [Question] Why not passing Connection URL query parameters to the dialect?

Mike Bayer Fri, 07 Jan 2022 08:01:13 -0800

the idea of Table objects being linked to a database is something I thought was 
a good idea in 2006, which is why for the last 15 years there's been this 
notion of "bound metadata" that associates a specific engine with Table 
objects.   however, probably by 2009 if not earlier, the limited and misleading 
nature of this idea was pretty apparent not the least of which because modern 
applications quite often need a certain Table object to apply to lots of 
different databases, different kinds of databases, etc, and then people were 
trying to hack around "bound metadata" not doing any of these things, even 
though by then "bound metadata" was fully optional.  but since the pattern was 
there, people were confused, "bound metadata" was present, why aren't we using 
it then, why is it there, etc.


Here we are and when SQLAlchemy 2.0 betas are released hopefully in a few 
months you'll see the notion of linking a Table directly to anything regarding 
anything to do with locating a specific database is gone.

If your Table has things to do with it that you need to know when you execute 
queries, which are invariant no matter what the database URL is, you can put 
those things in the table.info dictionary.

Overall, URLs are meant to refer to "where is a particular database, get me in" 
and that's it.  things that are *in* the database, tuning parameters, etc. that 
all goes in config.   An app will usually have config that is more than just a 
single URL argument.

On Fri, Jan 7, 2022, at 5:26 AM, [email protected] wrote:
> Hi !
> 
> Ok. So if I understand you correctly, you want to keep query parameters 
> solely for DBAPI drivers connection parameters and would hence not accept a 
> PR that would implement something that changes that.
> 
> There are other reasons though for which I was looking into this. In 
> particular, what I am mentioning is already sort of done by PyAthena. They 
> use at least two query parameters that help tell where the data is stored.
> One (`s3_staging_prefix`) tells where query results are stored and fits 
> nicely amongst the connection parameters.
> The second (`s3_prefix`) is used to tall where data should be stored when a 
> table is created and does not fit so well.
> 
> It does not fit because you end-up relying on SchemaItem to be bound to a 
> connection to get back those parameters 
> <https://github.com/laughingman7743/PyAthena/blob/addbfe79e0dabbcf191bca8803e5e5f0e9e6cee4/pyathena/sqlalchemy_athena.py#L218-L225>,
>  but in many case this binding is not done.
> In particular DDL statements compilation just blows in your face. A statement 
> like:
> 
>       Table('name', MetaData(), Column('c', Integer)).create(bind=engine)
> 
> Fails with:
> 
>       File "~/pyathena/sqlalchemy_athena.py", line 313, in post_create_table
>         raw_connection = table.bind.raw_connection()
>      AttributeError: 'NoneType' object has no attribute 'raw_connection'
>          Table('name', MetaData(), Column('c', Integer)).create(bind=engine)
> 
> I guess the storage location of a table does fit in the table dialect kwargs:
> 
>     Table('<name>', MetaData(), ..., awsathena_location='s3://...')
> 
> Initially I thought it could be useful, e.g. when building ETL pipelines that 
> moves data around, to be able to bind a table with the actual storage 
> location as late as possible (to reuse a Table object).
> 
> But generally other bits in the table definition needs to change too, like 
> the name of the schema. So there is no real benefit and one has to create 
> several Table objects anyway.
> And the use of the connection is just an unfortunate hack... And this is an 
> issue that should be addressed in PyAthena.
> 
> Thanks for your input, helps choosing the better fix for this.
> 
> Regards,
> Nicolas
> 
> Le jeudi 6 janvier 2022 à 18:18:49 UTC+1, Mike Bayer a écrit :
>> __
>> hey there -
>> 
>> database URLs do support query string parameters, however they have a 
>> specific meaning which is that they are consumed by the DBAPI in use, not 
>> the dialect directly.  Please review the docs at 
>> https://docs.sqlalchemy.org/en/14/core/engines.html#custom-dbapi-connect-arguments-on-connect-routines
>>   for background on how these arguments are used.
>> 
>> 
>> 
>> On Thu, Jan 6, 2022, at 8:48 AM, [email protected] wrote:
>>> Hi !
>>> 
>>> While working on some improvements to PyAthena, I was looking into means to 
>>> pass some parameters to the dialect. Going through the code of the 
>>> `create_engine()` function code, I saw that dialects `__init__()` where 
>>> given dialect kwargs passed as kwargs 
>>> <https://github.com/sqlalchemy/sqlalchemy/blob/db85d28a857945ce021e27a187a14999eeb5c89e/lib/sqlalchemy/engine/create.py#L480-L482>
>>>  to the create_engine() function. But the dialect does not have access to 
>>> the connection URL.
>>> 
>>> E.g. you can do:
>>> 
>>> e = create_engine('<url>', dialect_kwarg1=<value>, 
>>> dialect_kwarg2='<value>', ...)
>>> 
>>> But not:
>>> 
>>> e = create_engine('<url>?dialect_kwarg1=<value>&dialect_kwarg2=<value>', 
>>> ...)
>>> # or
>>> e = create_engine('<url>?kwarg1=<value>&kwarg2=<value>', ...)
>>> # though I guess cause you can pass other kind of args, like pool args, 
>>> you'd like to keep # the `<dialect>_` prefix
>>> 
>>> I was wondering why? Particularly given that since the connection URL is 
>>> what determines the dialect, keeping dialect specific stuff in the URL does 
>>> not seem that far fetch. Or am I overlooking something?
>>> 
>>> Why does it matters? I find that passing arguments through the URL very 
>>> handy. Allows to easily override certain configuration parameters, with 
>>> touching any code. It also makes it easy to exchange settings with other 
>>> people.
>>> 
>>> If there are no particular reasons to not do this, would you accept a PR to 
>>> deal with this?
>>> 
>>> Thanks,
>>> Nicolas.
>>> 
>>> 
>>> -- 
>>> SQLAlchemy - 
>>> The Python SQL Toolkit and Object Relational Mapper
>>>  
>>> http://www.sqlalchemy.org/
>>>  
>>> To post example code, please provide an MCVE: Minimal, Complete, and 
>>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
>>> description.
>>> --- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "sqlalchemy" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected].
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/sqlalchemy/f185f29c-9ec2-48b8-a5de-fe6112ab25c5n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/sqlalchemy/f185f29c-9ec2-48b8-a5de-fe6112ab25c5n%40googlegroups.com?utm_medium=email&utm_source=footer>.
>> 
> 
> 
> -- 
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/eedd1454-e953-4a6b-9f51-b270984c65cfn%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/eedd1454-e953-4a6b-9f51-b270984c65cfn%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/8ec61c87-b762-406e-9e3e-8ad779407bfb%40www.fastmail.com.

Re: [sqlalchemy] [Question] Why not passing Connection URL query parameters to the dialect?

Reply via email to