Re: [sqlalchemy] [Question] Why not passing Connection URL query parameters to the dialect?

2022-01-07 Thread nicolas...@jobteaser.com
Don't get me wrong, I only have praises for the work currently being done 
on removing all the `bind`.

It was one of the things that had me confused with SQLAlchemy when I 
started working with it some years back and also caused me a few headaches.
And honestly after weighing the pros and cons, the advantages of having 
those parameters in the URL are gone (as mentioned earlier). It's just not 
the "right" place to put those.

Just two notes:

The need for the hack I am pointing out in PyAthena, arises from packages 
using SQLAlchemy (here Pandas, as pointed out by the dialect maintainer in 
the issue I opened , 
and the sligthly disappointing fix I provided 
). I haven't dug into 
pandas code to see if there are other, better ways to work the problem, but 
I have my doubts. I don't know about other dialects that could have 
resorted to the same kind of hacks. And I am surely not implying you should 
change course. It is just a warning about issues you might see coming in a 
not so far future, seeing that 2.0 is already in beta.

Detail, but in the docs (and in the code), the Table.bind 

 
attribute (for example) is not yet marked as deprecated, even in the latest 
1.4 
.
 
This is probably not encouraging people to get rid of its use. Though 
looking at the main branch, it is indeed gone. I could work on a PR to mark 
the remaining bind on 1.4 as deprecated if that would be helpful. I have 
been stuck with 1.3 because of lagging dependencies in my projects, so that 
could help me catch up with the coming changes.

Regards,
Nicolas

Le vendredi 7 janvier 2022 à 17:11:49 UTC+1, Mike Bayer a écrit :

> also if you really want your app to have just one URL with all kinds of 
> config in it, then just use that.  get the URL object using the make_url() 
> API, pull out the configuration you need from URL.query, make a new URL 
> from that one that is for your database, then connect.it's all public 
> API that's there:
>
>
> https://docs.sqlalchemy.org/en/14/core/engines.html?highlight=url#sqlalchemy.engine.URL
>
> as far as specific dialects, these tend not to have too much in the way of 
> dialect-specific options that are not part of what's needed to connect to 
> the DBAPI.  SQLAlchemy's dialects have a few which usually refers to some 
> SQL generation behaviors, but most of these are themselves derived from 
> inspecting the database itself, after the connection has been established.
>
>
>
> On Fri, Jan 7, 2022, at 11:00 AM, Mike Bayer wrote:
>
> the idea of Table objects being linked to a database is something I 
> thought was a good idea in 2006, which is why for the last 15 years there's 
> been this notion of "bound metadata" that associates a specific engine with 
> Table objects.   however, probably by 2009 if not earlier, the limited and 
> misleading nature of this idea was pretty apparent not the least of which 
> because modern applications quite often need a certain Table object to 
> apply to lots of different databases, different kinds of databases, etc, 
> and then people were trying to hack around "bound metadata" not doing any 
> of these things, even though by then "bound metadata" was fully optional.  
> but since the pattern was there, people were confused, "bound metadata" was 
> present, why aren't we using it then, why is it there, etc.
>
> Here we are and when SQLAlchemy 2.0 betas are released hopefully in a few 
> months you'll see the notion of linking a Table directly to anything 
> regarding anything to do with locating a specific database is gone.
>
> If your Table has things to do with it that you need to know when you 
> execute queries, which are invariant no matter what the database URL is, 
> you can put those things in the table.info dictionary.
>
> Overall, URLs are meant to refer to "where is a particular database, get 
> me in" and that's it.  things that are *in* the database, tuning 
> parameters, etc. that all goes in config.   An app will usually have config 
> that is more than just a single URL argument.
>
> On Fri, Jan 7, 2022, at 5:26 AM, nicolas...@jobteaser.com wrote:
>
> Hi !
>
> Ok. So if I understand you correctly, you want to keep query parameters 
> solely for DBAPI drivers connection parameters and would hence not accept a 
> PR that would implement something that changes that.
>
> There are other reasons though for which I was looking into this. In 
> particular, what I am mentioning is already sort of done by PyAthena. They 
> use at least two query parameters that help tell where the data is stored.
> One (`s3_staging_prefix`) tells where query results are stored and fits 
> nicely amongst the connection parameters.
> The second (`s3_prefix`) 

Re: [sqlalchemy] With loader criteria and baked query cache

2022-01-07 Thread Mike Bayer
i dont think with_loader_criteria ever expected the target entity to be part of 
"secondary" in a relationship().  I have no idea what that would do and I'm 
surprised it works at all.

I would try instead to map from A->A_rel_B->B explicitly and use two 
relationships for this join.



On Fri, Jan 7, 2022, at 12:51 PM, Tomas Pavlovsky wrote:
> Hello Mike, 
> 
> 
> class X()
>id: id: int = Column( Integer, primary_key=True)
>a_id:int = Column(Integer, ForeignKey("A.id"))
> 
> subq = select(A_rel_B).join(B).order(B.size).limit(10)
> 
> class A()
>id: id: int = Column( Integer, primary_key=True)
>bs = relationship(B, viewonly=True, secondary=subq, 
> uselist=False,primaryjoin=and_(A.id == subq.c.a_id))
> 
> class B()
>id: id: int = Column( Integer, primary_key=True)
>size= Column( Integer)
> 
> class A_rel_B()
>   a_id: :int = Column(Integer, ForeignKey("A.id"))
>   b_id: :int = Column(Integer, ForeignKey("B.id"))
> 
>  x: int 
> q = select(X).options(
>selectinload(A).options(joinedload(A.bs)),
> with_loader_criteria(A_rel_B, func.fn(A_rel_B.b_id, x).is_(True))
> )
> 
> When i add with_loader_criteria(A_rel_B, func.fn(A_rel_B.b_id, x).is_(True)), 
> then x value is cached and don't change. It is ok?
> 
> Thanks
> 
> 
> -- 
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/0fb2b5d7-4321-4349-add7-7256cba8aea2n%40googlegroups.com
>  
> .

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/247cbd1e-9e4a-4554-bda3-dcfe61f2d486%40www.fastmail.com.


[sqlalchemy] With loader criteria and baked query cache

2022-01-07 Thread Tomas Pavlovsky
Hello Mike, 


class X()
   id: id: int = Column( Integer, primary_key=True)
   a_id:int = Column(Integer, ForeignKey("A.id"))

subq = select(A_rel_B).join(B).order(B.size).limit(10)

class A()
   id: id: int = Column( Integer, primary_key=True)
   bs = relationship(B, viewonly=True, secondary=subq, 
uselist=False,primaryjoin=and_(A.id == subq.c.a_id))

class B()
   id: id: int = Column( Integer, primary_key=True)
   size= Column( Integer)

class A_rel_B()
  a_id: :int = Column(Integer, ForeignKey("A.id"))
  b_id: :int = Column(Integer, ForeignKey("B.id"))

 x: int 
q = select(X).options(
   selectinload(A).options(joinedload(A.bs)),
with_loader_criteria(A_rel_B, func.fn(A_rel_B.b_id, x).is_(True))
)

When i add with_loader_criteria(A_rel_B, func.fn(A_rel_B.b_id, 
x).is_(True)), then x value is cached and don't change. It is ok?

Thanks

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/0fb2b5d7-4321-4349-add7-7256cba8aea2n%40googlegroups.com.


Re: [sqlalchemy] [Question] Why not passing Connection URL query parameters to the dialect?

2022-01-07 Thread Jonathan Vanasco
> Ok. So if I understand you correctly, you want to keep query parameters 
solely for DBAPI drivers connection parameters and would hence not accept a 
PR that would implement something that changes that.

Just adding: the standard across programming languages and database 
products/projects is to utilize a database connection url as SQLAlchemy 
currently implements.  departing from this concept could create too many 
compatibility issues in the future, and has the potential for breaking 
existing integrations today.
On Friday, January 7, 2022 at 11:11:49 AM UTC-5 Mike Bayer wrote:

> also if you really want your app to have just one URL with all kinds of 
> config in it, then just use that.  get the URL object using the make_url() 
> API, pull out the configuration you need from URL.query, make a new URL 
> from that one that is for your database, then connect.it's all public 
> API that's there:
>
>
> https://docs.sqlalchemy.org/en/14/core/engines.html?highlight=url#sqlalchemy.engine.URL
>
> as far as specific dialects, these tend not to have too much in the way of 
> dialect-specific options that are not part of what's needed to connect to 
> the DBAPI.  SQLAlchemy's dialects have a few which usually refers to some 
> SQL generation behaviors, but most of these are themselves derived from 
> inspecting the database itself, after the connection has been established.
>
>
>
> On Fri, Jan 7, 2022, at 11:00 AM, Mike Bayer wrote:
>
> the idea of Table objects being linked to a database is something I 
> thought was a good idea in 2006, which is why for the last 15 years there's 
> been this notion of "bound metadata" that associates a specific engine with 
> Table objects.   however, probably by 2009 if not earlier, the limited and 
> misleading nature of this idea was pretty apparent not the least of which 
> because modern applications quite often need a certain Table object to 
> apply to lots of different databases, different kinds of databases, etc, 
> and then people were trying to hack around "bound metadata" not doing any 
> of these things, even though by then "bound metadata" was fully optional.  
> but since the pattern was there, people were confused, "bound metadata" was 
> present, why aren't we using it then, why is it there, etc.
>
> Here we are and when SQLAlchemy 2.0 betas are released hopefully in a few 
> months you'll see the notion of linking a Table directly to anything 
> regarding anything to do with locating a specific database is gone.
>
> If your Table has things to do with it that you need to know when you 
> execute queries, which are invariant no matter what the database URL is, 
> you can put those things in the table.info dictionary.
>
> Overall, URLs are meant to refer to "where is a particular database, get 
> me in" and that's it.  things that are *in* the database, tuning 
> parameters, etc. that all goes in config.   An app will usually have config 
> that is more than just a single URL argument.
>
> On Fri, Jan 7, 2022, at 5:26 AM, nicolas...@jobteaser.com wrote:
>
> Hi !
>
> Ok. So if I understand you correctly, you want to keep query parameters 
> solely for DBAPI drivers connection parameters and would hence not accept a 
> PR that would implement something that changes that.
>
> There are other reasons though for which I was looking into this. In 
> particular, what I am mentioning is already sort of done by PyAthena. They 
> use at least two query parameters that help tell where the data is stored.
> One (`s3_staging_prefix`) tells where query results are stored and fits 
> nicely amongst the connection parameters.
> The second (`s3_prefix`) is used to tall where data should be stored when 
> a table is created and does not fit so well.
>
> It does not fit because you end-up relying on SchemaItem to be bound to a 
> connection to get back those parameters 
> ,
>  
> but in many case this binding is not done.
> In particular DDL statements compilation just blows in your face. A 
> statement like:
>
>   Table('name', MetaData(), Column('c', Integer)).create(bind=engine)
>
> Fails with:
>
>   File "~/pyathena/sqlalchemy_athena.py", line 313, in 
> post_create_table
> raw_connection = table.bind.raw_connection()
>  AttributeError: 'NoneType' object has no attribute 'raw_connection'
>  Table('name', MetaData(), Column('c', 
> Integer)).create(bind=engine)
>
> I guess the storage location of a table does fit in the table dialect 
> kwargs:
>
> Table('', MetaData(), ..., awsathena_location='s3://...')
>
> Initially I thought it could be useful, e.g. when building ETL pipelines 
> that moves data around, to be able to bind a table with the actual storage 
> location as late as possible (to reuse a Table object).
>
> But generally other bits in the table definition needs to change too, like 
> the name of the 

Re: [sqlalchemy] [Question] Why not passing Connection URL query parameters to the dialect?

2022-01-07 Thread Mike Bayer
also if you really want your app to have just one URL with all kinds of config 
in it, then just use that.  get the URL object using the make_url() API, pull 
out the configuration you need from URL.query, make a new URL from that one 
that is for your database, then connect.it's all public API that's there:

https://docs.sqlalchemy.org/en/14/core/engines.html?highlight=url#sqlalchemy.engine.URL

as far as specific dialects, these tend not to have too much in the way of 
dialect-specific options that are not part of what's needed to connect to the 
DBAPI.  SQLAlchemy's dialects have a few which usually refers to some SQL 
generation behaviors, but most of these are themselves derived from inspecting 
the database itself, after the connection has been established.



On Fri, Jan 7, 2022, at 11:00 AM, Mike Bayer wrote:
> the idea of Table objects being linked to a database is something I thought 
> was a good idea in 2006, which is why for the last 15 years there's been this 
> notion of "bound metadata" that associates a specific engine with Table 
> objects.   however, probably by 2009 if not earlier, the limited and 
> misleading nature of this idea was pretty apparent not the least of which 
> because modern applications quite often need a certain Table object to apply 
> to lots of different databases, different kinds of databases, etc, and then 
> people were trying to hack around "bound metadata" not doing any of these 
> things, even though by then "bound metadata" was fully optional.  but since 
> the pattern was there, people were confused, "bound metadata" was present, 
> why aren't we using it then, why is it there, etc.
> 
> Here we are and when SQLAlchemy 2.0 betas are released hopefully in a few 
> months you'll see the notion of linking a Table directly to anything 
> regarding anything to do with locating a specific database is gone.
> 
> If your Table has things to do with it that you need to know when you execute 
> queries, which are invariant no matter what the database URL is, you can put 
> those things in the table.info dictionary.
> 
> Overall, URLs are meant to refer to "where is a particular database, get me 
> in" and that's it.  things that are *in* the database, tuning parameters, 
> etc. that all goes in config.   An app will usually have config that is more 
> than just a single URL argument.
> 
> On Fri, Jan 7, 2022, at 5:26 AM, nicolas...@jobteaser.com wrote:
>> Hi !
>> 
>> Ok. So if I understand you correctly, you want to keep query parameters 
>> solely for DBAPI drivers connection parameters and would hence not accept a 
>> PR that would implement something that changes that.
>> 
>> There are other reasons though for which I was looking into this. In 
>> particular, what I am mentioning is already sort of done by PyAthena. They 
>> use at least two query parameters that help tell where the data is stored.
>> One (`s3_staging_prefix`) tells where query results are stored and fits 
>> nicely amongst the connection parameters.
>> The second (`s3_prefix`) is used to tall where data should be stored when a 
>> table is created and does not fit so well.
>> 
>> It does not fit because you end-up relying on SchemaItem to be bound to a 
>> connection to get back those parameters 
>> ,
>>  but in many case this binding is not done.
>> In particular DDL statements compilation just blows in your face. A 
>> statement like:
>> 
>>   Table('name', MetaData(), Column('c', Integer)).create(bind=engine)
>> 
>> Fails with:
>> 
>>   File "~/pyathena/sqlalchemy_athena.py", line 313, in post_create_table
>> raw_connection = table.bind.raw_connection()
>>  AttributeError: 'NoneType' object has no attribute 'raw_connection'
>>  Table('name', MetaData(), Column('c', Integer)).create(bind=engine)
>> 
>> I guess the storage location of a table does fit in the table dialect kwargs:
>> 
>> Table('', MetaData(), ..., awsathena_location='s3://...')
>> 
>> Initially I thought it could be useful, e.g. when building ETL pipelines 
>> that moves data around, to be able to bind a table with the actual storage 
>> location as late as possible (to reuse a Table object).
>> 
>> But generally other bits in the table definition needs to change too, like 
>> the name of the schema. So there is no real benefit and one has to create 
>> several Table objects anyway.
>> And the use of the connection is just an unfortunate hack... And this is an 
>> issue that should be addressed in PyAthena.
>> 
>> Thanks for your input, helps choosing the better fix for this.
>> 
>> Regards,
>> Nicolas
>> 
>> Le jeudi 6 janvier 2022 à 18:18:49 UTC+1, Mike Bayer a écrit :
>>> __
>>> hey there -
>>> 
>>> database URLs do support query string parameters, however they have a 
>>> specific meaning which is that they are consumed by the DBAPI in use, not 
>>> the dialect 

Re: [sqlalchemy] [Question] Why not passing Connection URL query parameters to the dialect?

2022-01-07 Thread Mike Bayer
the idea of Table objects being linked to a database is something I thought was 
a good idea in 2006, which is why for the last 15 years there's been this 
notion of "bound metadata" that associates a specific engine with Table 
objects.   however, probably by 2009 if not earlier, the limited and misleading 
nature of this idea was pretty apparent not the least of which because modern 
applications quite often need a certain Table object to apply to lots of 
different databases, different kinds of databases, etc, and then people were 
trying to hack around "bound metadata" not doing any of these things, even 
though by then "bound metadata" was fully optional.  but since the pattern was 
there, people were confused, "bound metadata" was present, why aren't we using 
it then, why is it there, etc.

Here we are and when SQLAlchemy 2.0 betas are released hopefully in a few 
months you'll see the notion of linking a Table directly to anything regarding 
anything to do with locating a specific database is gone.

If your Table has things to do with it that you need to know when you execute 
queries, which are invariant no matter what the database URL is, you can put 
those things in the table.info dictionary.

Overall, URLs are meant to refer to "where is a particular database, get me in" 
and that's it.  things that are *in* the database, tuning parameters, etc. that 
all goes in config.   An app will usually have config that is more than just a 
single URL argument.

On Fri, Jan 7, 2022, at 5:26 AM, nicolas...@jobteaser.com wrote:
> Hi !
> 
> Ok. So if I understand you correctly, you want to keep query parameters 
> solely for DBAPI drivers connection parameters and would hence not accept a 
> PR that would implement something that changes that.
> 
> There are other reasons though for which I was looking into this. In 
> particular, what I am mentioning is already sort of done by PyAthena. They 
> use at least two query parameters that help tell where the data is stored.
> One (`s3_staging_prefix`) tells where query results are stored and fits 
> nicely amongst the connection parameters.
> The second (`s3_prefix`) is used to tall where data should be stored when a 
> table is created and does not fit so well.
> 
> It does not fit because you end-up relying on SchemaItem to be bound to a 
> connection to get back those parameters 
> ,
>  but in many case this binding is not done.
> In particular DDL statements compilation just blows in your face. A statement 
> like:
> 
>   Table('name', MetaData(), Column('c', Integer)).create(bind=engine)
> 
> Fails with:
> 
>   File "~/pyathena/sqlalchemy_athena.py", line 313, in post_create_table
> raw_connection = table.bind.raw_connection()
>  AttributeError: 'NoneType' object has no attribute 'raw_connection'
>  Table('name', MetaData(), Column('c', Integer)).create(bind=engine)
> 
> I guess the storage location of a table does fit in the table dialect kwargs:
> 
> Table('', MetaData(), ..., awsathena_location='s3://...')
> 
> Initially I thought it could be useful, e.g. when building ETL pipelines that 
> moves data around, to be able to bind a table with the actual storage 
> location as late as possible (to reuse a Table object).
> 
> But generally other bits in the table definition needs to change too, like 
> the name of the schema. So there is no real benefit and one has to create 
> several Table objects anyway.
> And the use of the connection is just an unfortunate hack... And this is an 
> issue that should be addressed in PyAthena.
> 
> Thanks for your input, helps choosing the better fix for this.
> 
> Regards,
> Nicolas
> 
> Le jeudi 6 janvier 2022 à 18:18:49 UTC+1, Mike Bayer a écrit :
>> __
>> hey there -
>> 
>> database URLs do support query string parameters, however they have a 
>> specific meaning which is that they are consumed by the DBAPI in use, not 
>> the dialect directly.  Please review the docs at 
>> https://docs.sqlalchemy.org/en/14/core/engines.html#custom-dbapi-connect-arguments-on-connect-routines
>>   for background on how these arguments are used.
>> 
>> 
>> 
>> On Thu, Jan 6, 2022, at 8:48 AM, nicolas...@jobteaser.com wrote:
>>> Hi !
>>> 
>>> While working on some improvements to PyAthena, I was looking into means to 
>>> pass some parameters to the dialect. Going through the code of the 
>>> `create_engine()` function code, I saw that dialects `__init__()` where 
>>> given dialect kwargs passed as kwargs 
>>> 
>>>  to the create_engine() function. But the dialect does not have access to 
>>> the connection URL.
>>> 
>>> E.g. you can do:
>>> 
>>> e = create_engine('', dialect_kwarg1=, 
>>> dialect_kwarg2='', ...)
>>> 
>>> But not:
>>> 
>>> e = 

[sqlalchemy] Create association of three or more tables

2022-01-07 Thread fco...@gmail.com
Hi all,

I would like to create some association of, at least, 3 tables User, Org 
(organisation) and Role : a User is given a Role on an Organisation.

So I began writing the following snippet but I am stucked as I do not know 
how I should write relationship() for User to be able to refer Orgs and 
Roles, Org to refer Users and Roles ...

I would like to avoid tricky code, do you have some advice on how I should 
do ?

# 
https://docs.sqlalchemy.org/en/14/orm/basic_relationships.html#association-object
from datetime import datetime
from sqlalchemy import create_engine
from sqlalchemy import Column, ForeignKey, Integer, Text, DateTime
from sqlalchemy.orm import declarative_base, relationship
from sqlalchemy.orm.session import Session

Base = declarative_base()

class UserOrgRole(Base):
__tablename__ = "user_org_role"

user_id = Column(
ForeignKey("user.id", ondelete="CASCADE"), primary_key=True
)
org_id = Column(ForeignKey("org.id", ondelete="CASCADE"), primary_key=True)
# Does it make sense ?
role_id = Column(
ForeignKey("role.id", ondelete="CASCADE"), primary_key=True
)

# I do not know what the relationship could back populate, org or role ???
user = relationship("User")
org = relationship("Org")
role = relationship("Role")


class User(Base):
__tablename__ = "user"

id = Column(Integer, primary_key=True)
username = Column(Text, index=True, nullable=False)
fullname = Column(Text, nullable=False)
account_type = Column(Text, nullable=False)

def __repr__(self):
return (
f""
)


class Org(Base):
__tablename__ = "org"

id = Column(Integer, primary_key=True)
name = Column(Text, index=True, nullable=False)
slug = Column(Text, index=True, nullable=False)
created_at = Column(DateTime)

def __repr__(self):
return (
f""
)


class Role(Base):
__tablename__ = "role"

id = Column(Integer, primary_key=True)
name = Column(Text, index=True, nullable=False)

def __repr__(self):
return f""


if __name__ == "__main__":
engine = create_engine(
"sqlite:///association_object_ternary.db", echo=False
)

Base.metadata.drop_all(engine)
Base.metadata.create_all(engine)

with Session(engine) as session:
# create parent, append a child via association
u1 = User(
username="jlondon",
fullname="Jack London",
account_type="member",
)
o1 = Org(name="o1", slug="o1", created_at=datetime.utcnow())
owner = Role(name="owner")

uor1 = UserOrgRole()
uor1.user = u1
uor1.org = o1
uor1.role = owner

with session.begin():
session.add(u1)
session.add(o1)
session.add(owner)

session.add(uor1)

Thanks for your help and for this nice library which has a great 
documentation.
Françoise

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/b0619e40-c5d5-4590-a61e-7d22aa6ad993n%40googlegroups.com.


Re: [sqlalchemy] [Question] Why not passing Connection URL query parameters to the dialect?

2022-01-07 Thread nicolas...@jobteaser.com
Hi !

Ok. So if I understand you correctly, you want to keep query parameters 
solely for DBAPI drivers connection parameters and would hence not accept a 
PR that would implement something that changes that.

There are other reasons though for which I was looking into this. In 
particular, what I am mentioning is already sort of done by PyAthena. They 
use at least two query parameters that help tell where the data is stored.
One (`s3_staging_prefix`) tells where query results are stored and fits 
nicely amongst the connection parameters.
The second (`s3_prefix`) is used to tall where data should be stored when a 
table is created and does not fit so well.

It does not fit because you end-up relying on SchemaItem to be bound to a 
connection to get back those parameters 
,
 
but in many case this binding is not done.
In particular DDL statements compilation just blows in your face. A 
statement like:

  Table('name', MetaData(), Column('c', Integer)).create(bind=engine)

Fails with:

  File "~/pyathena/sqlalchemy_athena.py", line 313, in post_create_table
raw_connection = table.bind.raw_connection()
 AttributeError: 'NoneType' object has no attribute 'raw_connection'
 Table('name', MetaData(), Column('c', Integer)).create(bind=engine)

I guess the storage location of a table does fit in the table dialect 
kwargs:

Table('', MetaData(), ..., awsathena_location='s3://...')

Initially I thought it could be useful, e.g. when building ETL pipelines 
that moves data around, to be able to bind a table with the actual storage 
location as late as possible (to reuse a Table object).

But generally other bits in the table definition needs to change too, like 
the name of the schema. So there is no real benefit and one has to create 
several Table objects anyway.
And the use of the connection is just an unfortunate hack... And this is an 
issue that should be addressed in PyAthena.

Thanks for your input, helps choosing the better fix for this.

Regards,
Nicolas

Le jeudi 6 janvier 2022 à 18:18:49 UTC+1, Mike Bayer a écrit :

> hey there -
>
> database URLs do support query string parameters, however they have a 
> specific meaning which is that they are consumed by the DBAPI in use, not 
> the dialect directly.  Please review the docs at 
> https://docs.sqlalchemy.org/en/14/core/engines.html#custom-dbapi-connect-arguments-on-connect-routines
>   
> for background on how these arguments are used.
>
>
>
> On Thu, Jan 6, 2022, at 8:48 AM, nicolas...@jobteaser.com wrote:
>
> Hi !
>
> While working on some improvements to PyAthena, I was looking into means 
> to pass some parameters to the dialect. Going through the code of the `
> create_engine()` function code, I saw that dialects `__init__()` where 
> given dialect kwargs passed as kwargs 
> 
>  
> to the create_engine() function. But the dialect does not have access to 
> the connection URL.
>
> E.g. you can do:
>
> e = create_engine('', dialect_kwarg1=, 
> dialect_kwarg2='', ...)
>
> But not:
>
> e = create_engine('?dialect_kwarg1=_kwarg2=', 
> ...)
> # or
> e = create_engine('?kwarg1==', ...)
> # though I guess cause you can pass other kind of args, like pool args, 
> you'd like to keep # the `_` prefix
>
> I was wondering why? Particularly given that since the connection URL is 
> what determines the dialect, keeping dialect specific stuff in the URL does 
> not seem that far fetch. Or am I overlooking something?
>
> Why does it matters? I find that passing arguments through the URL very 
> handy. Allows to easily override certain configuration parameters, with 
> touching any code. It also makes it easy to exchange settings with other 
> people.
>
> If there are no particular reasons to not do this, would you accept a PR 
> to deal with this?
>
> Thanks,
> Nicolas.
>
>
> -- 
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/f185f29c-9ec2-48b8-a5de-fe6112ab25c5n%40googlegroups.com
>  
> 
> .
>
>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code,