Re: [sqlalchemy] SQLAlchemy 1.0.12 use inner JOIN instead of LEFT OUTER JOIN

2016-04-25 Thread Mike Bayer



On 04/25/2016 05:16 PM, Alex Dev wrote:

Hello,

I have a broken query when migrating from SQLAlchemy 0.9.4 to 1.0.12. It
seems to be linked to a behavioral change in the ORM
(http://docs.sqlalchemy.org/en/rel_1_0/changelog/migration_10.html#right-inner-join-nesting-now-the-default-for-joinedload-with-innerjoin-true)

Here is simplified version of the code:

|
# -*- coding: utf-8 -*-
fromsqlalchemy import*
fromsqlalchemy.ext.declarative importdeclarative_base
fromsqlalchemy.orm import*


Base=declarative_base()

_plant_table =Table('plant',Base.metadata,
Column('id',Integer,primary_key=True)
)

_plant_dimensionsseries_table =Table('plant_dimensionsseries',Base.metadata,
Column('plant_id',Integer,primary_key=True),
Column('dimensionsseriestype_id',Integer,primary_key=True),
ForeignKeyConstraint(['plant_id'],['plant.id'])
)

_view_plant_dimensionsseries_table
=Table('view_plant_dimensionsseries',Base.metadata,
Column('plant_id',Integer,primary_key=True),
Column('dimensionsseriestype_id',Integer,primary_key=True),
ForeignKeyConstraint(
['plant_id','dimensionsseriestype_id'],
['plant_dimensionsseries.plant_id','plant_dimensionsseries.dimensionsseriestype_id'])
)

classPlant(Base):
 __table__ =_plant_table

classPlantDimensionsseries(Base):
 __table__ =_plant_dimensionsseries_table
 _plant_id =__table__.c.plant_id
 _dimensionsseriestype_id =__table__.c.dimensionsseriestype_id

 plant =relationship('Plant',
 innerjoin=True,
 backref=backref('plant_dimensionsseries'))

classPlantDimensionsseriesDataComputed(Base):
 __table__ =_view_plant_dimensionsseries_table
 _plant_id =__table__.c.plant_id
 _dimensionsseriestype_id =__table__.c.dimensionsseriestype_id

# One-to-one relationship
 dimensionsseries =relationship('PlantDimensionsseries',
 innerjoin=True,
 backref=backref('data_computed',
 innerjoin=True))

if__name__ =='__main__':

 engine
=create_engine('postgresql://nurseryacme_employee@localhost:5432/nurseryacme')
Session=sessionmaker(bind=engine)
 session =Session()

# query 1:
# SQLAlchemy 0.9.4: Correct SQL generated
# SQLAlchemy 1.0.12: Wrong SQL generated, a inner JOIN is used instead
of a LEFT OUTER JOIN between plant_dimensionsseries and
view_plant_dimensionsseries
printsession.query(Plant).\
 outerjoin(Plant.plant_dimensionsseries).\
 options(contains_eager(Plant.plant_dimensionsseries)).\

options(joinedload(Plant.plant_dimensionsseries,PlantDimensionsseries.data_computed))


well this usage above is wrong.  You can't have contains_eager() and 
joinedload() along the same paths at the same time like that.   Also, 
chaining joinedload() from contains_eager() is not a very typical thing 
to do, it works, but joinedload() does not coordinate with 
contains_eager() in any way, and it has no idea that you are using 
outerjoin() to the left of it.


The two correct ways to do this are:

print session.query(Plant).\
join(Plant.plant_dimensionsseries).\

options(contains_eager(Plant.plant_dimensionsseries).joinedload(PlantDimensionsseries.data_computed, 
))


and

print session.query(Plant).\
outerjoin(Plant.plant_dimensionsseries).\

options(contains_eager(Plant.plant_dimensionsseries).joinedload(PlantDimensionsseries.data_computed, 
innerjoin=False))



or of course:


print session.query(Plant).\
outerjoin(Plant.plant_dimensionsseries).\

options(joinedload(Plant.plant_dimensionsseries).joinedload(PlantDimensionsseries.data_computed))


where in the last one, the joinedload() to the right coordinates with 
the joinedload to the left to become a LEFT OUTER JOIN.








# query 2:
# SQLAlchemy 1.0.12: Correct SQL generated
printsession.query(Plant).\
 outerjoin(Plant.plant_dimensionsseries).\
 options(contains_eager(Plant.plant_dimensionsseries)).\

options(joinedload(Plant.plant_dimensionsseries,PlantDimensionsseries.data_computed,innerjoin=False))

|

Result with SQLAlchemy 0.9.4:
|
# query 1
SELECT ...
FROM plant LEFT OUTER JOIN plant_dimensionsseries ON plant.id
=plant_dimensionsseries.plant_id *LEFT OUTER
JOIN*view_plant_dimensionsseries AS view_plant_dimensionsseries_1 ON
plant_dimensionsseries.plant_id =view_plant_dimensionsseries_1.plant_id
AND plant_dimensionsseries.dimensionsseriestype_id
=view_plant_dimensionsseries_1.dimensionsseriestype_id
# query 2
SELECT ...
FROM plant LEFT OUTER JOIN plant_dimensionsseries ON plant.id
=plant_dimensionsseries.plant_id *LEFT OUTER
JOIN*view_plant_dimensionsseries AS view_plant_dimensionsseries_1 ON
plant_dimensionsseries.plant_id =view_plant_dimensionsseries_1.plant_id
AND plant_dimensionsseries.dimensionsseriestype_id
=view_plant_dimensionsseries_1.dimensionsseriestype_id

|

Result with SQLAlchemy 1.0.12:
|
# query 1
SELECT ...
FROM plant LEFT OUTER JOIN plant_dimensionsseries ON plant.id
=plant_dimensionsseries.plant_id *JOIN*view_plant_dimensionsseries AS
view_p

Re: [sqlalchemy] Bug in BufferedColumnResultProxy class?

2016-04-25 Thread Mike Bayer



On 04/25/2016 11:04 AM, Piotr Dobrogost wrote:

On Mon, Apr 25, 2016 at 3:23 PM, Mike Bayer  wrote:


OK I can try to work with that but that's a very specific feature, you'd
need to be using the Query cache extension,


I have problem finding information on "Query cache extension".
Googling for  I get
http://docs.sqlalchemy.org/en/latest/orm/examples.html#module-examples.dogpile_caching
as the first result from SA's docs and
http://stackoverflow.com/questions/204918/does-sqlalchemy-support-caching
as the first result from Stackoverflow.

Is caching using dogpile what you call "Query cache extension"? If so
we don't use it.


this extension: 
http://docs.sqlalchemy.org/en/rel_1_0/orm/extensions/baked.html?highlight=baked#module-sqlalchemy.ext.baked








  or using the "compiled_cache"
option,


There's no "compiled_cache" string in Kotti framework so I guess we
don't use it.


option, *or* the error is being generated within the context of a Session
being flushed within an INSERT or UPDATE RETURNING statement, which itself I
don't think uses the traditional result set model in Oracle anyway.


It's hard to say. Right before printing call stack from line 519 of
result.py there's SELECT logged:


the stack trace shown here illustrates an INSERT statement in the 
context of an ORM flush, which does use the compiled_cache feature; so 
to that degree, this part of the puzzle makes sense.


However, in cx_oracle, an INSERT statement does not return rows there's 
no data to be returned; cursor.description should be None and the 
BufferredColumnResultProxy will not be used.An INSERT statement in 
Oracle that uses "RETURNING", as is usually the case for the ORM INSERT 
statement, uses OUT parameters to achieve this and will make use of the 
ReturningResultProxy to work around what's needed here.


Since we're working without a test case of any kind, are you sure that 
the result proxy indicating the problem is not a ReturningResultProxy ? 
  Are we dealing with a column that has a server-generated default 
value that is of type BLOB, CLOB, or similar ?








2016-04-25 16:36:02,954 INFO
[sqlalchemy.engine.base.Engine][waitress] SELECT nodes.id AS nodes_id,
nodes.type AS nodes_type, nodes.parent_id AS nodes_parent_id,
nodes.position AS nodes_position, nodes."_acl" AS nodes__acl,
nodes.name AS nodes_name, nodes.title AS nodes_title,
nodes.annotations AS nodes_annotations, nodes.path AS nodes_path
FROM nodes LEFT OUTER JOIN contents ON nodes.id = contents.id LEFT
OUTER JOIN documents ON contents.id = documents.id LEFT OUTER JOIN
files ON contents.id = files.id LEFT OUTER JOIN images ON contents.id
= images.id
WHERE nodes.parent_id IS NULL

and right before the traceback is printed there's rollback being logged:
2016-04-25 16:36:02,975 INFO  [sqlalchemy.engine.base.Engine][waitress] ROLLBACK

however I don't know if it answers your last question.


any of these true for your case ?  Looks like you haven't provided a stack
trace for your issue, I'm sure you have that, so send that along.


Please find all information at
https://gist.github.com/piotr-dobrogost/6d57cacb9e77f59748353b2b6c1334d7
Probably you are most interested in the raw log (file named "Full log
from the start of the application (with empty db)"). Please note that
I modified line 519 in result.py by inserting the following code:
import traceback; traceback.print_stack();
to get call stack at this moment of execution.
Also the nature of the bug is such that there's no information in the
log about original problem (not executing some processors which were
registered and executed earlier). I'm guessing there are scopes during
which one should not modify the list of registered processors. If it
happens that this list (as a consequence of this bug) is being cleared
while one of such scopes that it would be awesome if SA could somehow
notice that the list of processors was changed in some scope where it
should be left intact and log warning or raise hard error. This way
the reason for error would be much easier to find in this case and
probably also in future.


Regards,
Piotr Dobrogost



--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] SQLAlchemy 1.0.12 use inner JOIN instead of LEFT OUTER JOIN

2016-04-25 Thread Alex Dev
Hello,

I have a broken query when migrating from SQLAlchemy 0.9.4 to 1.0.12. It 
seems to be linked to a behavioral change in the ORM 
(http://docs.sqlalchemy.org/en/rel_1_0/changelog/migration_10.html#right-inner-join-nesting-now-the-default-for-joinedload-with-innerjoin-true)

Here is simplified version of the code:

# -*- coding: utf-8 -*-
from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import *


Base = declarative_base()

_plant_table = Table('plant', Base.metadata,
Column('id', Integer, primary_key=True)
)

_plant_dimensionsseries_table = Table('plant_dimensionsseries', Base.
metadata,
Column('plant_id', Integer, primary_key=True),
Column('dimensionsseriestype_id', Integer, primary_key=True),
ForeignKeyConstraint(['plant_id'], ['plant.id'])
)

_view_plant_dimensionsseries_table = Table('view_plant_dimensionsseries', 
Base.metadata,
Column('plant_id', Integer, primary_key=True),
Column('dimensionsseriestype_id', Integer, primary_key=True),
ForeignKeyConstraint(
['plant_id', 'dimensionsseriestype_id'],
['plant_dimensionsseries.plant_id', 
'plant_dimensionsseries.dimensionsseriestype_id'])
)

class Plant(Base):
__table__ = _plant_table

class PlantDimensionsseries(Base):
__table__ = _plant_dimensionsseries_table
_plant_id = __table__.c.plant_id
_dimensionsseriestype_id = __table__.c.dimensionsseriestype_id

plant = relationship('Plant',
innerjoin=True,
backref=backref('plant_dimensionsseries'))

class PlantDimensionsseriesDataComputed(Base):
__table__ = _view_plant_dimensionsseries_table
_plant_id = __table__.c.plant_id
_dimensionsseriestype_id = __table__.c.dimensionsseriestype_id

# One-to-one relationship
dimensionsseries = relationship('PlantDimensionsseries',
innerjoin=True,
backref=backref('data_computed',
innerjoin=True))

if __name__ == '__main__':

engine = create_engine(
'postgresql://nurseryacme_employee@localhost:5432/nurseryacme')
Session = sessionmaker(bind=engine)
session = Session()

# query 1:
# SQLAlchemy 0.9.4: Correct SQL generated
# SQLAlchemy 1.0.12: Wrong SQL generated, a inner JOIN is used instead 
of a LEFT OUTER JOIN between plant_dimensionsseries and 
view_plant_dimensionsseries
print session.query(Plant).\
outerjoin(Plant.plant_dimensionsseries).\
options(contains_eager(Plant.plant_dimensionsseries)).\
options(joinedload(Plant.plant_dimensionsseries, 
PlantDimensionsseries.data_computed))

# query 2:
# SQLAlchemy 1.0.12: Correct SQL generated
print session.query(Plant).\
outerjoin(Plant.plant_dimensionsseries).\
options(contains_eager(Plant.plant_dimensionsseries)).\
options(joinedload(Plant.plant_dimensionsseries, 
PlantDimensionsseries.data_computed, innerjoin=False))


Result with SQLAlchemy 0.9.4:
# query 1
SELECT ...
FROM plant LEFT OUTER JOIN plant_dimensionsseries ON plant.id = 
plant_dimensionsseries.plant_id *LEFT OUTER JOIN* 
view_plant_dimensionsseries AS view_plant_dimensionsseries_1 ON 
plant_dimensionsseries.plant_id = view_plant_dimensionsseries_1.plant_id 
AND plant_dimensionsseries.dimensionsseriestype_id = 
view_plant_dimensionsseries_1.dimensionsseriestype_id
# query 2
SELECT ...
FROM plant LEFT OUTER JOIN plant_dimensionsseries ON plant.id = 
plant_dimensionsseries.plant_id *LEFT OUTER JOIN* 
view_plant_dimensionsseries AS view_plant_dimensionsseries_1 ON 
plant_dimensionsseries.plant_id = view_plant_dimensionsseries_1.plant_id 
AND plant_dimensionsseries.dimensionsseriestype_id = 
view_plant_dimensionsseries_1.dimensionsseriestype_id


Result with SQLAlchemy 1.0.12:
# query 1
SELECT ...
FROM plant LEFT OUTER JOIN plant_dimensionsseries ON plant.id = 
plant_dimensionsseries.plant_id *JOIN* view_plant_dimensionsseries AS 
view_plant_dimensionsseries_1 ON plant_dimensionsseries.plant_id = 
view_plant_dimensionsseries_1.plant_id AND 
plant_dimensionsseries.dimensionsseriestype_id 
= view_plant_dimensionsseries_1.dimensionsseriestype_id
# query 2
SELECT ...
FROM plant LEFT OUTER JOIN plant_dimensionsseries ON plant.id = 
plant_dimensionsseries.plant_id *LEFT OUTER JOIN* 
view_plant_dimensionsseries AS view_plant_dimensionsseries_1 ON 
plant_dimensionsseries.plant_id = view_plant_dimensionsseries_1.plant_id 
AND plant_dimensionsseries.dimensionsseriestype_id = 
view_plant_dimensionsseries_1.dimensionsseriestype_id

In query 1 under SQLAlchemy 1.0.12, the query discard many rows due the 
JOIN chained to the LEFT OUTER JOIN and this precisely what SQLAlchemy 
wanted to avoid if I refer to the description of "Right-nested inner joins 
available in joined eager loads" 
(http://docs.sqlalchemy.org/en/rel_1_0/changelog/migration_09.html#feature-2976).

I can fix the query 1 by adding the innerjoin=False as in query 

Re: [sqlalchemy] Bug in BufferedColumnResultProxy class?

2016-04-25 Thread Piotr Dobrogost
On Mon, Apr 25, 2016 at 3:23 PM, Mike Bayer  wrote:
>
> OK I can try to work with that but that's a very specific feature, you'd
> need to be using the Query cache extension,

I have problem finding information on "Query cache extension".
Googling for  I get
http://docs.sqlalchemy.org/en/latest/orm/examples.html#module-examples.dogpile_caching
as the first result from SA's docs and
http://stackoverflow.com/questions/204918/does-sqlalchemy-support-caching
as the first result from Stackoverflow.

Is caching using dogpile what you call "Query cache extension"? If so
we don't use it.

>  or using the "compiled_cache"
> option,

There's no "compiled_cache" string in Kotti framework so I guess we
don't use it.

> option, *or* the error is being generated within the context of a Session
> being flushed within an INSERT or UPDATE RETURNING statement, which itself I
> don't think uses the traditional result set model in Oracle anyway.

It's hard to say. Right before printing call stack from line 519 of
result.py there's SELECT logged:
2016-04-25 16:36:02,954 INFO
[sqlalchemy.engine.base.Engine][waitress] SELECT nodes.id AS nodes_id,
nodes.type AS nodes_type, nodes.parent_id AS nodes_parent_id,
nodes.position AS nodes_position, nodes."_acl" AS nodes__acl,
nodes.name AS nodes_name, nodes.title AS nodes_title,
nodes.annotations AS nodes_annotations, nodes.path AS nodes_path
FROM nodes LEFT OUTER JOIN contents ON nodes.id = contents.id LEFT
OUTER JOIN documents ON contents.id = documents.id LEFT OUTER JOIN
files ON contents.id = files.id LEFT OUTER JOIN images ON contents.id
= images.id
WHERE nodes.parent_id IS NULL

and right before the traceback is printed there's rollback being logged:
2016-04-25 16:36:02,975 INFO  [sqlalchemy.engine.base.Engine][waitress] ROLLBACK

however I don't know if it answers your last question.

> any of these true for your case ?  Looks like you haven't provided a stack
> trace for your issue, I'm sure you have that, so send that along.

Please find all information at
https://gist.github.com/piotr-dobrogost/6d57cacb9e77f59748353b2b6c1334d7
Probably you are most interested in the raw log (file named "Full log
from the start of the application (with empty db)"). Please note that
I modified line 519 in result.py by inserting the following code:
import traceback; traceback.print_stack();
to get call stack at this moment of execution.
Also the nature of the bug is such that there's no information in the
log about original problem (not executing some processors which were
registered and executed earlier). I'm guessing there are scopes during
which one should not modify the list of registered processors. If it
happens that this list (as a consequence of this bug) is being cleared
while one of such scopes that it would be awesome if SA could somehow
notice that the list of processors was changed in some scope where it
should be left intact and log warning or raise hard error. This way
the reason for error would be much easier to find in this case and
probably also in future.


Regards,
Piotr Dobrogost

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Bug in BufferedColumnResultProxy class?

2016-04-25 Thread Mike Bayer



On 04/25/2016 05:49 AM, Piotr Dobrogost wrote:

On Friday, April 22, 2016 at 5:22:38 PM UTC+2, Mike Bayer wrote:


On 04/22/2016 10:40 AM, Piotr Dobrogost wrote:
 > It seems BufferedColumnResultProxy class (used only in Oracle
dialect)
 > has a bug.

I cannot confirm it, would need a test case.  Here's one that is
extremely simple but exercises the features you describe.   I have no
doubt that what you're seeing is a bug however so I'd need to know what
to add to this test to illustrate the issue:


Thanks for quick response and creation of sample code. We have trouble
adding something that would trigger bug as we don't fully understand
source code of SA and in addition we use other framework (Kotti) which
adds its layer of complexity. Having said that we think that what's
needed to reproduce bug is to modify your test case so that line 519 in
result.py would get hit
(https://github.com/zzzeek/sqlalchemy/blob/rel_1_0_12/lib/sqlalchemy/engine/result.py#L519).


OK I can try to work with that but that's a very specific feature, you'd 
need to be using the Query cache extension, or using the 
"compiled_cache" option, *or* the error is being generated within the 
context of a Session being flushed within an INSERT or UPDATE RETURNING 
statement, which itself I don't think uses the traditional result set 
model in Oracle anyway.   Are any of these true for your case ?  Looks 
like you haven't provided a stack trace for your issue, I'm sure you 
have that, so send that along.







There is no need for you or your developers to attempt to debug deep
issues like these; I can identify them within minutes given a
demonstration of the behavior and I am glad to do this work once an
incorrect behavior is demonstrated.


Thank you very much for offering your help.

The work that you need to do when you encounter stack traces like this
is to isolate the behavior into a simple test case, such as the one I
have above.Using the "divide and conquer" approach, where you begin
with your whole application, then slowly remove pieces of it that
continue to exhibit the bad behavior until you have something entirely
minimal.  The guidelines that I often refer to at
http://stackoverflow.com/help/mcve
 have an excellent description
of this.


Sure thing. The problem in practice is that in order to create "simple
test case" it's often needed to get good understanding of used libraries
at the source code level. The good example is your sample test case
which we don't know how to modify in spite of the fact that we suspect
which specific line needs to be hit.


Regards,
Piotr Dobrogost

--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to sqlalchemy+unsubscr...@googlegroups.com
.
To post to this group, send email to sqlalchemy@googlegroups.com
.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Bug in BufferedColumnResultProxy class?

2016-04-25 Thread Piotr Dobrogost
On Friday, April 22, 2016 at 5:22:38 PM UTC+2, Mike Bayer wrote:
>
>
> On 04/22/2016 10:40 AM, Piotr Dobrogost wrote: 
> > It seems BufferedColumnResultProxy class (used only in Oracle dialect) 
> > has a bug. 
>
 

> I cannot confirm it, would need a test case.  Here's one that is 
> extremely simple but exercises the features you describe.   I have no 
> doubt that what you're seeing is a bug however so I'd need to know what 
> to add to this test to illustrate the issue: 
>

Thanks for quick response and creation of sample code. We have trouble 
adding something that would trigger bug as we don't fully understand source 
code of SA and in addition we use other framework (Kotti) which adds its 
layer of complexity. Having said that we think that what's needed to 
reproduce bug is to modify your test case so that line 519 in result.py 
would get hit 
(https://github.com/zzzeek/sqlalchemy/blob/rel_1_0_12/lib/sqlalchemy/engine/result.py#L519).

There is no need for you or your developers to attempt to debug deep 
> issues like these; I can identify them within minutes given a 
> demonstration of the behavior and I am glad to do this work once an 
> incorrect behavior is demonstrated. 
>

Thank you very much for offering your help. 

The work that you need to do when you encounter stack traces like this 
> is to isolate the behavior into a simple test case, such as the one I 
> have above.Using the "divide and conquer" approach, where you begin 
> with your whole application, then slowly remove pieces of it that 
> continue to exhibit the bad behavior until you have something entirely 
> minimal.  The guidelines that I often refer to at 
> http://stackoverflow.com/help/mcve have an excellent description of this. 
>

Sure thing. The problem in practice is that in order to create "simple test 
case" it's often needed to get good understanding of used libraries at the 
source code level. The good example is your sample test case which we don't 
know how to modify in spite of the fact that we suspect which specific line 
needs to be hit.


Regards,
Piotr Dobrogost

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.