Re: [sqlalchemy] SQLAlchemy does not properly create temporary table for subqueries in MariaDB

2018-01-17 Thread Lukas Barth
Thanks a lot! So, if I get this correctly, the difference is that you 
assigned a label to the timestamp, and that you're doing "~exists()" 
instead of "~in()" in the delete clause, right?

I'll have to adapt this a bit to get it into my (obviously a bit more 
complicated) actual code, but thanks for the help so far!

Lukas

On Tuesday, January 16, 2018 at 12:14:14 AM UTC+1, Mike Bayer wrote:
>
> On Mon, Jan 15, 2018 at 6:09 PM, Mike Bayer  > wrote: 
> > On Mon, Jan 15, 2018 at 5:32 PM, Mike Bayer  > wrote: 
> > 
> > that's not quite right, because I'm not selecting from Test.  Adding 
> > the JOIN back in, I've tried aliasing Test everywhere and it still 
> > produces the error.  MySQL is really bad at subqueries.  give me a 
> > minute to find a query that works here. 
>
> OK try this: 
>
> latest_items = session.query( 
> Test.value, func.max(Test.timestamp).label('latest_timestamp') 
> ).group_by(Test.value).subquery(name='subquery1') 
>
>
> deleted_rows = ( 
> session.query(Test) 
> .filter((Test.timestamp < 5)) 
> .filter( 
> ~exists(). 
> where(Test.value == latest_items.c.value). 
> where(Test.timestamp == latest_items.c.latest_timestamp) 
> ).delete(synchronize_session=False) 
> ) 
>
>
> output: 
>
> DELETE FROM test WHERE test.timestamp < %s AND NOT (EXISTS (SELECT * 
> FROM (SELECT test.value AS value, max(test.timestamp) AS latest_timestamp 
> FROM test GROUP BY test.value) AS subquery1 
> WHERE test.value = subquery1.value AND test.timestamp = 
> subquery1.latest_timestamp)) 
>
>
>
> > 
> > 
> > 
> > 
> > 
> >> 
> >> 
> >> 
> >> 
> >>> 
> >>> … I guess that's a bug? At least I would assume that this should not 
> result 
> >>> in an OperationalError, especially since it works with sqlite? 
> >>> 
> >>> Thanks for any help, 
> >>> 
> >>> Lukas 
> >>> 
> >>> [0] https://tinloaf.de/~tinloaf/sqlalchemy/mariadb.html 
> >>> [1] https://tinloaf.de/~tinloaf/sqlalchemy/sqlite.html 
> >>> [2] 
> >>> 
> https://stackoverflow.com/questions/5816840/delete-i-cant-specify-target-table
>  
> >>> 
> >>> -- 
> >>> SQLAlchemy - 
> >>> The Python SQL Toolkit and Object Relational Mapper 
> >>> 
> >>> http://www.sqlalchemy.org/ 
> >>> 
> >>> To post example code, please provide an MCVE: Minimal, Complete, and 
> >>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> >>> description. 
> >>> --- 
> >>> You received this message because you are subscribed to the Google 
> Groups 
> >>> "sqlalchemy" group. 
> >>> To unsubscribe from this group and stop receiving emails from it, send 
> an 
> >>> email to sqlalchemy+...@googlegroups.com . 
> >>> To post to this group, send email to sqlal...@googlegroups.com 
> . 
> >>> Visit this group at https://groups.google.com/group/sqlalchemy. 
> >>> For more options, visit https://groups.google.com/d/optout. 
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] SQLAlchemy does not properly create temporary table for subqueries in MariaDB

2018-01-15 Thread Mike Bayer
On Mon, Jan 15, 2018 at 6:09 PM, Mike Bayer  wrote:
> On Mon, Jan 15, 2018 at 5:32 PM, Mike Bayer  wrote:
>
> that's not quite right, because I'm not selecting from Test.  Adding
> the JOIN back in, I've tried aliasing Test everywhere and it still
> produces the error.  MySQL is really bad at subqueries.  give me a
> minute to find a query that works here.

OK try this:

latest_items = session.query(
Test.value, func.max(Test.timestamp).label('latest_timestamp')
).group_by(Test.value).subquery(name='subquery1')


deleted_rows = (
session.query(Test)
.filter((Test.timestamp < 5))
.filter(
~exists().
where(Test.value == latest_items.c.value).
where(Test.timestamp == latest_items.c.latest_timestamp)
).delete(synchronize_session=False)
)


output:

DELETE FROM test WHERE test.timestamp < %s AND NOT (EXISTS (SELECT *
FROM (SELECT test.value AS value, max(test.timestamp) AS latest_timestamp
FROM test GROUP BY test.value) AS subquery1
WHERE test.value = subquery1.value AND test.timestamp =
subquery1.latest_timestamp))



>
>
>
>
>
>>
>>
>>
>>
>>>
>>> … I guess that's a bug? At least I would assume that this should not result
>>> in an OperationalError, especially since it works with sqlite?
>>>
>>> Thanks for any help,
>>>
>>> Lukas
>>>
>>> [0] https://tinloaf.de/~tinloaf/sqlalchemy/mariadb.html
>>> [1] https://tinloaf.de/~tinloaf/sqlalchemy/sqlite.html
>>> [2]
>>> https://stackoverflow.com/questions/5816840/delete-i-cant-specify-target-table
>>>
>>> --
>>> SQLAlchemy -
>>> The Python SQL Toolkit and Object Relational Mapper
>>>
>>> http://www.sqlalchemy.org/
>>>
>>> To post example code, please provide an MCVE: Minimal, Complete, and
>>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>>> description.
>>> ---
>>> You received this message because you are subscribed to the Google Groups
>>> "sqlalchemy" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to sqlalchemy+unsubscr...@googlegroups.com.
>>> To post to this group, send email to sqlalchemy@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/sqlalchemy.
>>> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] SQLAlchemy does not properly create temporary table for subqueries in MariaDB

2018-01-15 Thread Mike Bayer
On Mon, Jan 15, 2018 at 5:32 PM, Mike Bayer  wrote:
> On Mon, Jan 15, 2018 at 4:47 PM, Lukas Barth  wrote:
>> Hi,
>>
>> first things first: I put a complete notebook showing the problem at [0],
>> I'll only post the excerpts I deem useful below. The exact same code, this
>> time working, with an sqlite in-memory database is at [1]. I'm using
>> SQLAlchemy version 1.2.1, python 3.5, mysqlclient version 1.3.12 and MariaDB
>> version 10.1.29.
>>
>> I have this persistent class:
>> class Test(Base):
>> __tablename__ = "test"
>>
>> id = Column(Integer, primary_key=True)
>> timestamp = Column(Integer)
>> value = Column(Integer)
>>
>> Resulting in this schema:
>>
>>
>> CREATE TABLE test (
>>id INTEGER NOT NULL,
>>timestamp INTEGER,
>>value INTEGER,
>>PRIMARY KEY (id)
>> )
>>
>> What I want to achieve is to delete all objects from Test, which have a
>> timestamp less than some value (five in my example code), but which are
>> *not* the most recent entries for their respective value. For example: If
>> there is a Test object (call it A) with timestamp = 4 and value = 1234, and
>> all other (if there exist any at all…) Test objects with value = 1234 have a
>> timestamp of less than 4, then I want A to *not* be deleted, even though its
>> timestamp is less than 5.
>>
>>
>> I figured I go with two subqueries: The first subquery uses a group_by on
>> value, and max() on timestamp. These are the objects to be protected from
>> deletion. The second subquery retrieves the ids of the objects in subquery
>> 1. Then I can issue a delete statement, filtering so that the ids of the
>> objects to be deleted are not in subquery 2.
>>
>> Here's the first subquery:
>> protected_items = session.query(Test.id, Test.value,
>> func.max(Test.timestamp))\
>>  .group_by(Test.value).subquery(name='subquery1')
>>
>> And the second one:
>> protected_items_ids = session.query(Test.id).join(
>> protected_items, Test.id == protected_items.c.id)\
>> .subquery(name='subquery2')
>>
>> And finally, the deletion:
>>
>> deleted_rows = session.query(Test) \
>>   .filter((Test.timestamp < 5)) \
>>   .filter(~Test.id.in_(
>>   protected_items_ids)) \
>>   .delete(synchronize_session=False)
>>
>>
>> This works great when using it with a sqlite database. However, it gives an
>> OperationalError when using it with MariaDB. See the bottom of [0] for the
>> full stack trace. This is the error message I get from the MariaDB server:
>>
>> OperationalError: (_mysql_exceptions.OperationalError)
>>
>> (1093, "Table 'test' is specified twice, both as a target for 'DELETE' and
>> as a separate source for data")
>>
>> [SQL: 'DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT
>> test.id \nFROM test INNER JOIN (SELECT test.id AS id, test.value AS value,
>> max(test.timestamp) AS max_1 \nFROM test GROUP BY test.value) AS subquery1
>> ON test.id = subquery1.id)'] [parameters: (5,)] (Background on this error
>> at: http://sqlalche.me/e/e3q8)
>>
>>
>> Doing dome digging, I think one needs to specify a name for the innermost
>> subquery (subquery 1), to make MariaDB create a temporary table for this. At
>> least that's what [2] suggests. I would have assumed the inner subquery to
>> be named ("… as subquery1"), since I specified "name = subquery1", but that
>> doesn't seem to be
>>
> print(protected_items)
>> SELECT test.id, test.value, max(test.timestamp) AS max_1
>> FROM test GROUP BY test.value
>
> doing a print() on the subquery object itself does not show it in
> context, as the subquery / alias name only comes out if you select
> FROM the object.  If you look at the SQL shown in the error message,
> it is rendering"AS subquery1", so that is not the problem here.
>
> the query as written doesn't actually seem to do anything with the
> subqueries, because you are putting this max(test.timestamp) in the
> columns clause of an embedded subquery, but you aren't filtering on it
> or doing anything at all with it, it's being thrown away.The query
> I think you need is:
>
> latest_items = session.query(
> Test.value, func.max(Test.timestamp).label('latest_timestamp')
> ).group_by(Test.value).subquery(name='subquery1')
>
> latest_item_ids = session.query(Test.id).filter(
> Test.timestamp == latest_items.c.latest_timestamp).\
> filter(Test.value == latest_items.c.value)
>
> deleted_rows = session.query(Test) \
>   .filter((Test.timestamp < 5)) \
>   .filter(~Test.id.in_(latest_item_ids)) \
>   .delete(synchronize_session=False)
>
>
>
> generating:
>
> DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT
> test.id AS test_id
> FROM (SELECT test.value AS value, max(test.timestamp) AS latest_timestamp
> FROM test GROUP BY test.value) AS subquery1
> WHERE test.timestamp = subquery1

Re: [sqlalchemy] SQLAlchemy does not properly create temporary table for subqueries in MariaDB

2018-01-15 Thread Mike Bayer
On Mon, Jan 15, 2018 at 4:47 PM, Lukas Barth  wrote:
> Hi,
>
> first things first: I put a complete notebook showing the problem at [0],
> I'll only post the excerpts I deem useful below. The exact same code, this
> time working, with an sqlite in-memory database is at [1]. I'm using
> SQLAlchemy version 1.2.1, python 3.5, mysqlclient version 1.3.12 and MariaDB
> version 10.1.29.
>
> I have this persistent class:
> class Test(Base):
> __tablename__ = "test"
>
> id = Column(Integer, primary_key=True)
> timestamp = Column(Integer)
> value = Column(Integer)
>
> Resulting in this schema:
>
>
> CREATE TABLE test (
>id INTEGER NOT NULL,
>timestamp INTEGER,
>value INTEGER,
>PRIMARY KEY (id)
> )
>
> What I want to achieve is to delete all objects from Test, which have a
> timestamp less than some value (five in my example code), but which are
> *not* the most recent entries for their respective value. For example: If
> there is a Test object (call it A) with timestamp = 4 and value = 1234, and
> all other (if there exist any at all…) Test objects with value = 1234 have a
> timestamp of less than 4, then I want A to *not* be deleted, even though its
> timestamp is less than 5.
>
>
> I figured I go with two subqueries: The first subquery uses a group_by on
> value, and max() on timestamp. These are the objects to be protected from
> deletion. The second subquery retrieves the ids of the objects in subquery
> 1. Then I can issue a delete statement, filtering so that the ids of the
> objects to be deleted are not in subquery 2.
>
> Here's the first subquery:
> protected_items = session.query(Test.id, Test.value,
> func.max(Test.timestamp))\
>  .group_by(Test.value).subquery(name='subquery1')
>
> And the second one:
> protected_items_ids = session.query(Test.id).join(
> protected_items, Test.id == protected_items.c.id)\
> .subquery(name='subquery2')
>
> And finally, the deletion:
>
> deleted_rows = session.query(Test) \
>   .filter((Test.timestamp < 5)) \
>   .filter(~Test.id.in_(
>   protected_items_ids)) \
>   .delete(synchronize_session=False)
>
>
> This works great when using it with a sqlite database. However, it gives an
> OperationalError when using it with MariaDB. See the bottom of [0] for the
> full stack trace. This is the error message I get from the MariaDB server:
>
> OperationalError: (_mysql_exceptions.OperationalError)
>
> (1093, "Table 'test' is specified twice, both as a target for 'DELETE' and
> as a separate source for data")
>
> [SQL: 'DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT
> test.id \nFROM test INNER JOIN (SELECT test.id AS id, test.value AS value,
> max(test.timestamp) AS max_1 \nFROM test GROUP BY test.value) AS subquery1
> ON test.id = subquery1.id)'] [parameters: (5,)] (Background on this error
> at: http://sqlalche.me/e/e3q8)
>
>
> Doing dome digging, I think one needs to specify a name for the innermost
> subquery (subquery 1), to make MariaDB create a temporary table for this. At
> least that's what [2] suggests. I would have assumed the inner subquery to
> be named ("… as subquery1"), since I specified "name = subquery1", but that
> doesn't seem to be
>
 print(protected_items)
> SELECT test.id, test.value, max(test.timestamp) AS max_1
> FROM test GROUP BY test.value

doing a print() on the subquery object itself does not show it in
context, as the subquery / alias name only comes out if you select
FROM the object.  If you look at the SQL shown in the error message,
it is rendering"AS subquery1", so that is not the problem here.

the query as written doesn't actually seem to do anything with the
subqueries, because you are putting this max(test.timestamp) in the
columns clause of an embedded subquery, but you aren't filtering on it
or doing anything at all with it, it's being thrown away.The query
I think you need is:

latest_items = session.query(
Test.value, func.max(Test.timestamp).label('latest_timestamp')
).group_by(Test.value).subquery(name='subquery1')

latest_item_ids = session.query(Test.id).filter(
Test.timestamp == latest_items.c.latest_timestamp).\
filter(Test.value == latest_items.c.value)

deleted_rows = session.query(Test) \
  .filter((Test.timestamp < 5)) \
  .filter(~Test.id.in_(latest_item_ids)) \
  .delete(synchronize_session=False)



generating:

DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT
test.id AS test_id
FROM (SELECT test.value AS value, max(test.timestamp) AS latest_timestamp
FROM test GROUP BY test.value) AS subquery1
WHERE test.timestamp = subquery1.latest_timestamp AND test.value =
subquery1.value)


which is accepted by the database.




>
> … I guess that's a bug? At least I would assume that this should not result
> in an OperationalError,

[sqlalchemy] SQLAlchemy does not properly create temporary table for subqueries in MariaDB

2018-01-15 Thread Lukas Barth
Hi,

first things first: I put a complete notebook showing the problem at [0], 
I'll only post the excerpts I deem useful below. The exact same code, this 
time working, with an sqlite in-memory database is at [1]. I'm using 
SQLAlchemy version 1.2.1, python 3.5, mysqlclient version 1.3.12 and 
MariaDB version 10.1.29.

I have this persistent class:
class Test(Base):
__tablename__ = "test"

id = Column(Integer, primary_key=True)
timestamp = Column(Integer)
value = Column(Integer)

Resulting in this schema:
 

CREATE TABLE test ( 
   id INTEGER NOT NULL, 
   timestamp INTEGER, 
   value INTEGER, 
   PRIMARY KEY (id) 
)

What I want to achieve is to delete all objects from Test, which have a 
timestamp less than some value (five in my example code), but which are *not* 
the most recent entries for their respective value. For example: If there is a 
Test object (call it A) with timestamp = 4 and value = 1234, and all other (if 
there exist any at all…) Test objects with value = 1234 have a timestamp of 
less than 4, then I want A to *not* be deleted, even though its timestamp is 
less than 5.


I figured I go with two subqueries: The first subquery uses a group_by on 
value, and max() on timestamp. These are the objects to be protected from 
deletion. The second subquery retrieves the ids of the objects in subquery 
1. Then I can issue a delete statement, filtering so that the ids of the 
objects to be deleted are not in subquery 2.

Here's the first subquery:
protected_items = session.query(Test.id, Test.value,
func.max(Test.timestamp))\
 .group_by(Test.value).subquery(name='subquery1')

And the second one:
protected_items_ids = session.query(Test.id).join(
protected_items, Test.id == protected_items.c.id)\
.subquery(name='subquery2')

And finally, the deletion:

deleted_rows = session.query(Test) \
  .filter((Test.timestamp < 5)) \
  .filter(~Test.id.in_(
  protected_items_ids)) \
  .delete(synchronize_session=False)


This works great when using it with a sqlite database. However, it gives an 
OperationalError when using it with MariaDB. See the bottom of [0] for the 
full stack trace. This is the error message I get from the MariaDB server:

OperationalError: (_mysql_exceptions.OperationalError) 

(1093, "Table 'test' is specified twice, both as a target for 'DELETE' and as a 
separate source for data") 

[SQL: 'DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT 
test.id \nFROM test INNER JOIN (SELECT test.id AS id, test.value AS value, 
max(test.timestamp) AS max_1 \nFROM test GROUP BY test.value) AS subquery1 ON 
test.id = subquery1.id)'] [parameters: (5,)] (Background on this error at: 
http://sqlalche.me/e/e3q8)


Doing dome digging, I think one needs to specify a name for the innermost 
subquery (subquery 1), to make MariaDB create a temporary table for this. 
At least that's what [2] suggests. I would have assumed the inner subquery 
to be named ("… as subquery1"), since I specified "name = subquery1", but 
that doesn't seem to be 

>>> print(protected_items)
SELECT test.id, test.value, max(test.timestamp) AS max_1 
FROM test GROUP BY test.value

… I guess that's a bug? At least I would assume that this should not result 
in an OperationalError, especially since it works with sqlite?

Thanks for any help,

Lukas

[0] https://tinloaf.de/~tinloaf/sqlalchemy/mariadb.html 
[1] https://tinloaf.de/~tinloaf/sqlalchemy/sqlite.html
[2] 
https://stackoverflow.com/questions/5816840/delete-i-cant-specify-target-table

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.