On Mon, Jan 15, 2018 at 5:32 PM, Mike Bayer <mike...@zzzcomputing.com> wrote:
> On Mon, Jan 15, 2018 at 4:47 PM, Lukas Barth <tinlo...@gmail.com> wrote:
>> Hi,
>>
>> first things first: I put a complete notebook showing the problem at [0],
>> I'll only post the excerpts I deem useful below. The exact same code, this
>> time working, with an sqlite in-memory database is at [1]. I'm using
>> SQLAlchemy version 1.2.1, python 3.5, mysqlclient version 1.3.12 and MariaDB
>> version 10.1.29.
>>
>> I have this persistent class:
>> class Test(Base):
>>     __tablename__ = "test"
>>
>>     id = Column(Integer, primary_key=True)
>>     timestamp = Column(Integer)
>>     value = Column(Integer)
>>
>> Resulting in this schema:
>>
>>
>> CREATE TABLE test (
>>    id INTEGER NOT NULL,
>>    timestamp INTEGER,
>>    value INTEGER,
>>    PRIMARY KEY (id)
>> )
>>
>> What I want to achieve is to delete all objects from Test, which have a
>> timestamp less than some value (five in my example code), but which are
>> *not* the most recent entries for their respective value. For example: If
>> there is a Test object (call it A) with timestamp = 4 and value = 1234, and
>> all other (if there exist any at all…) Test objects with value = 1234 have a
>> timestamp of less than 4, then I want A to *not* be deleted, even though its
>> timestamp is less than 5.
>>
>>
>> I figured I go with two subqueries: The first subquery uses a group_by on
>> value, and max() on timestamp. These are the objects to be protected from
>> deletion. The second subquery retrieves the ids of the objects in subquery
>> 1. Then I can issue a delete statement, filtering so that the ids of the
>> objects to be deleted are not in subquery 2.
>>
>> Here's the first subquery:
>> protected_items = session.query(Test.id, Test.value,
>>                                 func.max(Test.timestamp))\
>>                          .group_by(Test.value).subquery(name='subquery1')
>>
>> And the second one:
>> protected_items_ids = session.query(Test.id).join(
>>         protected_items, Test.id == protected_items.c.id)\
>>         .subquery(name='subquery2')
>>
>> And finally, the deletion:
>>
>> deleted_rows = session.query(Test) \
>>                       .filter((Test.timestamp < 5)) \
>>                       .filter(~Test.id.in_(
>>                           protected_items_ids)) \
>>                       .delete(synchronize_session=False)
>>
>>
>> This works great when using it with a sqlite database. However, it gives an
>> OperationalError when using it with MariaDB. See the bottom of [0] for the
>> full stack trace. This is the error message I get from the MariaDB server:
>>
>> OperationalError: (_mysql_exceptions.OperationalError)
>>
>> (1093, "Table 'test' is specified twice, both as a target for 'DELETE' and
>> as a separate source for data")
>>
>> [SQL: 'DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT
>> test.id \nFROM test INNER JOIN (SELECT test.id AS id, test.value AS value,
>> max(test.timestamp) AS max_1 \nFROM test GROUP BY test.value) AS subquery1
>> ON test.id = subquery1.id)'] [parameters: (5,)] (Background on this error
>> at: http://sqlalche.me/e/e3q8)
>>
>>
>> Doing dome digging, I think one needs to specify a name for the innermost
>> subquery (subquery 1), to make MariaDB create a temporary table for this. At
>> least that's what [2] suggests. I would have assumed the inner subquery to
>> be named ("… as subquery1"), since I specified "name = subquery1", but that
>> doesn't seem to be
>>
>>>>> print(protected_items)
>> SELECT test.id, test.value, max(test.timestamp) AS max_1
>> FROM test GROUP BY test.value
>
> doing a print() on the subquery object itself does not show it in
> context, as the subquery / alias name only comes out if you select
> FROM the object.  If you look at the SQL shown in the error message,
> it is rendering"AS subquery1", so that is not the problem here.
>
> the query as written doesn't actually seem to do anything with the
> subqueries, because you are putting this max(test.timestamp) in the
> columns clause of an embedded subquery, but you aren't filtering on it
> or doing anything at all with it, it's being thrown away.    The query
> I think you need is:
>
> latest_items = session.query(
>     Test.value, func.max(Test.timestamp).label('latest_timestamp')
> ).group_by(Test.value).subquery(name='subquery1')
>
> latest_item_ids = session.query(Test.id).filter(
>     Test.timestamp == latest_items.c.latest_timestamp).\
>     filter(Test.value == latest_items.c.value)
>
> deleted_rows = session.query(Test) \
>                       .filter((Test.timestamp < 5)) \
>                       .filter(~Test.id.in_(latest_item_ids)) \
>                       .delete(synchronize_session=False)
>
>
>
> generating:
>
> DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT
> test.id AS test_id
> FROM (SELECT test.value AS value, max(test.timestamp) AS latest_timestamp
> FROM test GROUP BY test.value) AS subquery1
> WHERE test.timestamp = subquery1.latest_timestamp AND test.value =
> subquery1.value)
>
>
> which is accepted by the database.

that's not quite right, because I'm not selecting from Test.  Adding
the JOIN back in, I've tried aliasing Test everywhere and it still
produces the error.  MySQL is really bad at subqueries.  give me a
minute to find a query that works here.





>
>
>
>
>>
>> … I guess that's a bug? At least I would assume that this should not result
>> in an OperationalError, especially since it works with sqlite?
>>
>> Thanks for any help,
>>
>> Lukas
>>
>> [0] https://tinloaf.de/~tinloaf/sqlalchemy/mariadb.html
>> [1] https://tinloaf.de/~tinloaf/sqlalchemy/sqlite.html
>> [2]
>> https://stackoverflow.com/questions/5816840/delete-i-cant-specify-target-table
>>
>> --
>> SQLAlchemy -
>> The Python SQL Toolkit and Object Relational Mapper
>>
>> http://www.sqlalchemy.org/
>>
>> To post example code, please provide an MCVE: Minimal, Complete, and
>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>> description.
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to sqlalchemy+unsubscr...@googlegroups.com.
>> To post to this group, send email to sqlalchemy@googlegroups.com.
>> Visit this group at https://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to