On Mon, Jan 15, 2018 at 4:47 PM, Lukas Barth <[email protected]> wrote:
> Hi,
>
> first things first: I put a complete notebook showing the problem at [0],
> I'll only post the excerpts I deem useful below. The exact same code, this
> time working, with an sqlite in-memory database is at [1]. I'm using
> SQLAlchemy version 1.2.1, python 3.5, mysqlclient version 1.3.12 and MariaDB
> version 10.1.29.
>
> I have this persistent class:
> class Test(Base):
> __tablename__ = "test"
>
> id = Column(Integer, primary_key=True)
> timestamp = Column(Integer)
> value = Column(Integer)
>
> Resulting in this schema:
>
>
> CREATE TABLE test (
> id INTEGER NOT NULL,
> timestamp INTEGER,
> value INTEGER,
> PRIMARY KEY (id)
> )
>
> What I want to achieve is to delete all objects from Test, which have a
> timestamp less than some value (five in my example code), but which are
> *not* the most recent entries for their respective value. For example: If
> there is a Test object (call it A) with timestamp = 4 and value = 1234, and
> all other (if there exist any at all…) Test objects with value = 1234 have a
> timestamp of less than 4, then I want A to *not* be deleted, even though its
> timestamp is less than 5.
>
>
> I figured I go with two subqueries: The first subquery uses a group_by on
> value, and max() on timestamp. These are the objects to be protected from
> deletion. The second subquery retrieves the ids of the objects in subquery
> 1. Then I can issue a delete statement, filtering so that the ids of the
> objects to be deleted are not in subquery 2.
>
> Here's the first subquery:
> protected_items = session.query(Test.id, Test.value,
> func.max(Test.timestamp))\
> .group_by(Test.value).subquery(name='subquery1')
>
> And the second one:
> protected_items_ids = session.query(Test.id).join(
> protected_items, Test.id == protected_items.c.id)\
> .subquery(name='subquery2')
>
> And finally, the deletion:
>
> deleted_rows = session.query(Test) \
> .filter((Test.timestamp < 5)) \
> .filter(~Test.id.in_(
> protected_items_ids)) \
> .delete(synchronize_session=False)
>
>
> This works great when using it with a sqlite database. However, it gives an
> OperationalError when using it with MariaDB. See the bottom of [0] for the
> full stack trace. This is the error message I get from the MariaDB server:
>
> OperationalError: (_mysql_exceptions.OperationalError)
>
> (1093, "Table 'test' is specified twice, both as a target for 'DELETE' and
> as a separate source for data")
>
> [SQL: 'DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT
> test.id \nFROM test INNER JOIN (SELECT test.id AS id, test.value AS value,
> max(test.timestamp) AS max_1 \nFROM test GROUP BY test.value) AS subquery1
> ON test.id = subquery1.id)'] [parameters: (5,)] (Background on this error
> at: http://sqlalche.me/e/e3q8)
>
>
> Doing dome digging, I think one needs to specify a name for the innermost
> subquery (subquery 1), to make MariaDB create a temporary table for this. At
> least that's what [2] suggests. I would have assumed the inner subquery to
> be named ("… as subquery1"), since I specified "name = subquery1", but that
> doesn't seem to be
>
>>>> print(protected_items)
> SELECT test.id, test.value, max(test.timestamp) AS max_1
> FROM test GROUP BY test.value
doing a print() on the subquery object itself does not show it in
context, as the subquery / alias name only comes out if you select
FROM the object. If you look at the SQL shown in the error message,
it is rendering"AS subquery1", so that is not the problem here.
the query as written doesn't actually seem to do anything with the
subqueries, because you are putting this max(test.timestamp) in the
columns clause of an embedded subquery, but you aren't filtering on it
or doing anything at all with it, it's being thrown away. The query
I think you need is:
latest_items = session.query(
Test.value, func.max(Test.timestamp).label('latest_timestamp')
).group_by(Test.value).subquery(name='subquery1')
latest_item_ids = session.query(Test.id).filter(
Test.timestamp == latest_items.c.latest_timestamp).\
filter(Test.value == latest_items.c.value)
deleted_rows = session.query(Test) \
.filter((Test.timestamp < 5)) \
.filter(~Test.id.in_(latest_item_ids)) \
.delete(synchronize_session=False)
generating:
DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT
test.id AS test_id
FROM (SELECT test.value AS value, max(test.timestamp) AS latest_timestamp
FROM test GROUP BY test.value) AS subquery1
WHERE test.timestamp = subquery1.latest_timestamp AND test.value =
subquery1.value)
which is accepted by the database.
>
> … I guess that's a bug? At least I would assume that this should not result
> in an OperationalError, especially since it works with sqlite?
>
> Thanks for any help,
>
> Lukas
>
> [0] https://tinloaf.de/~tinloaf/sqlalchemy/mariadb.html
> [1] https://tinloaf.de/~tinloaf/sqlalchemy/sqlite.html
> [2]
> https://stackoverflow.com/questions/5816840/delete-i-cant-specify-target-table
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
http://www.sqlalchemy.org/
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable
Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.