Re: [sqlalchemy] SQLAlchemy does not properly create temporary table for subqueries in MariaDB
Thanks a lot! So, if I get this correctly, the difference is that you assigned a label to the timestamp, and that you're doing "~exists()" instead of "~in()" in the delete clause, right? I'll have to adapt this a bit to get it into my (obviously a bit more complicated) actual code, but thanks for the help so far! Lukas On Tuesday, January 16, 2018 at 12:14:14 AM UTC+1, Mike Bayer wrote: > > On Mon, Jan 15, 2018 at 6:09 PM, Mike Bayer > wrote: > > On Mon, Jan 15, 2018 at 5:32 PM, Mike Bayer > wrote: > > > > that's not quite right, because I'm not selecting from Test. Adding > > the JOIN back in, I've tried aliasing Test everywhere and it still > > produces the error. MySQL is really bad at subqueries. give me a > > minute to find a query that works here. > > OK try this: > > latest_items = session.query( > Test.value, func.max(Test.timestamp).label('latest_timestamp') > ).group_by(Test.value).subquery(name='subquery1') > > > deleted_rows = ( > session.query(Test) > .filter((Test.timestamp < 5)) > .filter( > ~exists(). > where(Test.value == latest_items.c.value). > where(Test.timestamp == latest_items.c.latest_timestamp) > ).delete(synchronize_session=False) > ) > > > output: > > DELETE FROM test WHERE test.timestamp < %s AND NOT (EXISTS (SELECT * > FROM (SELECT test.value AS value, max(test.timestamp) AS latest_timestamp > FROM test GROUP BY test.value) AS subquery1 > WHERE test.value = subquery1.value AND test.timestamp = > subquery1.latest_timestamp)) > > > > > > > > > > > > > > >> > >> > >> > >> > >>> > >>> … I guess that's a bug? At least I would assume that this should not > result > >>> in an OperationalError, especially since it works with sqlite? > >>> > >>> Thanks for any help, > >>> > >>> Lukas > >>> > >>> [0] https://tinloaf.de/~tinloaf/sqlalchemy/mariadb.html > >>> [1] https://tinloaf.de/~tinloaf/sqlalchemy/sqlite.html > >>> [2] > >>> > https://stackoverflow.com/questions/5816840/delete-i-cant-specify-target-table > > >>> > >>> -- > >>> SQLAlchemy - > >>> The Python SQL Toolkit and Object Relational Mapper > >>> > >>> http://www.sqlalchemy.org/ > >>> > >>> To post example code, please provide an MCVE: Minimal, Complete, and > >>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full > >>> description. > >>> --- > >>> You received this message because you are subscribed to the Google > Groups > >>> "sqlalchemy" group. > >>> To unsubscribe from this group and stop receiving emails from it, send > an > >>> email to sqlalchemy+...@googlegroups.com . > >>> To post to this group, send email to sqlal...@googlegroups.com > . > >>> Visit this group at https://groups.google.com/group/sqlalchemy. > >>> For more options, visit https://groups.google.com/d/optout. > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] SQLAlchemy does not properly create temporary table for subqueries in MariaDB
On Mon, Jan 15, 2018 at 6:09 PM, Mike Bayer wrote: > On Mon, Jan 15, 2018 at 5:32 PM, Mike Bayer wrote: > > that's not quite right, because I'm not selecting from Test. Adding > the JOIN back in, I've tried aliasing Test everywhere and it still > produces the error. MySQL is really bad at subqueries. give me a > minute to find a query that works here. OK try this: latest_items = session.query( Test.value, func.max(Test.timestamp).label('latest_timestamp') ).group_by(Test.value).subquery(name='subquery1') deleted_rows = ( session.query(Test) .filter((Test.timestamp < 5)) .filter( ~exists(). where(Test.value == latest_items.c.value). where(Test.timestamp == latest_items.c.latest_timestamp) ).delete(synchronize_session=False) ) output: DELETE FROM test WHERE test.timestamp < %s AND NOT (EXISTS (SELECT * FROM (SELECT test.value AS value, max(test.timestamp) AS latest_timestamp FROM test GROUP BY test.value) AS subquery1 WHERE test.value = subquery1.value AND test.timestamp = subquery1.latest_timestamp)) > > > > > >> >> >> >> >>> >>> … I guess that's a bug? At least I would assume that this should not result >>> in an OperationalError, especially since it works with sqlite? >>> >>> Thanks for any help, >>> >>> Lukas >>> >>> [0] https://tinloaf.de/~tinloaf/sqlalchemy/mariadb.html >>> [1] https://tinloaf.de/~tinloaf/sqlalchemy/sqlite.html >>> [2] >>> https://stackoverflow.com/questions/5816840/delete-i-cant-specify-target-table >>> >>> -- >>> SQLAlchemy - >>> The Python SQL Toolkit and Object Relational Mapper >>> >>> http://www.sqlalchemy.org/ >>> >>> To post example code, please provide an MCVE: Minimal, Complete, and >>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full >>> description. >>> --- >>> You received this message because you are subscribed to the Google Groups >>> "sqlalchemy" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to sqlalchemy+unsubscr...@googlegroups.com. >>> To post to this group, send email to sqlalchemy@googlegroups.com. >>> Visit this group at https://groups.google.com/group/sqlalchemy. >>> For more options, visit https://groups.google.com/d/optout. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] SQLAlchemy does not properly create temporary table for subqueries in MariaDB
On Mon, Jan 15, 2018 at 5:32 PM, Mike Bayer wrote: > On Mon, Jan 15, 2018 at 4:47 PM, Lukas Barth wrote: >> Hi, >> >> first things first: I put a complete notebook showing the problem at [0], >> I'll only post the excerpts I deem useful below. The exact same code, this >> time working, with an sqlite in-memory database is at [1]. I'm using >> SQLAlchemy version 1.2.1, python 3.5, mysqlclient version 1.3.12 and MariaDB >> version 10.1.29. >> >> I have this persistent class: >> class Test(Base): >> __tablename__ = "test" >> >> id = Column(Integer, primary_key=True) >> timestamp = Column(Integer) >> value = Column(Integer) >> >> Resulting in this schema: >> >> >> CREATE TABLE test ( >>id INTEGER NOT NULL, >>timestamp INTEGER, >>value INTEGER, >>PRIMARY KEY (id) >> ) >> >> What I want to achieve is to delete all objects from Test, which have a >> timestamp less than some value (five in my example code), but which are >> *not* the most recent entries for their respective value. For example: If >> there is a Test object (call it A) with timestamp = 4 and value = 1234, and >> all other (if there exist any at all…) Test objects with value = 1234 have a >> timestamp of less than 4, then I want A to *not* be deleted, even though its >> timestamp is less than 5. >> >> >> I figured I go with two subqueries: The first subquery uses a group_by on >> value, and max() on timestamp. These are the objects to be protected from >> deletion. The second subquery retrieves the ids of the objects in subquery >> 1. Then I can issue a delete statement, filtering so that the ids of the >> objects to be deleted are not in subquery 2. >> >> Here's the first subquery: >> protected_items = session.query(Test.id, Test.value, >> func.max(Test.timestamp))\ >> .group_by(Test.value).subquery(name='subquery1') >> >> And the second one: >> protected_items_ids = session.query(Test.id).join( >> protected_items, Test.id == protected_items.c.id)\ >> .subquery(name='subquery2') >> >> And finally, the deletion: >> >> deleted_rows = session.query(Test) \ >> .filter((Test.timestamp < 5)) \ >> .filter(~Test.id.in_( >> protected_items_ids)) \ >> .delete(synchronize_session=False) >> >> >> This works great when using it with a sqlite database. However, it gives an >> OperationalError when using it with MariaDB. See the bottom of [0] for the >> full stack trace. This is the error message I get from the MariaDB server: >> >> OperationalError: (_mysql_exceptions.OperationalError) >> >> (1093, "Table 'test' is specified twice, both as a target for 'DELETE' and >> as a separate source for data") >> >> [SQL: 'DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT >> test.id \nFROM test INNER JOIN (SELECT test.id AS id, test.value AS value, >> max(test.timestamp) AS max_1 \nFROM test GROUP BY test.value) AS subquery1 >> ON test.id = subquery1.id)'] [parameters: (5,)] (Background on this error >> at: http://sqlalche.me/e/e3q8) >> >> >> Doing dome digging, I think one needs to specify a name for the innermost >> subquery (subquery 1), to make MariaDB create a temporary table for this. At >> least that's what [2] suggests. I would have assumed the inner subquery to >> be named ("… as subquery1"), since I specified "name = subquery1", but that >> doesn't seem to be >> > print(protected_items) >> SELECT test.id, test.value, max(test.timestamp) AS max_1 >> FROM test GROUP BY test.value > > doing a print() on the subquery object itself does not show it in > context, as the subquery / alias name only comes out if you select > FROM the object. If you look at the SQL shown in the error message, > it is rendering"AS subquery1", so that is not the problem here. > > the query as written doesn't actually seem to do anything with the > subqueries, because you are putting this max(test.timestamp) in the > columns clause of an embedded subquery, but you aren't filtering on it > or doing anything at all with it, it's being thrown away.The query > I think you need is: > > latest_items = session.query( > Test.value, func.max(Test.timestamp).label('latest_timestamp') > ).group_by(Test.value).subquery(name='subquery1') > > latest_item_ids = session.query(Test.id).filter( > Test.timestamp == latest_items.c.latest_timestamp).\ > filter(Test.value == latest_items.c.value) > > deleted_rows = session.query(Test) \ > .filter((Test.timestamp < 5)) \ > .filter(~Test.id.in_(latest_item_ids)) \ > .delete(synchronize_session=False) > > > > generating: > > DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT > test.id AS test_id > FROM (SELECT test.value AS value, max(test.timestamp) AS latest_timestamp > FROM test GROUP BY test.value) AS subquery1 > WHERE test.timestamp = subquery1
Re: [sqlalchemy] SQLAlchemy does not properly create temporary table for subqueries in MariaDB
On Mon, Jan 15, 2018 at 4:47 PM, Lukas Barth wrote: > Hi, > > first things first: I put a complete notebook showing the problem at [0], > I'll only post the excerpts I deem useful below. The exact same code, this > time working, with an sqlite in-memory database is at [1]. I'm using > SQLAlchemy version 1.2.1, python 3.5, mysqlclient version 1.3.12 and MariaDB > version 10.1.29. > > I have this persistent class: > class Test(Base): > __tablename__ = "test" > > id = Column(Integer, primary_key=True) > timestamp = Column(Integer) > value = Column(Integer) > > Resulting in this schema: > > > CREATE TABLE test ( >id INTEGER NOT NULL, >timestamp INTEGER, >value INTEGER, >PRIMARY KEY (id) > ) > > What I want to achieve is to delete all objects from Test, which have a > timestamp less than some value (five in my example code), but which are > *not* the most recent entries for their respective value. For example: If > there is a Test object (call it A) with timestamp = 4 and value = 1234, and > all other (if there exist any at all…) Test objects with value = 1234 have a > timestamp of less than 4, then I want A to *not* be deleted, even though its > timestamp is less than 5. > > > I figured I go with two subqueries: The first subquery uses a group_by on > value, and max() on timestamp. These are the objects to be protected from > deletion. The second subquery retrieves the ids of the objects in subquery > 1. Then I can issue a delete statement, filtering so that the ids of the > objects to be deleted are not in subquery 2. > > Here's the first subquery: > protected_items = session.query(Test.id, Test.value, > func.max(Test.timestamp))\ > .group_by(Test.value).subquery(name='subquery1') > > And the second one: > protected_items_ids = session.query(Test.id).join( > protected_items, Test.id == protected_items.c.id)\ > .subquery(name='subquery2') > > And finally, the deletion: > > deleted_rows = session.query(Test) \ > .filter((Test.timestamp < 5)) \ > .filter(~Test.id.in_( > protected_items_ids)) \ > .delete(synchronize_session=False) > > > This works great when using it with a sqlite database. However, it gives an > OperationalError when using it with MariaDB. See the bottom of [0] for the > full stack trace. This is the error message I get from the MariaDB server: > > OperationalError: (_mysql_exceptions.OperationalError) > > (1093, "Table 'test' is specified twice, both as a target for 'DELETE' and > as a separate source for data") > > [SQL: 'DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT > test.id \nFROM test INNER JOIN (SELECT test.id AS id, test.value AS value, > max(test.timestamp) AS max_1 \nFROM test GROUP BY test.value) AS subquery1 > ON test.id = subquery1.id)'] [parameters: (5,)] (Background on this error > at: http://sqlalche.me/e/e3q8) > > > Doing dome digging, I think one needs to specify a name for the innermost > subquery (subquery 1), to make MariaDB create a temporary table for this. At > least that's what [2] suggests. I would have assumed the inner subquery to > be named ("… as subquery1"), since I specified "name = subquery1", but that > doesn't seem to be > print(protected_items) > SELECT test.id, test.value, max(test.timestamp) AS max_1 > FROM test GROUP BY test.value doing a print() on the subquery object itself does not show it in context, as the subquery / alias name only comes out if you select FROM the object. If you look at the SQL shown in the error message, it is rendering"AS subquery1", so that is not the problem here. the query as written doesn't actually seem to do anything with the subqueries, because you are putting this max(test.timestamp) in the columns clause of an embedded subquery, but you aren't filtering on it or doing anything at all with it, it's being thrown away.The query I think you need is: latest_items = session.query( Test.value, func.max(Test.timestamp).label('latest_timestamp') ).group_by(Test.value).subquery(name='subquery1') latest_item_ids = session.query(Test.id).filter( Test.timestamp == latest_items.c.latest_timestamp).\ filter(Test.value == latest_items.c.value) deleted_rows = session.query(Test) \ .filter((Test.timestamp < 5)) \ .filter(~Test.id.in_(latest_item_ids)) \ .delete(synchronize_session=False) generating: DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT test.id AS test_id FROM (SELECT test.value AS value, max(test.timestamp) AS latest_timestamp FROM test GROUP BY test.value) AS subquery1 WHERE test.timestamp = subquery1.latest_timestamp AND test.value = subquery1.value) which is accepted by the database. > > … I guess that's a bug? At least I would assume that this should not result > in an OperationalError,
[sqlalchemy] SQLAlchemy does not properly create temporary table for subqueries in MariaDB
Hi, first things first: I put a complete notebook showing the problem at [0], I'll only post the excerpts I deem useful below. The exact same code, this time working, with an sqlite in-memory database is at [1]. I'm using SQLAlchemy version 1.2.1, python 3.5, mysqlclient version 1.3.12 and MariaDB version 10.1.29. I have this persistent class: class Test(Base): __tablename__ = "test" id = Column(Integer, primary_key=True) timestamp = Column(Integer) value = Column(Integer) Resulting in this schema: CREATE TABLE test ( id INTEGER NOT NULL, timestamp INTEGER, value INTEGER, PRIMARY KEY (id) ) What I want to achieve is to delete all objects from Test, which have a timestamp less than some value (five in my example code), but which are *not* the most recent entries for their respective value. For example: If there is a Test object (call it A) with timestamp = 4 and value = 1234, and all other (if there exist any at all…) Test objects with value = 1234 have a timestamp of less than 4, then I want A to *not* be deleted, even though its timestamp is less than 5. I figured I go with two subqueries: The first subquery uses a group_by on value, and max() on timestamp. These are the objects to be protected from deletion. The second subquery retrieves the ids of the objects in subquery 1. Then I can issue a delete statement, filtering so that the ids of the objects to be deleted are not in subquery 2. Here's the first subquery: protected_items = session.query(Test.id, Test.value, func.max(Test.timestamp))\ .group_by(Test.value).subquery(name='subquery1') And the second one: protected_items_ids = session.query(Test.id).join( protected_items, Test.id == protected_items.c.id)\ .subquery(name='subquery2') And finally, the deletion: deleted_rows = session.query(Test) \ .filter((Test.timestamp < 5)) \ .filter(~Test.id.in_( protected_items_ids)) \ .delete(synchronize_session=False) This works great when using it with a sqlite database. However, it gives an OperationalError when using it with MariaDB. See the bottom of [0] for the full stack trace. This is the error message I get from the MariaDB server: OperationalError: (_mysql_exceptions.OperationalError) (1093, "Table 'test' is specified twice, both as a target for 'DELETE' and as a separate source for data") [SQL: 'DELETE FROM test WHERE test.timestamp < %s AND test.id NOT IN (SELECT test.id \nFROM test INNER JOIN (SELECT test.id AS id, test.value AS value, max(test.timestamp) AS max_1 \nFROM test GROUP BY test.value) AS subquery1 ON test.id = subquery1.id)'] [parameters: (5,)] (Background on this error at: http://sqlalche.me/e/e3q8) Doing dome digging, I think one needs to specify a name for the innermost subquery (subquery 1), to make MariaDB create a temporary table for this. At least that's what [2] suggests. I would have assumed the inner subquery to be named ("… as subquery1"), since I specified "name = subquery1", but that doesn't seem to be >>> print(protected_items) SELECT test.id, test.value, max(test.timestamp) AS max_1 FROM test GROUP BY test.value … I guess that's a bug? At least I would assume that this should not result in an OperationalError, especially since it works with sqlite? Thanks for any help, Lukas [0] https://tinloaf.de/~tinloaf/sqlalchemy/mariadb.html [1] https://tinloaf.de/~tinloaf/sqlalchemy/sqlite.html [2] https://stackoverflow.com/questions/5816840/delete-i-cant-specify-target-table -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.