Thanks for the detailed explanation. I think I should explain why I run into this problem, as it's in line with emitting a warning on this use case.
1. I wrote the query without any kind of alias in SQLAlchemy. 2. Received an error from PostgreSQL "subquery in FROM must have an alias", so I added it in both places "to be safe". One for the subquery, one for the join. 3. After this the query worked - so I confusingly thought I did the right thing - but became unusably slow. Then I tried to debug step by step and arrived that the adding of alias to join() made it into a very different query which happened to be slow. I think I'm not the only user who adds aliases "everywhere" once receiving that alias is needed error. I thought that the j = ... is a subquery as it's in a select_from() line. So yes, in my case adding those warnings would be very helpful, directing me in the right direction. Zsolt On Fri, 29 Nov 2019 at 04:50, Mike Bayer <mike...@zzzcomputing.com> wrote: > > > > On Thu, Nov 28, 2019, at 4:31 PM, Zsolt Ero wrote: > > I have the following query: > > min_values = ( > sa.select( > [ > table.c.region_id, > table.c.operator, > table.c.day_str, > > sa.func.min(table.c.processed_time).label('min_processed_time'), > ] > ) > .group_by(table.c.region_id, table.c.operator, table.c.day_str) > .order_by(table.c.region_id, table.c.operator, table.c.day_str) > ).alias('mv') > > j = min_values.join( > config_mds, > sa.and_( > min_values.c.region_id == config_mds.c.region_id, > min_values.c.operator == config_mds.c.operator, > ), > ).alias('j') > > stmt = ( > sa.select([min_values.c.region_id, min_values.c.operator, > min_values.c.day_str]) > .select_from(j) > .where(min_values.c.min_processed_time < > config_mds.c.last_reprocess_time) > ) > > > Which works well, as long as I do not add the .alias('j') part. If I add it, > it generates a wildly different SQL code, which doesn't finish in reasonable > time (I terminated it after a few minutes whereas the first one finishes in a > few seconds. > > > sure, the "j" is shorthand for calling a SELECT of the join, then aliasing > it. you can no longer refer to 'min_values" in the subsequent query because > that means you want to add it to the FROM clause separately from the "j" and > it will go poorly. > > > > Here is the generated SQL without alias('j'): > > SELECT > mv.region_id, > mv.operator, > mv.day_str > FROM > ( > SELECT > status_changes.region_id AS region_id, > status_changes.operator AS OPERATOR, > status_changes.day_str AS day_str, > min(status_changes.processed_time) AS min_processed_time > FROM > status_changes > GROUP BY > status_changes.region_id, > status_changes.operator, > status_changes.day_str > ORDER BY > status_changes.region_id, > status_changes.operator, > status_changes.day_str > ) AS mv > JOIN config_mds ON mv.region_id = config_mds.region_id > AND mv.operator = config_mds.operator > WHERE > config_mds.last_reprocess_time > mv.min_processed_time > > > And here is with the added alias('j'), which doesn't terminates: > > SELECT > mv.region_id, > mv.operator, > mv.day_str > FROM > ( > SELECT > status_changes.region_id AS region_id, > status_changes.operator AS OPERATOR, > status_changes.day_str AS day_str, > min(status_changes.processed_time) AS min_processed_time > FROM > status_changes > GROUP BY > status_changes.region_id, > status_changes.operator, > status_changes.day_str > ORDER BY > status_changes.region_id, > status_changes.operator, > status_changes.day_str > ) AS mv, > config_mds, > ( > SELECT > mv.region_id AS mv_region_id, > mv.operator AS mv_operator, > mv.day_str AS mv_day_str, > mv.min_processed_time AS mv_min_processed_time, > config_mds.region_id AS config_mds_region_id, > config_mds.operator AS config_mds_operator, > config_mds.disable_download AS config_mds_disable_download, > config_mds.trips_url AS config_mds_trips_url, > config_mds.status_changes_url AS config_mds_status_changes_url, > config_mds.url_headers AS config_mds_url_headers, > config_mds.login_data AS config_mds_login_data, > config_mds.use_different_region AS config_mds_use_different_region, > config_mds.comment AS config_mds_comment, > config_mds.last_reprocess_time AS config_mds_last_reprocess_time, > config_mds.token_cache AS config_mds_token_cache > FROM > ( > SELECT > status_changes.region_id AS region_id, > status_changes.operator AS OPERATOR, > status_changes.day_str AS day_str, > min(status_changes.processed_time) AS min_processed_time > FROM > status_changes > GROUP BY > status_changes.region_id, > status_changes.operator, > status_changes.day_str > ORDER BY > status_changes.region_id, > status_changes.operator, > status_changes.day_str > ) AS mv > JOIN config_mds ON mv.region_id = config_mds.region_id > AND mv.operator = config_mds.operator > ) AS j > WHERE > config_mds.last_reprocess_time > mv.min_processed_time; > > > > This behaviour confuses me a bit. Why would adding an alias make a difference > to the generated query? > > > > you have: > > min_values = "(SELECT .... FROM ... ) AS mv" > > select([min_values.c.x]) = "SELECT mv.x FROM (SELECT ... FROM ..) AS mv" > > > then you say, "j = min_values.join(config_mds, ...).alias("j")" > > this is the same as saying: > > j = select([min_values.join(config_mds, ...)]).alias("j") > > an "alias" is only against a SELECT or a table; there is no "SELECT * FROM > (x JOIN y) AS j" syntax. Did a quick test and apparently Postgresql > allows that syntax but it fails on MySQL. SQLite seems to support it but > that's a newer thing as SQLite didn't support nested JOINs at all some years > ago. Adding the "AS <name>" to the plain parenthesies like that It's not > something I've seen in SQL before and SQLAlchemy has no concept of that right > now. > > so given that join().alias() is shorthand for select([join()]).alias(), this > means: > > select([j.c.x]) = "SELECT j.x FROM (SELECT mv.*, config_mds.* FROM (SELECT > ... FROM ..) AS mv JOIN config_mds ON ...) AS j" > > In SQLAlchemy 1.4 there's been a big scale-back of how things like alias() > and join() can be used in order to decrease this kind of confusion, and I was > sort of hoping this particular use was also deprecated but apparently it > isn't. so https://github.com/sqlalchemy/sqlalchemy/issues/5010 is added and > the join().alias() usage will emit a deprecation warning in 1.4 because it > does a thing that by today's standards is surprising. > > anyway, if you mix the join() and the join().alias() together, like: > > select([min_values.c.x]).select_from(j)... > > this is like: > > "SELECT mv.x FROM (SELECT ... FROM ..) AS mv, (SELECT mv.*, config_mds.* FROM > (SELECT ... FROM ..) AS mv JOIN config_mds ON ...) AS j" > > that is, saying select_from(j) doesn't mean that's the only thing you are > selecting from, it also has to select from min_values directly. > > TLDR join().alias() is doing a large implicit thing based on older > assumptions, when parenthesized "(a JOIN b)" clauses were not accepted by > SQLite and possibly others and SQLAlchemy always used SELECT... in order to > group joins. you can group joins in parentheses now but you still can't > give them an alias() without a SELECT implied. > > > > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/bbf11796-b9ed-4b6f-9188-c35585dfc01d%40googlegroups.com. > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to a topic in the Google > Groups "sqlalchemy" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/sqlalchemy/nPKcBGWWY3s/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/f26cf2af-cb16-4488-b75b-baca3c457e00%40www.fastmail.com. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/CAKw-smA-O2CPJwTys6r2KE0fFV%2BdnNL4jEgnB66VcGbJKO%2BObw%40mail.gmail.com.