:facepalm: i should have seen that much sooner. Thank you so much for your help. No longer need to do any of that crazy stuff I attempted!
On Thursday, February 7, 2019 at 5:13:41 PM UTC-5, Mike Bayer wrote: > > On Thu, Feb 7, 2019 at 2:49 PM Ian Miller <irmil...@gmail.com > <javascript:>> wrote: > > > > The reason why I iterated over .c._all_columns was because it shows the > references to the 2 "metadata_value" columns, whereas selectable.c only > references 1. > > > so that's a sign that you need to apply labels to these columns > because they are conflicting. with select(), call > select().apply_labels() so that these two columns are prefixed with > their table name in a label. > > > > > > For example, selectable.c shows the following: > > > > (Pdb++) list(query.c) > > [Column('metadata_value', VARCHAR(length=255), > table=<non_interval_query>), Column('created_at', TIMESTAMP(), > table=<non_interval_query>), <sqlalchemy.sql.elements.ColumnClause at > 0x7f15e1f6fef0; %(139731962070520 coalesce)s>]Enter code here... > > > > selectable.c._all_columns shows the following: > > > > (Pdb++) query.c._all_columns > > [Column('metadata_value', VARCHAR(length=255), > table=<non_interval_query>), Column('metadata_value', VARCHAR(length=255), > table=<non_interval_query>), Column('created_at', TIMESTAMP(), > table=<non_interval_query>), <sqlalchemy.sql.elements.ColumnClause at > 0x7f15e1f6fef0; %(139731962070520 coalesce)s>]Enter code here... > > > > > > > > > > On Thursday, February 7, 2019 at 12:53:51 PM UTC-5, Mike Bayer wrote: > >> > >> just as a note, don't access .c._all_columns, just iterate over > selectable.c > >> > >> On Thu, Feb 7, 2019 at 10:45 AM Ian Miller <irmil...@gmail.com> wrote: > >> > > >> > Hello Mike, > >> > > >> > Thank you for your response! I have currently constructed the ORM > implementation that you suggested in your response. Here's the code: > >> > > >> > def _rebuild_non_interval_query_for_group_by(self, session, query): > >> > from sqlalchemy import table, column, select > >> > from sqlalchemy.orm import aliased > >> > from collections import defaultdict, OrderedDict > >> > > >> > post_metadata = table( > >> > "post_metadata", column("post_id"), column("metadata_value") > >> > ) > >> > campaign_metadata = table( > >> > "campaign_metadata", column("campaign_id"), > column("metadata_value") > >> > ) > >> > asset_metadata = table( > >> > "asset_metadata", column("asset_id"), > column("metadata_value") > >> > ) > >> > > >> > vw_asset = table("vw_asset", column("id")) > >> > vw_campaign = table("vw_campaign", column("id")) > >> > vw_post = table("vw_post", column("id")) > >> > > >> > METADATA_PRIMARY_TABLE_MAP = { > >> > asset_metadata.name: vw_asset, > >> > campaign_metadata.name: vw_campaign, > >> > post_metadata.name: vw_post, > >> > } > >> > > >> > METADATA_NAME_TABLE_MAP = { > >> > asset_metadata.name: asset_metadata, > >> > campaign_metadata.name: campaign_metadata, > >> > post_metadata.name: post_metadata, > >> > } > >> > > >> > primary_tables = set() > >> > metadata_columns_count = defaultdict(int) > >> > metadata_alias = OrderedDict() > >> > columns = [] > >> > for c in query.c._all_columns: > >> > if c.name == "metadata_value": > >> > parent_column = list(c.base_columns)[0] > >> > table = parent_column.table > >> > primary_tables.add(METADATA_PRIMARY_TABLE_MAP[table.name]) > > >> > metadata_columns_count[METADATA_NAME_TABLE_MAP[table.name]] > += 1 > >> > alias_number = > metadata_columns_count[METADATA_NAME_TABLE_MAP[table.name]] > >> > alias_name = "{}_{}".format(table.name, alias_number) > >> > alias = aliased(parent_column.table, alias_name) > >> > metadata_alias[alias_name] = alias > >> > column = alias.c.metadata_value.label( > >> > "{}_{}_{}".format(table.name, alias_number, > "metadata_value") > >> > ) > >> > columns.append(column) > >> > else: > >> > columns.append(c) > >> > > >> > # start constructing query > >> > non_interval_query = > session.query(*columns).select_from(*primary_tables) > >> > > >> > for alias_name, alias in metadata_alias.items(): > >> > object_type = self._get_object_type_from_metadata_name( > alias.original.name) > >> > non_interval_query = ( > >> > non_interval_query > >> > .join( > >> > alias, > >> > getattr(alias.c, "{}_id".format(object_type)) == > METADATA_PRIMARY_TABLE_MAP[alias.original.name].c.id > >> > ) > >> > ) > >> > > >> > non_interval_query = > non_interval_query.subquery("non_interval_query") > >> > > >> > return non_interval_query > >> > > >> > > >> > > >> > The "metadata_alias" values are [('post_metadata_1", alias), > ('post_metadata_2', alias)] - the alias correspond to the post_metadata_1 > and post_metadata_2 alias in your example. However, when I reference these > in the join, the aliased table names are not "post_metadata_1" or > "post_metadata_2" - they're "post_metadata_3" and "post_metadata_4". I'm > unable to figure out why there's a new join seemingly created instead of > referencing the aliased tables that were passed in. > >> > > >> > Here's the query that the above generates: > >> > > >> > SELECT post_metadata_1.metadata_value AS > post_metadata_1_metadata_value, > >> > post_metadata_2.metadata_value AS > post_metadata_2_metadata_value, > >> > non_interval_query.created_at, > >> > non_interval_query.coalesce_1 \nFROM > >> > (SELECT post_metadata_3.metadata_value AS metadata_value, > post_metadata_4.metadata_value AS metadata_value, vw_post.created_at AS > created_at, coalesce(count(DISTINCT vw_post.id), :coalesce_2) AS > coalesce_1 \nFROM vw_post > >> > JOIN post_metadata AS post_metadata_3 ON post_metadata_3.post_id = > vw_post.id > >> > JOIN post_metadata AS post_metadata_4 ON post_metadata_4.post_id = > vw_post.id \nWHERE post_metadata_3.metadata_value IN (:metadata_value_1, > :metadata_value_2) > >> > AND post_metadata_4.metadata_value IN (:metadata_value_3, > :metadata_value_4) > >> > AND vw_post.created_at >= :created_at_1 > >> > AND vw_post.created_at <= :created_at_2 > >> > AND post_metadata_3.schema_uid = :schema_uid_1 > >> > AND post_metadata_3.metadata_name = :metadata_name_1 > >> > AND post_metadata_4.schema_uid = :schema_uid_2 > >> > AND post_metadata_4.metadata_name = :metadata_name_2 > >> > AND vw_post.license_id IN (:license_id_1, :license_id_2) > >> > GROUP BY vw_post.created_at, post_metadata_3.metadata_value, > post_metadata_4.metadata_value, vw_post.created_at) AS non_interval_query, > >> > vw_post > >> > JOIN post_metadata AS post_metadata_1 ON post_metadata_1.post_id = > vw_post.id > >> > JOIN post_metadata AS post_metadata_2 ON post_metadata_2.post_id = > vw_post.id; > >> > > >> > > >> > > >> > > >> > On Tuesday, February 5, 2019 at 11:51:25 AM UTC-5, Ian Miller wrote: > >> > Hello all - > >> > > >> > I am relatively new to using SQLAlchemy for more complex use cases. I > am in the process of creating a time series query, but I am unable to > reference a column by its alias at the top level of the query. > >> > > >> > This is the query that I am trying to address that SQLAlchemy is > currently generating: > >> > > >> > > >> > > >> > SELECT non_interval_query.metadata_value AS > non_interval_query_metadata_value, > >> > coalesce(sum(non_interval_query.coalesce_2), 0) AS coalesce_1, > >> > timestamp > >> > FROM > >> > (SELECT generate_series(date_trunc('day', > date('2019-01-06T00:00:00+00:00')), date_trunc('day', > date('2019-01-12T00:00:00+00:00')), '1 day') AS timestamp) AS time_series > >> > LEFT OUTER JOIN > >> > (SELECT post_metadata_1.metadata_value AS > post_metadata_1_metadata_value, > >> > post_metadata_2.metadata_value AS > post_metadata_2_metadata_value, > >> > vw_post.created_at AS vw_post_created_at, > >> > coalesce(count(DISTINCT vw_post.id), 0) AS coalesce_1 > >> > FROM vw_post > >> > JOIN post_metadata AS post_metadata_1 ON post_metadata_1.post_id = > vw_post.id > >> > JOIN post_metadata AS post_metadata_2 ON post_metadata_2.post_id = > vw_post.id > >> > WHERE post_metadata_1.metadata_value IN ('<metadata_values>') > >> > AND post_metadata_2.metadata_value IN ('<metadata_value>') > >> > AND vw_post.created_at >= '2019-01-06T00:00:00+00:00' > >> > AND vw_post.created_at <= '2019-01-12T00:00:00+00:00' > >> > AND post_metadata_1.schema_uid = '<schema_uid>' > >> > AND post_metadata_1.metadata_name = '<metadata_name>' > >> > AND post_metadata_2.schema_uid = '<schema_uid>' > >> > AND post_metadata_2.metadata_name = '<metadata_name>' > >> > AND vw_post.license_id IN (<license_ids>) > >> > GROUP BY vw_post.created_at, > >> > post_metadata_1.metadata_value, > >> > post_metadata_2.metadata_value, > >> > vw_post.created_at) AS non_interval_query ON > date_trunc('day', created_at) = timestamp; > >> > You'll notice that "non_interval_query.metadata_value AS > non_interval_query_metadata_value" specified at the beginning of the query > is ambiguous due to the 2 "metadata_value" selects in the > "non_interval_query" subquery. What I'm trying to do is have 2 selects at > the top level - one for "non_interval_query.post_metadata_1_metadata_value" > and one for "non_interval_query.post_metadata_2_metadata_value". > >> > > >> > > >> > For reference, here is the code used to generate the above query: > >> > > >> > > >> > > >> > > >> > def apply_date_group_by(self, session, query, range_gb_params): > >> > field_name = self.db.get("column") > >> > model = self._object.get("model") > >> > > >> > if not field_name or not model: > >> > raise ValueError("Invalid date group by") > >> > > >> > gb_column = self._build_column() > >> > interval = range_gb_params.get("interval") > >> > interval_type = range_gb_params.get("interval_type") > >> > > >> > time_series = func.generate_series( > >> > func.date_trunc(interval_type, > func.date(range_gb_params["start"])), > >> > func.date_trunc(interval_type, > func.date(range_gb_params["end"])), > >> > interval, > >> > ).label("timestamp") > >> > > >> > ts_column = column("timestamp") > >> > > >> > time_series_query = > session.query(time_series).subquery("time_series") > >> > non_interval_query = query.subquery("non_interval_query") > >> > # have to replace the original gb_column with the 'timestamp' > column > >> > # in order to properly merge the dataset into the time series > dataset > >> > non_gb_columns, gbs = self._prepare_non_gb_columns( > >> > ts_column, gb_column, non_interval_query.columns > >> > ) > >> > > >> > # construct query with correct position passed in from > `range_gb_params` > >> > query_position = range_gb_params.get("query_index_position", 0) > >> > non_gb_columns.insert(query_position, ts_column) > >> > > >> > date_gb_query = session.query(*non_gb_columns).select_from( > >> > time_series_query.outerjoin( > >> > non_interval_query, > >> > func.date_trunc(interval_type, column(field_name)) == > ts_column, > >> > ) > >> > ) > >> > > >> > if gbs: > >> > date_gb_query = date_gb_query.group_by(*gbs) > >> > > >> > return date_gb_query.order_by(ts_column) > >> > > >> > > >> > Any help on this would be greatly appreciated! > >> > > >> > > >> > -- > >> > SQLAlchemy - > >> > The Python SQL Toolkit and Object Relational Mapper > >> > > >> > http://www.sqlalchemy.org/ > >> > > >> > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > >> > --- > >> > You received this message because you are subscribed to the Google > Groups "sqlalchemy" group. > >> > To unsubscribe from this group and stop receiving emails from it, > send an email to sqlalchemy+...@googlegroups.com. > >> > To post to this group, send email to sqlal...@googlegroups.com. > >> > Visit this group at https://groups.google.com/group/sqlalchemy. > >> > For more options, visit https://groups.google.com/d/optout. > > > > -- > > SQLAlchemy - > > The Python SQL Toolkit and Object Relational Mapper > > > > http://www.sqlalchemy.org/ > > > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > > --- > > You received this message because you are subscribed to the Google > Groups "sqlalchemy" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to sqlalchemy+...@googlegroups.com <javascript:>. > > To post to this group, send email to sqlal...@googlegroups.com > <javascript:>. > > Visit this group at https://groups.google.com/group/sqlalchemy. > > For more options, visit https://groups.google.com/d/optout. > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.