[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15566982#comment-15566982 ] Maryann Xue commented on PHOENIX-2679: -- This prevents "select from view" from working. > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.2.patch, PHOENIX-2679.wip.patch, > suggested-calcite-changes-for-PHOENIX-2679.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1700#comment-1700 ] Julian Hyde commented on PHOENIX-2679: -- We're making sure that the hard stuff is in Calcite - in this case, name resolution of column families, and materialized view matching - and well unit-tested. Which just leaves the glue in Phoenix. > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.2.patch, PHOENIX-2679.wip.patch, > suggested-calcite-changes-for-PHOENIX-2679.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1674#comment-1674 ] James Taylor commented on PHOENIX-2679: --- Ok, thanks for taking a look. Overall, I'm hoping we can push as much into the Calcite layer as possible. It seems that our integration layer is growing in complexity which unfortunately dilutes the value proposition a little bit (but don't get me wrong - we're gaining a huge amount of course too - we just need to keep an eye on this IMHO). > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.2.patch, PHOENIX-2679.wip.patch, > suggested-calcite-changes-for-PHOENIX-2679.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1643#comment-1643 ] Julian Hyde commented on PHOENIX-2679: -- [~jamestaylor], I think [~maryannxue] has found a good balance. The generic stuff is in the Calcite PR, and the "glue" stuff is in the Phoenix patch. At some point someone else will want to do secondary index optimization in Calcite, and that will be the right time to look to pull some of the Phoenix stuff into Calcite. But even then, the combination of secondary indexes plus column families is very Phoenix-specific. > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.2.patch, PHOENIX-2679.wip.patch, > suggested-calcite-changes-for-PHOENIX-2679.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15552930#comment-15552930 ] Maryann Xue commented on PHOENIX-2679: -- Found another issue here. Right now we re-order our columns so that columns from the same family can be grouped together and that column family can be taken as a structured type. So OrderByIT, there's a test regression in https://builds.apache.org/job/Phoenix-calcite/20/#showFailuresLink, because it has a table definition like: {code} CREATE TABLE t (a_string varchar not null, cf1.a integer, cf1.b varchar, col1 integer, cf2.c varchar, cf2.d integer, col2 integer CONSTRAINT pk PRIMARY KEY (a_string)) {code} So when we implement PhoenixTable.getRowType(), the table definition is actually changed from "a_string, cf1.a, cf1.b, col1, cf2.c, cf2.d, col2" into "a_string, cf1.a, cf1.b, col1, col2 cf2.c, cf2.d". As a result "UPSERT INTO T" expects a different column order (data type order) for parameters, and we got an Exception. This happens just because the column re-ordering and has nothing to do with this patch yet. UPSERT will get more complicated once we apply this patch, which means there's more work to do for UPSERT for column family support. But regarding this problem alone, would you think it would make sense to add a check in Phoenix DDL that requires users to put column definitions from the same family together, like in this case, only allows "a_string, cf1.a, cf1.b, col1, col2 cf2.c, cf2.d", [~jamestaylor]? > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.2.patch, PHOENIX-2679.wip.patch, > suggested-calcite-changes-for-PHOENIX-2679.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15550207#comment-15550207 ] James Taylor commented on PHOENIX-2679: --- [~julianhyde] - do you think this is the best solution? Should this code live in calcite? It feels like the two level schema feature is half in calcite and half in phoenix. > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.2.patch, PHOENIX-2679.wip.patch, > suggested-calcite-changes-for-PHOENIX-2679.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15550130#comment-15550130 ] Maryann Xue commented on PHOENIX-2679: -- Right now CalciteIT.testSelectFromView() still fails. An exception is thrown at validation stage. > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.2.patch, PHOENIX-2679.wip.patch, > suggested-calcite-changes-for-PHOENIX-2679.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549550#comment-15549550 ] Julian Hyde commented on PHOENIX-2679: -- IIRC, and if it is useful, there is a way for a Table/RelOptTable to become a TableScan without referring to its place in the Schema hierarchy. > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549497#comment-15549497 ] Maryann Xue commented on PHOENIX-2679: -- The reason why we need to put them in the Schema (at a later stage though) is that we need to get RelOptTable to create a new TableScan. {code} List name = new ArrayList(table.getQualifiedName()); name.set(name.size() - 1, table.unwrap(PhoenixTable.class).getFlattenedName()); RelOptTable flattenedTable = table.getRelOptSchema().getTableForMember(name); PhoenixTableScan newScan = PhoenixTableScan.create(scan.getCluster(), flattenedTable); {code} So these shadow tables are just Objects and do not have much impact other than that. > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549468#comment-15549468 ] Julian Hyde commented on PHOENIX-2679: -- I believe that Schema is only used at validation time, not at planning time. If you leave the tables out of the Schema, user queries will never find them. In fact tables used by the planner don't even need to have names. (The entries in Schema are like inodes in a UNIX file system.) > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549449#comment-15549449 ] James Taylor commented on PHOENIX-2679: --- Do we really have to do all that? Sounds brittle. If calcite has support for this at validation time why can't it support it better later in the planning process? > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549420#comment-15549420 ] Maryann Xue commented on PHOENIX-2679: -- The shadow tables are actually what represent the real table structure in HBase, but they have tweaked names, just to be different from the original table names. In order not to let users access those made-up names in SQL queries, I just add those tables in a later stage after validation, but before planning. Actually I made a mistake in the above comment. The test case can now pass even with a filter in the query. As long as "flattenTypes()" works OK, our shadow table replacement rule can work too. So right now the problem is just (1) flatten types and (2) disable field trimming at that stage. We can actually apply a field trimming program after shadow table replacement program. > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549331#comment-15549331 ] Julian Hyde commented on PHOENIX-2679: -- Do these shadow tables exist in HBase? > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann Xue > Labels: calcite > Attachments: PHOENIX-2679.wip.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-2679) Implement column family schema structure in Calcite-Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15547766#comment-15547766 ] Maryann Xue commented on PHOENIX-2679: -- I'm implementing it using shadow tables, which means each structured type Phoenix table (referenced in the query) will have a corresponding flattened type shadow table. At a point right before real query planning starts, we'll have a Hep program that replaces a "flattening" Project on top of a TableScan of the structured type table with a TableScan of the flattened type table. After this point, everything will work exactly the same as they did before. Secondary indexes will be defined based on these shadow tables, which I assume would just work the same as before too. I've also managed to avoid the side effects of shadow tables by adding them through a Hook that's triggered after sql-to-rel conversion, so that they won't be visible at validation time. Part of it is already working, but with some changes on the Calcite side, as shown below: {code} diff --git a/core/src/main/java/org/apache/calcite/prepare/Prepare.java b/core/src/main/java/org/apache/calcite/prepare/Prepare.java index 70cddf6..8d9b5f4 100644 --- a/core/src/main/java/org/apache/calcite/prepare/Prepare.java +++ b/core/src/main/java/org/apache/calcite/prepare/Prepare.java @@ -258,7 +258,7 @@ public PreparedResult prepareSql( // Structured type flattening, view expansion, and plugging in physical // storage. -root = root.withRel(flattenTypes(root.rel, true)); +root = root.withRel(sqlToRelConverter.flattenTypes(root.rel, true)); if (this.context.config().forceDecorrelate()) { // Subquery decorrelation. @@ -363,7 +363,7 @@ private boolean shouldTrim(RelNode rootRel) { // For now, don't trim if there are more than 3 joins. The projects // near the leaves created by trim migrate past joins and seem to // prevent join-reordering. -return THREAD_TRIM.get() || RelOptUtil.countJoins(rootRel) < 2; +return THREAD_TRIM.get() && RelOptUtil.countJoins(rootRel) < 2; } public RelRoot expandView(RelDataType rowType, String queryString, {code} The reasons are: 1. CalcitePrepareStmt.flattenTypes() doesn't really do anything to flatten types, but SqlToRelConverter.flattenTypes() does. 2. trimUnusedFields() doesn't seem to work right for structured types after flattening. The OR logic seems to be wrong by itself, or was it intended that way? {code} diff --git a/core/src/main/java/org/apache/calcite/sql2rel/RelStructuredTypeFlattener.java b/core/src/main/java/org/apache/calcite/sql2rel/RelStructuredTypeFlattener.java index 82b1f4e..db508d6 100644 --- a/core/src/main/java/org/apache/calcite/sql2rel/RelStructuredTypeFlattener.java +++ b/core/src/main/java/org/apache/calcite/sql2rel/RelStructuredTypeFlattener.java @@ -27,6 +27,7 @@ import org.apache.calcite.rel.core.CorrelationId; import org.apache.calcite.rel.core.Sample; import org.apache.calcite.rel.core.Sort; +import org.apache.calcite.rel.core.TableScan; import org.apache.calcite.rel.core.Uncollect; import org.apache.calcite.rel.logical.LogicalAggregate; import org.apache.calcite.rel.logical.LogicalCalc; @@ -39,7 +40,6 @@ import org.apache.calcite.rel.logical.LogicalSort; import org.apache.calcite.rel.logical.LogicalTableFunctionScan; import org.apache.calcite.rel.logical.LogicalTableModify; -import org.apache.calcite.rel.logical.LogicalTableScan; import org.apache.calcite.rel.logical.LogicalUnion; import org.apache.calcite.rel.logical.LogicalValues; import org.apache.calcite.rel.stream.LogicalChi; @@ -648,7 +648,7 @@ private boolean isConstructor(RexNode rexNode) { || (call.isA(SqlKind.NEW_SPECIFICATION)); } - public void rewriteRel(LogicalTableScan rel) { + public void rewriteRel(TableScan rel) { RelNode newRel = rel.getTable().toRel(toRelContext); if (!SqlTypeUtil.isFlat(rel.getRowType())) { final List> flattenedExpList = Lists.newArrayList(); {code} 3. Apparently it wouldn't work for PhoenixTableScan, which is part of rel. [~julianhyde], for (2) and (3), do you think it would be reasonable to make the changes into Calcite? for (1) can we make it an option? Aside from the points listed above, one big blocker right now is that the ProjectFilterTransposeRule could not push the "flattening" Project through the filter, coz the "flattening" Project was thought to be a trivial Project. So right now any query with a filter cannot work. I'll try to see if I can work this out. > Implement column family schema structure in Calcite-Phoenix > --- > > Key: PHOENIX-2679 > URL: https://issues.apache.org/jira/browse/PHOENIX-2679 > Project: Phoenix > Issue Type: Task >Reporter: Maryann Xue >Assignee: Maryann