[jira] [Created] (CALCITE-5530) RelToSqlConverter[ORDER BY] generates an incorrect field alias when 2 projection fields have the same name
Abbas Gadhia created CALCITE-5530: - Summary: RelToSqlConverter[ORDER BY] generates an incorrect field alias when 2 projection fields have the same name Key: CALCITE-5530 URL: https://issues.apache.org/jira/browse/CALCITE-5530 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.33.0 Reporter: Abbas Gadhia In queries typical of Teradata, if there exists an expression alias that is also a field in the underlying table, any references to that field in the ORDER BY, refer to the underlying physical column rather than the expression alias. For ex. in the following query and Rel {code:java} SELECT UPPER(ENAME) AS EMPNO FROM scott.EMP order by EMPNO;{code} {noformat} LogicalProject(EMPNO=[$0]) LogicalSort(sort0=[$1], dir0=[ASC]) LogicalProject(EMPNO=[UPPER($1)], EMPNO0=[$0]) LogicalTableScan(table=[[scott, EMP]]) {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Generated Aliases for Expressions - $f and expr$
Hi,I noticed that while converting from Sql to Rel in SqlToRelConverter, the method deriveAlias is used which creates aliases of the type EXPR$0 (when there is no specific alias assigned to an expression)However, if we use RelBuilder.project_ , it creates aliases/field names using SqlValidatorUtil.F_SUGGESTER. It is apparent that F_SUGGESTER is used for deriving field names, however, it also gets used for deriving names for unnamed expressions (in RelBuilder) There are checks in SqlUtil.isGeneratedAlias that checks only the EXPR aliases and not the $f aliases. Question:Can we also additionally check $f aliases as part of the SqlUtil.isGeneratedAlias check?Is there any specific background to $f aliases that is not simply apparent? ThanksAbbas
Re: Volcano Planner - Tree unimplementable
Hi Masayuki, Your suggestions helped quite a bit. The mistake i was making was trying to create a tree without ANY enumerable or bindable traits, since after the optimization phase, i really dont need to "execute" anything. I just wished to print the optimal/re-shuffled tree in a special readable way. Your "catch all" converter (CassandraToEnumerableConverter) was part of the solution i.e (rel#7:Subset#1) I also was making the mistake of not setting the root of the tree to a top level EnumerableConvention/BindableConvention. This led to the other problem i.e (rel#4:Subset#1.NONE.[]) with best=null. I fixed this with the following RelNode project = relBuilder.build() RelNode root = cluster.planner.changeTraits(project, project.traitSet.replace(EnumerableConvention.INSTANCE).simplify()) cluster.planner.setRoot(root) Thanks for all your help! Regds Abbas On Saturday, 17 March 2018 11:37 PM, Masayuki Takahashiwrote: I mistook a sentence. wrong: This rel was selected at Set#1 because this one is ENUMERABLE and "best=null". correct: This rel was selected at Set#1 because this one is ENUMERABLE. "best=null" is the cause of the error. thanks. -- Masayuki Takahashi
Re: Calcite class diagram
There are some interesting presentations given by Christian Tzolov and Jordon Halterman which document many of the important classes. Here are the links https://events.static.linuxfound.org/sites/events/files/slides/ApacheCon2016ChristianTzolov.v4.pdf https://www.slideshare.net/JordanHalterman/introduction-to-apache-calcite Regds Abbas On Monday, 19 March 2018 1:22 AM, Edmon Begoliwrote: It is in a paper that went for SIGMOD this year: https://arxiv.org/abs/1802.10233 Keep in mind, it is not much, but if you read the paper you will get a pretty decent idea at the high-to-mid level of what is going on. On Sun, Mar 18, 2018 at 3:49 PM, Muhammad Gelbana wrote: > Is this UML shared somewhere I can access ? Would you please share it ? Any > insight into Calcit's structure may make all the difference for me. > > Thanks, > Gelbana > > On Sun, Mar 18, 2018 at 9:24 PM, Edmon Begoli wrote: > > > This realistic for any real-life frameworks. UML breaks down very quickly > > when applied to a real software - both generation, and maintenance. > > > > We have UML for some high level concepts such as adapters, etc., but it > is > > conceptual, not 1:1 mapped class diagrams. > > > > > > On Sun, Mar 18, 2018 at 3:22 PM, Muhammad Gelbana > > wrote: > > > > > Has anyone generated a decent class diagram representing Calcite's core > > > classes and their relations ? > > > > > > I've just tried to create one using *code2uml* and in a few minutes it > > has > > > generated a 5.9 GB image file ! And it wasn't 10% complete ! > > > > > > Would you kindly share if you have anything of sort ? > > > > > > Thanks, > > > Gelbana > > > > > >
Volcano Planner - Tree unimplementable
Hi, I'm having problems getting the Volcano Planner to implement my Rel tree. I keep getting errors like "could not be implemented; planner state" My target tree has 2 conventions. The first is what i'm calling a "Native" convention which is similar to the JdbcConvention in its semantics, but I dont use the JdbcConvention because it forces me to create JdbcTableScans that need an instance of a live jdbc Datasource. The second convention I have is called Foo (for lack of a better name), using which I intend to physically implement or execute my Rel tree. Here is my original rel, which i built using a 2 Relbuilders (1 Relbuilder built using a custom TableScanFactory and the other Relbuilder built using an empty Context i.e default TableScanFactory) LogicalProject --NativeTableScan To implement this tree, I have a few rules set up such as, NativeToFooConverterRule (similar to JdbcToEnumerableConverterRule) NativeProjectRule (similar to JdbcProjectRule) I register these rules inside of the NativeConvention in the register method (similar to JdbcConvention). For now, i've hardcoded the cost (row count) of the NativeTable object (which reside inside RelOptSchema) to be 100. The planner state, after it fails is below. In case the full log with TRACE levels is needed, i've attached it here. https://tinyurl.com/yct3hnle java.lang.RuntimeException: org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node [rel#4:Subset#1.NONE.[]] could not be implemented; planner state: Root: rel#4:Subset#1.NONE.[] Original rel: LogicalProject(subset=[rel#4:Subset#1.NONE.[]], id=[$0]): rowcount = 100.0, cumulative cost = {100.0 rows, 100.0 cpu, 0.0 io}, id = 3 NativeTableScan(subset=[rel#2:Subset#0.NativeTeradataConvention.[]], table=[[s1, emp]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 0 Sets: Set#0, type: RecordType(INTEGER id, VARCHAR name, INTEGER department) rel#2:Subset#0.NativeTeradataConvention.[], best=rel#0, importance=0.9 rel#0:NativeTableScan.NativeTeradataConvention.[](table=[s1, emp]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} rel#5:Subset#0.ENUMERABLE.[], best=null, importance=0.45 rel#15:Subset#0.BeamConvention.[], best=rel#14, importance=0.45 rel#14:NativeToBeamConverterRel.BeamConvention.[](input=rel#2:Subset#0.NativeTeradataConvention.[]), rowcount=100.0, cumulative cost={110.0 rows, 111.0 cpu, 0.0 io} Set#1, type: RecordType(INTEGER id) rel#4:Subset#1.NONE.[], best=null, importance=1.0 rel#3:LogicalProject.NONE.[](input=rel#2:Subset#0.NativeTeradataConvention.[],id=$0), rowcount=100.0, cumulative cost={inf} rel#7:Subset#1.ENUMERABLE.[], best=null, importance=0.5 rel#6:EnumerableProject.ENUMERABLE.[](input=rel#5:Subset#0.ENUMERABLE.[],id=$0), rowcount=100.0, cumulative cost={inf} rel#9:Subset#1.NativeTeradataConvention.[], best=rel#8, importance=0.5 rel#8:NativeProject.NativeTeradataConvention.[](input=rel#2:Subset#0.NativeTeradataConvention.[],id=$0), rowcount=100.0, cumulative cost={180.0 rows, 181.0 cpu, 0.0 io} rel#11:Subset#1.BeamConvention.[], best=rel#10, importance=0.5 rel#10:NativeToBeamConverterRel.BeamConvention.[](input=rel#9:Subset#1.NativeTeradataConvention.[]), rowcount=100.0, cumulative cost={190.0 rows, 191.0 cpu, 0.0 io} I tried referring to this email on the mailing list, but so far no luck. https://mail-archives.apache.org/mod_mbox/calcite-dev/201606.mbox/%3ccfa8c0e7e88fc641b42fb1b7e70fc487b8bc8...@szxema508-mbx.china.huawei.com%3e Thanks Abbas
Re: Multi Product support in a single RelNode tree
These are great inputs. Thank you! On Thursday 15 February 2018, 1:04:18 AM IST, Julian Hyde <jh...@apache.org> wrote: Schema (and SchemaPlus) is a namespace used to look up object names when validating a SQL query. It is not strictly required if you are building the query manually, or using RelBuilder. The key is the TableScan objects (in this case JdbcTableScan) representing accesses to tables. Those tables could be foreign tables in the same schema, or in different schemas, or be free-floating objects not in any schema at all. The important thing is the instance of JdbcConvention in their TableScan.getTraitSet(). That JdbcConvention contains the URL of the database, its dialect, etc. With different JdbcConvention instances you could join a table in an Oracle database to a table in a table in a SqlServer database or even in a different Oracle database. But if two tables are in the same Oracle database they must have the same JdbcConvention instance. Otherwise Calcite will not consider creating a JdbcJoin (i.e. a join inside the target database). Julian > On Feb 13, 2018, at 10:01 PM, Abbas Gadhia <ab_gad...@yahoo.com.INVALID> > wrote: > > Hi, > I want to build a RelNode tree with different conventions on different > RelNodes (for example: in the following select query "select * from t1,t2", > t1 is a table from Oracle and t2 is a table from SqlServer). > > I'm confused whether i should be using a single SchemaPlus to hold table > references from both Oracle and SqlServer or I should be creating a different > SchemaPlus for each product. Different SchemaPlus would force me to use a > different RelBuilder, so my guess is that a single SchemaPlus with the > following hierarchy may suffice ("oracle" -> "database1" -> "schema1" -> > "t1"). However, I suspect this single hierarchy (with the product name > inside) may not play well with other parts of Calcite. > Any thoughts, however small would be appreciated. > Thanks > Abbas >
Multi Product support in a single RelNode tree
Hi, I want to build a RelNode tree with different conventions on different RelNodes (for example: in the following select query "select * from t1,t2", t1 is a table from Oracle and t2 is a table from SqlServer). I'm confused whether i should be using a single SchemaPlus to hold table references from both Oracle and SqlServer or I should be creating a different SchemaPlus for each product. Different SchemaPlus would force me to use a different RelBuilder, so my guess is that a single SchemaPlus with the following hierarchy may suffice ("oracle" -> "database1" -> "schema1" -> "t1"). However, I suspect this single hierarchy (with the product name inside) may not play well with other parts of Calcite. Any thoughts, however small would be appreciated. Thanks Abbas
[jira] [Created] (CALCITE-1998) Hive - Version specific handling for NULLS FIRST/ NULLS LAST
Abbas Gadhia created CALCITE-1998: - Summary: Hive - Version specific handling for NULLS FIRST/ NULLS LAST Key: CALCITE-1998 URL: https://issues.apache.org/jira/browse/CALCITE-1998 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.15.0 Reporter: Abbas Gadhia Assignee: Julian Hyde Fix For: 1.15.0 This JIRA is for PR https://github.com/apache/calcite/pull/545 We are making the HiveSqlDialect version aware for the NULLS FIRST and NULLS LAST feature. In https://issues.apache.org/jira/browse/HIVE-12994 , the authors clarified that the default NullCollation for Hive is "NullDirection.LOW". Currently, the DEFAULT set in Calcite for Hive is NullCollation.HIGH In this PR, we are making 2 changes. # Change the default NullCollation from HIGH to LOW # Add NullCollation emulation in the HiveSqlDialect when the version of the dialect is less that 2.1.0 or when the version is blank. We're also adding a new Dialect "BigQuerySqlDialect" with the "quoteString" as "`" based on https://cloud.google.com/bigquery/docs/reference/standard-sql/lexical -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Translate RelToSql based on the version of the database
Hi, I would like to add support in the SqlImplementor class for the Nulls First/Nulls Last feature that has been recently added to Hive.https://issues.apache.org/jira/browse/HIVE-12994 I know that i need to modify the method "addOrderItem", however I also realize that if I do so, i will break the compatibility for versions of Hive that are less that 2.1.0 (since the feature was added only in 2.1.0) I also know that there is an active PR https://issues.apache.org/jira/browse/CALCITE-1913which attempts to add a version field in the SqlDialect However, it seems that it will take time to complete. What do you think I should do? Should i go ahead and add the "null direction" feature for Hive without consideration for backward compatibility? ThanksAbbas Gadhia
Table Alias in TableScan
Hi,I'm trying to build a Hive query adapter using Calcite.My source system can be anything SQL, NoSQL etc. Since the source systems typically have business names/alias given to their tables in the queries, I would like to preserve them in the Hive Adapter layer, so that the translated query contains the same business names. I was thinking of storing the alias in a custom "HiveTableScan extends TableScan" class. However, RelBuilder.scan() and the accompanying RelFactories dont allow me to do so (atleast thats what i think). I know that Apache Hive doesnt go via the RelBuilder API/RelFactories and instead chooses to manually create a "new HiveTableScan" object with its accompanying "RelOptHiveTable" class. Is this the way i should be going? Or is there a more elegant way. Sorry if this question might be too basic. Best RegardsAbbas
RelBuilder examples
Hi,I'm trying to use RelBuilder to do a project and join together, however, none of the tests in RelBuilderTest.java seem to show an example on how to do so. Is there a place where I can find such an example? I'm trying to dig into Hive code to see how its done there, but its taking me quite some time to understand the CalcitePlanner.java code in Hive. Any help would be appreciated.ThanksAbbas