[Question] Long Runtime for SubqueryRemoveRule, vs Expanding Subqueries
Hi Calcite community, I'm running into some challenges using SubqueryRemoveRule and was hoping the community could help me understand any gaps in my understanding. I am attempting to remove subqueries from row expressions in a relational plan. My motivation for doing this is that I want to call metadata queries such GetColumnOrigins on my relational plan, but these tools don't seem to work unless subqueries are removed. To remove subqueries, I am... - Using SqlToRelConverter to convert from SqlNode to RelRoot. - After obtaining the RelRoot, using a HepProgram with subquery remove rules to remove the subqueries. For small programs, this process works fine, however when testing against larger queries, ~2MB, the process of removing subqueries takes about as much time as converting a SqlNode to a RelRoot, effectively doubling the time to analyze a query. I've tried using a deprecated setting, SqlToRelConverter.Config.isExpand, which seems to provide results similar to what I want with minimal performance impact. Since SqlToRelConverter.Config.isExpand is deprecated, I assume that the non-deprecated way to expand subqueries would be as performant or better, but I am experiencing a worse performance. This leads me to believe that I'm using the APIs incorrectly, but I'm not sure what I am missing. Q: Is there a more efficient way in Calcite to remove subqueries than what I've described above? Is there risk to using SqlToRelConverter.Config.isExpand even if it's deprecated? Q: Are there any other resources besides Java docs and GitHub projects I could refer to to learn more about my problem? Thanks for any help, Logan
[jira] [Created] (CALCITE-6434) Specify identifier quoting for HiveSqlDialect and SparkSqlDialect
xiong duan created CALCITE-6434: --- Summary: Specify identifier quoting for HiveSqlDialect and SparkSqlDialect Key: CALCITE-6434 URL: https://issues.apache.org/jira/browse/CALCITE-6434 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: xiong duan Assignee: xiong duan Fix For: 1.38.0 The SQL: {code:java} SELECT product.product_class_id C FROM foodmart.product LEFT JOIN (SELECT CASE COUNT(*) WHEN 0 THEN NULL WHEN 1 THEN MIN(product_class_id) ELSE (SELECT NULL UNION ALL SELECT NULL) END $f0 FROM foodmart.product) t0 ON TRUE WHERE product.net_weight > t0.$f0{code} Generate by SINGLE_VALUE agg function. This SQL will parse failed in Spark Unless we add the identifier quoting like `t0`.`$f0` -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6433) SUBSTRING can return incorrect empty result for some parameters
Iurii Gerzhedovich created CALCITE-6433: --- Summary: SUBSTRING can return incorrect empty result for some parameters Key: CALCITE-6433 URL: https://issues.apache.org/jira/browse/CALCITE-6433 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.37.0 Reporter: Iurii Gerzhedovich SUBSTRING function for cases when 3rd parameter (length) more than Integer.MAX_VALUE can return empty result due to code do clamp that value and after that it can't be more than Integer.MAX_VALUE. Simple way to reproduce : append into *SqlOperatorTest* smth like: {noformat} f.checkScalar( String.format("{fn SUBSTRING('abcdef', %d, %d)}", Integer.MIN_VALUE, 10L + Integer.MAX_VALUE), "abcdef", "VARCHAR(6) NOT NULL"); {noformat} it`s all due to check after clamping {noformat} public static String substring(String c, int s, int l) { long e = (long) s + (long) l; -- here we can got incorrect length . if (s > lc || e < 1L) { return ""; } -{noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)