Re: Template SQL

Askar Bozcan Fri, 07 Apr 2023 08:03:54 -0700

Hey,
You can use Planner.reset(). Note that it has mutable state inside, so do
not reuse the same planner instance in any multi-threading environment.


(I am assuming you have access to table metadata so that you will be able
to convert your SqlNode tree into RelNode tree, relational expression tree)
- Only return results above a certain threshold when using GROUP BY, for
example (HAVING COUNT(col1) > 100).

I'm not quite sure I understand this question is HAVING part of the query?

- Restrict the column on which joins can happen, or else throw an error
(almost like analysisException in Spark)

Do you have access to table/schema metadata?
If you do:
1) Convert your parsed syntax tree (SqlNode) into a logical relational tree
(RelNode).
Watch this tutorial by Stamatis: https://www.youtube.com/watch?v=p1O3E33FIs8.
It will explain way better the usage than I can in an email. (And it what
introduced me to Calcite's basics :))
2) Traverse your relational tree by implementing RelShuttle
<https://calcite.apache.org/javadocAggregate/org/apache/calcite/rel/RelShuttle.html>.
(Look up Visitor pattern to better understand how this recursive traversal
works)
3) Ignore any RelNode's which are not LogicalJoin.
4) When you encounter LogicalJoin, traverse its children with getLeft and
getRight
5) When you encounter LogicalTableScan,  You can get its RelOptTable with
getTable, and then RelOptTable.getRowType() to find the fields.
(Not 100% about step 5, but should be close to it)


- Restrict the columns that can be passed in a select statement or else
throw an error (like in the point above)
Same logic as above. But instead of TableScan, look at the root
LogicalProject.

All of the above should be doable with just the parse tree (SqlNode) and
without access to table metadata using SqlShuttle; however, it's easier and
less error-prone with relational tree IMO.

- Askar


On Thu, 6 Apr 2023 at 23:31, Soumyadeep Mukhopadhyay <soumyamy...@gmail.com>
wrote:

> Hey Askar,
>
> It worked exactly how you suggested.
>
> ```
>
> private val sqlQuery: String = "SELECT list_of_columns FROM table_name
> WHERE predicate_column = 'predicate_value'"
> private val frameworkConfig: FrameworkConfig =
> Frameworks.newConfigBuilder.build()
> private val planner: Planner = Frameworks.getPlanner(frameworkConfig)
> private val planner2: Planner = Frameworks.getPlanner(frameworkConfig)
> private val planner3: Planner = Frameworks.getPlanner(frameworkConfig)
> private val sqlNode: SqlNode = planner.parse(sqlQuery)
> println(sqlNode.getKind)
> private val sqlSelectStmt: SqlSelect = sqlNode.asInstanceOf[SqlSelect]
>
> private val setSelectColumnsQuery = "SELECT age"
> private val selectList =
> planner2.parse(setSelectColumnsQuery).asInstanceOf[SqlSelect].getSelectList
> private val setFromTableQuery = "SELECT employee"
> private val fromTable =
> planner3.parse(setFromTableQuery).asInstanceOf[SqlSelect].getSelectList
>
> sqlSelectStmt.setSelectList(selectList)
> sqlSelectStmt.setFrom(fromTable)
> private val finalQuery = sqlSelectStmt.asInstanceOf[SqlNode]
> println(finalQuery.toSqlString(SnowflakeSqlDialect.DEFAULT))
>
> ```
>
> The only caveat I see is the need for a new planner for every new query.
> Should I do something else or is this expected?
>
> On a different note, I wanted to ask about how I can write my own rules. Is
> it possible to enforce following rules:
> - Only return results above a certain threshold when using GROUP BY, for
> example (HAVING COUNT(col1) > 100).
> - Restrict the column on which joins can happen, or else throw an error
> (almost like analysisException in Spark)
> - Restrict the columns that can be passed in a select statement or else
> throw an error (like in the point above)
>
> Not sure if it is feasible, please feel free to suggest. :) Thanks again
> for your time!
>
> With regards,
> Soumyadeep Mukhopadhyay.
>
>
> On Sun, 2 Apr 2023 at 8:59 PM, Oscar Mulin <askar.mu...@gmail.com> wrote:
>
> > Hey Soumyadeep,
> > I think that can work with a few caveats.
> > 0) Use the Planner from Frameworks
> > <
> >
> https://calcite.apache.org/javadocAggregate/org/apache/calcite/tools/Frameworks.html
> > >
> > 1) Parse the “template query" into a syntax tree (a root SqlNode) but do
> > not validate it
> > 2) Cast the root SqlNode into a SqlSelect
> > <
> >
> https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/SqlSelect.html
> > >
> > (can
> > verify that it's a SELECT by using getKind())
> > 3) Use the setters from SqlSelect to modify the children of root
> SqlSelect
> > node.
> > 4) Unparse the root query back into a string (SqlNode.unparse()).
> >
> > Now the tricky part is 3), as you have to set proper SqlNode types as
> > children of the SELECT node for column names, table names and for WHERE
> > predicate. I don't remember them properly, but what you can do is parse a
> > proper query (again, don't validate it as you'll then need table
> metadata),
> > and check the kinds (getKind()) of children SqlNode's and replace them
> with
> > your replacements using setters of root SqlSelect node.
> >
> > This should work, but I'm not 100% certain as I'm unable to check right
> > now.
> >
> > PS: Take care about dialects. Dialects are used in unparsing as a
> > "configuration" of SqlWriter, and can unparse the syntax tree differently
> > based on the dialect you have chosen.
> >
> > Kind regards,
> > Askar Bozcan
> >
> > On 2 Apr 2023, at 14:56, Soumyadeep Mukhopadhyay <soumyamy...@gmail.com>
> > wrote:
> >
> > Hello All,
> >
> > I have just heard of Apache Calcite and was exploring the possibilities.
> I
> > wish to achieve the following, and wanted to check if my hunch is
> correct:
> > - Use a template to build SQL queries, like use jinja-sql or even pebble
> > (interpret the SqlNode tree kind of structure from my template and then
> add
> > the necessary fields like table name and group by fields from an input)
> >
> > So what I am expecting is "SELECT ? FROM ? WHERE ?" would be inside a
> query
> > template (in Jinja-sql it may look like "SELECT {{select_fields |
> sqlsafe}}
> > FROM {{table_name | sqlsafe}} WHERE {{where_clause | sqlsafe}}" and
> values
> > like 'select_fields' would be substituted at run-time from a processing
> > engine like https://pypi.org/project/Jinja2/, but for Java) and the
> output
> > would be "SELECT col1 FROM table1 WHERE col1 IS NOT NULL" with some
> dialect
> > (like Snowflake or Big-Query).
> >
> > Is this possible? Any recommendations or suggestions are welcome. Even if
> > the approach feels wrong please let me know. :)
> > Thank you for your time and consideration.
> >
> > With regards,
> > Soumyadeep Mukhopadhyay.
> >
>

Re: Template SQL

Reply via email to