The purpose of Calcite’s Spark Adapter is to circumvent Spark SQL and Catalyst entirely. Calcite parses the SQL, it optimizes it to create a physical plan that uses Spark relational operators, then converts that plan to a Spark program.
If you want to use Spark SQL and Catalyst that’s totally fine, but don’t use Calcite for those cases. Julian > On Mar 16, 2018, at 11:44 AM, Linan Zheng <[email protected]> wrote: > > Hi Everyone, > > My name is Linan Zheng and currently a senior CS student at Boston > University. I am fascinated by the idea of adding Apache Spark's > DataFrame/DataSet API support in Apache Calcite. Right now I am working on > the proposal which i hope that I can get some advice with. My question is > that since Spark has implement the Catalyst query optimizer in its Spark > SQL, how should I approach Catalyst's planning rules(logical and physical)? > And who should be in charge of the query optimization? Any advice and > corrections will be much appreciated and thank you guys for reading this > email. > > -- > Best Regard, > Linan Zheng
