+1 And since we are starting this refactory for CarbonData 2.0 which is a major version upgrade, I suggest to consider optimize following features: 1. make global dictionary obsolete so that planning phase is cleaner. After spark tungsten project, actually the benefit get from global dictionary is not much. 2. make "stored by" syntax obsolete, thus making CREATE TABLE DDL fully comply to Hive and SparkSQL syntax, keeping only "stored as" and "using" syntax.
Regards, Jacky On 2019/08/22 04:58:53, Ajith shetty <[email protected]> wrote: > Hi Community > > From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides > SparkSessionExtensions in order to extended capabilities of spark. Carbon can > use this in order to avoid the tight coupling due to CarbonSession in spark > environment. > https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html > > Main Scope: > 1. Compatible with Spark 2.3.2+ > 2. Make Carbon Parser Pluggable > a. Move to antlr4 based parser > 3. Make Analyzer Rules Pluggable > 4. Make Optimizer Rules Pluggable > 5. Make Planning Strategies Pluggable > > We can have Sub jiras in order to cover all the scenarios due to this. Please > input your thoughts. > > Regards >
