Hi,
I am wondering the benefit of Drill over Spark SQL since both of them share a 
lot in sql optimization
1. DAG and Memory Computation and pipeline within stage
2. Support multiple data sources, like JSON, Parquet, CSV,TSV, HBase, Hive etc
3. code gen
4. Columnar storage
..

One major difference between Drill and Spark SQL is that Drill is based on MPP 
architecture while Spark SQL is based on MapReduce paradigm,

I am not familiar with MPP, could someone elaborate on how Drill embraces MPP? 
It looks to me Drill still need shuffle for operation like aggregation, join.

Reply via email to