Re: SparkSQL and star schema

2015-02-14 Thread Michael Armbrust
Yes. Though for good performance it is usually important to make sure that you have statistics for the smaller dimension tables. Today that can be done by creating them in the hive metastore and running ANALYZE TABLE table COMPUTE STATISTICS noscan. In Spark 1.3 this will happen automatically

SparkSQL and star schema

2015-02-13 Thread Paolo Platter
Hi, is SparkSQL + Parquet suitable to replicate a star schema ? Paolo Platter AgileLab CTO