Re: SparkSQL + Parquet performance

2015-04-14 Thread Akhil Das
That totally depends on your disk IO and the number of CPUs that you have in the cluster. For example, if you are having a disk IO of 100MB/s and a handful of CPUs ( say 40 cores, on 10 machines), then it could take you to ~ 1GB/Sec i believe. Thanks Best Regards On Tue, Apr 7, 2015 at 2:48 AM,

SparkSQL + Parquet performance

2015-04-06 Thread Paolo Platter
Hi all, is there anyone using SparkSQL + Parquet that has made a benchmark about storing parquet files on HDFS or on CFS ( Cassandra File System )? What storage can improve performance of SparkSQL+ Parquet ? Thanks Paolo