That totally depends on your disk IO and the number of CPUs that you have in the cluster. For example, if you are having a disk IO of 100MB/s and a handful of CPUs ( say 40 cores, on 10 machines), then it could take you to ~ 1GB/Sec i believe.
Thanks Best Regards On Tue, Apr 7, 2015 at 2:48 AM, Paolo Platter <paolo.plat...@agilelab.it> wrote: > Hi all, > > is there anyone using SparkSQL + Parquet that has made a benchmark > about storing parquet files on HDFS or on CFS ( Cassandra File System )? > What storage can improve performance of SparkSQL+ Parquet ? > > Thanks > > Paolo > >