Re: Spark Distribution of Small Dataset

2016-01-28 Thread Kevin Mellott
, Jan 28, 2016 at 4:41 AM, Philip Lee <philjj...@gmail.com> wrote: > Hi, > > Simple Question about Spark Distribution of Small Dataset. > > Let's say I have 8 machine with 48 cores and 48GB of RAM as a cluster. > Dataset (format is ORC by Hive) is so small like 1GB, but I co

Spark Distribution of Small Dataset

2016-01-28 Thread Philip Lee
Hi, Simple Question about Spark Distribution of Small Dataset. Let's say I have 8 machine with 48 cores and 48GB of RAM as a cluster. Dataset (format is ORC by Hive) is so small like 1GB, but I copied it to HDFS. 1) if spark-sql run the dataset distributed on HDFS in each machine, what happens