Parquet number of partitions

Eric Eijkelenboom Tue, 05 May 2015 06:57:49 -0700

Hello guys

Q1: How does Spark determine the number of partitions when reading a Parquet 
file?


val df = sqlContext.parquetFile(path)

Is it some way related to the number of Parquet row groups in my input?

Q2: How can I reduce this number of partitions? Doing this:

df.rdd.coalesce(200).count

from the spark-shell causes job execution to hang… 

Any ideas? Thank you in advance. 

Eric
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Parquet number of partitions

Reply via email to