Hi,

My Hadoop is configured to have replication ratio = 2. I've added 
$HADOOP_HOME/config to the PATH as suggested in 
http://apache-spark-user-list.1001560.n3.nabble.com/hdfs-replication-on-saving-RDD-td289.html.
 Spark (1.4) does rdd.saveAsTextFile with replication=2. However 
DataFrame.saveAsParquet is done with replication = 3. How can I force Spark 
Dataframe to save parquet files with replication factor other than 3 (default 
one)?

Best regards, Alexander

Reply via email to