RE: SparkSQL: How to specify replication factor on the persisted parquet files?

2015-06-09 Thread Haopu Wang
Subject: Re: SparkSQL: How to specify replication factor on the persisted parquet files? Then one possible workaround is to set dfs.replication in sc.hadoopConfiguration. However, this configuration is shared by all Spark jobs issued within the same application. Since different Spark jobs can

Re: SparkSQL: How to specify replication factor on the persisted parquet files?

2015-06-09 Thread ayan guha
Wang; user Subject: Re: SparkSQL: How to specify replication factor on the persisted parquet files? Then one possible workaround is to set dfs.replication in sc.hadoopConfiguration. However, this configuration is shared by all Spark jobs issued within the same application. Since different

RE: SparkSQL: How to specify replication factor on the persisted parquet files?

2015-06-08 Thread Haopu Wang
: Cheng Lian [mailto:lian.cs@gmail.com] Sent: Sunday, June 07, 2015 10:17 PM To: Haopu Wang; user Subject: Re: SparkSQL: How to specify replication factor on the persisted parquet files? Were you using HiveContext.setConf()? dfs.replication is a Hadoop configuration, but setConf() is only

Re: SparkSQL: How to specify replication factor on the persisted parquet files?

2015-06-08 Thread Cheng Lian
files. -Original Message- From: Cheng Lian [mailto:lian.cs@gmail.com] Sent: Sunday, June 07, 2015 10:17 PM To: Haopu Wang; user Subject: Re: SparkSQL: How to specify replication factor on the persisted parquet files? Were you using HiveContext.setConf()? dfs.replication is a Hadoop

Re: SparkSQL: How to specify replication factor on the persisted parquet files?

2015-06-07 Thread Cheng Lian
Were you using HiveContext.setConf()? dfs.replication is a Hadoop configuration, but setConf() is only used to set Spark SQL specific configurations. You may either set it in your Hadoop core-site.xml. Cheng On 6/2/15 2:28 PM, Haopu Wang wrote: Hi, I'm trying to save SparkSQL DataFrame