Cheng, thanks for the response. Yes, I was using HiveContext.setConf() to set "dfs.replication". However, I cannot change the value in Hadoop core-site.xml because that will change every HDFS file. I only want to change the replication factor of some specific files.
-----Original Message----- From: Cheng Lian [mailto:lian.cs....@gmail.com] Sent: Sunday, June 07, 2015 10:17 PM To: Haopu Wang; user Subject: Re: SparkSQL: How to specify replication factor on the persisted parquet files? Were you using HiveContext.setConf()? "dfs.replication" is a Hadoop configuration, but setConf() is only used to set Spark SQL specific configurations. You may either set it in your Hadoop core-site.xml. Cheng On 6/2/15 2:28 PM, Haopu Wang wrote: > Hi, > > I'm trying to save SparkSQL DataFrame to a persistent Hive table using > the default parquet data source. > > I don't know how to change the replication factor of the generated > parquet files on HDFS. > > I tried to set "dfs.replication" on HiveContext but that didn't work. > Any suggestions are appreciated very much! > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org