HAO REN created SPARK-6675: ------------------------------ Summary: HiveContext setConf seems not stable Key: SPARK-6675 URL: https://issues.apache.org/jira/browse/SPARK-6675 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.3.0 Environment: AWS ec2 xlarge2 cluster launched by spark's script Reporter: HAO REN
I find HiveContext.setConf does not work correctly. Here are some code snippets showing the problem: snippet 1: ---------------------------------------------------------------------------------------------------------------- import org.apache.spark.sql.hive.HiveContext import org.apache.spark.{SparkConf, SparkContext} object Main extends App { val conf = new SparkConf() .setAppName("context-test") .setMaster("local[8]") val sc = new SparkContext(conf) val hc = new HiveContext(sc) hc.setConf("spark.sql.shuffle.partitions", "10") hc.setConf("hive.metastore.warehouse.dir", "/home/spark/hive/warehouse_test") hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println hc.getAllConfs filter(_._1.contains("shuffle.partitions")) foreach println } ---------------------------------------------------------------------------------------------------------------- Results: (hive.metastore.warehouse.dir,/home/spark/hive/warehouse_test) (spark.sql.shuffle.partitions,10) snippet 2: ---------------------------------------------------------------------------------------------------------------- ... hc.setConf("hive.metastore.warehouse.dir", "/home/spark/hive/warehouse_test") hc.setConf("spark.sql.shuffle.partitions", "10") hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println hc.getAllConfs filter(_._1.contains("shuffle.partitions")) foreach println ... ---------------------------------------------------------------------------------------------------------------- Results: (hive.metastore.warehouse.dir,/user/hive/warehouse) (spark.sql.shuffle.partitions,10) You can see that I just permuted the two setConf call, then that leads to two different Hive configuration. It seems that HiveContext can not set a new value on "hive.metastore.warehouse.dir" key in one or the first "setConf" call. You need another "setConf" call before changing "hive.metastore.warehouse.dir". For example, set "hive.metastore.warehouse.dir" twice and the snippet 1 snippet 3: ---------------------------------------------------------------------------------------------------------------- ... hc.setConf("hive.metastore.warehouse.dir", "/home/spark/hive/warehouse_test") hc.setConf("hive.metastore.warehouse.dir", "/home/spark/hive/warehouse_test") hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println ... ---------------------------------------------------------------------------------------------------------------- Results: (hive.metastore.warehouse.dir,/home/spark/hive/warehouse_test) You can reproduce this if you move to the latest branch-1.3 (1.3.1-snapshot, htag = 7d029cb1eb6f1df1bce1a3f5784fb7ce2f981a33) I have also tested the released 1.3.0 (htag = 4aaf48d46d13129f0f9bdafd771dd80fe568a7dc). It has the same problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org