[ https://issues.apache.org/jira/browse/SPARK-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592845#comment-14592845 ]
Yin Huai commented on SPARK-6675: --------------------------------- [~invkrh] Can you try 1.4 or master and see if this problem still exists? > HiveContext setConf is not stable > --------------------------------- > > Key: SPARK-6675 > URL: https://issues.apache.org/jira/browse/SPARK-6675 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.3.0 > Environment: AWS ec2 xlarge2 cluster launched by spark's script > Reporter: Hao Ren > Priority: Critical > > I find HiveContext.setConf does not work correctly. Here are some code > snippets showing the problem: > snippet 1: > {code} > import org.apache.spark.sql.hive.HiveContext > import org.apache.spark.{SparkConf, SparkContext} > object Main extends App { > val conf = new SparkConf() > .setAppName("context-test") > .setMaster("local[8]") > val sc = new SparkContext(conf) > val hc = new HiveContext(sc) > hc.setConf("spark.sql.shuffle.partitions", "10") > hc.setConf("hive.metastore.warehouse.dir", > "/home/spark/hive/warehouse_test") > hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println > hc.getAllConfs filter(_._1.contains("shuffle.partitions")) foreach println > } > {code} > Results: > (hive.metastore.warehouse.dir,/home/spark/hive/warehouse_test) > (spark.sql.shuffle.partitions,10) > snippet 2: > {code} > ... > hc.setConf("hive.metastore.warehouse.dir", > "/home/spark/hive/warehouse_test") > hc.setConf("spark.sql.shuffle.partitions", "10") > hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println > hc.getAllConfs filter(_._1.contains("shuffle.partitions")) foreach println > ... > {code} > Results: > (hive.metastore.warehouse.dir,/user/hive/warehouse) > (spark.sql.shuffle.partitions,10) > You can see that I just permuted the two setConf call, then that leads to two > different Hive configuration. > It seems that HiveContext can not set a new value on > "hive.metastore.warehouse.dir" key in one or the first "setConf" call. > You need another "setConf" call before changing > "hive.metastore.warehouse.dir". For example, set > "hive.metastore.warehouse.dir" twice and the snippet 1 > snippet 3: > {code} > ... > hc.setConf("hive.metastore.warehouse.dir", > "/home/spark/hive/warehouse_test") > hc.setConf("hive.metastore.warehouse.dir", > "/home/spark/hive/warehouse_test") > hc.getAllConfs filter(_._1.contains("warehouse.dir")) foreach println > ... > {code} > Results: > (hive.metastore.warehouse.dir,/home/spark/hive/warehouse_test) > You can reproduce this if you move to the latest branch-1.3 (1.3.1-snapshot, > htag = 7d029cb1eb6f1df1bce1a3f5784fb7ce2f981a33) > I have also tested the released 1.3.0 (htag = > 4aaf48d46d13129f0f9bdafd771dd80fe568a7dc). It has the same problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org