Hi, I have some strange behaviour when using textFile to read some data from HDFS in spark 0.9.1. I get UnknownHost exceptions, where hadoop client tries to resolve the dfs.nameservices and fails.
So far: - this has been tested inside the shell - the exact same code works with spark-0.8.1 - the shell is launched with HADOOP_CONF_DIR pointing to our HA conf - if before that some other rdd is created from HDFS and succeeds than, this works also (might be related in the way the default hadoop configuration is being shared?) - if using the new MR API it works sc.newAPIHadoopFile(path, classOf[TextInputFormat], classOf[LongWritable], classOf[Text], sc.hadoopConfiguration).map(_._2.toString) Hadoop disitribution: 2.0.0-cdh4.1.2 Spark 0.9.1 - packaged with correct version of hadoop Eugen