https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html You don't have to rely on single NN. You can specify a kind of "NN HA alias" and underlying HDFS client would connect to NN which is active right now. Thanks for pointing HADOOP_CONF_DIR, seems like it's the thing I need.
2017-03-26 14:31 GMT+02:00 Jianfeng (Jeff) Zhang <jzh...@hortonworks.com>: > > What do you mean non-reliable ? If you want to read/write 2 hadoop cluster > in one program, I am afraid this is the only way. It is impossible to > specify multiple HADOOP_CONF_DIR under one jvm classpath. Only one > default configuration will be used. > > > Best Regard, > Jeff Zhang > > > From: Serega Sheypak <serega.shey...@gmail.com> > Reply-To: "users@zeppelin.apache.org" <users@zeppelin.apache.org> > Date: Sunday, March 26, 2017 at 7:47 PM > To: "users@zeppelin.apache.org" <users@zeppelin.apache.org> > Subject: Re: Setting Zeppelin to work with multiple Hadoop clusters when > running Spark. > > I know it, thanks, but it's non reliable solution. > > 2017-03-26 5:23 GMT+02:00 Jianfeng (Jeff) Zhang <jzh...@hortonworks.com>: > >> >> You can try to specify the namenode address for hdfs file. e.g >> >> spark.read.csv(“hdfs://localhost:9009/file”) >> >> Best Regard, >> Jeff Zhang >> >> >> From: Serega Sheypak <serega.shey...@gmail.com> >> Reply-To: "users@zeppelin.apache.org" <users@zeppelin.apache.org> >> Date: Sunday, March 26, 2017 at 2:47 AM >> To: "users@zeppelin.apache.org" <users@zeppelin.apache.org> >> Subject: Setting Zeppelin to work with multiple Hadoop clusters when >> running Spark. >> >> Hi, I have three hadoop clusters. Each cluster has it's own NN HA >> configured and YARN. >> I want to allow user to read from ant cluster and write to any cluster. >> Also user should be able to choose where to run is spark job. >> What is the right way to configure it in Zeppelin? >> >> >