I am already using hdfsbolt successfully (without hdfs HA). So, I assume this the client java class in already in my classpath if it comes with the hadoop jar files that I load when I run my topology, unless there is jar specific to the hadoop HA classes which would have dfs.clent.failover.proxy.provider class. I mean I don't need to take any specific action aside from configuring my dfs-site.xml.
thanks, Clay On Thu, Feb 19, 2015 at 11:14 AM, Harsha <st...@harsha.io> wrote: > > Clay, > When you are using storm-hdfs connector you need to package > core-site.xml and hdfs-site.xml form you cluster into your topology > jar . You can configure the storm-hdfs bolt to pass nameserviceID > > HdfsBolt bolt = new HdfsBolt() > .withFsURL("hdfs://myNameserviceID") > .withFileNameFormat(fileNameformat) > .withRecordFormat(format) > .withRotationPolicy(rotationPolicy) > .withSynPolicy(syncPolicy); > > The above is all that needed to use namenode HA with your storm-hdfs. > > -Harsha > > On Thu, Feb 19, 2015, at 08:58 AM, Bobby Evans wrote: > > Hadoop has lots of different configurations in core-site.xml, > > hdfs-site.xml, ... all of which eventually get loaded into the > > Configuration object used to create a FileSystem instance. There are so > > many different configurations related to security, HA, etc. that it is > > almost impossible for me to guess exactly which ones you need to have set > > correctly to make this work. Typically what we do for storm to be able > > to talk to HDFS is to package the complete set of configs that appear on > > a Hadoop Gateway with the topology jar when it is shipped. This > > guarantees that the config is the same as on the gateway and should > > behave the same way. You can also grab them from the name node or any of > > the hadoop compute nodes. > > This will work for the HdfsBolt that loads default configurations from > the classpath before overriding them with any custom configurations you set > for that bolt. > > > > - Bobby > > > > > > On Thursday, February 19, 2015 10:42 AM, clay teahouse > > <clayteaho...@gmail.com> wrote: > > > > > > Bobby,What do you mean by client here? In this context, do you consider > > hdfsbolt a client? If yes, then which configuration you are referring > > to? I've seen the following, but I am not sure if I follow. > > > > - dfs.client.failover.proxy.provider.[nameservice ID] - the Java class > > that HDFS clients use to contact the Active NameNodeConfigure the name > > of the Java class which will be used by the DFS Client to determine > > which NameNode is the current Active, and therefore which NameNode is > > currently serving client requests. The only implementation which > > currently ships with Hadoop is the ConfiguredFailoverProxyProvider, so > > use this unless you are using a custom one. For example: <property> > > <name>dfs.client.failover.proxy.provider.mycluster</name> > > > > <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> > > </property> > > thanks,Clay > > > > On Thu, Feb 19, 2015 at 8:38 AM, Bobby Evans > > <ev...@yahoo-inc.com.invalid> wrote: > > > > HDFS HA provides fail-over for the name node and the client determines > > which name node is the active one but should be completely transparent to > > you if the client is configured correctly. > > - Bobby > > > > > > On Thursday, February 19, 2015 6:47 AM, clay teahouse < > clayteaho...@gmail.com> wrote: > > > > > > Hi All, > > Has anyone used HdfsBolt with hdfs in HA mode? How would you determine > > which hdfs node is the active node? > > > > thanks > > Clay > > > > > > > > > > > > > > >