Re: HdfsBolt and hdfs in HA mode

clay teahouse Thu, 19 Feb 2015 11:42:06 -0800

I am already using hdfsbolt successfully (without hdfs HA). So, I assume
this the client java class in already in my classpath if it comes with the
hadoop jar files that I load when I run my topology, unless there is jar
specific to the hadoop HA classes which would have
dfs.clent.failover.proxy.provider class.  I mean I don't need to take any
specific action aside from configuring my dfs-site.xml.


thanks,
Clay

On Thu, Feb 19, 2015 at 11:14 AM, Harsha <st...@harsha.io> wrote:

>
> Clay,
>      When you are using storm-hdfs connector you need to package
>      core-site.xml and hdfs-site.xml form you cluster into your topology
>      jar . You can configure the storm-hdfs bolt to pass nameserviceID
>
> HdfsBolt bolt = new HdfsBolt()
>            .withFsURL("hdfs://myNameserviceID")
>            .withFileNameFormat(fileNameformat)
>            .withRecordFormat(format)
>            .withRotationPolicy(rotationPolicy)
>            .withSynPolicy(syncPolicy);
>
> The above is all that needed to use namenode HA with your storm-hdfs.
>
> -Harsha
>
> On Thu, Feb 19, 2015, at 08:58 AM, Bobby Evans wrote:
> > Hadoop has lots of different configurations in core-site.xml,
> > hdfs-site.xml, ... all of which eventually get loaded into the
> > Configuration object used to create a FileSystem instance.  There are so
> > many different configurations related to security, HA, etc. that it is
> > almost impossible for me to guess exactly which ones you need to have set
> > correctly to make this work.  Typically what we do for storm to be able
> > to talk to HDFS is to package the complete set of configs that appear on
> > a Hadoop Gateway with the topology jar when it is shipped.  This
> > guarantees that the config is the same as on the gateway and should
> > behave the same way.  You can also grab them from the name node or any of
> > the hadoop compute nodes.
> >  This will work for the HdfsBolt that loads default configurations from
> the classpath before overriding them with any custom configurations you set
> for that bolt.
> >
> > - Bobby
> >
> >
> >      On Thursday, February 19, 2015 10:42 AM, clay teahouse
> >      <clayteaho...@gmail.com> wrote:
> >
> >
> >  Bobby,What do you mean by client here? In this context, do you consider
> >  hdfsbolt a client? If yes, then which configuration you are referring
> >  to? I've seen the following, but I am not sure if I follow.
> >
> >    - dfs.client.failover.proxy.provider.[nameservice ID] - the Java class
> >    that HDFS clients use to contact the Active NameNodeConfigure the name
> >    of the Java class which will be used by the DFS Client to determine
> >    which NameNode is the current Active, and therefore which NameNode is
> >    currently serving client requests. The only implementation which
> >    currently ships with Hadoop is the ConfiguredFailoverProxyProvider, so
> >    use this unless you are using a custom one. For example:   <property>
> >   <name>dfs.client.failover.proxy.provider.mycluster</name>
> >
>  
> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
> > </property>
> > thanks,Clay
> >
> > On Thu, Feb 19, 2015 at 8:38 AM, Bobby Evans
> > <ev...@yahoo-inc.com.invalid> wrote:
> >
> > HDFS HA provides fail-over for the name node and the client determines
> > which name node is the active one but should be completely transparent to
> > you if the client is configured correctly.
> >  - Bobby
> >
> >
> >      On Thursday, February 19, 2015 6:47 AM, clay teahouse <
> clayteaho...@gmail.com> wrote:
> >
> >
> >  Hi All,
> > Has anyone used HdfsBolt with hdfs in HA mode? How would you determine
> > which hdfs node is the active node?
> >
> > thanks
> > Clay
> >
> >
> >
> >
> >
> >
> >
>

Re: HdfsBolt and hdfs in HA mode

Reply via email to