Re: Configuring Multiple Data Nodes on Pseudo-distributed mode ?

Harsh J Fri, 10 Jun 2011 22:49:24 -0700

Kumar,

Your config seems alright. That post described it for 0.21/trunk
scripts I believe. On 0.20.x based release, like CDH3, you can also
simply use the hadoop-daemon.sh to do it. Just have to mess with some
PID files.


Here's how I do it on my Mac to start 3 DNs:

$ ls conf*
conf conf.1 conf.2
$ hadoop-daemon.sh start datanode # Default
$ rm pids/hadoop-harsh-datanode.pid
$ hadoop-daemon.sh --config conf.1 start datanode # conf.1 DN
$ rm pids/hadoop-harsh-datanode.pid
$ hadoop-daemon.sh --config conf.2 start datanode # conf.2 DN

To kill any DN, jps/ps and find out which one it is you want to and
kill the PID displayed.

On Sat, Jun 11, 2011 at 5:34 AM, Kumar Kandasami
<kumaravel.kandas...@gmail.com> wrote:
> Thank you Harsh.
>
> I have been following the documentation in the mailing list, and have an
> issue starting the second data node (because of port conflict).
>
> - First, I don't see the bin/hdfs  in the directory ( I am using Mac m/c and
> installed hadoop using CDH3 tarball)
> - I am using the following command instead of the one mentioned in the step
> #3 in the mailing list.
>
> ./bin/hadoop-daemon.sh --config ../conf2 start datanode
>
> Error: datanode running as process 5981. Stop it first.
>
> - Port configuration in the hdfs-site.xml below.
>
> Data Node #1: Conf file
>
> <property>
>    <name>dfs.replication</name>
>    <value>1</value>
>  </property>
>  <property>
>     <name>dfs.permissions</name>
>     <value>false</value>
>  </property>
>
>  <property>
>     <!-- specify this so that running 'hadoop namenode -format' formats the
> right dir -->
>     <name>dfs.name.dir</name>
>     <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/name</value>
>  </property>
>
>  <property>
>     <name>dfs.data.dir</name>
>     <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/data</value>
>  </property>
>  <property>
>    <name>dfs.datanode.address</name>
>    <value>0.0.0.0:50010</value>
>  </property>
>
>  <property>
>    <name>dfs.datanode.ipc.address</name>
>    <value>0.0.0.0:50020</value>
>    <description>
>      The datanode ipc server address and port.
>      If the port is 0 then the server will start on a free port.
>    </description>
>  </property>
>
>  <property>
>    <name>dfs.datanode.http.address</name>
>    <value>0.0.0.0:50075</value>
>  </property>
>
>  <property>
>    <name>dfs.datanode.https.address</name>
>    <value>0.0.0.0:50475</value>
>  </property>
> </configuration>
>
> Data Node #2: Conf (2) file
>
> <property>
>    <name>dfs.replication</name>
>    <value>1</value>
>  </property>
>  <property>
>     <name>dfs.permissions</name>
>     <value>false</value>
>  </property>
>
>  <property>
>     <!-- specify this so that running 'hadoop namenode -format' formats the
> right dir -->
>     <name>dfs.name.dir</name>
>     <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/name</value>
>  </property>
>
>  <property>
>     <name>dfs.data.dir</name>
>     <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/data2</value>
>  </property>
>
>  <property>
>    <name>dfs.datanode.address</name>
>    <value>0.0.0.0:50012</value>
>  </property>
>
>  <property>
>    <name>dfs.datanode.ipc.address</name>
>    <value>0.0.0.0:50022</value>
>    <description>
>      The datanode ipc server address and port.
>      If the port is 0 then the server will start on a free port.
>    </description>
>  </property>
>
>  <property>
>    <name>dfs.datanode.http.address</name>
>    <value>0.0.0.0:50077</value>
>  </property>
>
>  <property>
>    <name>dfs.datanode.https.address</name>
>    <value>0.0.0.0:50477</value>
>  </property>
>
>
>
> Kumar    _/|\_
> www.saisk.com
> ku...@saisk.com
> "making a profound difference with knowledge and creativity..."
>
>
> On Fri, Jun 10, 2011 at 12:20 AM, Harsh J <ha...@cloudera.com> wrote:
>
>> Try using search-hadoop.com, its pretty kick-ass.
>>
>> Here's what you're seeking (Matt's reply in particular):
>>
>> http://search-hadoop.com/m/sApJY1zWgQV/multiple+datanodes&subj=Multiple+DataNodes+on+a+single+machine
>>
>> On Fri, Jun 10, 2011 at 9:04 AM, Kumar Kandasami
>> <kumaravel.kandas...@gmail.com> wrote:
>> > Environment: Mac 10.6.x.  Hadoop version: hadoop-0.20.2-cdh3u0
>> >
>> > Is there any good reference/link that provides configuration of
>> additional
>> > data-nodes on a single machine (in pseudo distributed mode).
>> >
>> >
>> > Thanks for the support.
>> >
>> >
>> > Kumar    _/|\_
>> > www.saisk.com
>> > ku...@saisk.com
>> > "making a profound difference with knowledge and creativity..."
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>



-- 
Harsh J

Re: Configuring Multiple Data Nodes on Pseudo-distributed mode ?

Reply via email to