Re: Secondary Namenode on hadoop 0.20.205 ?

2011-12-26 Thread Harsh J
Yes, checkpoints are helpful when your original NN image goes corrupt (very 
very rare, if you use dual or more dfs.name.dir points to be safe).

On 27-Dec-2011, at 12:33 PM, praveenesh kumar wrote:

> Cool.
> I just did that..
> So now I am seeing my fsimage file on SNN's hadoop.tmp.dir...
> So incase my NN went down.. I can take this image file from SNN and paste
> it  at NN's *dfs.name.dir/current/fsimage *
> and I can have NN up based on last snapshot that SNN had, right ?
> 
> Thanks,
> Praveenesh
> 
> On Tue, Dec 27, 2011 at 12:20 PM, Harsh J  wrote:
>> The link Uma passed already covered that question:
> http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/[dfs.http.address
> in hdfs-site.xml pointing to NN_HOST:50070 should d.]
>> 
>> Also, if you are using the tarball start/stop scripts, putting in the
> hostname for SNN in the conf/masters list is sufficient to get it
> auto-started there.
>> 
>> On 27-Dec-2011, at 11:36 AM, praveenesh kumar wrote:
>> 
>>> Thanks..But, my 1st question is still unanswered.
>>> I have a 8 DN/TT machines and 1 NN machine.
>>> I want to set one of my DN/TT machine as SNN.
>>> How I have to configure my conf/*.xml files to achieve this ?
>>> 
>>> Thanks,
>>> Praveenesh
>>> 
>>> On Mon, Dec 26, 2011 at 8:44 PM, Harsh J  wrote:
 (Answering beyond Uma's reply)
 
> Can a DN also act as SNN, any pros and cons of having this
> configuration ?
 
 You can run SNN on a regular slave box if you can't have a dedicate a
> box, it shouldn't be an issue for small clusters -- Do ensure its disk
> configuration is proper, and its allocated near to the same heap as the
> NameNode is.
 
 For large clusters where the fsimage and periodic edits file sizes are
> larger, it would be worth placing it on a separate box given SNN's
> interactions.
 
 On 26-Dec-2011, at 7:53 PM, Uma Maheswara Rao G wrote:
 
> Hey Praveenesh,
> 
> You can start secondary namenode also by just giving the option
> ./hadoop secondarynamenode
> 
> DN can not act as seconday namenode. The basic work for seconday
> namenode is to do checkpointing and getting the edits insync with Namenode
> till last checkpointing period. DN is to store the real data blocks
> physically.
> you need to configure correct namenode http address also for the
> secondaryNN, so that it can connect NN for checkpointing operations.
> 
> http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html#Secondary+NameNode
> You can configure secondary node IP in masters file, start-dfs.sh
> itself will start the SNN automatically as it starts DN and NN as well.
> 
> also you can see
> http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/
> 
> Regards,
> Uma
> 
> From: praveenesh kumar [praveen...@gmail.com]
> Sent: Monday, December 26, 2011 5:05 PM
> To: common-user@hadoop.apache.org
> Subject: Secondary Namenode on hadoop 0.20.205 ?
> 
> Hey people,
> 
> How can we setup another machine in the cluster as Secondary Namenode
> in hadoop 0.20.205 ?
> Can a DN also act as SNN, any pros and cons of having this
> configuration ?
> 
> Thanks,
> Praveenesh
 
>> 



Re: Secondary Namenode on hadoop 0.20.205 ?

2011-12-26 Thread praveenesh kumar
Cool.
I just did that..
So now I am seeing my fsimage file on SNN's hadoop.tmp.dir...
So incase my NN went down.. I can take this image file from SNN and paste
it  at NN's *dfs.name.dir/current/fsimage *
and I can have NN up based on last snapshot that SNN had, right ?

Thanks,
Praveenesh

On Tue, Dec 27, 2011 at 12:20 PM, Harsh J  wrote:
> The link Uma passed already covered that question:
http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/[dfs.http.address
in hdfs-site.xml pointing to NN_HOST:50070 should d.]
>
> Also, if you are using the tarball start/stop scripts, putting in the
hostname for SNN in the conf/masters list is sufficient to get it
auto-started there.
>
> On 27-Dec-2011, at 11:36 AM, praveenesh kumar wrote:
>
>> Thanks..But, my 1st question is still unanswered.
>> I have a 8 DN/TT machines and 1 NN machine.
>> I want to set one of my DN/TT machine as SNN.
>> How I have to configure my conf/*.xml files to achieve this ?
>>
>> Thanks,
>> Praveenesh
>>
>> On Mon, Dec 26, 2011 at 8:44 PM, Harsh J  wrote:
>>> (Answering beyond Uma's reply)
>>>
 Can a DN also act as SNN, any pros and cons of having this
configuration ?
>>>
>>> You can run SNN on a regular slave box if you can't have a dedicate a
box, it shouldn't be an issue for small clusters -- Do ensure its disk
configuration is proper, and its allocated near to the same heap as the
NameNode is.
>>>
>>> For large clusters where the fsimage and periodic edits file sizes are
larger, it would be worth placing it on a separate box given SNN's
interactions.
>>>
>>> On 26-Dec-2011, at 7:53 PM, Uma Maheswara Rao G wrote:
>>>
 Hey Praveenesh,

  You can start secondary namenode also by just giving the option
./hadoop secondarynamenode

 DN can not act as seconday namenode. The basic work for seconday
namenode is to do checkpointing and getting the edits insync with Namenode
till last checkpointing period. DN is to store the real data blocks
physically.
  you need to configure correct namenode http address also for the
secondaryNN, so that it can connect NN for checkpointing operations.

http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html#Secondary+NameNode
 You can configure secondary node IP in masters file, start-dfs.sh
itself will start the SNN automatically as it starts DN and NN as well.

 also you can see
http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/

 Regards,
 Uma
 
 From: praveenesh kumar [praveen...@gmail.com]
 Sent: Monday, December 26, 2011 5:05 PM
 To: common-user@hadoop.apache.org
 Subject: Secondary Namenode on hadoop 0.20.205 ?

 Hey people,

 How can we setup another machine in the cluster as Secondary Namenode
 in hadoop 0.20.205 ?
 Can a DN also act as SNN, any pros and cons of having this
configuration ?

 Thanks,
 Praveenesh
>>>
>


Re: Secondary Namenode on hadoop 0.20.205 ?

2011-12-26 Thread Harsh J
The link Uma passed already covered that question: 
http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/
 [dfs.http.address in hdfs-site.xml pointing to NN_HOST:50070 should do.]

Also, if you are using the tarball start/stop scripts, putting in the hostname 
for SNN in the conf/masters list is sufficient to get it auto-started there.

On 27-Dec-2011, at 11:36 AM, praveenesh kumar wrote:

> Thanks..But, my 1st question is still unanswered.
> I have a 8 DN/TT machines and 1 NN machine.
> I want to set one of my DN/TT machine as SNN.
> How I have to configure my conf/*.xml files to achieve this ?
> 
> Thanks,
> Praveenesh
> 
> On Mon, Dec 26, 2011 at 8:44 PM, Harsh J  wrote:
>> (Answering beyond Uma's reply)
>> 
>>> Can a DN also act as SNN, any pros and cons of having this configuration ?
>> 
>> You can run SNN on a regular slave box if you can't have a dedicate a box, 
>> it shouldn't be an issue for small clusters -- Do ensure its disk 
>> configuration is proper, and its allocated near to the same heap as the 
>> NameNode is.
>> 
>> For large clusters where the fsimage and periodic edits file sizes are 
>> larger, it would be worth placing it on a separate box given SNN's 
>> interactions.
>> 
>> On 26-Dec-2011, at 7:53 PM, Uma Maheswara Rao G wrote:
>> 
>>> Hey Praveenesh,
>>> 
>>>  You can start secondary namenode also by just giving the option ./hadoop 
>>> secondarynamenode
>>> 
>>> DN can not act as seconday namenode. The basic work for seconday namenode 
>>> is to do checkpointing and getting the edits insync with Namenode till last 
>>> checkpointing period. DN is to store the real data blocks physically.
>>>  you need to configure correct namenode http address also for the 
>>> secondaryNN, so that it can connect NN for checkpointing operations.
>>> http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html#Secondary+NameNode
>>> You can configure secondary node IP in masters file, start-dfs.sh itself 
>>> will start the SNN automatically as it starts DN and NN as well.
>>> 
>>> also you can see 
>>> http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/
>>> 
>>> Regards,
>>> Uma
>>> 
>>> From: praveenesh kumar [praveen...@gmail.com]
>>> Sent: Monday, December 26, 2011 5:05 PM
>>> To: common-user@hadoop.apache.org
>>> Subject: Secondary Namenode on hadoop 0.20.205 ?
>>> 
>>> Hey people,
>>> 
>>> How can we setup another machine in the cluster as Secondary Namenode
>>> in hadoop 0.20.205 ?
>>> Can a DN also act as SNN, any pros and cons of having this configuration ?
>>> 
>>> Thanks,
>>> Praveenesh
>> 



Re: Secondary Namenode on hadoop 0.20.205 ?

2011-12-26 Thread praveenesh kumar
Thanks..But, my 1st question is still unanswered.
I have a 8 DN/TT machines and 1 NN machine.
I want to set one of my DN/TT machine as SNN.
How I have to configure my conf/*.xml files to achieve this ?

Thanks,
Praveenesh

On Mon, Dec 26, 2011 at 8:44 PM, Harsh J  wrote:
> (Answering beyond Uma's reply)
>
>> Can a DN also act as SNN, any pros and cons of having this configuration ?
>
> You can run SNN on a regular slave box if you can't have a dedicate a box, it 
> shouldn't be an issue for small clusters -- Do ensure its disk configuration 
> is proper, and its allocated near to the same heap as the NameNode is.
>
> For large clusters where the fsimage and periodic edits file sizes are 
> larger, it would be worth placing it on a separate box given SNN's 
> interactions.
>
> On 26-Dec-2011, at 7:53 PM, Uma Maheswara Rao G wrote:
>
>> Hey Praveenesh,
>>
>>  You can start secondary namenode also by just giving the option ./hadoop 
>> secondarynamenode
>>
>> DN can not act as seconday namenode. The basic work for seconday namenode is 
>> to do checkpointing and getting the edits insync with Namenode till last 
>> checkpointing period. DN is to store the real data blocks physically.
>>  you need to configure correct namenode http address also for the 
>> secondaryNN, so that it can connect NN for checkpointing operations.
>> http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html#Secondary+NameNode
>> You can configure secondary node IP in masters file, start-dfs.sh itself 
>> will start the SNN automatically as it starts DN and NN as well.
>>
>> also you can see 
>> http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/
>>
>> Regards,
>> Uma
>> 
>> From: praveenesh kumar [praveen...@gmail.com]
>> Sent: Monday, December 26, 2011 5:05 PM
>> To: common-user@hadoop.apache.org
>> Subject: Secondary Namenode on hadoop 0.20.205 ?
>>
>> Hey people,
>>
>> How can we setup another machine in the cluster as Secondary Namenode
>> in hadoop 0.20.205 ?
>> Can a DN also act as SNN, any pros and cons of having this configuration ?
>>
>> Thanks,
>> Praveenesh
>


anyone upgrade from 0.21 to 0.22?

2011-12-26 Thread steven zhuang
hello all,
I have a running hadoop 0.21.0 cluster, since 0.22 is the stable version
and we don't use Kerberos. just want to be sure, has anybody upgrade from
hadoop 0.21 to 0.22 siccessfully?
any hint would be greatly appreciated!



-- 
best wishes.
Steven


Re: How to remove ending tab separator in streaming output

2011-12-26 Thread Mahadev Konar
Bccing common-user and ccing mapred-user. Please use the correct
mailing lists for your questions.

You can use -Dstream.map.output.field.separator=
 for specifying the seperator.

  The link below should have more information.

http://hadoop.apache.org/common/docs/r0.20.205.0/streaming.html#Customizing+How+Lines+are+Split+into+Key%2FValue+Pairs

Hope that helps.

thanks
mahadev


On Mon, Dec 26, 2011 at 2:42 AM, devdoer bird  wrote:
>
> HI:
>
> In streming MR program, I use /bin/cat as a mapper and set reducer=NONE,
>  but the outputs all end with a tab separator. How do I configure the
> streaming MR program to remove the ending separator?
>
> Thanks.


Re: Hadoop configuration

2011-12-26 Thread Humayun kabir
Hi Uma,
Thanks a lot. At last it is running without errors. Thank you very much for
your suggestion.

On 26 December 2011 20:04, Uma Maheswara Rao G  wrote:

> Hey Humayun,
>  Looks your hostname still not resoling properly. even though you
> configured hostnames as master, slave...etc, it is getting humayun as
> hostname.
> just edit /etc/HOSTNAME file with correct hostname what you are expecting
> here.
> To confirm whether it is resolving properly or not, you can just do below
> steps
>#hostname
>  //should get hostname here
> correctly ( ex: master)
>   #hostname -i
>  ..//should resolve correct IP
> here   ... ( ex: master ip)
>
>
> and make sure slave and slave1 sre pingable each other.
>
> Regards,
> Uma
>
> 
> From: Humayun kabir [humayun0...@gmail.com]
> Sent: Saturday, December 24, 2011 9:51 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Hadoop configuration
>
> i've checked my log files. But i don't understand to why this error occurs.
> here i my logs files. please give me some suggestion.
>
> jobtracker.log < http://paste.ubuntu.com/781181/ >
>
> namenode.log < http://paste.ubuntu.com/781183/ >
>
> datanode.log(1st machine) < http://paste.ubuntu.com/781176/ >
>
> datanode.log(2nd machine)  < http://paste.ubuntu.com/781195
> /
> >
>
> tasktracker.log(1st machine) < http://paste.ubuntu.com/781192/ >
>
> tasktracker.log(2nd machine) < http://paste.ubuntu.com/781197/ >
>
>
>
> On 24 December 2011 15:26, Joey Krabacher  wrote:
>
> > have you checked your log files for any clues?
> >
> > --Joey
> >
> > On Sat, Dec 24, 2011 at 3:15 AM, Humayun kabir 
> > wrote:
> > > Hi Uma,
> > >
> > > Thank you very much for your tips. We tried it in 3 nodes in virtual
> box
> > as
> > > you suggested. But still we are facing problem. Here is our all
> > > configuration file to all nodes. please take a look and show us some
> ways
> > > to solve it. It was nice and it would be great if you help us in this
> > > regards.
> > >
> > > core-site.xml < http://pastebin.com/Twn5edrp >
> > > hdfs-site.xml < http://pastebin.com/k4hR4GE9 >
> > > mapred-site.xml < http://pastebin.com/gZuyHswS >
> > >
> > > /etc/hosts < http://pastebin.com/5s0yhgnj >
> > >
> > > output < http://paste.ubuntu.com/780807/ >
> > >
> > >
> > > Hope you will understand and extend your helping hand towards us.
> > >
> > > Have a nice day.
> > >
> > > Regards
> > > Humayun
> > >
> > > On 23 December 2011 17:31, Uma Maheswara Rao G 
> > wrote:
> > >
> > >> Hi Humayun ,
> > >>
> > >>  Lets assume you have JT, TT1, TT2, TT3
> > >>
> > >>  Now you should configure the \etc\hosts like below examle
> > >>
> > >>  10.18.xx.1 JT
> > >>
> > >>  10.18.xx.2 TT1
> > >>
> > >>  10.18.xx.3 TT2
> > >>
> > >>  10.18.xx.4 TT3
> > >>
> > >>   Configure the same set in all the machines, so that all task
> trackers
> > >> can talk each other with hostnames correctly. Also pls remove some
> > entries
> > >> from your files
> > >>
> > >>   127.0.0.1 localhost.localdomain localhost
> > >>
> > >>   127.0.1.1 humayun
> > >>
> > >>
> > >>
> > >> I have seen others already suggested many links for the regular
> > >> configuration items. Hope you might clear about them.
> > >>
> > >> hope it will help...
> > >>
> > >> Regards,
> > >>
> > >> Uma
> > >>
> > >> 
> > >>
> > >> From: Humayun kabir [humayun0...@gmail.com]
> > >> Sent: Thursday, December 22, 2011 10:34 PM
> > >> To: common-user@hadoop.apache.org; Uma Maheswara Rao G
> > >> Subject: Re: Hadoop configuration
> > >>
> > >> Hello Uma,
> > >>
> > >> Thanks for your cordial and quick reply. It would be great if you
> > explain
> > >> what you suggested to do. Right now we are running on following
> > >> configuration.
> > >>
> > >> We are using hadoop on virtual box. when it is a single node then it
> > works
> > >> fine for big dataset larger than the default block size. but in case
> of
> > >> multinode cluster (2 nodes) we are facing some problems. We are able
> to
> > >> ping both "Master->Slave" and "Slave->Master".
> > >> Like when the input dataset is smaller than the default block size(64
> > MB)
> > >> then it works fine. but when the input dataset is larger than the
> > default
> > >> block size then it shows ‘too much fetch failure’ in reduce state.
> > >> here is the output link
> > >> http://paste.ubuntu.com/707517/
> > >>
> > >> this is our /etc/hosts file
> > >>
> > >> 192.168.60.147 humayun # Added by NetworkManager
> > >> 127.0.0.1 localhost.localdomain localhost
> > >> ::1 humayun localhost6.localdomain6 localhost6
> > >> 127.0.1.1 humayun
> > >>
> > >> # The following lines are desirable for IPv6 capable hosts
> > >> ::1 localhost ip6-localhost ip6-loopback
> > >> fe00::0 ip6-localnet
> > >> ff00::0 ip6-mcastprefix
> > >> ff02::1 ip6-

Re: Secondary Namenode on hadoop 0.20.205 ?

2011-12-26 Thread Harsh J
(Answering beyond Uma's reply)

> Can a DN also act as SNN, any pros and cons of having this configuration ?

You can run SNN on a regular slave box if you can't have a dedicate a box, it 
shouldn't be an issue for small clusters -- Do ensure its disk configuration is 
proper, and its allocated near to the same heap as the NameNode is.

For large clusters where the fsimage and periodic edits file sizes are larger, 
it would be worth placing it on a separate box given SNN's interactions.

On 26-Dec-2011, at 7:53 PM, Uma Maheswara Rao G wrote:

> Hey Praveenesh,
> 
>  You can start secondary namenode also by just giving the option ./hadoop 
> secondarynamenode
> 
> DN can not act as seconday namenode. The basic work for seconday namenode is 
> to do checkpointing and getting the edits insync with Namenode till last 
> checkpointing period. DN is to store the real data blocks physically.
>  you need to configure correct namenode http address also for the 
> secondaryNN, so that it can connect NN for checkpointing operations. 
> http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html#Secondary+NameNode
> You can configure secondary node IP in masters file, start-dfs.sh itself will 
> start the SNN automatically as it starts DN and NN as well.
> 
> also you can see 
> http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/
> 
> Regards,
> Uma
> 
> From: praveenesh kumar [praveen...@gmail.com]
> Sent: Monday, December 26, 2011 5:05 PM
> To: common-user@hadoop.apache.org
> Subject: Secondary Namenode on hadoop 0.20.205 ?
> 
> Hey people,
> 
> How can we setup another machine in the cluster as Secondary Namenode
> in hadoop 0.20.205 ?
> Can a DN also act as SNN, any pros and cons of having this configuration ?
> 
> Thanks,
> Praveenesh



RE: Secondary Namenode on hadoop 0.20.205 ?

2011-12-26 Thread Uma Maheswara Rao G
Hey Praveenesh,
  
  You can start secondary namenode also by just giving the option ./hadoop 
secondarynamenode
 
DN can not act as seconday namenode. The basic work for seconday namenode is to 
do checkpointing and getting the edits insync with Namenode till last 
checkpointing period. DN is to store the real data blocks physically.
  you need to configure correct namenode http address also for the secondaryNN, 
so that it can connect NN for checkpointing operations. 
 
http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html#Secondary+NameNode
You can configure secondary node IP in masters file, start-dfs.sh itself will 
start the SNN automatically as it starts DN and NN as well.

also you can see 
http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/

Regards,
Uma

From: praveenesh kumar [praveen...@gmail.com]
Sent: Monday, December 26, 2011 5:05 PM
To: common-user@hadoop.apache.org
Subject: Secondary Namenode on hadoop 0.20.205 ?

Hey people,

How can we setup another machine in the cluster as Secondary Namenode
in hadoop 0.20.205 ?
Can a DN also act as SNN, any pros and cons of having this configuration ?

Thanks,
Praveenesh


RE: Hadoop configuration

2011-12-26 Thread Uma Maheswara Rao G
Hey Humayun,
 Looks your hostname still not resoling properly. even though you configured 
hostnames as master, slave...etc, it is getting humayun as hostname. 
just edit /etc/HOSTNAME file with correct hostname what you are expecting here.
To confirm whether it is resolving properly or not, you can just do below steps
#hostname
  //should get hostname here 
correctly ( ex: master)
   #hostname -i
  ..//should resolve correct IP here   
... ( ex: master ip)


and make sure slave and slave1 sre pingable each other.

Regards,
Uma


From: Humayun kabir [humayun0...@gmail.com]
Sent: Saturday, December 24, 2011 9:51 PM
To: common-user@hadoop.apache.org
Subject: Re: Hadoop configuration

i've checked my log files. But i don't understand to why this error occurs.
here i my logs files. please give me some suggestion.

jobtracker.log < http://paste.ubuntu.com/781181/ >

namenode.log < http://paste.ubuntu.com/781183/ >

datanode.log(1st machine) < http://paste.ubuntu.com/781176/ >

datanode.log(2nd machine)  < http://paste.ubuntu.com/781195/
>

tasktracker.log(1st machine) < http://paste.ubuntu.com/781192/ >

tasktracker.log(2nd machine) < http://paste.ubuntu.com/781197/ >



On 24 December 2011 15:26, Joey Krabacher  wrote:

> have you checked your log files for any clues?
>
> --Joey
>
> On Sat, Dec 24, 2011 at 3:15 AM, Humayun kabir 
> wrote:
> > Hi Uma,
> >
> > Thank you very much for your tips. We tried it in 3 nodes in virtual box
> as
> > you suggested. But still we are facing problem. Here is our all
> > configuration file to all nodes. please take a look and show us some ways
> > to solve it. It was nice and it would be great if you help us in this
> > regards.
> >
> > core-site.xml < http://pastebin.com/Twn5edrp >
> > hdfs-site.xml < http://pastebin.com/k4hR4GE9 >
> > mapred-site.xml < http://pastebin.com/gZuyHswS >
> >
> > /etc/hosts < http://pastebin.com/5s0yhgnj >
> >
> > output < http://paste.ubuntu.com/780807/ >
> >
> >
> > Hope you will understand and extend your helping hand towards us.
> >
> > Have a nice day.
> >
> > Regards
> > Humayun
> >
> > On 23 December 2011 17:31, Uma Maheswara Rao G 
> wrote:
> >
> >> Hi Humayun ,
> >>
> >>  Lets assume you have JT, TT1, TT2, TT3
> >>
> >>  Now you should configure the \etc\hosts like below examle
> >>
> >>  10.18.xx.1 JT
> >>
> >>  10.18.xx.2 TT1
> >>
> >>  10.18.xx.3 TT2
> >>
> >>  10.18.xx.4 TT3
> >>
> >>   Configure the same set in all the machines, so that all task trackers
> >> can talk each other with hostnames correctly. Also pls remove some
> entries
> >> from your files
> >>
> >>   127.0.0.1 localhost.localdomain localhost
> >>
> >>   127.0.1.1 humayun
> >>
> >>
> >>
> >> I have seen others already suggested many links for the regular
> >> configuration items. Hope you might clear about them.
> >>
> >> hope it will help...
> >>
> >> Regards,
> >>
> >> Uma
> >>
> >> 
> >>
> >> From: Humayun kabir [humayun0...@gmail.com]
> >> Sent: Thursday, December 22, 2011 10:34 PM
> >> To: common-user@hadoop.apache.org; Uma Maheswara Rao G
> >> Subject: Re: Hadoop configuration
> >>
> >> Hello Uma,
> >>
> >> Thanks for your cordial and quick reply. It would be great if you
> explain
> >> what you suggested to do. Right now we are running on following
> >> configuration.
> >>
> >> We are using hadoop on virtual box. when it is a single node then it
> works
> >> fine for big dataset larger than the default block size. but in case of
> >> multinode cluster (2 nodes) we are facing some problems. We are able to
> >> ping both "Master->Slave" and "Slave->Master".
> >> Like when the input dataset is smaller than the default block size(64
> MB)
> >> then it works fine. but when the input dataset is larger than the
> default
> >> block size then it shows ‘too much fetch failure’ in reduce state.
> >> here is the output link
> >> http://paste.ubuntu.com/707517/
> >>
> >> this is our /etc/hosts file
> >>
> >> 192.168.60.147 humayun # Added by NetworkManager
> >> 127.0.0.1 localhost.localdomain localhost
> >> ::1 humayun localhost6.localdomain6 localhost6
> >> 127.0.1.1 humayun
> >>
> >> # The following lines are desirable for IPv6 capable hosts
> >> ::1 localhost ip6-localhost ip6-loopback
> >> fe00::0 ip6-localnet
> >> ff00::0 ip6-mcastprefix
> >> ff02::1 ip6-allnodes
> >> ff02::2 ip6-allrouters
> >> ff02::3 ip6-allhosts
> >>
> >> 192.168.60.1 master
> >> 192.168.60.2 slave
> >>
> >>
> >> Regards,
> >>
> >> -Humayun.
> >>
> >>
> >> On 22 December 2011 15:47, Uma Maheswara Rao G  >> > wrote:
> >> Hey Humayun,
> >>
> >>  To solve the too many fetch failures problem, you should configure host
> >> mapping correctly.
> >> Each tasktracker should be able to ping from each other.
> >>

Secondary Namenode on hadoop 0.20.205 ?

2011-12-26 Thread praveenesh kumar
Hey people,

How can we setup another machine in the cluster as Secondary Namenode
in hadoop 0.20.205 ?
Can a DN also act as SNN, any pros and cons of having this configuration ?

Thanks,
Praveenesh


How to remove ending tab separator in streaming output

2011-12-26 Thread devdoer bird
HI:

In streming MR program, I use /bin/cat as a mapper and set reducer=NONE,
 but the outputs all end with a tab separator. How do I configure the
streaming MR program to remove the ending separator?

Thanks.