Re: Installing both hadoop1 and haddop2 on single node

2014-05-05 Thread chandra kant
Thanks..
I set it accordingly..

On Monday, 5 May 2014, Shengjun Xin s...@gopivotal.com wrote:

 I think you need to set two different environment variables for hadoop1
 and hadoop2, such as HADOOP_HOME, HADOOP_CONF_DIR and before you run hadoop
 command, you need to make sure the correct environment variables are enable.


 On Mon, May 5, 2014 at 12:36 PM, chandra kant 
 chandralakshmikan...@gmail.comjavascript:_e(%7B%7D,'cvml','chandralakshmikan...@gmail.com');
  wrote:


 Hi,
 Is it possible to install both hadoop1 and hadoop2 side  by side on
 single node for development purpose so that I can choose any one of them by
 just shutting down one and starting another?
 I installed hadoop 1.2.1 and ran air successfully.
 Next, when I try to do hdfs namenode - format  from hadoop2.2.0 , it
 tries to format the hadoop.tmp.dir set up by hadoop1.2.1 , which is clearly
 not desirable. I set up dfs.namenode.name.dir and dfs.datanode.data.dir
 in hdfs-site.xml to different
 locations.
 But again the same problem...
 Any suggestions

 --
 Chandra




 --
 Regards
 Shengjun



[Blog] Map-only jobs in Hadoop for beginers

2014-05-05 Thread unmesha sreeveni
​Hi

http://www.unmeshasreeveni.blogspot.in/2014/05/map-only-jobs-in-hadoop.html

This is a post on Map-only Jobs in Hadoop for beginers.

​

-- 
*Thanks  Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/


Regaring the shell for hadoop for command completing and ease directory listing

2014-05-05 Thread Ponnusamy, Arivazhagan
Hi,
Does hadoop support the command completion tool and state aware tool like to 
store the current working directory in hadoop file system like in unix shell.

If not any open source tools available ?

Regards,
James Arivazhagan Ponnusamy


Re: Installing both hadoop1 and haddop2 on single node

2014-05-05 Thread chandra kant
I mean , I tried that already. In my .bash_profile , I set up
HADOOP_HOME, HADOOP_MAPRED_HOME, HADOOP_COMMON_HOME, HADOOP_HDFS_HOME,
HADOOP_YARN_HOME, HADOOP_CONF_DIR to point to hadoop2 directory. And
similarly  I have HADOOP_PREFIX and HADOOP_HOME for hadoop-1.
So , i comment out hadoop-2 specific conf lines in .bash_profile while
running hadoop-1 and vice-versa.
And here the problem comes, that when trying to run hadoop-2 , it asks to
format the  hadoop.tmp.dir set up by hadoop-1.

Just for context- I have done fair amount of work on hadoop-1 and this is
my first attempt at hadoop-2.

--
Chandra


On Mon, May 5, 2014 at 11:31 AM, chandra kant 
chandralakshmikan...@gmail.com wrote:

 Thanks..
 I set it accordingly..


 On Monday, 5 May 2014, Shengjun Xin s...@gopivotal.com wrote:

 I think you need to set two different environment variables for hadoop1
 and hadoop2, such as HADOOP_HOME, HADOOP_CONF_DIR and before you run hadoop
 command, you need to make sure the correct environment variables are enable.


 On Mon, May 5, 2014 at 12:36 PM, chandra kant 
 chandralakshmikan...@gmail.com wrote:


 Hi,
 Is it possible to install both hadoop1 and hadoop2 side  by side on
 single node for development purpose so that I can choose any one of them by
 just shutting down one and starting another?
 I installed hadoop 1.2.1 and ran air successfully.
 Next, when I try to do hdfs namenode - format  from hadoop2.2.0 , it
 tries to format the hadoop.tmp.dir set up by hadoop1.2.1 , which is clearly
 not desirable. I set up dfs.namenode.name.dir and dfs.datanode.data.dir
 in hdfs-site.xml to different
 locations.
 But again the same problem...
 Any suggestions

 --
 Chandra




 --
 Regards
 Shengjun




Reading from jar file in local filesystem using hadoop FileSystem

2014-05-05 Thread Diego Fernandez
Hey guys, I'm new here.  I the following question on SO
http://stackoverflow.com/questions/23478746/reading-from-jar-file-in-local-filesystem-using-hadoop-filesystembut
I figured someone here might have a better idea.

Any help would be greatly appreciated.

-- 
Diego Fernandez - 爱国


Are mapper classes re-instantiated for each record?

2014-05-05 Thread jeremy p
Let's say I have TaskTracker that receives 5 records to process for a
single job.  When the TaskTracker processses the first record, it will
instantiate my Mapper class and execute my setup() function.  It will then
run the map() method on that record.  My question is this : what happens
when the map() method has finished processing the first record?  I'm
guessing it will do one of two things :

1) My cleanup() function will execute.  After the cleanup() method has
finished, this instance of the Mapper object will be destroyed.  When it is
time to process the next record, a new Mapper object will be instantiated.
 Then my setup() method will execute, the map() method will execute, the
cleanup() method will execute, and then the Mapper instance will be
destroyed.  When it is time to process the next record, a new Mapper object
will be instantiated.  This process will repeat itself until all 5 records
have been processed.  In other words, my setup() and cleanup() methods will
have been executed 5 times each.

or

2) When the map() method has finished processing my first record, the
Mapper instance will NOT be destroyed.  It will be reused for all 5
records.  When the map() method has finished processing the last record, my
cleanup() method will execute.  In other words, my setup() and cleanup()
methods will only execute 1 time each.

Thanks for the help!


issue about cluster balance

2014-05-05 Thread ch huang
hi,maillist:
 i have a 5-node hadoop cluster,and yesterday i add 5 new
box into my cluster,after that i start balance task,but it move only 7%
data to new node in 20 hour , and i already set
dfs.datanode.balance.bandwidthPerSec 10M ,and the threshold is 10%,why the
balance task take long time ?


Re: Installing both hadoop1 and haddop2 on single node

2014-05-05 Thread Shengjun Xin
According to your description, I think it is still a configuration problem.
Before you run hadoop command, did you check the hadoop version and hadoop
environment variables ? Are they what you want?


On Mon, May 5, 2014 at 11:42 PM, chandra kant 
chandralakshmikan...@gmail.com wrote:

 I mean , I tried that already. In my .bash_profile , I set up
 HADOOP_HOME, HADOOP_MAPRED_HOME, HADOOP_COMMON_HOME, HADOOP_HDFS_HOME,
 HADOOP_YARN_HOME, HADOOP_CONF_DIR to point to hadoop2 directory. And
 similarly  I have HADOOP_PREFIX and HADOOP_HOME for hadoop-1.
 So , i comment out hadoop-2 specific conf lines in .bash_profile while
 running hadoop-1 and vice-versa.
 And here the problem comes, that when trying to run hadoop-2 , it asks to
 format the  hadoop.tmp.dir set up by hadoop-1.

 Just for context- I have done fair amount of work on hadoop-1 and this is
 my first attempt at hadoop-2.

 --
 Chandra


 On Mon, May 5, 2014 at 11:31 AM, chandra kant 
 chandralakshmikan...@gmail.com wrote:

 Thanks..
 I set it accordingly..


 On Monday, 5 May 2014, Shengjun Xin s...@gopivotal.com wrote:

 I think you need to set two different environment variables for hadoop1
 and hadoop2, such as HADOOP_HOME, HADOOP_CONF_DIR and before you run hadoop
 command, you need to make sure the correct environment variables are enable.


 On Mon, May 5, 2014 at 12:36 PM, chandra kant 
 chandralakshmikan...@gmail.com wrote:


 Hi,
 Is it possible to install both hadoop1 and hadoop2 side  by side on
 single node for development purpose so that I can choose any one of them by
 just shutting down one and starting another?
 I installed hadoop 1.2.1 and ran air successfully.
 Next, when I try to do hdfs namenode - format  from hadoop2.2.0 , it
 tries to format the hadoop.tmp.dir set up by hadoop1.2.1 , which is clearly
 not desirable. I set up dfs.namenode.name.dir and dfs.datanode.data.dir
 in hdfs-site.xml to different
 locations.
 But again the same problem...
 Any suggestions

 --
 Chandra




 --
 Regards
 Shengjun





-- 
Regards
Shengjun


Yarn HA - ZooKeeper ACL for fencing

2014-05-05 Thread Manoj Samel
Hi,

For yarn.resourcemanager.zk-state-store.root-node.acl, the yarn-default.xml
says For fencing to work, the ACLs should be carefully set differently on
each ResourceManger such that all the ResourceManagers have shared admin
access and the Active ResourceManger takes over (exclusively) the
create-delete access.

Can someone give actual example of such permissions ?

Thanks,


Re: Installing both hadoop1 and haddop2 on single node

2014-05-05 Thread Stanley Shi
Please be sure to use different HADOOP_CONF_DIR for the two version; and
also in the configuration, be sure to use different folder to store the
HDFS related files;

Regards,
*Stanley Shi,*



On Tue, May 6, 2014 at 8:41 AM, Shengjun Xin s...@gopivotal.com wrote:

 According to your description, I think it is still a configuration
 problem. Before you run hadoop command, did you check the hadoop version
 and hadoop environment variables ? Are they what you want?


 On Mon, May 5, 2014 at 11:42 PM, chandra kant 
 chandralakshmikan...@gmail.com wrote:

 I mean , I tried that already. In my .bash_profile , I set up
 HADOOP_HOME, HADOOP_MAPRED_HOME, HADOOP_COMMON_HOME, HADOOP_HDFS_HOME,
 HADOOP_YARN_HOME, HADOOP_CONF_DIR to point to hadoop2 directory. And
 similarly  I have HADOOP_PREFIX and HADOOP_HOME for hadoop-1.
 So , i comment out hadoop-2 specific conf lines in .bash_profile while
 running hadoop-1 and vice-versa.
 And here the problem comes, that when trying to run hadoop-2 , it asks to
 format the  hadoop.tmp.dir set up by hadoop-1.

 Just for context- I have done fair amount of work on hadoop-1 and this is
 my first attempt at hadoop-2.

 --
 Chandra


 On Mon, May 5, 2014 at 11:31 AM, chandra kant 
 chandralakshmikan...@gmail.com wrote:

 Thanks..
 I set it accordingly..


 On Monday, 5 May 2014, Shengjun Xin s...@gopivotal.com wrote:

 I think you need to set two different environment variables for hadoop1
 and hadoop2, such as HADOOP_HOME, HADOOP_CONF_DIR and before you run hadoop
 command, you need to make sure the correct environment variables are 
 enable.


 On Mon, May 5, 2014 at 12:36 PM, chandra kant 
 chandralakshmikan...@gmail.com wrote:


 Hi,
 Is it possible to install both hadoop1 and hadoop2 side  by side on
 single node for development purpose so that I can choose any one of them 
 by
 just shutting down one and starting another?
 I installed hadoop 1.2.1 and ran air successfully.
 Next, when I try to do hdfs namenode - format  from hadoop2.2.0 , it
 tries to format the hadoop.tmp.dir set up by hadoop1.2.1 , which is 
 clearly
 not desirable. I set up dfs.namenode.name.dir and 
 dfs.datanode.data.dir
 in hdfs-site.xml to different
 locations.
 But again the same problem...
 Any suggestions

 --
 Chandra




 --
 Regards
 Shengjun





 --
 Regards
 Shengjun



RE: issue about cluster balance

2014-05-05 Thread Rakesh R
Could you give more details like,

-  Could you convert 7% to the total amount of moved data in MBs.

-  Also, could you tell me 7% data movement per DN ?

-  What values showing for the ‘over-utilized’, ‘above-average’, 
‘below-average’, ‘below-average’ nodes. Balancer will do the pairing based on 
these values.

-  Please tell me the cluster topology - SAME_NODE_GROUP, SAME_RACK. 
Basically this will matters when choosing the sourceNode vs balancerNode pairs 
as well as the proxy source.

Did you see all the DNs are getting utilized for the block movement.

-  Any exceptions occurred when block movement

-  How many iterations played in these hours

-Rakesh

From: ch huang [mailto:justlo...@gmail.com]
Sent: 06 May 2014 06:10
To: user@hadoop.apache.org
Subject: issue about cluster balance

hi,maillist:
 i have a 5-node hadoop cluster,and yesterday i add 5 new box 
into my cluster,after that i start balance task,but it move only 7% data to new 
node in 20 hour , and i already set dfs.datanode.balance.bandwidthPerSec 10M 
,and the threshold is 10%,why the balance task take long time ?


Specifying a node/host name on which the Application Master should run

2014-05-05 Thread Krishna Kishore Bonagiri
Hi,

  Is there a way to specify a host name on which we want to run our
application master. Can we do this when it is being launched from the
YarnClient?

Thanks,
Kishore


Re: Are mapper classes re-instantiated for each record?

2014-05-05 Thread Sergey Murylev
Hi Jeremy,

According to official documentation
http://hadoop.apache.org/docs/r2.2.0/api/org/apache/hadoop/mapreduce/Mapper.html
setup and cleanup calls performed for each InputSplit. In this case you
variant 2 is more correct. But actually single mapper can be used for
processing multiple InputSplits. In you case if you have 5 files with 1
record each it can call setup/cleanup 5 times. But if your records are
in single file I think that setup/cleanup should be called once.

--
Thanks,
Sergey

On 06/05/14 02:49, jeremy p wrote:
 Let's say I have TaskTracker that receives 5 records to process for a
 single job.  When the TaskTracker processses the first record, it will
 instantiate my Mapper class and execute my setup() function.  It will
 then run the map() method on that record.  My question is this : what
 happens when the map() method has finished processing the first
 record?  I'm guessing it will do one of two things :

 1) My cleanup() function will execute.  After the cleanup() method has
 finished, this instance of the Mapper object will be destroyed.  When
 it is time to process the next record, a new Mapper object will be
 instantiated.  Then my setup() method will execute, the map() method
 will execute, the cleanup() method will execute, and then the Mapper
 instance will be destroyed.  When it is time to process the next
 record, a new Mapper object will be instantiated.  This process will
 repeat itself until all 5 records have been processed.  In other
 words, my setup() and cleanup() methods will have been executed 5
 times each.

 or

 2) When the map() method has finished processing my first record, the
 Mapper instance will NOT be destroyed.  It will be reused for all 5
 records.  When the map() method has finished processing the last
 record, my cleanup() method will execute.  In other words, my setup()
 and cleanup() methods will only execute 1 time each.

 Thanks for the help!



signature.asc
Description: OpenPGP digital signature