add me

2014-05-25 Thread Michael Bronfman


[cid:image001.png@01CF7807.C48D6A20]

Michael Bronfman
DevOps Engineer | Business Technology

o. +972.73.208.7418 | m. +972.54.445.1122
w. www.perion.comhttp://www.perion.com/


This message may contain confidential and/or privileged information. It is 
intended to be read only by the individual or entity to whom it is addressed or 
by their designee. If the reader of this message is not the intended recipient, 
you are on notice that any uses, copy, disclose or distribution of this 
message, in any form, is strictly prohibited. If you have received this message 
in error, please notify the sender by reply email or by telephone and delete or 
destroy any copy of this message. Thank you.




Re: Web interface not show applications

2014-05-25 Thread Wangda Tan
Thanks for reporting this, will look at this.


On Sat, May 24, 2014 at 8:53 AM, bo yang bobyan...@gmail.com wrote:

 Yes, after application shown in main page, do step 1 again, the
 applications will disappear again.

 I use hadoop 2.4, running on Windows.



 On Fri, May 23, 2014 at 4:09 PM, Wangda Tan wheele...@gmail.com wrote:

 Bo,
 Thanks for your steps. I want to know after application shown in main
 page, do step 1 again, will this application disappear?
 And what's version of your Hadoop?

 Thanks,
 Wangda


 On Sat, May 24, 2014 at 1:33 AM, bo yang bobyan...@gmail.com wrote:

 Wangda,

 Yes, it is still reproducible.

 Steps in my side:
 1. Go to Scheduler Page, click some child queue.
 2. Go to Main Page, I see no applications there.
 3. Go to Scheduler Page, click the root queue.
 4. Go to Main Page, I can see applications as normal then.

 Thanks,
 Bo



 On Fri, May 23, 2014 at 6:45 AM, Wangda Tan wheele...@gmail.com wrote:

 Boyu,
 Your problem typically is forgetting set mapreduce.framework.name=yarn
 in yarn-site.xml or mapred-site.xml.

 Bo,
 Is this reproducible? I think it's more like a bug we need fix, and
 could you tell me how to run into this bug if it's re-producible.


 On Thu, May 15, 2014 at 3:26 AM, bo yang bobyan...@gmail.com wrote:

 Hi Boyu,

 I hit similar situation previously. I found I clicked some child queue
 in Scheduler page, then I could not see any applications in main page. If 
 I
 click the root queue in Scheduler page, I can see applications in main 
 page
 again. You may have a similar try.

 Regards,
 Bo



 On Wed, May 14, 2014 at 10:01 AM, Boyu Zhang boyuzhan...@gmail.comwrote:

 Dear all,

 I am using hadoop 2.4.0 in pseudo distributed mode, I tried to run
 the example wordcount program. It finished successfully, but I am not 
 able
 to see the application/job from the localhost:8088 web interface.

 I started the job history daemon, and nothing from localhost:19888
 either.

 Can anybody provide any intuitions?

 Thanks a lot!
 Boyu









running mapreduce

2014-05-25 Thread dwld0...@gmail.com






Hi Once running mapreduce, it will appear an unavailable process.Each time it 
will be like this.3472 ThriftServer 
3134 NodeManager 
3322 HRegionServer 
4383 -- process information unavailable 
4595 Jps 
2978 DataNode
I delete the process id in directory of  /tmp/hsperfdata_yarn,but it still 
appears again after running mapreduce .






dwld0...@gmail.com



Re: running mapreduce

2014-05-25 Thread Ted Yu
Can you provide a bit more information ?
Such as the release of hadoop you're running.

BTW did you use 'ps' command to see the command line for 4383 ?

Cheers


On Sun, May 25, 2014 at 7:30 AM, dwld0...@gmail.com dwld0...@gmail.comwrote:

 Hi
 *Once running mapreduce, it will appear an unavailable process.*
 *Each time it will be like this.*
 3472 ThriftServer
 3134 NodeManager
 3322 HRegionServer
 4383 -- process information unavailable
 4595 Jps
 2978 DataNode

 *I delete the process id in directory of  /tmp/hsperfdata_yarn,*
 *but it still appears again after running mapreduce .*





 --
 dwld0...@gmail.com



How to make sure data blocks are shared between 2 datanodes

2014-05-25 Thread Sindhu Hosamane

 Hello Friends, 
 
 I am running  multiple datanodes on a single machine .
 
 The output of jps command shows 
 Namenode   Datanode Datanode Jobtracker tasktracker
 Secondary Namenode
 
 Which assures that 2 datanodes are up and running .I execute cascalog 
 queries on this 2 datanode hadoop cluster  , And i get the results of query 
 too.
 I am not sure if it is really using both datanodes . ( bcoz anyways i get 
 results with one datanode )
 
 (read somewhere about HDFS storing data in datanodes like below )
 1)  A HDFS scheme might automatically move data from one DataNode to another 
 if the free space on a DataNode falls below a certain threshold. 
 2)  Internally, a file is split into one or more blocks and these blocks are 
 stored in a set of DataNodes. 
 
 My doubts are :
 * Do i have to make any configuration changes in hadoop to tell it to share 
 datablocks between 2 datanodes or does it do automatically .
 * Also My test data is not too big . its only 240 KB . According to point 1) 
 i don't know if such small test data can initiate automatic movement of  
 data from one datanode to another .
 * Also what should dfs.replication  value be when i am running 2 datanodes  
 ?  (i guess its 2 )
 
 
 Any advice or help would be very much appreciated .
 
 Best Regards,
 Sindhu
 
 



Re: How to make sure data blocks are shared between 2 datanodes

2014-05-25 Thread Peyman Mohajerian
Block size are typically 64 M or 12 M, so in your case only a single block
is involved which means if you have a single replica then only a single
data node will be used. The default replication is three and since you only
have two data nodes, you will most likely have two copies of the data in
two separate data nodes.


On Sun, May 25, 2014 at 12:40 PM, Sindhu Hosamane sindh...@gmail.comwrote:


 Hello Friends,

 I am running  multiple datanodes on a single machine .

 The output of jps command shows
 Namenode   Datanode Datanode Jobtracker tasktracker
  Secondary Namenode

 Which assures that 2 datanodes are up and running .I execute cascalog
 queries on this 2 datanode hadoop cluster  , And i get the results of query
 too.
 I am not sure if it is really using both datanodes . ( bcoz anyways i get
 results with one datanode )

 (read somewhere about HDFS storing data in datanodes like below )
 1)  A HDFS scheme might automatically move data from one DataNode to
 another if the free space on a DataNode falls below a certain threshold.
 2)  Internally, a file is split into one or more blocks and these blocks
 are stored in a set of DataNodes.

 My doubts are :
 * Do i have to make any configuration changes in hadoop to tell it to
 share datablocks between 2 datanodes or does it do automatically .
 * Also My test data is not too big . its only 240 KB . According to point
 1) i don't know if such small test data can initiate automatic movement of
  data from one datanode to another .
 * Also what should dfs.replication  value be when i am running 2 datanodes
  ?  (i guess its 2 )


 Any advice or help would be very much appreciated .

 Best Regards,
 Sindhu







Re: Re: running mapreduce

2014-05-25 Thread dwld0...@gmail.com






HiIt is CDH5.0.0, Hadoop 2.3.0,

I found the  unavailable process disappeared this morning.but it appears again 
on the Map and Reduce server after running mapreduce 
#jps15371 Jps

2269 QuorumPeerMain

15306 -- process information unavailable

11295 DataNode

11455 NodeManager
#ps -ef|grep javaOnly three right processes is showed,without 15306.


dwld0...@gmail.com
 From: Ted YuDate: 2014-05-25 22:53To: common-user@hadoop.apache.orgSubject: 
Re: running mapreduceCan you provide a bit more information ?Such as the 
release of hadoop you're running.
BTW did you use 'ps' command to see the command line for 4383 ?

Cheers

On Sun, May 25, 2014 at 7:30 AM, dwld0...@gmail.com dwld0...@gmail.com wrote:


Hi Once running mapreduce, it will appear an unavailable process.
Each time it will be like this.
3472 ThriftServer 
3134 NodeManager 

3322 HRegionServer 
4383 -- process information unavailable 

4595 Jps 
2978 DataNode

I delete the process id in directory of  /tmp/hsperfdata_yarn,
but it still appears again after running mapreduce .







dwld0...@gmail.com





Re: Web interface not show applications

2014-05-25 Thread bo yang
No problem. Thanks Wangda to follow up on this issue!


On Sun, May 25, 2014 at 6:36 AM, Wangda Tan wheele...@gmail.com wrote:

 Thanks for reporting this, will look at this.


 On Sat, May 24, 2014 at 8:53 AM, bo yang bobyan...@gmail.com wrote:

 Yes, after application shown in main page, do step 1 again, the
 applications will disappear again.

 I use hadoop 2.4, running on Windows.



 On Fri, May 23, 2014 at 4:09 PM, Wangda Tan wheele...@gmail.com wrote:

 Bo,
 Thanks for your steps. I want to know after application shown in main
 page, do step 1 again, will this application disappear?
 And what's version of your Hadoop?

 Thanks,
 Wangda


 On Sat, May 24, 2014 at 1:33 AM, bo yang bobyan...@gmail.com wrote:

 Wangda,

 Yes, it is still reproducible.

 Steps in my side:
 1. Go to Scheduler Page, click some child queue.
 2. Go to Main Page, I see no applications there.
 3. Go to Scheduler Page, click the root queue.
 4. Go to Main Page, I can see applications as normal then.

 Thanks,
 Bo



 On Fri, May 23, 2014 at 6:45 AM, Wangda Tan wheele...@gmail.comwrote:

 Boyu,
 Your problem typically is forgetting set mapreduce.framework.name=yarn
 in yarn-site.xml or mapred-site.xml.

 Bo,
 Is this reproducible? I think it's more like a bug we need fix, and
 could you tell me how to run into this bug if it's re-producible.


 On Thu, May 15, 2014 at 3:26 AM, bo yang bobyan...@gmail.com wrote:

 Hi Boyu,

 I hit similar situation previously. I found I clicked some child
 queue in Scheduler page, then I could not see any applications in main
 page. If I click the root queue in Scheduler page, I can see applications
 in main page again. You may have a similar try.

 Regards,
 Bo



 On Wed, May 14, 2014 at 10:01 AM, Boyu Zhang 
 boyuzhan...@gmail.comwrote:

 Dear all,

 I am using hadoop 2.4.0 in pseudo distributed mode, I tried to run
 the example wordcount program. It finished successfully, but I am not 
 able
 to see the application/job from the localhost:8088 web interface.

 I started the job history daemon, and nothing from localhost:19888
 either.

 Can anybody provide any intuitions?

 Thanks a lot!
 Boyu