add me
[cid:image001.png@01CF7807.C48D6A20] Michael Bronfman DevOps Engineer | Business Technology o. +972.73.208.7418 | m. +972.54.445.1122 w. www.perion.comhttp://www.perion.com/ This message may contain confidential and/or privileged information. It is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any uses, copy, disclose or distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please notify the sender by reply email or by telephone and delete or destroy any copy of this message. Thank you.
Re: Web interface not show applications
Thanks for reporting this, will look at this. On Sat, May 24, 2014 at 8:53 AM, bo yang bobyan...@gmail.com wrote: Yes, after application shown in main page, do step 1 again, the applications will disappear again. I use hadoop 2.4, running on Windows. On Fri, May 23, 2014 at 4:09 PM, Wangda Tan wheele...@gmail.com wrote: Bo, Thanks for your steps. I want to know after application shown in main page, do step 1 again, will this application disappear? And what's version of your Hadoop? Thanks, Wangda On Sat, May 24, 2014 at 1:33 AM, bo yang bobyan...@gmail.com wrote: Wangda, Yes, it is still reproducible. Steps in my side: 1. Go to Scheduler Page, click some child queue. 2. Go to Main Page, I see no applications there. 3. Go to Scheduler Page, click the root queue. 4. Go to Main Page, I can see applications as normal then. Thanks, Bo On Fri, May 23, 2014 at 6:45 AM, Wangda Tan wheele...@gmail.com wrote: Boyu, Your problem typically is forgetting set mapreduce.framework.name=yarn in yarn-site.xml or mapred-site.xml. Bo, Is this reproducible? I think it's more like a bug we need fix, and could you tell me how to run into this bug if it's re-producible. On Thu, May 15, 2014 at 3:26 AM, bo yang bobyan...@gmail.com wrote: Hi Boyu, I hit similar situation previously. I found I clicked some child queue in Scheduler page, then I could not see any applications in main page. If I click the root queue in Scheduler page, I can see applications in main page again. You may have a similar try. Regards, Bo On Wed, May 14, 2014 at 10:01 AM, Boyu Zhang boyuzhan...@gmail.comwrote: Dear all, I am using hadoop 2.4.0 in pseudo distributed mode, I tried to run the example wordcount program. It finished successfully, but I am not able to see the application/job from the localhost:8088 web interface. I started the job history daemon, and nothing from localhost:19888 either. Can anybody provide any intuitions? Thanks a lot! Boyu
running mapreduce
Hi Once running mapreduce, it will appear an unavailable process.Each time it will be like this.3472 ThriftServer 3134 NodeManager 3322 HRegionServer 4383 -- process information unavailable 4595 Jps 2978 DataNode I delete the process id in directory of /tmp/hsperfdata_yarn,but it still appears again after running mapreduce . dwld0...@gmail.com
Re: running mapreduce
Can you provide a bit more information ? Such as the release of hadoop you're running. BTW did you use 'ps' command to see the command line for 4383 ? Cheers On Sun, May 25, 2014 at 7:30 AM, dwld0...@gmail.com dwld0...@gmail.comwrote: Hi *Once running mapreduce, it will appear an unavailable process.* *Each time it will be like this.* 3472 ThriftServer 3134 NodeManager 3322 HRegionServer 4383 -- process information unavailable 4595 Jps 2978 DataNode *I delete the process id in directory of /tmp/hsperfdata_yarn,* *but it still appears again after running mapreduce .* -- dwld0...@gmail.com
How to make sure data blocks are shared between 2 datanodes
Hello Friends, I am running multiple datanodes on a single machine . The output of jps command shows Namenode Datanode Datanode Jobtracker tasktracker Secondary Namenode Which assures that 2 datanodes are up and running .I execute cascalog queries on this 2 datanode hadoop cluster , And i get the results of query too. I am not sure if it is really using both datanodes . ( bcoz anyways i get results with one datanode ) (read somewhere about HDFS storing data in datanodes like below ) 1) A HDFS scheme might automatically move data from one DataNode to another if the free space on a DataNode falls below a certain threshold. 2) Internally, a file is split into one or more blocks and these blocks are stored in a set of DataNodes. My doubts are : * Do i have to make any configuration changes in hadoop to tell it to share datablocks between 2 datanodes or does it do automatically . * Also My test data is not too big . its only 240 KB . According to point 1) i don't know if such small test data can initiate automatic movement of data from one datanode to another . * Also what should dfs.replication value be when i am running 2 datanodes ? (i guess its 2 ) Any advice or help would be very much appreciated . Best Regards, Sindhu
Re: How to make sure data blocks are shared between 2 datanodes
Block size are typically 64 M or 12 M, so in your case only a single block is involved which means if you have a single replica then only a single data node will be used. The default replication is three and since you only have two data nodes, you will most likely have two copies of the data in two separate data nodes. On Sun, May 25, 2014 at 12:40 PM, Sindhu Hosamane sindh...@gmail.comwrote: Hello Friends, I am running multiple datanodes on a single machine . The output of jps command shows Namenode Datanode Datanode Jobtracker tasktracker Secondary Namenode Which assures that 2 datanodes are up and running .I execute cascalog queries on this 2 datanode hadoop cluster , And i get the results of query too. I am not sure if it is really using both datanodes . ( bcoz anyways i get results with one datanode ) (read somewhere about HDFS storing data in datanodes like below ) 1) A HDFS scheme might automatically move data from one DataNode to another if the free space on a DataNode falls below a certain threshold. 2) Internally, a file is split into one or more blocks and these blocks are stored in a set of DataNodes. My doubts are : * Do i have to make any configuration changes in hadoop to tell it to share datablocks between 2 datanodes or does it do automatically . * Also My test data is not too big . its only 240 KB . According to point 1) i don't know if such small test data can initiate automatic movement of data from one datanode to another . * Also what should dfs.replication value be when i am running 2 datanodes ? (i guess its 2 ) Any advice or help would be very much appreciated . Best Regards, Sindhu
Re: Re: running mapreduce
HiIt is CDH5.0.0, Hadoop 2.3.0, I found the unavailable process disappeared this morning.but it appears again on the Map and Reduce server after running mapreduce #jps15371 Jps 2269 QuorumPeerMain 15306 -- process information unavailable 11295 DataNode 11455 NodeManager #ps -ef|grep javaOnly three right processes is showed,without 15306. dwld0...@gmail.com From: Ted YuDate: 2014-05-25 22:53To: common-user@hadoop.apache.orgSubject: Re: running mapreduceCan you provide a bit more information ?Such as the release of hadoop you're running. BTW did you use 'ps' command to see the command line for 4383 ? Cheers On Sun, May 25, 2014 at 7:30 AM, dwld0...@gmail.com dwld0...@gmail.com wrote: Hi Once running mapreduce, it will appear an unavailable process. Each time it will be like this. 3472 ThriftServer 3134 NodeManager 3322 HRegionServer 4383 -- process information unavailable 4595 Jps 2978 DataNode I delete the process id in directory of /tmp/hsperfdata_yarn, but it still appears again after running mapreduce . dwld0...@gmail.com
Re: Web interface not show applications
No problem. Thanks Wangda to follow up on this issue! On Sun, May 25, 2014 at 6:36 AM, Wangda Tan wheele...@gmail.com wrote: Thanks for reporting this, will look at this. On Sat, May 24, 2014 at 8:53 AM, bo yang bobyan...@gmail.com wrote: Yes, after application shown in main page, do step 1 again, the applications will disappear again. I use hadoop 2.4, running on Windows. On Fri, May 23, 2014 at 4:09 PM, Wangda Tan wheele...@gmail.com wrote: Bo, Thanks for your steps. I want to know after application shown in main page, do step 1 again, will this application disappear? And what's version of your Hadoop? Thanks, Wangda On Sat, May 24, 2014 at 1:33 AM, bo yang bobyan...@gmail.com wrote: Wangda, Yes, it is still reproducible. Steps in my side: 1. Go to Scheduler Page, click some child queue. 2. Go to Main Page, I see no applications there. 3. Go to Scheduler Page, click the root queue. 4. Go to Main Page, I can see applications as normal then. Thanks, Bo On Fri, May 23, 2014 at 6:45 AM, Wangda Tan wheele...@gmail.comwrote: Boyu, Your problem typically is forgetting set mapreduce.framework.name=yarn in yarn-site.xml or mapred-site.xml. Bo, Is this reproducible? I think it's more like a bug we need fix, and could you tell me how to run into this bug if it's re-producible. On Thu, May 15, 2014 at 3:26 AM, bo yang bobyan...@gmail.com wrote: Hi Boyu, I hit similar situation previously. I found I clicked some child queue in Scheduler page, then I could not see any applications in main page. If I click the root queue in Scheduler page, I can see applications in main page again. You may have a similar try. Regards, Bo On Wed, May 14, 2014 at 10:01 AM, Boyu Zhang boyuzhan...@gmail.comwrote: Dear all, I am using hadoop 2.4.0 in pseudo distributed mode, I tried to run the example wordcount program. It finished successfully, but I am not able to see the application/job from the localhost:8088 web interface. I started the job history daemon, and nothing from localhost:19888 either. Can anybody provide any intuitions? Thanks a lot! Boyu