Re: Hadoop Multi-node cluster on Windows
Thanks but that is almost a 5 year old tutorial, written when Hadoop was not natively supported on Windows. As of Hadoop 2.x, Windows is supported and setup is much simpler. No Cygwin required. From: Ted Yu yuzhih...@gmail.com To: common-u...@hadoop.apache.org user@hadoop.apache.org, Date: 06/04/2014 10:05 PM Subject:Re: Hadoop Multi-node cluster on Windows See also http://ebiquity.umbc.edu/Tutorials/Hadoop/05%20-%20Setup%20SSHD.html On Wed, Jun 4, 2014 at 6:43 PM, wadood.chaudh...@instinet.com wrote: Thanks. Linux cluster requires SSH etc. for every thing including single node. In Windows until now I have been able to setup to pseudo distributed level without SSH or CygWin. Someone was saying that some Windows service needs to be installed for communication across machines. Is that true? - Message from Olivier Renault on 06/04/2014 09:36:17 PM - To: user@hadoop.apache.org Subject: RE: Hadoop Multi-node cluster on Windows HDP run natively on Linux and Windows. It's the same code base. You can install a windows cluster in the same way as you do for Linux. Olivier On 4 Jun 2014 18:23, wadood.chaudh...@instinet.com wrote: Yes, but Horton runs Hadoop under Linux Virtual Machine. I am talking of pure and simple Hadoop. - Message from Ted Yu on 06/04/2014 09:18:51 PM - To: common-u...@hadoop.apache.org Subject: Re: Hadoop Multi-node cluster on Windows Have you seen this ? http://hortonworks.com/blog/install-hadoop-windows-hortonworks-data-platform-2-0/ Cheers On Wed, Jun 4, 2014 at 5:47 PM, wadood.chaudh...@instinet.com wrote: Has anyone successfully implemented Multi-node cluster under Windows using the latest version of Hadoop. We were able to run Pseudo distributed. Does it still needs SSH under Windows, or something installed as service. Inactive hide details for ch huang ---06/04/2014 08:40:38 PM---hi,mailist: my company signed a new IDC , i need m ch huang ---06/04/2014 08:40:38 PM---hi,mailist: my company signed a new IDC , i need move the total 50T data From: ch huang justlo...@gmail.com To: user@hadoop.apache.org, Date: 06/04/2014 08:40 PM Subject: issue about move data between two hdoop cluster hi,mailist: my company signed a new IDC , i need move the total 50T data from old hadoop cluster to the new cluster in new location ,how to do it? = Disclaimer This message is intended solely for use by the named addressee(s). If you receive this transmission in error, please immediately notify the sender and destroy this message in its entirety, whether in electronic or hard copy format. Any unauthorized use (and reliance thereon), copying, disclosure, retention, or distribution of this transmission or the material in this transmission is forbidden. We reserve the right to monitor and archive electronic communications. This material does not constitute an offer or solicitation with respect to the purchase or sale of any security. It should not be construed to contain any recommendation regarding any security or strategy. Any views expressed are those of the individual sender, except where the message states otherwise and the sender is authorized to state them to be the views of any such entity. This communication is provided on an “as is” basis. It contains material that is owned by Instinet Incorporated, its subsidiaries or its or their licensors, and may not, in whole or in part, be (i) copied, photocopied or duplicated in any form, by any means, or (ii) redistributed, posted, published, excerpted, or quoted without Instinet Incorporated's prior written consent. Please access the following link for important information and instructions: http://instinet.com/includes/index.jsp?thePage=/html/le_index.txt Securities products and services are provided by locally registered brokerage subsidiaries of Instinet Incorporated: Instinet Australia Pty Limited (ACN: 131 253 686 AFSL No: 327834), regulated by the Australian Securities Investments Commission; Instinet Canada Limited, member IIROC/CIPF; Instinet Pacific Limited, authorized and regulated by the Securities and Futures Commission of Hong Kong; Instinet Singapore
Re: Hadoop Multi-node cluster on Windows
Currently as per my knowledge only Hortonwork's distribution comes for windows. As such hadoop does not need ssh to work but ssh in linux was needed for the automated deployments. If you want to avoid that then, I would say setup a single machine with all installation where all packages and correct configurations are present. then make an image out of it and build other nodes. But then you will need to login to all nodes and start all services manually. Hortonwork's HDP 2.1 on windows step by step manual install guide is here http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1-Win-latest/bk_releasenotes_HDP-Win/content/index.html On Thu, Jun 5, 2014 at 11:29 AM, wadood.chaudh...@instinet.com wrote: Thanks but that is almost a 5 year old tutorial, written when Hadoop was not natively supported on Windows. As of Hadoop 2.x, Windows is supported and setup is much simpler. No Cygwin required. [image: Inactive hide details for Ted Yu ---06/04/2014 10:05:15 PM---See also http://ebiquity.umbc.edu/Tutorials/Hadoop/05%20-%20Setup%]Ted Yu ---06/04/2014 10:05:15 PM---See also http://ebiquity.umbc.edu/Tutorials/Hadoop/05%20-%20Setup%20SSHD.html From: Ted Yu yuzhih...@gmail.com To: common-u...@hadoop.apache.org user@hadoop.apache.org, Date: 06/04/2014 10:05 PM Subject: Re: Hadoop Multi-node cluster on Windows -- See also *http://ebiquity.umbc.edu/Tutorials/Hadoop/05%20-%20Setup%20SSHD.html* http://ebiquity.umbc.edu/Tutorials/Hadoop/05%20-%20Setup%20SSHD.html On Wed, Jun 4, 2014 at 6:43 PM, *wadood.chaudh...@instinet.com* wadood.chaudh...@instinet.com wrote: Thanks. Linux cluster requires SSH etc. for every thing including single node. In Windows until now I have been able to setup to pseudo distributed level without SSH or CygWin. Someone was saying that some Windows service needs to be installed for communication across machines. Is that true? - Message from Olivier Renault on 06/04/2014 09:36:17 PM - *To:* *user@hadoop.apache.org* user@hadoop.apache.org *Subject:* RE: Hadoop Multi-node cluster on Windows HDP run natively on Linux and Windows. It's the same code base. You can install a windows cluster in the same way as you do for Linux. Olivier On 4 Jun 2014 18:23, *wadood.chaudh...@instinet.com* wadood.chaudh...@instinet.com wrote: Yes, but Horton runs Hadoop under Linux Virtual Machine. I am talking of pure and simple Hadoop. - Message from Ted Yu on 06/04/2014 09:18:51 PM - *To:* *common-u...@hadoop.apache.org* common-u...@hadoop.apache.org *Subject:* Re: Hadoop Multi-node cluster on Windows Have you seen this ? *http://hortonworks.com/blog/install-hadoop-windows-hortonworks-data-platform-2-0/* http://hortonworks.com/blog/install-hadoop-windows-hortonworks-data-platform-2-0/ Cheers On Wed, Jun 4, 2014 at 5:47 PM, *wadood.chaudh...@instinet.com* wadood.chaudh...@instinet.com wrote: Has anyone successfully implemented Multi-node cluster under Windows using the latest version of Hadoop. We were able to run Pseudo distributed. Does it still needs SSH under Windows, or something installed as service. [image: Inactive hide details for ch huang ---06/04/2014 08:40:38 PM---hi,mailist: my company signed a new IDC , i need m]ch huang ---06/04/2014 08:40:38 PM---hi,mailist: my company signed a new IDC , i need move the total 50T data From: ch huang *justlo...@gmail.com* justlo...@gmail.com To: *user@hadoop.apache.org* user@hadoop.apache.org, Date: 06/04/2014 08:40 PM Subject: issue about move data between two hdoop cluster -- hi,mailist: my company signed a new IDC , i need move the total 50T data from old hadoop cluster to the new cluster in new location ,how to do it? * = * * Disclaimer * *This message is intended solely for use by the named addressee(s). If you receive this transmission in error, please immediately notify the sender and destroy this message in its entirety, whether in electronic or hard copy format. Any unauthorized use (and reliance thereon), copying, disclosure, retention, or distribution of this transmission or the material in this transmission is forbidden. We reserve the right to monitor and archive electronic communications. This material does not constitute an offer or solicitation with respect to the purchase or sale of any security. It should not be construed to contain any recommendation regarding any security or strategy. Any views expressed are those of the
Re: Hadoop Multi-node cluster on Windows
if you have installed a single node setup on windows then it should be easy to move to multinode. Never tried this myself on windows but this is what I do on linux and it should work on windows as well. In the current node, instead of having anything as localhost or 127.0.0.1 .. point it to real hostname or static ip address make an image out of it and launch similar instances Launch new instances using that image. disable windows security on all nodes make sure all nodes can communicate each other over network After that login to each node and start services there and it should come up as distributed node. On Thu, Jun 5, 2014 at 11:59 AM, wadood.chaudh...@instinet.com wrote: Thanks for looking into it. I have used Horton but that is a third party tool and should not be required when Apache has started supporting Windows. Here is the Wiki article on Windows from Apache's. https://wiki.apache.org/hadoop/Hadoop2OnWindows I am able to setup in Pseudo distributed mode using the instructions above. Unfortunately, the article ends there saying that instructions for Multi-cluster will be added later. No mention that you cannot do it under Windows or you must use Horton. Wadood [image: Inactive hide details for Nitin Pawar ---06/05/2014 02:09:29 AM---Currently as per my knowledge only Hortonwork's distribution]Nitin Pawar ---06/05/2014 02:09:29 AM---Currently as per my knowledge only Hortonwork's distribution comes for windows. From: Nitin Pawar nitinpawar...@gmail.com To: user@hadoop.apache.org, Date: 06/05/2014 02:09 AM Subject: Re: Hadoop Multi-node cluster on Windows -- Currently as per my knowledge only Hortonwork's distribution comes for windows. As such hadoop does not need ssh to work but ssh in linux was needed for the automated deployments. If you want to avoid that then, I would say setup a single machine with all installation where all packages and correct configurations are present. then make an image out of it and build other nodes. But then you will need to login to all nodes and start all services manually. Hortonwork's HDP 2.1 on windows step by step manual install guide is here *http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1-Win-latest/bk_releasenotes_HDP-Win/content/index.html* http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1-Win-latest/bk_releasenotes_HDP-Win/content/index.html On Thu, Jun 5, 2014 at 11:29 AM, *wadood.chaudh...@instinet.com* wadood.chaudh...@instinet.com wrote: Thanks but that is almost a 5 year old tutorial, written when Hadoop was not natively supported on Windows. As of Hadoop 2.x, Windows is supported and setup is much simpler. No Cygwin required. [image: Inactive hide details for Ted Yu ---06/04/2014 10:05:15 PM---See also http://ebiquity.umbc.edu/Tutorials/Hadoop/05%20-%20Setup%]Ted Yu ---06/04/2014 10:05:15 PM---See also *http://ebiquity.umbc.edu/Tutorials/Hadoop/05%20-%20Setup%20SSHD.html* http://ebiquity.umbc.edu/Tutorials/Hadoop/05%20-%20Setup%20SSHD.html From: Ted Yu *yuzhih...@gmail.com* yuzhih...@gmail.com To: *common-u...@hadoop.apache.org* common-u...@hadoop.apache.org *user@hadoop.apache.org* user@hadoop.apache.org, Date: 06/04/2014 10:05 PM Subject: Re: Hadoop Multi-node cluster on Windows -- See also *http://ebiquity.umbc.edu/Tutorials/Hadoop/05%20-%20Setup%20SSHD.html* http://ebiquity.umbc.edu/Tutorials/Hadoop/05%20-%20Setup%20SSHD.html On Wed, Jun 4, 2014 at 6:43 PM, *wadood.chaudh...@instinet.com* wadood.chaudh...@instinet.com wrote: Thanks. Linux cluster requires SSH etc. for every thing including single node. In Windows until now I have been able to setup to pseudo distributed level without SSH or CygWin. Someone was saying that some Windows service needs to be installed for communication across machines. Is that true? - Message from Olivier Renault on 06/04/2014 09:36:17 PM - *To:* *user@hadoop.apache.org* user@hadoop.apache.org *Subject:* RE: Hadoop Multi-node cluster on Windows HDP run natively on Linux and Windows. It's the same code base. You can install a windows cluster in the same way as you do for Linux. Olivier On 4 Jun 2014 18:23, *wadood.chaudh...@instinet.com* wadood.chaudh...@instinet.com wrote: Yes, but Horton runs Hadoop under Linux Virtual Machine. I am talking of pure and simple Hadoop. - Message from Ted Yu on 06/04/2014 09:18:51 PM - *To:* *common-u...@hadoop.apache.org* common-u...@hadoop.apache.org *Subject:* Re: Hadoop Multi-node cluster on Windows Have you seen this ? *http://hortonworks.com/blog/install-hadoop-windows-hortonworks-data-platform-2-0/*
Re: (Very) newbie questions
are you able to compile it using mvn compile -Pnative -Dmaven.test.skip=true? Raj K Singh http://in.linkedin.com/in/rajkrrsingh http://www.rajkrrsingh.blogspot.com Mobile Tel: +91 (0)9899821370 On Wed, Jun 4, 2014 at 4:01 AM, Christian Convey christian.con...@gmail.com wrote: I'm completely new to Hadoop, and I'm trying to build it for the first time. I cloned the Git repository and I'm building the tag version release-2.4.0. I get a clean mvn compile -Pnative, but I'm getting a few unit tests failing when I run mvn test. Does anyone know to what extent Hadoop (Common) 's unit tests failing is a serious issue? I.e., can I have a healthy Hadoop build, while still having had a few unit tests fail?
Re: how can i monitor Decommission progress?
use $hadoop dfsadmin -report Raj K Singh http://in.linkedin.com/in/rajkrrsingh http://www.rajkrrsingh.blogspot.com Mobile Tel: +91 (0)9899821370 On Sat, May 31, 2014 at 11:26 AM, ch huang justlo...@gmail.com wrote: hi,maillist: i decommission three node out of my cluster,but question is how can i see the decommission progress?,i just can see admin state from web ui
Re: how can i monitor Decommission progress?
The namenode webui provides that information. Click on the main webui the link associated with decommissioned nodes. Sent from phone On Jun 5, 2014, at 10:36 AM, Raj K Singh rajkrrsi...@gmail.com wrote: use $hadoop dfsadmin -report Raj K Singh http://in.linkedin.com/in/rajkrrsingh http://www.rajkrrsingh.blogspot.com Mobile Tel: +91 (0)9899821370 On Sat, May 31, 2014 at 11:26 AM, ch huang justlo...@gmail.com wrote: hi,maillist: i decommission three node out of my cluster,but question is how can i see the decommission progress?,i just can see admin state from web ui -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: how can i monitor Decommission progress?
but it can not show me ,how much it already done On Fri, Jun 6, 2014 at 2:56 AM, Suresh Srinivas sur...@hortonworks.com wrote: The namenode webui provides that information. Click on the main webui the link associated with decommissioned nodes. Sent from phone On Jun 5, 2014, at 10:36 AM, Raj K Singh rajkrrsi...@gmail.com wrote: use $hadoop dfsadmin -report Raj K Singh http://in.linkedin.com/in/rajkrrsingh http://www.rajkrrsingh.blogspot.com Mobile Tel: +91 (0)9899821370 On Sat, May 31, 2014 at 11:26 AM, ch huang justlo...@gmail.com wrote: hi,maillist: i decommission three node out of my cluster,but question is how can i see the decommission progress?,i just can see admin state from web ui CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
issue about change NN node
hi,mailist: I want to replace my NN in hadoop cluster(not NN HA,no secondary NN),how can i do this?
NN recovery from metadata backup
Hi,I am trying NN metadata recovery. I have taken backup of Namenode and Journal node meta data . It contains edit logs and fsimages. There are two NNs in my system. I take backup of metadata on both NNs (hdfs metadata QJM metadata) at regular frequency. I want to test recovery procedure in a worst case scenario. Assume both the NNs and Journal node are down with the metadata completely deleted. I want to recover NN metadata from backup and start NN. I know that there could be a data loss as the latest changes done after backup would be missing. Do you think such a scenario is possible/feasible ?I am facing some issues related to txn id mismatch, committed txn id. Please tell if there is a solution for the same. Steps tried:Take metadata backup of NN and QJM.Do some hdfs file operations (create new files).Stop NN and Journal node on both the machines.Delete metadata from /data/hdfs and journal directories.Restore Fsimages from backup (taken some time back). Start NN. It fails with below exception.Alternative: Restore all the edit logs and fsimage to both hdfs and qjm directories and start NN but still it fails. Both the NNs are down and I can't bring up. I don't want to format hdfs as it will change Cluster ID and the backup won't be usable. Exceptions:There appears to be a gap in the edit log. We expected txid 71453, but got txid 71466Client trying to move committed txid backward from 71599 to 71453recoverUnfinalizedSegments failed for required journal. Decided to synchronize log to startTxId: 71453 but logger 10.204.64.26:8485 had seen txid 71599 committed
Re: How can a task know if its running as a MR1 or MR2 job?
Hi I guess when we call submit() method that in tern call the setUseNewAPI() method to set the MR2 relevent setting.. here what i found after code walk through of hadoop private void setUseNewAPI() throws IOException { int numReduces = conf.getNumReduceTasks(); String oldMapperClass = mapred.mapper.class; String oldReduceClass = mapred.reducer.class; conf.setBooleanIfUnset(mapred.mapper.new-api, conf.get(oldMapperClass) == null); if(conf.getUseNewMapper()) { String mode = new map API; ensureNotSet(mapred.input.format.class, mode); ensureNotSet(oldMapperClass, mode); if(numReduces != 0) ensureNotSet(mapred.partitioner.class, mode); else ensureNotSet(mapred.output.format.class, mode); } else { String mode = map compatability; ensureNotSet(mapreduce.inputformat.class, mode); ensureNotSet(mapreduce.map.class, mode); if(numReduces != 0) ensureNotSet(mapreduce.partitioner.class, mode); else ensureNotSet(mapreduce.outputformat.class, mode); } if(numReduces != 0) { conf.setBooleanIfUnset(mapred.reducer.new-api, conf.get(oldReduceClass) == null); if(conf.getUseNewReducer()) { String mode = new reduce API; ensureNotSet(mapred.output.format.class, mode); ensureNotSet(oldReduceClass, mode); } else { String mode = reduce compatability; ensureNotSet(mapreduce.outputformat.class, mode); ensureNotSet(mapreduce.reduce.class, mode); } } } Raj K Singh http://in.linkedin.com/in/rajkrrsingh http://www.rajkrrsingh.blogspot.com Mobile Tel: +91 (0)9899821370 On Tue, Jun 3, 2014 at 5:34 PM, Michael Segel msegel_had...@hotmail.com wrote: Just a quick question... Suppose you have a M/R job running. How does the Mapper or Reducer task know or find out if its running as a M/R 1 or M/R 2 job? I would suspect the job context would hold that information... but on first glance I didn't see it. So what am I missing? Thx -Mike