Hi All, I am doing a test migration from Apache Hadoop-1.2.0 to Apache Hadoop-2.0.6-alpha on a single node environment.
I did the following: * Installed Apache Hadoop-1.2.0 * Ran word count sample MR jobs. The jobs executed successfully. * I stop all the services in Apache Hadoop-1.2.0 and then was able to start all services again. * The previous submitted jobs are visible after the stop/start in the job tracker url. Next I installed Apache Hadoop-2.0.6-alpha alongside. I used the SAME data directory locations that were in Apache Hadoop-1.2.0 in the configuration files namely: core-site.xml ---------------- $hadoop.tmp.dir /home/cloud/hadoop_migration/hadoop-data/tempdir hdfs-site.xml ----------------- $dfs.data.dir /home/cloud/hadoop_migration/hadoop-data/data $dfs.name.dir /home/cloud/hadoop_migration/hadoop-data/name I am UNABLE to start the NameNode from Apache Hadoop-2.0.6-alpha installation I am getting the error: 2013-12-03 18:28:23,941 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2013-12-03 18:28:24,080 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2013-12-03 18:28:24,081 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started 2013-12-03 18:28:24,576 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage directory (dfs.namenode.name.dir) configured. Beware of dataloss due to lack of redundant storage directories! 2013-12-03 18:28:24,576 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace edits storage directory (dfs.namenode.edits.dir) configured. Beware of dataloss due to lack of redundant storage directories! 2013-12-03 18:28:24,744 INFO org.apache.hadoop.util.HostsFileReader: Refreshing hosts (include/exclude) list 2013-12-03 18:28:24,749 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000 2013-12-03 18:28:24,762 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.block.access.token.enable=false 2013-12-03 18:28:24,762 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: defaultReplication = 1 2013-12-03 18:28:24,762 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication = 512 2013-12-03 18:28:24,762 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication = 1 2013-12-03 18:28:24,763 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplicationStreams = 2 2013-12-03 18:28:24,763 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: shouldCheckForEnoughRacks = false 2013-12-03 18:28:24,763 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: replicationRecheckInterval = 3000 2013-12-03 18:28:24,763 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: encryptDataTransfer = false 2013-12-03 18:28:24,771 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner = cloud (auth:SIMPLE) 2013-12-03 18:28:24,771 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup = supergroup 2013-12-03 18:28:24,771 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled = true 2013-12-03 18:28:24,771 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false 2013-12-03 18:28:24,776 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true 2013-12-03 18:28:25,230 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times 2013-12-03 18:28:25,243 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033 2013-12-03 18:28:25,244 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0 2013-12-03 18:28:25,244 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000 2013-12-03 18:28:25,288 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/cloud/hadoop_migration/hadoop-data/name/in_use.lock acquired by nodename 21...@impetus-942.impetus.co.in 2013-12-03 18:28:25,462 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system... 2013-12-03 18:28:25,462 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped. 2013-12-03 18:28:25,473 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete. 2013-12-03 18:28:25,474 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected version of storage directory /home/cloud/hadoop_migration/hadoop-data/name. Reported: -41. Expecting = -40. at org.apache.hadoop.hdfs.server.common.Storage.setLayoutVersion(Storage.java:1079) at org.apache.hadoop.hdfs.server.common.Storage.setFieldsFromProperties(Storage.java:887) at org.apache.hadoop.hdfs.server.namenode.NNStorage.setFieldsFromProperties(NNStorage.java:583) at org.apache.hadoop.hdfs.server.common.Storage.readProperties(Storage.java:918) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:304) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:200) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:627) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:469) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:403) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:437) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:609) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:594) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1169) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1235) 2013-12-03 18:28:25,479 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2013-12-03 18:28:25,481 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at Impetus-942.impetus.co.in/192.168.41.106 ************************************************************/ Independently both the installations(Apache Hadoop-1.2.0 and Apache Hadoop-2.0.6-alpha) are working for me. I am able to run the MR jobs on both the installations independently though. But I aim to migrate the data and jobs submitted from Apache Hadoop-1.2.0 to Apache Hadoop-2.0.6-alpha. Is there any HDFS compatibility issues from Apache Hadoop-1.2.0 to Apache Hadoop-2.0.6-alpha? Thanks, -Nirmal From: Nirmal Kumar Sent: Wednesday, November 27, 2013 2:56 PM To: user@hadoop.apache.org; rd...@iastate.edu Subject: RE: Any reference for upgrade hadoop from 1.x to 2.2 Hello Sandy, The post was useful and gave an insight of the migration. I am doing a test migration from Apache Hadoop-1.2.0 to Apache Hadoop-2.0.6-alpha on a single node environment. I am having the Apache Hadoop-1.2.0 up and running. Can you please let me know the steps that one should follow for the migration? I am thinking of doing something like: * Install Apache Hadoop-2.0.6-alpha alongside the existing Apache Hadoop-1.2.0 * Use the same HDFS locations * Change the various required configuration files * Stop Apache Hadoop-1.2.0 and start Apache Hadoop-2.0.6-alpha * Verify all the services are running * Test via mapreduce (test MRv1 and MRv2 examples) * Check Web UI Console and verify the MRv1 and MRv2 jobs These above steps needs to be performed on all the nodes in a cluster environment. The translation table mapping old configuration to new would be definitely *very* useful. Also the existing Hadoop ecosystem components needs to be considered: * Hive Scripts * Pig Scripts * Oozie Workflows Their compatibility and version support would need to be checked. Also thinking of any risks like Data Loss, others that one should keep in mind. Also I found: http://strataconf.com/strata2014/public/schedule/detail/32247 Thanks, -Nirmal From: Robert Dyer [mailto:psyb...@gmail.com] Sent: Friday, November 22, 2013 9:08 PM To: user@hadoop.apache.org<mailto:user@hadoop.apache.org> Subject: Re: Any reference for upgrade hadoop from 1.x to 2.2 Thanks Sandy! These seem helpful! "MapReduce cluster configuration options have been split into YARN configuration options, which go in yarn-site.xml; and MapReduce configuration options, which go in mapred-site.xml. Many have been given new names to reflect the shift. ... We'll follow up with a full translation table in a future post." This type of translation table mapping old configuration to new would be *very* useful! - Robert On Fri, Nov 22, 2013 at 2:15 AM, Sandy Ryza <sandy.r...@cloudera.com<mailto:sandy.r...@cloudera.com>> wrote: For MapReduce and YARN, we recently published a couple blog posts on migrating: http://blog.cloudera.com/blog/2013/11/migrating-to-mapreduce-2-on-yarn-for-users/ http://blog.cloudera.com/blog/2013/11/migrating-to-mapreduce-2-on-yarn-for-operators/ hope that helps, Sandy On Fri, Nov 22, 2013 at 3:03 AM, Nirmal Kumar <nirmal.ku...@impetus.co.in<mailto:nirmal.ku...@impetus.co.in>> wrote: Hi All, I am also looking into migrating\upgrading from Apache Hadoop 1.x to Apache Hadoop 2.x. I didn't find any doc\guide\blogs for the same. Although there are guides\docs for the CDH and HDP migration\upgradation from Hadoop 1.x to Hadoop 2.x Would referring those be of some use? I am looking for similar guides\docs for Apache Hadoop 1.x to Apache Hadoop 2.x. I found something on slideshare though. Not sure how much useful that is going to be. I still need to verify that. http://www.slideshare.net/mikejf12/an-example-apache-hadoop-yarn-upgrade Any suggestions\comments will be of great help. Thanks, -Nirmal From: Jilal Oussama [mailto:jilal.ouss...@gmail.com<mailto:jilal.ouss...@gmail.com>] Sent: Friday, November 08, 2013 9:13 PM To: user@hadoop.apache.org<mailto:user@hadoop.apache.org> Subject: Re: Any reference for upgrade hadoop from 1.x to 2.2 I am looking for the same thing if anyone can point us to a good direction please. Thank you. (Currently running Hadoop 1.2.1) 2013/11/1 YouPeng Yang <yypvsxf19870...@gmail.com<mailto:yypvsxf19870...@gmail.com>> Hi users Are there any reference docs to introduce how to upgrade hadoop from 1.x to 2.2. Regards ________________________________ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. ________________________________ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. ________________________________ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.