Hi Ilayaraja,
Hadoop HDFS has a utility distcp using which it should be possible to
copy data between two different versions of hadoop. Though I am not quite sure
if this utility would be able to read data from hadoop 15.5 . More information
is available at the link -
http://hadoop.apache.org/common/docs/r0.19.1/distcp.html#cpver
There is no information on supported versions at this URL, but it could be
worth a try.
You can also create a local environment to push your HDFS data from older
version to newer version. And then finally, move it on Amazon EMR.
Regards,
Sagar
-----Original Message-----
From: ilayaraja [mailto:[email protected]]
Sent: Sunday, March 21, 2010 12:53 PM
To: hadoop-user; hadoop-dev
Subject: Hadoop Compatibility and EMR
Hi,
We 've been using hadoop 15.5 in our production environment where we have about
10 TB of data stored on the dfs.
The files were generated as mapreduce output. We want to move our env. to
Amazon Elastic Map Reduce (EMR) which throws the following questions to us:
1. EMR supports only hadoop 19.0 and above. Is it possible to use the current
data that were generated with hadoop 15.5 from hadoop 19.0?
2. Or how can we make it possible to use or update to hadoop 19.0 from hadoop
15.5? What are the issues expected while doing so?
Regards,
Ilayaraja
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the
property of Persistent Systems Ltd. It is intended only for the use of the
individual or entity to which it is addressed. If you are not the intended
recipient, you are not authorized to read, retain, copy, print, distribute or
use this message. If you have received this communication in error, please
notify the sender and delete all copies of this message. Persistent Systems
Ltd. does not accept any liability for virus infected mails.