MRV1 / MRV2 interoperability question

2014-04-11 Thread David Rosenstrauch
I'm in the process of migrating over our Hadoop setup from MRv1 to MRv2 and have a question about interoperability. We run our Hadoop clusters in the cloud (AWS) in a transient fashion. I.e., start up clusters when needed, push all output from HDFS to S3, and shut the clusters down when done.

RE: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

2014-04-11 Thread Phan, Truong Q
I could not find the attempt_1395628276810_0062_m_000149_0 attemp* in the HDFS /tmp directory. Where can I find these log files. Thanks and Regards, Truong Phan P+ 61 2 8576 5771 M + 61 4 1463 7424 Etroung.p...@team.telstra.com W www.telstra.com -Original Message- From:

Re: InputFormat and InputSplit - Network location name contains /:

2014-04-11 Thread Patcharee Thongtra
Hi Harsh, Many thanks! I got rid of the problem by updating the InputSplit's getLocations() to return hosts. Patcharee On 04/11/2014 06:16 AM, Harsh J wrote: Do not use the InputSplit's getLocations() API to supply your file path, it is not intended for such things, if thats what you've

Number of map task

2014-04-11 Thread Patcharee Thongtra
Hi, I wrote a custom InputFormat. When I ran the pig script Load function using this InputFormat, the number of InputSplit 1, but there was only 1 map task handling these splits. Does the number of Map task not correspond to the number of splits? I think the job will be done quicker if

Re: how can i archive old data in HDFS?

2014-04-11 Thread Peyman Mohajerian
There is: http://hadoop.apache.org/docs/r1.2.1/hadoop_archives.html But not sure if it compresses the data or not. On Thu, Apr 10, 2014 at 9:57 PM, Stanley Shi s...@gopivotal.com wrote: AFAIK, no tools now. Regards, *Stanley Shi,* On Fri, Apr 11, 2014 at 9:09 AM, ch huang

Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

2014-04-11 Thread Roger Whitcomb
Hi, I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of issue browsing HDFS files. I have written an Apache Commons VFS (Virtual File System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for Pivot: full disclosure). And now I'm trying to get this

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

2014-04-11 Thread david marion
Hi Roger, I wrote the HDFS provider for Commons VFS. I went back and looked at the source and tests, and I don't see anything wrong with what you are doing. I did develop it against Hadoop 1.1.2 at the time, so there might be an issue that is not accounted for with Hadoop 2. It was also not

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

2014-04-11 Thread david marion
Also, make sure that the jars on the classpath actually contain the HDFS file system. I'm looking at: No FileSystem for scheme: hdfs which is an indicator for this condition. Dave From: dlmar...@hotmail.com To: user@hadoop.apache.org Subject: RE: Which Hadoop 2.x .jars are necessary for

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

2014-04-11 Thread Roger Whitcomb
Hi Dave, ​Thanks for the responses. I guess I have a small question then: what exact class(es) would it be looking for that it can't find? I have all the .jar files I mentioned below on the classpath, and it is loading and executing stuff in the org.apache.hadoop.fs.FileSystem class

RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

2014-04-11 Thread dlmarion
If memory serves me, its in the hadoop-hdfs.jar file. Sent via the Samsung GALAXY S®4, an ATT 4G LTE smartphone Original message From: Roger Whitcomb roger.whitc...@actian.com Date:04/11/2014 8:37 PM (GMT-05:00) To: user@hadoop.apache.org Subject: RE: Which Hadoop 2.x .jars

Resetting dead datanodes list

2014-04-11 Thread Ashwin Shankar
Hi, Hadoop-1's name node UI displays dead datanodes even if those instances are terminated and are not part of the cluster anymore. Is there a way to reset the dead datenode list without bouncing namenode ? This would help me in my script(which would run nightly) which parses the html