Re: Support for zipped input files

2009-03-10 Thread Ken Weiner
Thanks very much, Tom. You saved me a lot of time by confirming that it isn't available yet. I'll go vote for HADOOP-1824. On Tue, Mar 10, 2009 at 3:23 AM, Tom White t...@cloudera.com wrote: Hi Ken, Unfortunately, Hadoop doesn't yet support MapReduce on zipped files (see

Re: HDFS is corrupt, need to salvage the data.

2009-03-10 Thread Mayuran Yogarajah
lohit wrote: How many Datanodes do you have. From the output it looks like at the point when you ran fsck, you had only one datanode connected to your NameNode. Did you have others? Also, I see that your default replication is set to 1. Can you check if your datanodes are up and running.

streaming inputformat: class not found

2009-03-10 Thread t-alleyne
Hello, I'm try to run a mapreduce job on a data file in which the keys and values alternate rows. E.g. key1 value1 key2 ... I've written my own InputFormat by extending FileInputFormat (the code for this class is below.) The problem is that when I run hadoop streaming with the command

Re: HDFS is corrupt, need to salvage the data.

2009-03-10 Thread Raghu Angadi
Mayuran Yogarajah wrote: lohit wrote: How many Datanodes do you have. From the output it looks like at the point when you ran fsck, you had only one datanode connected to your NameNode. Did you have others? Also, I see that your default replication is set to 1. Can you check if your

Why is large number of [(heavy) keys , (light) value] faster than (light)key , (heavy) value

2009-03-10 Thread Gyanit
I have large number of key,value pairs. I don't actually care if data goes in value or key. Let me be more exact. (k,v) pair after combiner is about 1 mil. I have approx 1kb data for each pair. I can put it in keys or values. I have experimented with both options (heavy key , light value) vs

Re: Jobs stalling forever

2009-03-10 Thread Amareshwari Sriramadasu
This is due to HADOOP-5233. Got fixed in branch 0.19.2 -Amareshwari Nathan Marz wrote: Every now and then, I have jobs that stall forever with one map task remaining. The last map task remaining says it is at 100% and in the logs, it says it is in the process of committing. However, the task