Re: some map run really slow

2011-08-12 Thread wd
Thanks, the io.sort.mb is 200M here, I'll do more test on it. On Fri, Aug 12, 2011 at 8:47 PM, Florin P wrote: > Hello! >  We've encountered some slow map process when sending large amounts of data > from the mapper process to the reducer process. The output of the map process > will be first w

Re: Creating a custom composite key

2011-08-12 Thread Anthony Urso
This is fairly common. Just write your key as a Java class, implement WritableComparable, and do the right thing with your compareTo() and hashCode()/equals() methods. SecondarySort.IntPair in the examples may be inspirational. On Fri, Aug 12, 2011 at 3:49 PM, Roger Chen wrote: > Hi all, > Is an

Creating a custom composite key

2011-08-12 Thread Roger Chen
Hi all, Is anybody familiar with how to define a custom composite key in a format such as (Text, Text) so your context written as a key and value could be ((Text, Text) LongWritable)? Thanks, -- Roger Chen UC Davis Genome Center

Possible to override the context.write() method in ReduceContext?

2011-08-12 Thread Ross Nordeen
Using 0.20.2... Is it possible to override the context.write() method in ReduceContext? I have an entire set of Reducers that I would like to all use a specific function just before every context.write() but I don't want them to worry about that logic, just to have it handled transparently. Fo

Re: some map run really slow

2011-08-12 Thread Florin P
Hello! We've encountered some slow map process when sending large amounts of data from the mapper process to the reducer process. The output of the map process will be first written into a buffer whose size is given by the io.sort.mb property defined in core-site.xml. Its default value is 100.

Re: some map run really slow

2011-08-12 Thread wd
yes, there is a reduce. In fact, I'm using hive to run map reduce jobs, and the reducer is a perl script. The data send to reducer is about 1/3 or 1/4 of map input data. On Fri, Aug 12, 2011 at 5:26 PM, Florin P wrote: > Hello! >  Di you have a reducer class involved? If yes, what is the amount

Re: some map run really slow

2011-08-12 Thread Florin P
Hello! Di you have a reducer class involved? If yes, what is the amount of data that you are sending from the mapper to the reducer? Regards, Florin --- On Fri, 8/12/11, wd wrote: > From: wd > Subject: some map run really slow > To: mapreduce-user@hadoop.apache.org > Date: Friday, August 12,

Context.getInputSplit() returns null

2011-08-12 Thread Vegar Hatlevik
I am using hadoop 0.20.2 on CDH. I am trying to get the filename of the file currently being processed. I will extract some information from the filename which will determine the data processing to be performed. I want to do this because I need to process a lar

Re:Re: Status: FAILED Error: null, What's this problem?

2011-08-12 Thread 谭军
Hi, Mostafa you said it, the port number has changed to 50070. But I don't know how to read the userlog. I did not get any information helpful. Meta VERSION="1" . Job JOBID="job_201108111938_0002" JOBNAME="Retrieval" USER="root" SUBMIT_TIME="1313135011123" JOBCONF="hdfs://cx1:9000/home/cx/softw

some map run really slow

2011-08-12 Thread wd
hi, Here is the log for map run in one map/reduce. map1, run 1mins, 2sec, and processed 48572 rows 2011-08-12 01:34:28,313 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initializing Self 10 MAP 2011-08-12 01:34:28,313 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing Self 0 T