here is the Jira issue,
and the beginning of a patch
https://issues.apache.org/jira/browse/MAPREDUCE-4866
there is indeed a limitation on the byte array size (around Integer.MAX_VALUE)
Maybe we could use BigArrays to overcome this limitation ?
What do you think ?
regards
Olivier
Le 6 déc.
Hi Oleg,
Speculative tasks can be launched as TaskAttempt in MR jobs.
And, if no reducer class is set, MR launches default Reducer
class(IdentityReducer).
Thanks,
Tsuyoshi
On Sun, Dec 9, 2012 at 11:53 PM, Oleg Zhurakousky
oleg.zhurakou...@gmail.com wrote:
I studying user logs on the two
Hi All
I got the below exception, Is the issue related to
https://issues.apache.org/jira/browse/MAPREDUCE-1182 ?
Am using CDH3U1
2012-12-10 06:22:39,688 FATAL org.apache.hadoop.mapred.Task:
attempt_201211120903_9197_r_24_0 : Map output copy failure :
java.lang.OutOfMemoryError: Java heap
In the ReduceTask.java having the below code
maxSize = (int)(conf.getInt(mapred.job.reduce.total.mem.bytes,
(int)Math.min(Runtime.getRuntime().maxMemory(),
Integer.MAX_VALUE))
* maxInMemCopyUse);
But in the patch
maxSize = (long)Math.min(
MR launches multiple attempts for single Task in case of TaskAttempt failures
or when speculative execution is turned on. In either case, a given Task will
only ever have one successful TaskAttempt whose output will be accepted
(committed).
Number of reduces is set to 1 by default in
Not familiar with your apr stuff, but you should capture getJobStatus() method
instead of getAllJobs(). getJobStatus() is what is called for individual jobs,
getAllJobs() is called only when you try to list jobs.
Thanks,
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/
On Dec
Are you seeing any performance impact with this cache increase? It is normal in
linux system to grab high cache level.
-Bharath
From: Andy Isaacson a...@cloudera.com
To: user@hadoop.apache.org
Sent: Monday, December 10, 2012 11:23 AM
Subject: Re: Strange
What was the job or query you were running?
Couple of suggestions:
1. Reduce data set size with job chaining
2. Increase Reduce task heap
3. If you are using Hive/Pig, you may want to tune your query.
-Bharath
From: Manoj Babu manoj...@gmail.com
To:
Yes there is performance impact. It should be visible from the graph I
attached. Basically, the CPU is spending much more time on System and the
User time is lowered.
When this happens (if I don't do a drop_caches in time) the MR job winds up
taking significantly longer than usual.
On Mon,
On Sun, Dec 9, 2012 at 5:45 AM, a...@hsk.hk a...@hsk.hk wrote:
Hi,
I always set vm.swappiness = 0 for my hadoop servers (PostgreSQL
servers too).
I have just done this for that machine. So far, I have not seen a
re-occurrence of the strange behavior; it appears this might have solved
the
However, in the case Oleg is talking about the attempts are:
attempt_201212051224_0021_m_00_0
attempt_201212051224_0021_m_02_0
attempt_201212051224_0021_m_03_0
These aren't multiple attempts of a single task, are they ? They are
actually different tasks. If they were multiple
Hi all!
What are the best practices to start MR jobs?
Curently I start my jobs by cron.
Also I start jobs by internal timer from my java application (which is
similar starting by cron)
What are other approaches to start MR jobs?
Perhaps, some best practices apply here?
--
Best regards,
Ivan
Hello Ivan,
Instead of running as a cron job, you can launch MR jobs through
Apache Oozie, the workflow engine for MapReduce, Pig etc. For more details
you can visit the Oozie home page at : oozie.apache.org
Regards,
Mohammad Tariq
On Tue, Dec 11, 2012 at 11:32 AM, Ivan Ryndin
13 matches
Mail list logo