The CapacityScheduler provides exactly this. Setup 2 queues with appropriate
capacities for each:
http://hadoop.apache.org/common/docs/r1.0.0/capacity_scheduler.html
Arun
On Jan 17, 2012, at 10:57 PM, edward choi wrote:
Hi,
I often run into situations like this:
I am running a very heavy
Does it mean that on an average 1 file has only 2 blocks ( with
replication=1 ) ?
On 1/18/12, M. C. Srivas mcsri...@gmail.com wrote:
Konstantin's paper
http://www.usenix.org/publications/login/2010-04/openpdfs/shvachko.pdf
mentions that on average a file consumes about 600 bytes of memory in
Does it mean that on an average 1 file has only 2 blocks ( with
replication=1 ) ?
On 1/18/12, M. C. Srivas mcsri...@gmail.com wrote:
Konstantin's paper
http://www.usenix.org/publications/login/2010-04/openpdfs/shvachko.pdf
mentions that on average a file consumes about 600 bytes of memory in
The problem I've run into more than memory is having the system CPU
time get out of control. My guess is that the threshold for what is
considered overloaded is going to be dependent on your system setup,
what you're running on it, and what bounds your jobs.
On Tue, Jan 17, 2012 at 22:06,
Hi,
I have a small cluster of 4 datanodes, all datanodes are running version 0.20.
I am trying to decommission one of the nodes however I am seeing the DFS usage
fluctuate from 2.07 - 2.09 GB and then it drops back down to 2.07, the block
count also fluctuates in a similar pattern. It has been
It worked, thank you, Harsh.
Mark
On Wed, Jan 18, 2012 at 1:16 AM, Harsh J ha...@cloudera.com wrote:
Ah sorry about missing that. Settings would go in core-site.xml
(hdfs-site.xml will no longer be relevant anymore, once you switch to using
S3).
On 18-Jan-2012, at 12:36 PM, Mark Kerzner
I would strongly suggest using this method to read S3 only.
I have had problems with writing large volumes of data to S3 from Hadoop
using native s3fs. Supposedly a fix is on the way from Amazon (it is an
undocumented internal error being thrown). However, this fix is already 2
months later
Awesome important, Matt, thank you so much!
Mark
On Wed, Jan 18, 2012 at 10:53 AM, Matt Pouttu-Clarke
matt.pouttu-cla...@icrossing.com wrote:
I would strongly suggest using this method to read S3 only.
I have had problems with writing large volumes of data to S3 from Hadoop
using native
Hi,
Memory loading in most linux distro's is not readily available from
top or the usual suspects, in fact looking at top is rather
misleading. Linux can run just fine with committed memory greater than
100%, what you want to look at is the % of committed memory relative to
the total memery.
The map tasks fail timing out after 600 sec.
I am processing one 9 GB file with 16,000,000 records. Each record (think
is it as a line) generates hundreds of key value pairs.
The job is unusual in that the output of the mapper in terms of records or
bytes orders of magnitude larger than the
Sounds like mapred.task.timeout? The default is 10 minutes.
http://hadoop.apache.org/common/docs/current/mapred-default.html
Thanks,
Tom
On Wed, Jan 18, 2012 at 2:05 PM, Steve Lewis lordjoe2...@gmail.com wrote:
The map tasks fail timing out after 600 sec.
I am processing one 9 GB file with
Does it always fail at the same place? Does the task log shows something
unusual?
On Wed, Jan 18, 2012 at 3:46 PM, Steve Lewis lordjoe2...@gmail.com wrote:
I KNOW is is a task timeout - what I do NOT know is WHY merely cutting the
number of writes causes it to go away. It seems to imply that
It always fails with a task timeout and that error gives me very little
indication of where the error occurs. The one piece of data I have is that
if I only call context.write 1 in 100 times it does not time out suggesting
that it is not MY code that is timing out.
I could try to time the write
Perhaps you are not reporting progress throughout your task. If you
happen to run a job large enough job you hit the the default timeout
mapred.task.timeout (that defaults to 10 min). Perhaps you should
consider reporting progress in your mapper/reducer by calling
progress() on the Reporter
1) I do a lot of progress reporting
2) Why would the job succeed when the only change in the code is
if(NumberWrites++ % 100 == 0)
context.write(key,value);
comment out the test allowing full writes and the job fails
Since every write is a report I assume that something in the
Steve
Does the timeout happen for all the map jobs? Are you using some kind of shared
storage for map outputs? Any problems with the physical disks? If the shuffle
phase has started could the disks be I/O waiting between the read and write?
Raj
From: Steve
In my hands the problem occurs in all map jobs - an associate with a
different cluster - mine has 8 nodes - his 40 reports 80% of map tasks fail
with a few succeeding -
I suspect some kind of an I/O waiot but fail to see how it gets to 600sec
On Wed, Jan 18, 2012 at 4:50 PM, Raj V
But Steve, it is your code... :-)
Here is a simple test...
Set your code up where the run fails...
Add a simple timer to see how long you spend in the Mapper.map() method.
only print out the time if its greater than lets say 500 seconds...
The other thing is to update a dynamic counter in
You can try the following
- make it into a map only job (for debug purposes)
- start your shuffle phase after all the maps are complete( there is a
parameter for this)
-characterize your disks for performance
Raj
Sent from Samsung Mobile
Steve Lewis lordjoe2...@gmail.com wrote:
In my hands
Hi everyone,
Any ideas on how to tackle this kind of situation.
Thanks,
Praveenesh
On Tue, Jan 17, 2012 at 1:02 PM, praveenesh kumar praveen...@gmail.comwrote:
I have a replication factor of 2, because of the reason that I can not
afford 3 replicas on my cluster.
fsck output was saying block
20 matches
Mail list logo