disclaimer : a newbie!!!
Howdy?
Got a quick question. -libjars option doesn't seem to work for me in -
prettymuch - my first (or mayby second) mapreduce job.
Here's what i'm doing :
$bin/hadoop jar sherlock.jar somepkg.FindSchoolsJob -libjars HStats-1A18.jar
input output
sherlock.jar has my
I wrote the following code today. We have our own flow execution logic which
calls the following to collect counters.
enum COUNT_COLLECTION {
LOG,// log the counters
ADD_TO_CONF// add counters to JobConf
}
protected static void collectCounters(Runni
Hi,
I chain multiple jobs in my program. Job 1's reduce function has a counter.
I want job 3's reduce function to read this Job 1's counter. How?
Thanks.
Hi,
On Thu, Dec 9, 2010 at 4:35 PM, Pedro Costa wrote:
> Hi,
>
> 1 - Hadoop MR contains a TaskMemoryManagerThread class that is used to
> manage memory usage of tasks running under a TaskTracker. Why Hadoop
> MR needs a class to manage memory? Why it couldn't rely on the JVM, or
> this class is h
Hi James,
The ReduceDriver is configured to receive a list of inputs because
lists have ordering guarantees whereas other Iterables/Collections do
not; for determinism's sake, it is best to guarantee that you're
calling reduce() with an ordered set of values when testing.
It would be stellar if y
> 2 - How the JT knows that a Map or Reduce Task finished? Is through
> the heartbeat?
Exactly. Tasks communicate with their TTs through the umbilical, and each
TT communicates with the JT via heartbeat (and heartbeat response).
Greg
Thanks everyone. It turned out that I was using the wrong port. The issue was
resolved.
From: hadoopman [mailto:hadoop...@gmail.com]
Sent: Monday, December 06, 2010 6:26 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: distcp just fails (was:distcp fails with ConnectException)
On 12/06/2010
Hello.
I'm unsure of if this is a bug or an oversight, but since I've not found any
reference anywhere to this, I figured I might bring it to light.
I've been using MultipleInputs for several of my MapReduce jobs, where I am
joining together different forms of data. However, I have encountered
Take a look at NLineInputFormat. You might want to use it in combination with
DistributedCache.
Sent from my iPhone
On Dec 9, 2010, at 5:02 AM, Narinder Kumar wrote:
> Hi All,
>
> We have a problem in hand which we would like to solve using Distributed and
> Parallel Processing.
>
> Probl
For 1, TMMT uses ProcessTree to check for task that is running beyond
memory-limits and kills it.
On Thu, Dec 9, 2010 at 3:05 AM, Pedro Costa wrote:
> Hi,
>
> 1 - Hadoop MR contains a TaskMemoryManagerThread class that is used to
> manage memory usage of tasks running under a TaskTracker. Why Ha
Hi All,
We have a problem in hand which we would like to solve using Distributed and
Parallel Processing.
*Problem context* : We have a Map (Entity, Value). The entity can have a
parent which in turn will have its parent and so on till we reach the head.
I have to traverse this tree and do some c
Hi,
1 - Hadoop MR contains a TaskMemoryManagerThread class that is used to
manage memory usage of tasks running under a TaskTracker. Why Hadoop
MR needs a class to manage memory? Why it couldn't rely on the JVM, or
this class is here for another purpose?
2 - How the JT knows that a Map or Reduce
Hi,
This relates to a bug we had a while back.
When running a reducer, if you want to buffer the values, you normally need
to take a copy of each value as you iterate through them. This is because
the iterator always returns the same object but the contents of the object
get filled with each valu
13 matches
Mail list logo