-libjars?

2010-12-09 Thread Vipul Pandey
disclaimer : a newbie!!! Howdy? Got a quick question. -libjars option doesn't seem to work for me in - prettymuch - my first (or mayby second) mapreduce job. Here's what i'm doing : $bin/hadoop jar sherlock.jar somepkg.FindSchoolsJob -libjars HStats-1A18.jar input output sherlock.jar has my

Re: How to share Same Counter in Multiple Jobs?

2010-12-09 Thread Ted Yu
I wrote the following code today. We have our own flow execution logic which calls the following to collect counters. enum COUNT_COLLECTION { LOG,// log the counters ADD_TO_CONF// add counters to JobConf } protected static void collectCounters(Runni

How to share Same Counter in Multiple Jobs?

2010-12-09 Thread Savannah Beckett
Hi,   I chain multiple jobs in my program.  Job 1's reduce function has a counter.  I want job 3's reduce function to read this Job 1's counter.  How?  Thanks.

Re: Memory Manager in Hadoop MR

2010-12-09 Thread Hemanth Yamijala
Hi, On Thu, Dec 9, 2010 at 4:35 PM, Pedro Costa wrote: > Hi, > > 1 - Hadoop MR contains a TaskMemoryManagerThread class that is used to > manage memory usage of tasks running under a TaskTracker. Why Hadoop > MR needs a class to manage memory? Why it couldn't rely on the JVM, or > this class is h

Re: Behaviour of reducer's Iterable in MR unit.

2010-12-09 Thread Aaron Kimball
Hi James, The ReduceDriver is configured to receive a list of inputs because lists have ordering guarantees whereas other Iterables/Collections do not; for determinism's sake, it is best to guarantee that you're calling reduce() with an ordered set of values when testing. It would be stellar if y

Re: Memory Manager in Hadoop MR

2010-12-09 Thread Greg Roelofs
> 2 - How the JT knows that a Map or Reduce Task finished? Is through > the heartbeat? Exactly. Tasks communicate with their TTs through the umbilical, and each TT communicates with the JT via heartbeat (and heartbeat response). Greg

RE: distcp just fails (was:distcp fails with ConnectException)

2010-12-09 Thread Deepika Khera
Thanks everyone. It turned out that I was using the wrong port. The issue was resolved. From: hadoopman [mailto:hadoop...@gmail.com] Sent: Monday, December 06, 2010 6:26 PM To: mapreduce-user@hadoop.apache.org Subject: Re: distcp just fails (was:distcp fails with ConnectException) On 12/06/2010

MultipleInputs and Paths Containing Commas

2010-12-09 Thread Ghigliotti, Matthew
Hello. I'm unsure of if this is a bug or an oversight, but since I've not found any reference anywhere to this, I figured I might bring it to light. I've been using MultipleInputs for several of my MapReduce jobs, where I am joining together different forms of data. However, I have encountered

Re: Map-Reduce Applicability With All-In Memory Data

2010-12-09 Thread Jason
Take a look at NLineInputFormat. You might want to use it in combination with DistributedCache. Sent from my iPhone On Dec 9, 2010, at 5:02 AM, Narinder Kumar wrote: > Hi All, > > We have a problem in hand which we would like to solve using Distributed and > Parallel Processing. > > Probl

Re: Memory Manager in Hadoop MR

2010-12-09 Thread Ted Yu
For 1, TMMT uses ProcessTree to check for task that is running beyond memory-limits and kills it. On Thu, Dec 9, 2010 at 3:05 AM, Pedro Costa wrote: > Hi, > > 1 - Hadoop MR contains a TaskMemoryManagerThread class that is used to > manage memory usage of tasks running under a TaskTracker. Why Ha

Map-Reduce Applicability With All-In Memory Data

2010-12-09 Thread Narinder Kumar
Hi All, We have a problem in hand which we would like to solve using Distributed and Parallel Processing. *Problem context* : We have a Map (Entity, Value). The entity can have a parent which in turn will have its parent and so on till we reach the head. I have to traverse this tree and do some c

Memory Manager in Hadoop MR

2010-12-09 Thread Pedro Costa
Hi, 1 - Hadoop MR contains a TaskMemoryManagerThread class that is used to manage memory usage of tasks running under a TaskTracker. Why Hadoop MR needs a class to manage memory? Why it couldn't rely on the JVM, or this class is here for another purpose? 2 - How the JT knows that a Map or Reduce

Behaviour of reducer's Iterable in MR unit.

2010-12-09 Thread James Hammerton
Hi, This relates to a bug we had a while back. When running a reducer, if you want to buffer the values, you normally need to take a copy of each value as you iterate through them. This is because the iterator always returns the same object but the contents of the object get filled with each valu