For the record, I figured this out.
The default task scheduler just assigns tasks to trackers on a first come
first served basis as trackers' heartbeats are received; although this might
tend to favour data-local and then rack-local trackers at a large enough
scale.
I switched over to using FairS
You can use the HackReduce's datasets too for this.
http://hackreduce.org/datasets
Regards
El 6/7/2011 1:56 PM, Jonathan Coveney escribió:
Have you taken a look at the O'Reilly Hadoop book? It deals
consistently with a weather dataset that is, I believe, largely available.
2011/6/7 Francesco
I found from Googling around that I should probably be seeing messages like
"Choosing data-local task" and "Choosing rack-local task" - from
JobInProgress::addRunningTaskToTIP(). (e.g. here:
http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-dev/201012.mbox/%3C2120373776.44711293532894724.Ja
Great!
Thank you Jonathan
2011/6/7 Jonathan Coveney
> Have you taken a look at the O'Reilly Hadoop book? It deals consistently
> with a weather dataset that is, I believe, largely available.
>
>
> 2011/6/7 Francesco De Luca
>
>> Hello Sean,
>>
>> not exactely. I mean some applications like word
Have you taken a look at the O'Reilly Hadoop book? It deals consistently
with a weather dataset that is, I believe, largely available.
2011/6/7 Francesco De Luca
> Hello Sean,
>
> not exactely. I mean some applications like word count or inverted index
> and the relative input data.
>
> 2011/6/7
Hello Sean,
not exactely. I mean some applications like word count or inverted index and
the relative input data.
2011/6/7 Sean Owen
> Not sure if it's quite what you mean, but, Apache Mahout is essentially all
> applications of Hadoop for machine learning, a bunch of runnable jobs (some
> with
Not sure if it's quite what you mean, but, Apache Mahout is essentially all
applications of Hadoop for machine learning, a bunch of runnable jobs (some
with example data too).
mahout.apache.org
On Tue, Jun 7, 2011 at 3:54 PM, Francesco De Luca wrote:
> Where i can find some hadoop map reduce app
Where i can find some hadoop map reduce application examples (except word
count)
with associate input files?
Thanks
Harsh, thanks for the clarification.
But my mappers always seem to run elsewhere. Here's an example with 2
splits, both on rack1node1, but the 2 mappers get started on other nodes.
Could the "choosing a non-local task" message be significant?
I have actually read through the JobTracker source, bu