Hello,
I am trying to write a simple sorting application for hadoop. This is what
I have thought till now. Suppose I have 100 lines of data and 10 mappers, each
of
the 10 mappers will sort the data given to it. But I am unable to figure out is
how to join these outputs to one big sorted arra
Hello Everybody,
I have a small question. I want to know how would one implement
divide and conquer algorithms in Hadoop. For example suppose I want to implement
merge sort 100 lines in hadoop. There will be 10 mapper each sorting 10 lines.
Now comes the tough part
In the tradition
Hi Abhishek,
If you use input lines as your output keys in map, Hadoop internals
will do the work for you and the keys will appear in sorted order in
your reduce (you can use IdentityReducer). This needs a slight
adjustment if your input lines aren't unique.
If you have R reducers, this will crea
Hi,
> I have a small question. I want to know how would one implement
> divide and conquer algorithms in Hadoop. For example suppose I want to
> implement
> merge sort 100 lines in hadoop. There will be 10 mapper each sorting 10 lines.
> Now comes the tough part
>
> In the traditio
Hi all,
here is a wired observation. The keys in the result of *ONE* reducer are
ordered like this:
18166
18169
1817
18171
18172
why is key "1817" comes after "18169"? It makes sense if that key is "18170"
but it isn't! Why does it happen and basically, how does hadoop tell k
I'm not sure this sort of problem will be efficient in Hadoop, but its
the kind of problem WaveFS[1] is designed for. It propagates
intermediate values across the cluster, allowing for algorithms to run
in parallel, but coalesce shared products from distributed calculations.
Without the need to for
Hi all!
I'm using hadoop to make huge weka learning files. As you know, weka file
(ARFF) has some speicial file headers. Is there a way for me to add it at
the beginning of the file using FileOutputFormat? If so, how can I do that?
Thanks!
Regards
Song Liu
Hi Gang, It is sorting it lexicographically.
--Prateek.
On Sun, Feb 28, 2010 at 3:23 PM, Gang Luo wrote:
> Hi all,
> here is a wired observation. The keys in the result of *ONE* reducer are
> ordered like this:
> 18166
> 18169
> 1817
> 18171
> 18172
>
> why is key "1817" comes after "18169"? It
Hi Gang,
What's your reduce output key type? It looks like you're using Text
instead of IntWritable, causing your keys to be sorted
lexicographically instead of numerically.
Sorting is done with a comparator that defines how an arbitrary
element compares to another. Hashing serves a different pur
Hi all,
I just wanted to let you guys know that HDFS support has been added to the
recently released version 0.8.5 of muCommander ( http://www.mucommander.com/ ),
allowing you to browse, read and write to an HDFS cluster with the convenience
of a graphical user interface.
I'm considering adding
Thanks Ed and Prateek who indicate this in previous mail. Yes, I use Text
instead of IntWritable. It make sense if it is sorted in lexicographical order.
-Gang
- 原始邮件
发件人: Ed Mazur
收件人: common-user@hadoop.apache.org
发送日期: 2010/2/28 (周日) 4:28:46 下午
主 题: Re: no complete sort
Hi Gang,
write the header in setup method if you are using new hadoop API.
On Sun, Feb 28, 2010 at 1:26 PM, Song Liu wrote:
> Hi all!
> I'm using hadoop to make huge weka learning files. As you know, weka file
> (ARFF) has some speicial file headers. Is there a way for me to add it at
> the beginning o
Hi,
Is there any way we can chain the reducers . As in initially the reducers
work
on some data. The output of these reducers is again sent to the same reducers
again and so on. Similar to how the conquer step takes place in divide and
conquer algorithms ? I hope you got what I am trying to ask
13 matches
Mail list logo