It seems like the InMemoryFileSystem class has been deprecated in Hadoop
0.19.1. Why?
I want to reuse the result of reduce as the next time map's input. Cascading
does not work, because the data of each step is dependent. I set each
timestep mapreduce job as synchronization. If the InMemoryFileSys
Is there a way to implement some OutputCollector that can do what Andy wants
to do?
On Thu, Apr 2, 2009 at 10:21 AM, Rasit OZDAS wrote:
> Andy, I didn't try this feature. But I know that Yahoo had a
> performance record with this file format.
> I came across a file system included in hadoop code
I don't really see what the downside of reading it from disk is. A
list of word counts should be pretty small on disk so it shouldn't
take long to read it into a HashMap. Doing anything else is going to
cause you to go a long way out of your way to end up with the same
result.
-Bryan
On
That seems interesting, we have 3 replications as default.
Is there a way to define, lets say, 1 replication for only job-specific files?
2009/4/2 Owen O'Malley :
>
> On Apr 2, 2009, at 2:41 AM, andy2005cst wrote:
>
>>
>> I need to use the output of the reduce, but I don't know how to do.
>> use t
On Apr 2, 2009, at 2:41 AM, andy2005cst wrote:
I need to use the output of the reduce, but I don't know how to do.
use the wordcount program as an example if i want to collect the
wordcount
into a hashtable for further use, how can i do?
You can use an output format and then an input form
Andy, I didn't try this feature. But I know that Yahoo had a
performance record with this file format.
I came across a file system included in hadoop code (probably that
one) when searching the source code.
Luckily I found it: org.apache.hadoop.fs.InMemoryFileSystem
But if you have a lot of big fil
thanks for your reply. Let me explain more clearly, since Map Reduce is just
one step of my program, I need to use the output of reduce for furture
computation, so i do not need to want to wirte the output into disk, but
wanna to get the collection or list of the output in RAM. if it directly
wirt
Hi, hadoop is normally designed to write to disk. There are a special file
format, which writes output to RAM instead of disk.
But I don't have an idea if it's what you're looking for.
If what you said exists, there should be a mechanism which sends output as
objects rather than file content across
I need to use the output of the reduce, but I don't know how to do.
use the wordcount program as an example if i want to collect the wordcount
into a hashtable for further use, how can i do?
the example just show how to let the result onto disk.
myemail is : andy2005...@gmail.com
looking forward