Re: HELP: I wanna store the output value into a list not write to the disk

2009-04-02 Thread He Chen
It seems like the InMemoryFileSystem class has been deprecated in Hadoop 0.19.1. Why? I want to reuse the result of reduce as the next time map's input. Cascading does not work, because the data of each step is dependent. I set each timestep mapreduce job as synchronization. If the InMemoryFileSys

Re: HELP: I wanna store the output value into a list not write to the disk

2009-04-02 Thread Farhan Husain
Is there a way to implement some OutputCollector that can do what Andy wants to do? On Thu, Apr 2, 2009 at 10:21 AM, Rasit OZDAS wrote: > Andy, I didn't try this feature. But I know that Yahoo had a > performance record with this file format. > I came across a file system included in hadoop code

Re: HELP: I wanna store the output value into a list not write to the disk

2009-04-02 Thread Bryan Duxbury
I don't really see what the downside of reading it from disk is. A list of word counts should be pretty small on disk so it shouldn't take long to read it into a HashMap. Doing anything else is going to cause you to go a long way out of your way to end up with the same result. -Bryan On

Re: HELP: I wanna store the output value into a list not write to the disk

2009-04-02 Thread Rasit OZDAS
That seems interesting, we have 3 replications as default. Is there a way to define, lets say, 1 replication for only job-specific files? 2009/4/2 Owen O'Malley : > > On Apr 2, 2009, at 2:41 AM, andy2005cst wrote: > >> >> I need to use the output of the reduce, but I don't know how to do. >> use t

Re: HELP: I wanna store the output value into a list not write to the disk

2009-04-02 Thread Owen O'Malley
On Apr 2, 2009, at 2:41 AM, andy2005cst wrote: I need to use the output of the reduce, but I don't know how to do. use the wordcount program as an example if i want to collect the wordcount into a hashtable for further use, how can i do? You can use an output format and then an input form

Re: HELP: I wanna store the output value into a list not write to the disk

2009-04-02 Thread Rasit OZDAS
Andy, I didn't try this feature. But I know that Yahoo had a performance record with this file format. I came across a file system included in hadoop code (probably that one) when searching the source code. Luckily I found it: org.apache.hadoop.fs.InMemoryFileSystem But if you have a lot of big fil

Re: HELP: I wanna store the output value into a list not write to the disk

2009-04-02 Thread andy2005cst
thanks for your reply. Let me explain more clearly, since Map Reduce is just one step of my program, I need to use the output of reduce for furture computation, so i do not need to want to wirte the output into disk, but wanna to get the collection or list of the output in RAM. if it directly wirt

Re: HELP: I wanna store the output value into a list not write to the disk

2009-04-02 Thread Rasit OZDAS
Hi, hadoop is normally designed to write to disk. There are a special file format, which writes output to RAM instead of disk. But I don't have an idea if it's what you're looking for. If what you said exists, there should be a mechanism which sends output as objects rather than file content across

HELP: I wanna store the output value into a list not write to the disk

2009-04-02 Thread andy2005cst
I need to use the output of the reduce, but I don't know how to do. use the wordcount program as an example if i want to collect the wordcount into a hashtable for further use, how can i do? the example just show how to let the result onto disk. myemail is : andy2005...@gmail.com looking forward