Re: Pipelining Mappers and Reducers

2010-08-08 Thread Shai Erera
Hi I've done some work and thought I'd report back the results (that are not too encouraging). Approach 1: * Mappers output a side effect Lucene Directory (written on-disk) and a pair where the value is the location of the index on disk and the key is unimportant for now. * Reducer merges the on

Re: How read a key/value pair?

2010-08-08 Thread Pedro Costa
1 - I would like to create a generic method that could read the map outputs that are produced in the map side and print them out. I facing some difficulties on knowing how to read the map output files that are written in file or in memory in the map side. Can you give me some example on how can I

Re: How read a key/value pair?

2010-08-08 Thread Josh Patterson
Pedro, Could you provide a little more context? Exactly what are you trying to do? Josh Cloudera On Sun, Aug 8, 2010 at 1:20 PM, Pedro Costa wrote: > Hi, > > I would like to read a map output file that has the key/value pair. > Can anyone give an example in java please? > > Thanks > > -- > Pedr

How read a key/value pair?

2010-08-08 Thread Pedro Costa
Hi, I would like to read a map output file that has the key/value pair. Can anyone give an example in java please? Thanks -- Pedro

Re: Map output files contains headers?

2010-08-08 Thread Pedro Costa
I checked out those two files. On the reduce side, this is the file: A ??) and on the map side, this is the file: A ??%??d The hash should only be done the the letter A and now it's evident that the hash is not being done to that. The file should only contain the letter A. Maybe the re

Re: Map output files contains headers?

2010-08-08 Thread Dennis
Why donnot you check out the 10 bytes data and the 2 bytes data, to see the differences. Dennis. --- On Sun, 8/8/10, Pedro Costa wrote: From: Pedro Costa Subject: Re: Map output files contains headers? To: mapreduce-user@hadoop.apache.org Date: Sunday, August 8, 2010, 9:26 PM That's funny. I´

Re: Map output files contains headers?

2010-08-08 Thread Pedro Costa
I'm testing the hashes using the wordcount example. On Sun, Aug 8, 2010 at 2:32 PM, welman Lu wrote: > Are you sure the value type in pair output by mapper, > is the same with the the value type in pair that accepted by > reducer. > > I use the BytesWritable, and the sizes of the data are same.

Re: Map output files contains headers?

2010-08-08 Thread welman Lu
Are you sure the value type in pair output by mapper, is the same with the the value type in pair that accepted by reducer. I use the BytesWritable, and the sizes of the data are same.

Re: Map output files contains headers?

2010-08-08 Thread Pedro Costa
That's funny. I´m doing an hash to the map output file on the map side before it's written to the memory, and doing another hash after reduce fetch the map output, as result I get 2 different hashes. The size of the data that is hashed in the map side is 10 bytes, and on the reduce is 2 bytes. An

Re: Map output files contains headers?

2010-08-08 Thread Dennis
I donnot think there is any header. Dennis --- On Sun, 8/8/10, Pedro Costa wrote: From: Pedro Costa Subject: Map output files contains headers? To: mapreduce-user@hadoop.apache.org Date: Sunday, August 8, 2010, 9:01 PM Hi, The map output files that are produced in the map side and that the re

Map output files contains headers?

2010-08-08 Thread Pedro Costa
Hi, The map output files that are produced in the map side and that the reduce will fetch contains only data, or it also contains an header? If contains an header, what the size of it? Thanks, -- Pedro