None of that.
I checked the the input file's SequenceFile Header and it says
"org.apache.hadoop.io.compress.zlib.BuiltInZlibDeflater"
Kim
On Fri, Mar 28, 2014 at 10:34 AM, Hardik Pandya wrote:
> what is your compression format gzip, lzo or snappy
>
> for lzo final output
>
> FileOutputFormat.s
what is your compression format gzip, lzo or snappy
for lzo final output
FileOutputFormat.setCompressOutput(conf, true);
FileOutputFormat.setOutputCompressorClass(conf, LzoCodec.class);
In addition, to make LZO splittable, you need to make a LZO index file.
On Thu, Mar 27, 2014 at 8:57 PM, Kim
Thanks folks.
I am not awared my input data file has been compressed.
FileOutputFromat.setCompressOutput() is set to true when the file is
written. 8-(
Kim
On Thu, Mar 27, 2014 at 5:46 PM, Mostafa Ead wrote:
> The following might answer you partially:
>
> Input key is not read from HDFS, it is
The following might answer you partially:
Input key is not read from HDFS, it is auto generated as the offset of the
input value in the input file. I think that is (partially) why read hdfs
bytes is smaller than written hdfs bytes.
On Mar 27, 2014 1:34 PM, "Kim Chew" wrote:
> I am also wonderin
I am also wondering if, say, I have two identical timestamp so they are
going to be written to the same file. Does MulitpleOutputs handle appending?
Thanks.
Kim
On Thu, Mar 27, 2014 at 12:30 PM, Thomas Bentsen wrote:
> Have you checked the content of the files you write?
>
>
> /th
>
> On Thu,
Yea, gonna do that. 8-)
Kim
On Thu, Mar 27, 2014 at 12:30 PM, Thomas Bentsen wrote:
> Have you checked the content of the files you write?
>
>
> /th
>
> On Thu, 2014-03-27 at 11:43 -0700, Kim Chew wrote:
> > I have a simple M/R job using Mapper only thus no reducer. The mapper
> > read a times
Have you checked the content of the files you write?
/th
On Thu, 2014-03-27 at 11:43 -0700, Kim Chew wrote:
> I have a simple M/R job using Mapper only thus no reducer. The mapper
> read a timestamp from the value, generate a path to the output file
> and writes the key and value to the output f
I have a simple M/R job using Mapper only thus no reducer. The mapper read
a timestamp from the value, generate a path to the output file and writes
the key and value to the output file.
The input file is a sequence file, not compressed and stored in the HDFS,
it has a size of 162.68 MB.
Output a