https://issues.apache.org/jira/browse/HADOOP-6298
On Tue, Oct 1, 2013 at 12:39 AM, Chandra Mohan, Ananda Vel Murugan
ananda.muru...@honeywell.com wrote:
Hi,
I am using Hadoop 1.0.2. I have written a map reduce job. I have a
requirement to process the whole file without splitting. So I have written a
new input format to process the file as a whole by overriding the
isSplittable() method. I have also created a new Record reader
implementation to read the whole file. I followed the sample in Chapter 7 of
“Hadoop- The Definitive Guide” book. In my map reduce job, my mapper emits
BytesWritable as value. I want to get the bytes and read some specific
information from the bytes. I use ByteArrayInputStream and do further
processing. But strangely the following code shows different numbers.
Because of this I am getting errors.
//value - BytesWritable
System.out.println(“Bytes length ” + value.getLength()); // Bytes length
1931650
byte[] bytes = value.getBytes();
System.out.println(Bytes array length+bytes.length); //Bytes array length
2897340
My file size is 1931650 bytes. I don’t know why byte array is bigger than
the original file.
Any idea what is going wrong. Please help. Thanks in advance.
Regards,
Anand.C