Nartan, If you're using BytesWritable, I've heard that it doesn't return only valid bytes, it actually returns more than that. Here is this issue discussed: http://www.nabble.com/can%27t-read-the-SequenceFile-correctly-td21866960.html
Cheers, Rasit 2009/2/18 Nathan Marz <nat...@rapleaf.com> > Hello, > > I'm seeing very odd numbers from the HDFS job tracker page. I have a job > that operates over approximately 200 GB of data (209715200047 bytes to be > exact), and HDFS bytes read is 2,103,170,802,501 (2 TB). > > The "Map input bytes" is set to "209,714,811,510", which is a correct > number. > > The job only took 10 minutes to run, so there's no way that that much data > was actually read. Anyone have any idea of what's going on here? > > Thanks, > Nathan Marz > -- M. Raşit ÖZDAŞ