The header is expected to have the full name of the key class and value
class so if it is only detected with the first record (?) indeed the file
can not respect its own format.

I haven't tried it but LazyOutputFormat should solve your problem.
https://hadoop.apache.org/docs/current/api/index.html?org/apache/hadoop/mapred/lib/LazyOutputFormat.html

Regards

Bertrand Dechoux


Bertrand Dechoux


On Tue, Jul 22, 2014 at 10:39 PM, Edward Capriolo <edlinuxg...@gmail.com>
wrote:

> I have two processes. One that writes sequence files directly to hdfs, the
> other that is a hive table that reads these files.
>
> All works well with the exception that I am only flushing the files
> periodically. SequenceFile input format gets angry when it encounters
> 0-bytes seq files.
>
> I was considering flush and sync on first record write. Also was thinking
> should just be able to hack sequence file input format to skip 0 byte files
> and not throw exception on readFully() which it sometimes does.
>
> Anyone ever tackled this?
>

Reply via email to