I am putting binary data into binary columns in hive and using RCFile. Most data is just fine in my very large table, however queries over certain time frames get me RCFile/Compression issues. The data goes in fine. Is this a FS level corruption issue? Is this something tunable? How would I even go about troubleshooting something like this?
Hive Runtime Error while processing writable SEQ -org.apache.hadoop.hive.ql.io.RCFile$KeyBuffer/org.apache.hadoop.hive.ql.io.RCFile$ValueBuffer 'org.apache.hadoop.io.compress.GzipCodec hive.io.rcfile.column.number 72 T6J