Riju Trivedi created HIVE-29123:
-----------------------------------

             Summary: Extend ProtobufInputFormat to handle EOFException for 
partially written proto files.
                 Key: HIVE-29123
                 URL: https://issues.apache.org/jira/browse/HIVE-29123
             Project: Hive
          Issue Type: Improvement
            Reporter: Riju Trivedi
            Assignee: Riju Trivedi


HiveProtoLoggingHook in Hive and ProtoHistoryLoggingService in Tez logs query 
execution, query plan, and other runtime statistics in protobuf files. These 
proto files are exposed as EXTERNAL tables, read through 
ProtobufMessageInputFormat.

An abrupt AM kill or OOM event can result in empty or partially written proto 
files. Querying the table with empty/partially written files causes query 
failure with EOFException.
{code:java}
Caused by: java.io.EOFException
    at java.base/java.io.DataInputStream.readFully(DataInputStream.java:202)
    at 
org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:70)
    at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:120)
    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2505)
    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2637)
    at 
org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:82)
    at 
org.apache.hadoop.hive.ql.io.protobuf.ProtobufMessageInputFormat$1.next(ProtobufMessageInputFormat.java:124)
    at 
org.apache.hadoop.hive.ql.io.protobuf.ProtobufMessageInputFormat$1.next(ProtobufMessageInputFormat.java:84)
    at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
    ... 24 more {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to