[
https://issues.apache.org/jira/browse/HIVE-29123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sai Hemanth Gantasala resolved HIVE-29123.
------------------------------------------
Fix Version/s: 4.2.0
Resolution: Fixed
[~rtrivedi12] - Thanks for your contribution. This patch has been merged into
the master branch.
> Extend ProtobufInputFormat to handle EOFException for partially written proto
> files.
> ------------------------------------------------------------------------------------
>
> Key: HIVE-29123
> URL: https://issues.apache.org/jira/browse/HIVE-29123
> Project: Hive
> Issue Type: Improvement
> Reporter: Riju Trivedi
> Assignee: Riju Trivedi
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.2.0
>
>
> In Hive, the {{{}HiveProtoLoggingHook{}}}, and in Tez, the
> {{{}ProtoHistoryLoggingService{}}}, are responsible for logging query
> execution details, query plans, and other runtime statistics into protocol
> buffer (protobuf) files.
> These protobuf files are made accessible via EXTERNAL tables and are read
> using the {{{}ProtobufMessageInputFormat{}}}.
> However, in cases of abrupt *Application Master termination* or OutOfMemory
> ** errors, these proto files may be left empty or partially written.
> Attempting to query these EXTERNAL tables when such corrupted files are
> present can lead to query failures, typically with an {{{}EOFException{}}}.
> {code:java}
> Caused by: java.io.EOFException
> at java.base/java.io.DataInputStream.readFully(DataInputStream.java:202)
> at
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:70)
> at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:120)
> at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2505)
> at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2637)
> at
> org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:82)
> at
> org.apache.hadoop.hive.ql.io.protobuf.ProtobufMessageInputFormat$1.next(ProtobufMessageInputFormat.java:124)
> at
> org.apache.hadoop.hive.ql.io.protobuf.ProtobufMessageInputFormat$1.next(ProtobufMessageInputFormat.java:84)
> at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> ... 24 more {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)