[ https://issues.apache.org/jira/browse/AVRO-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16717787#comment-16717787 ]
ASF subversion and git services commented on AVRO-2109: ------------------------------------------------------- Commit a731fab500606404ecfd755717b441109ccf7337 in avro's branch refs/heads/branch-1.8 from [~gszadovszky] [ https://gitbox.apache.org/repos/asf?p=avro.git;h=a731fab ] AVRO-2109: Reset buffers in case of IOException Closes #260 Signed-off-by: Zoltan Ivanfi <z...@cloudera.com> Signed-off-by: sacharya <su...@apache.org> Signed-off-by: Nandor Kollar <nkol...@apache.org> (cherry picked from commit 673261c8656124cc58bee65fe5e8c779350779ee) > Reset buffers in case of IOException > ------------------------------------ > > Key: AVRO-2109 > URL: https://issues.apache.org/jira/browse/AVRO-2109 > Project: Apache Avro > Issue Type: Improvement > Components: java > Affects Versions: 1.8.2 > Reporter: Gabor Szadovszky > Assignee: Gabor Szadovszky > Priority: Major > Fix For: 1.7.8, 1.9.0, 1.8.3 > > > In case of an {{IOException}} is thrown out from > {{DataFileWriter.writeBlock}} the {{buffer}} and {{blockCount}} are not reset > therefore duplicated data is written out when {{close}}/{{flush}}. > This is actually a conceptual question whether we should reset the buffer or > not in case of an exception. In case of an exception occurs during writing > the file we shall expect that the file will be corrupt. So, the possible > duplication of data shall not matter. > In the other hand if the file is already corrupt why would we try to write > anything again at file close? > This issue comes from a Flume issue where the HDFS wait thread is interrupted > because of a timeout during writing an Avro file. The actual block is > properly written already but because of the {{IOException}} caused by the > thread interrupt we invoke {{close()}} on the writer which writes the block > again with some other stuff (maybe duplicated sync marker) that makes the > file corrupt. > [~busbey], [~nkollar], [~zi], any thoughts? -- This message was sent by Atlassian JIRA (v7.6.3#76005)