[ https://issues.apache.org/jira/browse/AVRO-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16311607#comment-16311607 ]
ASF subversion and git services commented on AVRO-2109: ------------------------------------------------------- Commit a731fab500606404ecfd755717b441109ccf7337 in avro's branch refs/heads/branch-1.8 from [~gszadovszky] [ https://git-wip-us.apache.org/repos/asf?p=avro.git;h=a731fab ] AVRO-2109: Reset buffers in case of IOException Closes #260 Signed-off-by: Zoltan Ivanfi <z...@cloudera.com> Signed-off-by: sacharya <su...@apache.org> Signed-off-by: Nandor Kollar <nkol...@apache.org> (cherry picked from commit 673261c8656124cc58bee65fe5e8c779350779ee) > Reset buffers in case of IOException > ------------------------------------ > > Key: AVRO-2109 > URL: https://issues.apache.org/jira/browse/AVRO-2109 > Project: Avro > Issue Type: Improvement > Components: java > Affects Versions: 1.8.2 > Reporter: Gabor Szadovszky > Assignee: Gabor Szadovszky > Fix For: 1.7.8, 1.8.3 > > > In case of an {{IOException}} is thrown out from > {{DataFileWriter.writeBlock}} the {{buffer}} and {{blockCount}} are not reset > therefore duplicated data is written out when {{close}}/{{flush}}. > This is actually a conceptual question whether we should reset the buffer or > not in case of an exception. In case of an exception occurs during writing > the file we shall expect that the file will be corrupt. So, the possible > duplication of data shall not matter. > In the other hand if the file is already corrupt why would we try to write > anything again at file close? > This issue comes from a Flume issue where the HDFS wait thread is interrupted > because of a timeout during writing an Avro file. The actual block is > properly written already but because of the {{IOException}} caused by the > thread interrupt we invoke {{close()}} on the writer which writes the block > again with some other stuff (maybe duplicated sync marker) that makes the > file corrupt. > [~busbey], [~nkollar], [~zi], any thoughts? -- This message was sent by Atlassian JIRA (v6.4.14#64029)