[ https://issues.apache.org/jira/browse/AVRO-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tom White resolved AVRO-1685. ----------------------------- Resolution: Fixed Assignee: Sehrope Sarkuni Hadoop Flags: Reviewed Fix Version/s: 1.8.0 I committed this. Thanks Sehrope! > Allow specifying sync in DataFileWriter.create > ---------------------------------------------- > > Key: AVRO-1685 > URL: https://issues.apache.org/jira/browse/AVRO-1685 > Project: Avro > Issue Type: Improvement > Components: java > Reporter: Sehrope Sarkuni > Assignee: Sehrope Sarkuni > Priority: Minor > Fix For: 1.8.0 > > Attachments: AVRO-1685.patch > > > Currently DataFileWriter generates a random 16-byte sync each time a new file > is created. This means that even if you write the exact same data in a new > file writer, the file itself will be slightly different (specifically the > sync will be different). > I'd like to be able to generate the exact same file multiple times. To do so, > I need a way to specify the 16-byte sync. > I've created a patch that adds this functionality by adding an overload of > the create() that takes a byte[] array as the third parameter. If the byte > array is null then a random sync is generated using the same internal static > generateSync() method as before. If it's not null then the length is checked > and it's used as the sync. The other two overloads of create(...) have been > modified to call the three parameter version with a null sync. > The patch includes three additional tests to check the error cases (invalid > length) and verify that generating the same file twice results in the same > byte array (i.e. exact match). -- This message was sent by Atlassian JIRA (v6.3.4#6332)