Thanks for your reply, I suspected this. I will create a JIRA ticket.
Vyacheslav On Feb 21, 2012, at 6:02 PM, Scott Carey wrote: > > On 2/21/12 7:29 AM, "Vyacheslav Zholudev" <vyacheslav.zholu...@gmail.com> > wrote: > >> Yep, I saw that method as well as the stackoverflow post. However, I'm >> interested how to append to a file on the arbitrary file system, not only >> on the local one. >> >> I want to get an OutputStream based on the Path and the FileSystem >> implementation and then pass it for appending to avro methods. >> >> Is that possible? > > It is not possible without modifying DataFileWriter. Please open a JIRA > ticket. > > It could not simply append to an OutputStream, since it must either: > * Seek to the start to validate the schemas match and find the sync > marker, or > * Trust that the schemas match and find the sync marker from the last block > > DataFileWriter cannot refer to Hadoop classes such as FileSystem, but we > could add something to the mapred module that takes a Path and FileSystem > and returns > something that implemements an interface that DataFileWriter can append > to. This would be something that is both a > http://avro.apache.org/docs/1.6.2/api/java/org/apache/avro/file/SeekableInp > ut.html > and an OutputStream, or has both an InputStream from the start of the > existing file and an OutputStream at the end. > > > > >> >> Thanks, >> Vyacheslav >> >> On Feb 21, 2012, at 5:29 AM, Harsh J wrote: >> >>> Hi, >>> >>> Use the appendTo feature of the DataFileWriter. See >>> >>> http://avro.apache.org/docs/1.6.2/api/java/org/apache/avro/file/DataFileW >>> riter.html#appendTo(java.io.File) >>> >>> For a quick setup example, read also: >>> >>> http://stackoverflow.com/questions/8806689/can-you-append-data-to-an-exis >>> ting-avro-data-file >>> >>> On Tue, Feb 21, 2012 at 3:15 AM, Vyacheslav Zholudev >>> <vyacheslav.zholu...@gmail.com> wrote: >>>> Hi, >>>> >>>> is it possible to append to an already existing avro file when it was >>>> written and closed before? >>>> >>>> If I use >>>> outputStream = fs.append(avroFilePath); >>>> >>>> then later on I get: java.io.IOException: Invalid sync! >>>> >>>> Probably because the schema is written twice and some other issues. >>>> >>>> If I use outputStream = fs.create(avroFilePath); then the avro file >>>> gets >>>> overwritten. >>>> >>>> Thanks, >>>> Vyacheslav >>> >>> >>> >>> -- >>> Harsh J >>> Customer Ops. Engineer >>> Cloudera | http://tiny.cloudera.com/about >> > >