Was a JIRA ticket ever created regarding appending to an existing Avro file on 
HDFS?

What is the status of such a capability, a year out from when the issue below 
was raised?

On Wed, 22 Feb 2012 10:57:48 +0100, "Vyacheslav Zholudev" 
<vyacheslav.zholu...@gmail.com> wrote:

> Thanks for your reply, I suspected this. 
>
> I will create a JIRA ticket.
>
> Vyacheslav
> 
> On Feb 21, 2012, at 6:02 PM, Scott Carey wrote:
> 
>> 
>> On 2/21/12 7:29 AM, "Vyacheslav Zholudev" <vyacheslav.zholu...@gmail.com>
>> wrote:
>> 
>>> Yep, I saw that method as well as the stackoverflow post. However, I'm
>>> interested how to append to a file on the arbitrary file system, not
>>> only on the local one. 
>>> 
>>> I want to get an OutputStream based on the Path and the FileSystem
>>> implementation and then pass it for appending to avro methods.
>>> 
>>> Is that possible?
>> 
>> It is not possible without modifying DataFileWriter. Please open a JIRA
>> ticket.  
>> 
>> It could not simply append to an OutputStream, since it must either:
>> * Seek to the start to validate the schemas match and find the sync
>> marker, or
>> * Trust that the schemas match and find the sync marker from the last
>> block
>> 
>> DataFileWriter cannot refer to Hadoop classes such as FileSystem, but we
>> could add something to the mapred module that takes a Path and
>> FileSystem and returns something that implemements an interface that
>> DataFileWriter can append to.  This would be something that is both a
>> http://avro.apache.org/docs/1.6.2/api/java/org/apache/avro/file/SeekableInput.html
>> and an OutputStream, or has both an InputStream from the start of the
>> existing file and an OutputStream at the end.
>> 
>>> Thanks,
>>> Vyacheslav
>>> 
>>> On Feb 21, 2012, at 5:29 AM, Harsh J wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Use the appendTo feature of the DataFileWriter. See
>>>> 
>>>> http://avro.apache.org/docs/1.6.2/api/java/org/apache/avro/file/DataFileWriter.html#appendTo(java.io.File)
>>>> 
>>>> For a quick setup example, read also:
>>>> 
>>>> http://stackoverflow.com/questions/8806689/can-you-append-data-to-an-existing-avro-data-file
>>>> 
>>>> On Tue, Feb 21, 2012 at 3:15 AM, Vyacheslav Zholudev
>>>> <vyacheslav.zholu...@gmail.com> wrote:
>>>>> Hi,
>>>>> 
>>>>> is it possible to append to an already existing avro file when it was
>>>>> written and closed before?
>>>>> 
>>>>> If I use
>>>>> outputStream = fs.append(avroFilePath);
>>>>> 
>>>>> then later on I get: java.io.IOException: Invalid sync!
>>>>> 
>>>>> Probably because the schema is written twice and some other issues.
>>>>> 
>>>>> If I use outputStream = fs.create(avroFilePath); then the avro file
>>>>> gets
>>>>> overwritten.
>>>>> 
>>>>> Thanks,
>>>>> Vyacheslav
>>>> 
>>>> -- 
>>>> Harsh J
>>>> Customer Ops. Engineer
>>>> Cloudera | http://tiny.cloudera.com/about

Reply via email to