[jira] [Commented] (AVRO-2178) avro C++ api support of tail reading of a growing avro file

2018-07-29 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561161#comment-16561161
 ] 

Thiruvalluvan M. G. commented on AVRO-2178:
---

It is an interesting use case.

I don't think implementation Avro data file in any language in this project 
will handle this use case well.

Looking at C++ implementation specifically, we have an underlying byte stream 
which can be "backed up" a bit, but it is not guaranteed to work. I don't think 
it is going to be easy to support this use case.

If you can take the latency impact, your reader process will have to wait for 
the writer to complete writing the whole file. The easiest way to achieve this 
is to let the writer to write into a temporary file and when done let the file 
be renamed to the actual destination name. This approach has a side benefit 
that partially written files are never visible to potential consumers.

> avro C++ api support of tail reading of a growing avro file
> ---
>
> Key: AVRO-2178
> URL: https://issues.apache.org/jira/browse/AVRO-2178
> Project: Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.8.2
>Reporter: peien
>Priority: Major
>
> Two processes, one is writing to an avro data file, another wishes to read 
> the latest written data.
> The problem with current C++ API is that when it reaches the EOF, an 
> exception will be thrown, and from the user perspective, I have no way to 
> retry or 'tail read' it again from the last good position.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2178) avro C++ api support of tail reading of a growing avro file

2018-08-20 Thread William Matthews (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16585500#comment-16585500
 ] 

William Matthews commented on AVRO-2178:


I have a pull request tracked in 
https://issues.apache.org/jira/browse/AVRO-2214 to add support for seeking. 
It'd give you a bit of a work around of "read to the end, stat the file until 
it grows, seek to the last known sync marker, read to the end, repeat". Would 
something like that work for you?

> avro C++ api support of tail reading of a growing avro file
> ---
>
> Key: AVRO-2178
> URL: https://issues.apache.org/jira/browse/AVRO-2178
> Project: Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.8.2
>Reporter: peien
>Priority: Major
>
> Two processes, one is writing to an avro data file, another wishes to read 
> the latest written data.
> The problem with current C++ API is that when it reaches the EOF, an 
> exception will be thrown, and from the user perspective, I have no way to 
> retry or 'tail read' it again from the last good position.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2178) avro C++ api support of tail reading of a growing avro file

2018-10-03 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637782#comment-16637782
 ] 

Thiruvalluvan M. G. commented on AVRO-2178:
---

A solution to your problem is to use [Unix named 
pipe|[http://example.com|https://en.wikipedia.org/wiki/Named_pipe]]. Instead of 
writing to a standard file, create a named pipe and then use that file to write 
and read. This would work because the reading named pipe blocks until more data 
is available or the writer explicitly closes the file (see [this 
documentation|https://www.systutorials.com/docs/linux/man/7-pipe/] for example)

> avro C++ api support of tail reading of a growing avro file
> ---
>
> Key: AVRO-2178
> URL: https://issues.apache.org/jira/browse/AVRO-2178
> Project: Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.8.2
>Reporter: peien
>Priority: Major
>
> Two processes, one is writing to an avro data file, another wishes to read 
> the latest written data.
> The problem with current C++ API is that when it reaches the EOF, an 
> exception will be thrown, and from the user perspective, I have no way to 
> retry or 'tail read' it again from the last good position.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2178) avro C++ api support of tail reading of a growing avro file

2018-11-12 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16684629#comment-16684629
 ] 

Thiruvalluvan M. G. commented on AVRO-2178:
---

The problem is not due to our C++ implementation, this will happen on any 
implementation. Will close the issue as 'not a problem' unless someone objects 
to it.

> avro C++ api support of tail reading of a growing avro file
> ---
>
> Key: AVRO-2178
> URL: https://issues.apache.org/jira/browse/AVRO-2178
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.8.2
>Reporter: peien
>Priority: Major
>
> Two processes, one is writing to an avro data file, another wishes to read 
> the latest written data.
> The problem with current C++ API is that when it reaches the EOF, an 
> exception will be thrown, and from the user perspective, I have no way to 
> retry or 'tail read' it again from the last good position.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)