[jira] [Created] (AVRO-2215) Add Java support to convert to Apache Arrow
Chen Liang created AVRO-2215: Summary: Add Java support to convert to Apache Arrow Key: AVRO-2215 URL: https://issues.apache.org/jira/browse/AVRO-2215 Project: Avro Issue Type: Improvement Components: java Reporter: Chen Liang It would be great if Avro can add support to allow convert data to/from Apache Arrow format. This is available for Parquet format, I wonder if this is doable for Avro and if there is any plan for this. In our use case, we are mainly targeting for Java convert. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AVRO-2178) avro C++ api support of tail reading of a growing avro file
[ https://issues.apache.org/jira/browse/AVRO-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585500#comment-16585500 ] William Matthews commented on AVRO-2178: I have a pull request tracked in https://issues.apache.org/jira/browse/AVRO-2214 to add support for seeking. It'd give you a bit of a work around of "read to the end, stat the file until it grows, seek to the last known sync marker, read to the end, repeat". Would something like that work for you? > avro C++ api support of tail reading of a growing avro file > --- > > Key: AVRO-2178 > URL: https://issues.apache.org/jira/browse/AVRO-2178 > Project: Avro > Issue Type: Improvement > Components: c++ >Affects Versions: 1.8.2 >Reporter: peien >Priority: Major > > Two processes, one is writing to an avro data file, another wishes to read > the latest written data. > The problem with current C++ API is that when it reaches the EOF, an > exception will be thrown, and from the user perspective, I have no way to > retry or 'tail read' it again from the last good position. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AVRO-2214) Support sync and seek in C++ DataFileReader
[ https://issues.apache.org/jira/browse/AVRO-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585494#comment-16585494 ] ASF GitHub Bot commented on AVRO-2214: -- wmatthews-google opened a new pull request #328: AVRO-2214 Support sync and seek in C++ DataFileReader URL: https://github.com/apache/avro/pull/328 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support sync and seek in C++ DataFileReader > --- > > Key: AVRO-2214 > URL: https://issues.apache.org/jira/browse/AVRO-2214 > Project: Avro > Issue Type: Improvement > Components: c++ >Affects Versions: 1.8.2 >Reporter: William Matthews >Priority: Minor > > Java DataFileReader supports sync, seek, pastSync, etc. which allow parallel > reads of files, and reasonably efficient "tailing" of files. It would be > great if these were supported in C++ too. > Also, I think this would serve as a bit of a workaround for > https://issues.apache.org/jira/browse/AVRO-2178 (stat a file & see if it has > grown, sync/seek, read, repeat). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AVRO-2214) Support sync and seek in C++ DataFileReader
William Matthews created AVRO-2214: -- Summary: Support sync and seek in C++ DataFileReader Key: AVRO-2214 URL: https://issues.apache.org/jira/browse/AVRO-2214 Project: Avro Issue Type: Improvement Components: c++ Affects Versions: 1.8.2 Reporter: William Matthews Java DataFileReader supports sync, seek, pastSync, etc. which allow parallel reads of files, and reasonably efficient "tailing" of files. It would be great if these were supported in C++ too. Also, I think this would serve as a bit of a workaround for https://issues.apache.org/jira/browse/AVRO-2178 (stat a file & see if it has grown, sync/seek, read, repeat). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AVRO-2213) C++ tests fail with boost 1.67+
[ https://issues.apache.org/jira/browse/AVRO-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585476#comment-16585476 ] ASF GitHub Bot commented on AVRO-2213: -- wmatthews-google opened a new pull request #327: AVRO-2213 C++ tests fail with boost 1.67+ URL: https://github.com/apache/avro/pull/327 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > C++ tests fail with boost 1.67+ > --- > > Key: AVRO-2213 > URL: https://issues.apache.org/jira/browse/AVRO-2213 > Project: Avro > Issue Type: Bug > Components: c++ >Affects Versions: 1.8.2 >Reporter: William Matthews >Priority: Minor > > Boost 1.67 and later now returns an error when multiple tests are added with > the same name > ([https://www.boost.org/doc/libs/1_67_0/libs/test/doc/html/boost_test/change_log.html).] > This fails for 3 tests: > CodecTests.cc: > Test setup error: boost::unit_test::framework::setup_error: test unit with > name 'testCodecResolving2_0' registered > multiple times > DataFileTests.cc: > Test setup error: boost::unit_test::framework::setup_error: test unit with > name 'DataFileTest__testReadFull' registered multiple times > unittest.cc: > Test setup error: boost::unit_test::framework::setup_error: test unit with > name 'T__test' registered multiple times > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AVRO-2213) C++ tests fail with boost 1.67+
William Matthews created AVRO-2213: -- Summary: C++ tests fail with boost 1.67+ Key: AVRO-2213 URL: https://issues.apache.org/jira/browse/AVRO-2213 Project: Avro Issue Type: Bug Components: c++ Affects Versions: 1.8.2 Reporter: William Matthews Boost 1.67 and later now returns an error when multiple tests are added with the same name ([https://www.boost.org/doc/libs/1_67_0/libs/test/doc/html/boost_test/change_log.html).] This fails for 3 tests: CodecTests.cc: Test setup error: boost::unit_test::framework::setup_error: test unit with name 'testCodecResolving2_0' registered multiple times DataFileTests.cc: Test setup error: boost::unit_test::framework::setup_error: test unit with name 'DataFileTest__testReadFull' registered multiple times unittest.cc: Test setup error: boost::unit_test::framework::setup_error: test unit with name 'T__test' registered multiple times -- This message was sent by Atlassian JIRA (v7.6.3#76005)