[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-22 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-804525790 > @liyafan82 @emkornfield Can you one of you update https://github.com/apache/arrow/blob/master/docs/source/status.rst#ipc-format once this is all finished? @pitrou I wil

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-17 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-801600759 > If you've already started [ARROW-11899](https://issues.apache.org/jira/browse/ARROW-11899) then I'll let you finish it up, hopefully it isn't too much work. We are discussing

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-17 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-801583143 > +1 thank you. @liyafan82 did you have plans to work on the follow-up items or ZSTD? Otherwise I can take them up. > > @HedgehogCode any thoughts on how to procede for LZ

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-17 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-800898224 > please update the docs to match, something like. > "Slice the buffer to contain the uncompressed bytes" Updated. Thank you. -

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-17 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-800897864 > With the new enum, maybe we can make this an accessor that returns and enum instead? and then the byte can be extracted from there where necesssary? Sounds good. I have

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-17 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-800883260 > Thank @liyafan82 a few more minor comments. I'd like to see this merged sooner rather then later so we can do the follow-up work. If you don't have bandwidth please let me kno

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-11 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-796674674 @emkornfield I have replied to each of the previous comment. So maybe it is ready for a new round of review. Thanks. ---

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-09 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-794829156 > @liyafan82 let me know when you think this is ready for re-review. I think like I said I think getting a baseline working so we can do the follow-up work makes the most sense

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-09 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-794791234 Thanks for the discussions. @HedgehogCode Thanks a lot for the performance data. Do you have any idea about the performance difference? Is it due to the fault of our imple

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-04 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-791193697 > @liyafan82 nice work. Left a few comments about API and structure let me know what you think. @emkornfield Thanks a lot for your comments. I will resolve them one by one

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-03 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-790239103 > Congratulations @liyafan82 ! Do you have an idea how hard it will be to add zstd support? @pitrou Support for zstd should be much easier, as you can see, most of the ef

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-03 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-789684969 The integration tests have passed. Please take another another look when you have time, dear reviewers. (maybe just review the last three commits) Thanks a lot. ---

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-03 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-789649806 > I've restarted the integration CI job, it seemed stuck downloading the docker image. @pitrou Thanks for your help. ---

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-03-01 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-788520085 > @liyafan82 Thanks for your great working! I have cherry pick your code in my project to enable the LZ4 compress. And I encountered the following two bugs. Looking forward your

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-02-23 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-784850125 To avoid the direct dependency on the lz4 library, I have extracted the concrete compression codec implementations to a separate module. Will continue to work on the integration

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-02-12 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-778084083 > @liyafan82 could you enable the java integration test to confirm that reading the files generated by C++ works before we merge (once we verify it is working I can take a final

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-02-05 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-773186162 Switched to the commons-compress library, according to @emkornfield's suggestion. This is an automated message

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-02-04 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-773186162 Switched to the commons-compress library, according to @emkornfield's suggestion. This is an automated message

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-01-31 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-770514765 > @liyafan82 per recent discussion on mailing list. I looked into it and the lz4 page mentioned https://commons.apache.org/proper/commons-compress/javadocs/api-release/org/apach

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-01-15 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-760767374 > The [comment in the BodyCompression protobuf](https://github.com/apache/arrow/blob/master/format/Message.fbs#L59-L65) states: > > > Each constituent buffer is first com

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-01-07 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-756003757 > When I use the changes and try to compress and decompress an empty buffer (by using a variable sized vector with only missing values) I get a SIGSEGV ([hs_err_pid10504.log](ht

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-01-05 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-755045402 > When I use the changes and try to compress and decompress an empty buffer (by using a variable sized vector with only missing values) I get a SIGSEGV ([hs_err_pid10504.log](ht

[GitHub] [arrow] liyafan82 commented on pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2021-01-05 Thread GitBox
liyafan82 commented on pull request #8949: URL: https://github.com/apache/arrow/pull/8949#issuecomment-755044988 > Is it possible to add a test to confirm that this can be read/written from the C++ implementation? @emkornfield I think it is a good idea to provide e2e cross-language