[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1449379971 > > @gszadovszky @wgtmac I have added a new workflow named Vector-plugins, can you run it ? > > It seems that this PR is closed. Could you please reopen it and see if it can run automatically? @wgtmac sorry, may be my wrong click, I have reopened it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1449363247 @gszadovszky @wgtmac I have added a new workflow named Vector-plugins, can you run it ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1449362200 > > > > @gszadovszky @wgtmac This feature need avx512vbmi and avx512_vbmi2 instruction set, so it needs github action runners with intel ice lake. I do not know how to select runners with Intel Ice Lake ? So I have submitted the help ([actions/runner#2467](https://github.com/actions/runner/issues/2467)). @gszadovszky @wgtmac -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1448194127 > @gszadovszky @wgtmac This feature need avx512vbmi and avx512_vbmi2 instruction set, so it needs github action runners with intel ice lake. I do not know how to select runners with Intel Ice Lake ? So I have submitted the help (https://github.com/actions/runner/issues/2467). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1427240074 > > @wgtmac PTAK again > > Generally this patch looks good to me now. Thanks @jiangjiguang for working on it! > > Could you approve the workflow and take another pass? @gszadovszky @shangxinli @ggershinsky @gszadovszky @shangxinli @ggershinsky Can you take a look the pr ? thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1418781509 @wgtmac I added doc about how big data applications use Java Vector API -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1413064662 > @wgtmac I know your concern: 1. I will keep the content of the PR updated if needed when java changed. 2. I have coded a test to verify generated code, org.apache.parquet.column.values.bitpacking.TestByteBitPacking512VectorLE 3. I have finished the TPC-H integrated Testing with spark, maybe I can write a document to give best practice to test them -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1400441832 > Thank @gszadovszky a lot for helping with this PR! > > +1 for what @gszadovszky said. The mainstream runtime JDK is still 1.8. Parquet is one of the underlying building blocks for many big data applications. The bare minimum, for now, is to keep java8 compatible. Otherwise forcing applications to upgrade to jdk17 because of Parquet is disruptive and impacts adoptions. > > @jiangjiguang, I am very happy to see you have this PR to help the Parquet community. Would you mind starting an email discussion to [dev@parquet.apache.org](mailto:dev@parquet.apache.org) for this topic? > > cc @ggershinsky @wgtmac @shangxinli > @gszadovszky @shangxinli @wgtmac I have started the discussion about how to upgrade java17 over a month, but nobody involved! So I have updated the PR, it does not involve how to upgrade java17. The default compilation is java8 Just add maven build parameters -P java17-target -P vector and get the expected jars when people want to use java17 vector to speed up parquet decode -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1350805103 @wgtmac I have resubmitted the PR, can you review it again ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1350287549 > This work looks promising! It would be great if you can add some micro-benchmark to parquet-benchmarks. @wgtmac I have add the micro-benchmark to parquet-benchmarks, this is the result: ![image](https://user-images.githubusercontent.com/12368495/207491959-a1a22134-98fd-45f6-aa08-1934584e0fbb.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1328405659 > @gszadovszky That what I said supports java17 and is compatible with java8 depends on compile environment. Can you start a discussion on how to support java17 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1327398085 @gszadovszky I agree on it. As far as I know some products(such as hadoop presto flink ) are working towards upgrading to java17. Spark already supports java17, trino requires a minimum java version is java17.0.3, So I think parquet should support java17 as soon as possible and be compatible with java8, because of java17 vector can bring 4x ~ 8x performance gain for parquet encode/decode. The PR uses "maven profile -P vector" and "code gen" to support java17 and is compatible with java8 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1326934392 @gszadovszky Could you review the PR ? thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1326291164 @gszadovszky I resubmitted the codeļ¼ can you approve the workflow again ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1325814649 @wangyum Could you review the PR ? thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1321101130 @wangyum @gszadovszky This PR always fails to build, I do not know why. Is the reason of failure "1 workflow awaiting approval" ? please help me, this is my first PR to parquet-mr community, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org