[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1449379971

   > > @gszadovszky @wgtmac I have added a new workflow named Vector-plugins, 
can you run it ?
   > 
   > It seems that this PR is closed. Could you please reopen it and see if it 
can run automatically?
   
   @wgtmac sorry, may be my wrong click, I have reopened it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1449363247

   @gszadovszky @wgtmac  I have added a new workflow named Vector-plugins, can 
you run it ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1449362200

   > > 
   > 
   > @gszadovszky @wgtmac This feature need avx512vbmi and avx512_vbmi2 
instruction set, so it needs github action runners with intel ice lake. I do 
not know how to select runners with Intel Ice Lake ? So I have submitted the 
help ([actions/runner#2467](https://github.com/actions/runner/issues/2467)).
   
   @gszadovszky  @wgtmac 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1448194127

   > 
   @gszadovszky @wgtmac  This feature need avx512vbmi and avx512_vbmi2 
instruction set, so it needs github action runners with intel ice lake. I do 
not know how to select runners with Intel Ice Lake ? So I have submitted the 
help (https://github.com/actions/runner/issues/2467).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-12 Thread via GitHub


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1427240074

   > > @wgtmac PTAK again
   > 
   > Generally this patch looks good to me now. Thanks @jiangjiguang for 
working on it!
   > 
   > Could you approve the workflow and take another pass? @gszadovszky 
@shangxinli @ggershinsky
   
   @gszadovszky @shangxinli @ggershinsky Can you take a look the pr ? thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-06 Thread via GitHub


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1418781509

   @wgtmac I added doc about how big data applications use Java Vector API


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-01 Thread via GitHub


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1413064662

   > 
   
   @wgtmac I know your concern: 
1. I will keep the content of the PR updated if needed when java changed.
2. I have coded a test to verify generated code, 
org.apache.parquet.column.values.bitpacking.TestByteBitPacking512VectorLE
3. I have finished the TPC-H integrated Testing with spark, maybe I can 
write a document to give best practice to test them


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-01-23 Thread via GitHub


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1400441832

   > Thank @gszadovszky a lot for helping with this PR!
   > 
   > +1 for what @gszadovszky said. The mainstream runtime JDK is still 1.8. 
Parquet is one of the underlying building blocks for many big data 
applications. The bare minimum, for now, is to keep java8 compatible. Otherwise 
forcing applications to upgrade to jdk17 because of Parquet is disruptive and 
impacts adoptions.
   > 
   > @jiangjiguang, I am very happy to see you have this PR to help the Parquet 
community. Would you mind starting an email discussion to 
[dev@parquet.apache.org](mailto:dev@parquet.apache.org) for this topic?
   > 
   > cc @ggershinsky @wgtmac
   
   @shangxinli 
   
   > 
   
   @gszadovszky @shangxinli @wgtmac I have started the discussion about how to 
upgrade java17 over a month, but nobody involved!  So I have updated the PR, it 
does not  involve how to upgrade java17.
   The default compilation is java8
   Just add maven build parameters  -P java17-target -P vector and get the 
expected jars  when people want to use java17 vector to speed up parquet decode


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2022-12-14 Thread GitBox


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1350805103

   @wgtmac  I have resubmitted the PR, can you review it again ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2022-12-13 Thread GitBox


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1350287549

   > This work looks promising! It would be great if you can add some 
micro-benchmark to parquet-benchmarks.
   
   @wgtmac I have add the micro-benchmark to parquet-benchmarks, this is the 
result:
   
   
![image](https://user-images.githubusercontent.com/12368495/207491959-a1a22134-98fd-45f6-aa08-1934584e0fbb.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2022-11-27 Thread GitBox


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1328405659

   > @gszadovszky That what I said supports java17 and is compatible with java8 
depends on compile environment. Can you start a discussion on how to support 
java17 ?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2022-11-25 Thread GitBox


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1327398085

   @gszadovszky I agree on it. As far as I know some products(such as hadoop 
presto flink ) are working towards upgrading to java17.  Spark already supports 
java17, trino requires a minimum java version is java17.0.3,  So I think 
parquet  should support java17 as soon as possible and be compatible with 
java8, because of java17 vector can bring 4x ~ 8x performance gain for parquet 
encode/decode. The PR uses "maven profile -P vector"  and  "code gen" to 
support java17 and is compatible with java8 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2022-11-24 Thread GitBox


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1326934392

   @gszadovszky Could you review the PR ? thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2022-11-24 Thread GitBox


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1326291164

   @gszadovszky I resubmitted the codeļ¼Œ can you approve the workflow again ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2022-11-23 Thread GitBox


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1325814649

   @wangyum Could you review the PR ? thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2022-11-20 Thread GitBox


jiangjiguang commented on PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1321101130

   @wangyum @gszadovszky This PR always fails to build, I do not know why.   Is 
the reason of failure "1 workflow awaiting approval" ? please help me, this is 
my first PR to parquet-mr community, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org