[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694877#comment-17694877 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang commented on code in PR

[GitHub] [parquet-mr] jiangjiguang commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub
jiangjiguang commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1121226677 ## .github/workflows/vector-plugins.yml: ## @@ -0,0 +1,56 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreem

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694876#comment-17694876 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang commented on code in PR

[GitHub] [parquet-mr] jiangjiguang commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub
jiangjiguang commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1121226362 ## .github/workflows/vector-plugins.yml: ## @@ -0,0 +1,56 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreem

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694874#comment-17694874 ] ASF GitHub Bot commented on PARQUET-2159: - wgtmac commented on code in PR #1011

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub
wgtmac commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1121218282 ## .github/workflows/vector-plugins.yml: ## @@ -0,0 +1,56 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

[GitHub] [parquet-mr] shangxinli merged pull request #1037: Add Gang Wu as committer

2023-02-28 Thread via GitHub
shangxinli merged PR #1037: URL: https://github.com/apache/parquet-mr/pull/1037 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694856#comment-17694856 ] ASF GitHub Bot commented on PARQUET-2159: - wgtmac commented on PR #1011: URL: h

[GitHub] [parquet-mr] wgtmac commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub
wgtmac commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1449381064 > @wgtmac sorry, may be my wrong click, I have reopened it NP. Seems it is running. https://github.com/apache/parquet-mr/actions/runs/4300429406/jobs/7496638049 -- This is an a

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694855#comment-17694855 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang commented on PR #1011:

[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1449379971 > > @gszadovszky @wgtmac I have added a new workflow named Vector-plugins, can you run it ? > > It seems that this PR is closed. Could you please reopen it and see if it ca

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694853#comment-17694853 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang opened a new pull reque

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694852#comment-17694852 ] ASF GitHub Bot commented on PARQUET-2159: - wgtmac commented on PR #1011: URL: h

[GitHub] [parquet-mr] wgtmac commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub
wgtmac commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1449365344 > @gszadovszky @wgtmac I have added a new workflow named Vector-plugins, can you run it ? It seems that this PR is closed. Could you please reopen it and see if it can run automa

[jira] [Commented] (PARQUET-2230) Add a new rewrite command powered by ParquetRewriter

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694851#comment-17694851 ] ASF GitHub Bot commented on PARQUET-2230: - wgtmac commented on PR #1036: URL: h

[GitHub] [parquet-mr] wgtmac commented on pull request #1036: PARQUET-2230: [CLI] Deprecate commands replaced by rewrite

2023-02-28 Thread via GitHub
wgtmac commented on PR #1036: URL: https://github.com/apache/parquet-mr/pull/1036#issuecomment-1449364300 > @wgtmac, the interface of parquet-cli is not the code but the commands. We are excluding parquet-cli from compatibility checks anyway, so you can modify the code as you wish (no depre

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694850#comment-17694850 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang commented on PR #1011:

[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1449363247 @gszadovszky @wgtmac I have added a new workflow named Vector-plugins, can you run it ? -- This is an automated message from the Apache Git Service. To respond to the message,

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694849#comment-17694849 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang closed pull request #10

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694848#comment-17694848 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang commented on PR #1011:

[GitHub] [parquet-mr] jiangjiguang closed pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub
jiangjiguang closed pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization URL: https://github.com/apache/parquet-mr/pull/1011 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1449362200 > > > > @gszadovszky @wgtmac This feature need avx512vbmi and avx512_vbmi2 instruction set, so it needs github action runners with intel ice lake. I do not know how to sel

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1037: Add Gang Wu as committer

2023-02-28 Thread via GitHub
wgtmac commented on code in PR #1037: URL: https://github.com/apache/parquet-mr/pull/1037#discussion_r1121134966 ## dev/COMMITTERS.md: ## @@ -50,7 +50,7 @@ Below is more information about each committer (in alphabetical order). If this | Wes McKinney | wesm

[GitHub] [parquet-mr] shangxinli commented on pull request #1037: Add Gang Wu as committer

2023-02-28 Thread via GitHub
shangxinli commented on PR #1037: URL: https://github.com/apache/parquet-mr/pull/1037#issuecomment-1449328980 @wgtmac check your IDs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [parquet-mr] shangxinli opened a new pull request, #1037: Add Gang Wu as committer

2023-02-28 Thread via GitHub
shangxinli opened a new pull request, #1037: URL: https://github.com/apache/parquet-mr/pull/1037 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Parquet Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references them

Re: Parquet Null logical type question

2023-02-28 Thread Micah Kornfield
It is a validation bug that you can read and write values to the column. My understanding of the use-case for the type is coming from more loosely typed systems that infer schemas on the fly and then write in the parquet. In these systems if a column contains all Null values then the actual type c

Re: Fallback Encoding for Very Sparse or Sorted Datasets

2023-02-28 Thread Gang Wu
Hi Patrick, Thanks for reporting the issue! Let me try to answer your question in short. 1. In your case, the good data is dictionary-encoded [1] and the size of the dictionary is 1. 2. The RLE encoding [2] you have observed from the good data applies to the only indices after dictionary encodin

Parquet Null logical type question

2023-02-28 Thread Jerry Adair
Hi, I am just learning of the Parquet Null logical type. I've read the documentation, as well as the brief inline commentary in the types header. That states that the Null logical type can annotate any primitive type. What I find confusing is that if I create a Parquet table with a primitive

Re: [IANA #1264350] application/vnd.apache.parquet registration request

2023-02-28 Thread Scott Lutwyche
Hi Amanda, Yes please keep it open I'm reaching out to the apache support team and a colleague to see if there is interest in providing some help to get the required information to you. Especially around the interoperability considerations. The feedback from Darell has outlined pretty clearly wha

[IANA #1264350] application/vnd.apache.parquet registration request

2023-02-28 Thread Amanda Baber via RT
Will do. I'll check back in a few weeks if we haven't heard from you. Thanks! Amanda On Mon Feb 27 23:23:01 2023, scott.lutwy...@des.qld.gov.au wrote: > Hi Amanda, > Yes please keep it open > I'm reaching out to the apache support team and a colleague to see if > there is interest in providing so

Fallback Encoding for Very Sparse or Sorted Datasets

2023-02-28 Thread Patrick Hansert
Hello everyone! First of all, I hope I'm in the right place. The contribution guidelines directed me here after I discovered the registration in the Jira tracker is closed. I'm a Ph.D. student at RPTU Kaiserslautern-Landau, and my current research revolves around sorting-based improvements to

[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1448194127 > @gszadovszky @wgtmac This feature need avx512vbmi and avx512_vbmi2 instruction set, so it needs github action runners with intel ice lake. I do not know how to select runne

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694545#comment-17694545 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang commented on PR #1011:

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694542#comment-17694542 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang commented on code in PR

[GitHub] [parquet-mr] jiangjiguang commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-28 Thread via GitHub
jiangjiguang commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1120068917 ## README.md: ## @@ -83,6 +83,20 @@ Parquet is a very active project, and new features are being added quickly. Here * Column stats * Delta encoding * Index

[jira] [Commented] (PARQUET-2103) crypto exception in print toPrettyJSON

2023-02-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694441#comment-17694441 ] ASF GitHub Bot commented on PARQUET-2103: - ggershinsky merged PR #1019: URL: ht

[GitHub] [parquet-mr] ggershinsky merged pull request #1019: PARQUET-2103: Fix crypto exception in print toPrettyJSON

2023-02-28 Thread via GitHub
ggershinsky merged PR #1019: URL: https://github.com/apache/parquet-mr/pull/1019 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet