[jira] [Created] (PARQUET-2363) ParquetRewriter should encrypt the V2 page header

2023-10-13 Thread Xianyang Liu (Jira)
Xianyang Liu created PARQUET-2363: - Summary: ParquetRewriter should encrypt the V2 page header Key: PARQUET-2363 URL: https://issues.apache.org/jira/browse/PARQUET-2363 Project: Parquet Issue

[PR] PARQUET-2363: ParquetRewriter should encrypt the V2 page header [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu opened a new pull request, #1169: URL: https://github.com/apache/parquet-mr/pull/1169 Make sure you have checked _all_ steps below. This PR adds support for encrypting the V2 page header when encrypting/re-encrypting files. ### Jira - [ ] My PR addresses the fol

[jira] [Commented] (PARQUET-2363) ParquetRewriter should encrypt the V2 page header

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774837#comment-17774837 ] ASF GitHub Bot commented on PARQUET-2363: - ConeyLiu opened a new pull request,

Re: [PR] PARQUET-2363: ParquetRewriter should encrypt the V2 page header [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1169: URL: https://github.com/apache/parquet-mr/pull/1169#discussion_r1357998012 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileWriter.java: ## @@ -784,13 +830,38 @@ public void writeDataPageV2(int rowCount, int nullCount,

[jira] [Commented] (PARQUET-2363) ParquetRewriter should encrypt the V2 page header

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774838#comment-17774838 ] ASF GitHub Bot commented on PARQUET-2363: - ConeyLiu commented on code in PR #11

Re: Drop parquet-thrift

2023-10-13 Thread Fokko Driesprong
Looking at the history , the last contribution to the module was on Jan 2021 . This has been released to the public. My main concern is

[jira] [Commented] (PARQUET-2363) ParquetRewriter should encrypt the V2 page header

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774840#comment-17774840 ] ASF GitHub Bot commented on PARQUET-2363: - ConeyLiu commented on PR #1169: URL:

Re: [PR] PARQUET-2363: ParquetRewriter should encrypt the V2 page header [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on PR #1169: URL: https://github.com/apache/parquet-mr/pull/1169#issuecomment-1761189862 Hi @wgtmac, please help to review this when you are free. Thanks a lot. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358376809 ## parquet-common/src/main/java/org/apache/parquet/conf/ParquetConfiguration.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774936#comment-17774936 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358382320 ## parquet-hadoop/src/main/java/org/apache/parquet/ParquetReadOptions.java: ## @@ -19,9 +19,12 @@ package org.apache.parquet; +import org.apache.hadoop.conf.Co

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774937#comment-17774937 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358384634 ## parquet-hadoop/src/main/java/org/apache/parquet/ParquetReadOptions.java: ## @@ -185,6 +232,31 @@ public static class Builder { protected int maxAllocationSi

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774940#comment-17774940 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358390515 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/DirectZstd.java: ## @@ -134,6 +144,10 @@ BytesInput getBytesInput() { } private static BufferPool

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774941#comment-17774941 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358392204 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputFormat.java: ## @@ -211,6 +213,19 @@ private static UnboundRecordFilter getUnboundRecordFilte

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774942#comment-17774942 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358397563 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetWriter.java: ## @@ -276,15 +279,40 @@ public ParquetWriter(Path file, Configuration conf, WriteSup

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774945#comment-17774945 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358398949 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/api/ReadSupport.java: ## @@ -75,14 +76,32 @@ public ReadContext init( throw new UnsupportedOperationE

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774949#comment-17774949 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358403120 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/api/WriteSupport.java: ## @@ -105,6 +106,15 @@ public Map getExtraMetaData() { */ public abstract W

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774950#comment-17774950 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358402436 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/api/ReadSupport.java: ## @@ -101,6 +120,24 @@ abstract public RecordMaterializer prepareForRead(

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774951#comment-17774951 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358406653 ## parquet-avro/src/main/java/org/apache/parquet/avro/AvroReadSupport.java: ## @@ -154,6 +171,10 @@ private static RecordMaterializer newCompatMaterializer( }

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774952#comment-17774952 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358407684 ## parquet-avro/src/main/java/org/apache/parquet/avro/AvroWriteSupport.java: ## @@ -405,6 +413,10 @@ private Binary fromAvroString(Object value) { } private

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774953#comment-17774953 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358412633 ## parquet-thrift/src/main/java/org/apache/parquet/thrift/ThriftRecordConverter.java: ## @@ -855,6 +857,18 @@ public ThriftRecordConverter(ThriftReader thriftReader

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774955#comment-17774955 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on code in PR #11

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#issuecomment-1761710017 @amousavigourabi we have added so many public methods to keep the backward compatibility. So which one is preferred (I think it should be the`ParquetConfiguration`)? Should we depreca

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774958#comment-17774958 ] ASF GitHub Bot commented on PARQUET-2347: - ConeyLiu commented on PR #1141: URL:

Re: [PR] allow read old parquet file which is maked by old api with old avro version which allow wrong default value in schema [parquet-mr]

2023-10-13 Thread via GitHub
ConeyLiu commented on code in PR #1140: URL: https://github.com/apache/parquet-mr/pull/1140#discussion_r1358420480 ## parquet-avro/src/main/java/org/apache/parquet/avro/AvroReadSupport.java: ## @@ -129,10 +129,10 @@ public RecordMaterializer prepareForRead( avroSchema = n

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
amousavigourabi commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358424942 ## parquet-hadoop/src/main/java/org/apache/parquet/ParquetReadOptions.java: ## @@ -185,6 +232,31 @@ public static class Builder { protected int maxAlloc

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774961#comment-17774961 ] ASF GitHub Bot commented on PARQUET-2347: - amousavigourabi commented on code in

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
amousavigourabi commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358428311 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/api/ReadSupport.java: ## @@ -75,14 +76,32 @@ public ReadContext init( throw new UnsupportedOpe

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774962#comment-17774962 ] ASF GitHub Bot commented on PARQUET-2347: - amousavigourabi commented on code in

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
amousavigourabi commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1358432821 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/api/ReadSupport.java: ## @@ -101,6 +120,24 @@ abstract public RecordMaterializer prepareForRead(

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774964#comment-17774964 ] ASF GitHub Bot commented on PARQUET-2347: - amousavigourabi commented on code in

Re: [PR] PARQUET-2347: Add interface layer between Parquet and Hadoop Configuration [parquet-mr]

2023-10-13 Thread via GitHub
amousavigourabi commented on PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#issuecomment-1761739906 > @amousavigourabi we have added so many public methods to keep the backward compatibility. So which one is preferred (I think it should be the`ParquetConfiguration`)? Should w

[jira] [Commented] (PARQUET-2347) Add interface layer between Parquet and Hadoop Configuration

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774967#comment-17774967 ] ASF GitHub Bot commented on PARQUET-2347: - amousavigourabi commented on PR #114

[jira] [Commented] (PARQUET-2361) Reduce failure rate of unit test testParquetFileWithBloomFilterWithFpp

2023-10-13 Thread Atour Mousavi Gourabi (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774984#comment-17774984 ] Atour Mousavi Gourabi commented on PARQUET-2361: [~fengjiajie] +1 for t

[VOTE][RESULT] Add Float16 type to specification

2023-10-13 Thread Ben Harkins
With 5 +1 binding votes and 3 +1 non-binding, the vote passes. Thank you to everyone who participated! Votes: - Antoine Pitrou - +1 (non-binding) - Xinli shang - +1 - Ryan Blue - +1 - Gábor Szádovszky - +1 - Micah Kornfield - +1 (non-binding) - Gang Wu - +1 (non-binding) - D

[PR] PARQUET-2361: Reduce failure rate of unit test [parquet-mr]

2023-10-13 Thread via GitHub
fengjiajie opened a new pull request, #1170: URL: https://github.com/apache/parquet-mr/pull/1170 Reduce failure rate of unit test testParquetFileWithBloomFilterWithFpp Change-Id: Ic230f197b0996333a082bb05bd201963d05d862e ``` [INFO] Results: [INFO] Error: Failures:

[jira] [Commented] (PARQUET-2361) Reduce failure rate of unit test testParquetFileWithBloomFilterWithFpp

2023-10-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775112#comment-17775112 ] ASF GitHub Bot commented on PARQUET-2361: - fengjiajie opened a new pull request

[jira] [Commented] (PARQUET-2361) Reduce failure rate of unit test testParquetFileWithBloomFilterWithFpp

2023-10-13 Thread Feng Jiajie (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775115#comment-17775115 ] Feng Jiajie commented on PARQUET-2361: -- Hi [~amousavigourabi]  I found two issues

[jira] [Comment Edited] (PARQUET-2361) Reduce failure rate of unit test testParquetFileWithBloomFilterWithFpp

2023-10-13 Thread Feng Jiajie (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775115#comment-17775115 ] Feng Jiajie edited comment on PARQUET-2361 at 10/14/23 3:05 AM: -

[jira] [Comment Edited] (PARQUET-2361) Reduce failure rate of unit test testParquetFileWithBloomFilterWithFpp

2023-10-13 Thread Feng Jiajie (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775115#comment-17775115 ] Feng Jiajie edited comment on PARQUET-2361 at 10/14/23 3:07 AM: -