[jira] [Updated] (PARQUET-1006) ColumnChunkPageWriter uses only heap memory.

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-1006: - Fix Version/s: (was: 1.13.0) > ColumnChunkPageWriter uses only heap memory. >

[jira] [Assigned] (PARQUET-2081) Encryption translation tool - Parquet-hadoop

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu reassigned PARQUET-2081: Assignee: Xinli Shang > Encryption translation tool - Parquet-hadoop >

[jira] [Resolved] (PARQUET-2081) Encryption translation tool - Parquet-hadoop

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2081. -- Resolution: Fixed > Encryption translation tool - Parquet-hadoop >

[jira] [Updated] (PARQUET-2212) Add ByteBuffer api for decryptors to allow direct memory to be decrypted

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2212: - Fix Version/s: (was: 1.12.3) > Add ByteBuffer api for decryptors to allow direct memory to be

[jira] [Resolved] (PARQUET-2127) Security risk in latest parquet-jackson-1.12.2.jar

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2127. -- Resolution: Fixed > Security risk in latest parquet-jackson-1.12.2.jar >

[jira] [Resolved] (PARQUET-2076) Improve Travis CI build Performance

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2076. -- Resolution: Fixed > Improve Travis CI build Performance > --- > >

[jira] [Updated] (PARQUET-2134) Incorrect type checking in HadoopStreams.wrap

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2134: - Fix Version/s: 1.13.0 > Incorrect type checking in HadoopStreams.wrap >

[jira] [Updated] (PARQUET-2154) ParquetFileReader should close its input stream when `filterRowGroups` throw Exception in constructor

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2154: - Fix Version/s: 1.13.0 > ParquetFileReader should close its input stream when `filterRowGroups` throw

[jira] [Updated] (PARQUET-2192) Add Java 17 build test to GitHub action

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2192: - Fix Version/s: 1.13.0 > Add Java 17 build test to GitHub action >

[jira] [Updated] (PARQUET-2167) CLI show footer command fails if Parquet file contains date fields

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2167: - Fix Version/s: 1.13.0 > CLI show footer command fails if Parquet file contains date fields >

[jira] [Updated] (PARQUET-2161) Row positions are computed incorrectly when range or offset metadata filter is used

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2161: - Fix Version/s: 1.13.0 > Row positions are computed incorrectly when range or offset metadata filter >

[jira] [Updated] (PARQUET-2160) Close decompression stream to free off-heap memory in time

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2160: - Fix Version/s: 1.13.0 > Close decompression stream to free off-heap memory in time >

[jira] [Updated] (PARQUET-2185) ParquetReader constructed using builder fails to read encrypted files

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2185: - Fix Version/s: 1.13.0 > ParquetReader constructed using builder fails to read encrypted files >

[jira] [Updated] (PARQUET-2198) Vulnerabilities in jackson-databind

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2198: - Fix Version/s: 1.13.0 > Vulnerabilities in jackson-databind > --- > >

[jira] [Updated] (PARQUET-2142) parquet-cli without hadoop throws java.lang.NoSuchMethodError on any parquet file access command

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2142: - Fix Version/s: 1.13.0 > parquet-cli without hadoop throws java.lang.NoSuchMethodError on any parquet

[jira] [Updated] (PARQUET-2219) ParquetFileReader throws a runtime exception when a file contains only headers and now row data

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2219: - Fix Version/s: 1.13.0 > ParquetFileReader throws a runtime exception when a file contains only >

[jira] [Updated] (PARQUET-1711) [parquet-protobuf] stack overflow when work with well known json type

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-1711: - Fix Version/s: 1.13.0 > [parquet-protobuf] stack overflow when work with well known json type >

[jira] [Updated] (PARQUET-2177) Fix parquet-cli not to fail showing descriptions

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2177: - Fix Version/s: 1.13.0 > Fix parquet-cli not to fail showing descriptions >

[jira] [Updated] (PARQUET-2244) Dictionary filter may skip row-groups incorrectly when evaluating notIn

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2244: - Fix Version/s: 1.13.0 > Dictionary filter may skip row-groups incorrectly when evaluating notIn >

[jira] [Updated] (PARQUET-2247) Fail-fast if CapacityByteArrayOutputStream write overflow

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2247: - Fix Version/s: 1.13.0 > Fail-fast if CapacityByteArrayOutputStream write overflow >

[jira] [Updated] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2241: - Fix Version/s: 1.13.0 > ByteStreamSplitDecoder broken in presence of nulls >

[jira] [Updated] (PARQUET-2251) Avoid generating Bloomfilter when all pages of a column are encoded by dictionary

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2251: - Fix Version/s: 1.13.0 > Avoid generating Bloomfilter when all pages of a column are encoded by >

[jira] [Updated] (PARQUET-2157) Add BloomFilter fpp config

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2157: - Fix Version/s: 1.13.0 > Add BloomFilter fpp config > -- > >

[jira] [Updated] (PARQUET-2164) CapacityByteArrayOutputStream overflow while writing causes negative row group sizes to be written

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2164: - Fix Version/s: 1.13.0 > CapacityByteArrayOutputStream overflow while writing causes negative row >

[jira] [Updated] (PARQUET-2202) Redundant String allocation on the hot path in CapacityByteArrayOutputStream.setByte

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2202: - Fix Version/s: 1.13.0 > Redundant String allocation on the hot path in >

[jira] [Updated] (PARQUET-2243) Support zstd-jni in DirectCodecFactory

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2243: - Fix Version/s: 1.13.0 > Support zstd-jni in DirectCodecFactory >

[jira] [Updated] (PARQUET-2103) crypto exception in print toPrettyJSON

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2103: - Fix Version/s: 1.13.0 > crypto exception in print toPrettyJSON >

[jira] [Updated] (PARQUET-2138) Add ShowBloomFilterCommand to parquet-cli

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2138: - Fix Version/s: 1.13.0 > Add ShowBloomFilterCommand to parquet-cli >

[jira] [Updated] (PARQUET-2191) Upgrade Scala to 2.12.17

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2191: - Fix Version/s: 1.13.0 > Upgrade Scala to 2.12.17 > > > Key:

[jira] [Updated] (PARQUET-2169) Upgrade Avro to version 1.11.1

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2169: - Fix Version/s: 1.13.0 > Upgrade Avro to version 1.11.1 > -- > >

[jira] [Updated] (PARQUET-2196) Support LZ4_RAW codec

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2196: - Fix Version/s: 1.13.0 > Support LZ4_RAW codec > - > > Key:

[jira] [Assigned] (PARQUET-2191) Upgrade Scala to 2.12.17

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu reassigned PARQUET-2191: Assignee: Yuming Wang > Upgrade Scala to 2.12.17 > > >

[jira] [Updated] (PARQUET-2197) Document uniform encryption

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2197: - Fix Version/s: 1.13.0 > Document uniform encryption > --- > >

[jira] [Updated] (PARQUET-2155) Upgrade protobuf version to 3.17.3

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2155: - Fix Version/s: 1.13.0 > Upgrade protobuf version to 3.17.3 > -- > >

[jira] [Updated] (PARQUET-2195) Add scan command to parquet-cli

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2195: - Fix Version/s: 1.13.0 > Add scan command to parquet-cli > --- > >

[jira] [Updated] (PARQUET-2224) Publish SBOM artifacts

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2224: - Fix Version/s: 1.13.0 > Publish SBOM artifacts > -- > > Key:

[jira] [Updated] (PARQUET-2176) Parquet writers should allow for configurable index/statistics truncation

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2176: - Fix Version/s: 1.13.0 > Parquet writers should allow for configurable index/statistics truncation >

[jira] [Updated] (PARQUET-2208) Add details to nested column encryption config doc and exception text

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2208: - Fix Version/s: 1.13.0 > Add details to nested column encryption config doc and exception text >

[jira] [Updated] (PARQUET-2246) Add short circuit logic to column index filter

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2246: - Fix Version/s: 1.13.0 > Add short circuit logic to column index filter >

[jira] [Updated] (PARQUET-2227) Refactor different file rewriters to use single implementation

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2227: - Fix Version/s: 1.13.0 > Refactor different file rewriters to use single implementation >

[jira] [Updated] (PARQUET-2226) Support merge Bloom Filter

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2226: - Fix Version/s: 1.13.0 > Support merge Bloom Filter > -- > >

[jira] [Updated] (PARQUET-2258) Storing toString fields in FilterPredicate instances can lead to memory pressure

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2258: - Fix Version/s: 1.13.0 (was: 1.12.3) > Storing toString fields in

[jira] [Resolved] (PARQUET-2075) Unified Rewriter Tool

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2075. -- Fix Version/s: 1.13.0 Resolution: Fixed > Unified Rewriter Tool > --- >

[jira] [Updated] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2159: - Fix Version/s: 1.13.0 > Parquet bit-packing de/encode optimization >

[jira] [Updated] (PARQUET-2252) Make some methods public to allow external projects to implement page skipping

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2252: - Fix Version/s: 1.13.0 > Make some methods public to allow external projects to implement page skipping

[jira] [Updated] (PARQUET-2229) ParquetRewriter supports masking and encrypting the same column

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2229: - Fix Version/s: 1.13.0 > ParquetRewriter supports masking and encrypting the same column >

[jira] [Updated] (PARQUET-2228) ParquetRewriter supports more than one input file

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2228: - Fix Version/s: 1.13.0 > ParquetRewriter supports more than one input file >

[jira] [Updated] (PARQUET-2230) Add a new rewrite command powered by ParquetRewriter

2023-04-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2230: - Fix Version/s: 1.13.0 > Add a new rewrite command powered by ParquetRewriter >

[jira] [Commented] (PARQUET-2224) Publish SBOM artifacts

2023-03-28 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706171#comment-17706171 ] Gang Wu commented on PARQUET-2224: -- Thanks [~dongjoon] for the detail! > Publish SBOM artifacts >

[jira] [Commented] (PARQUET-2224) Publish SBOM artifacts

2023-03-27 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17705396#comment-17705396 ] Gang Wu commented on PARQUET-2224: -- [~ste...@apache.org] Do you have relevant JIRAs? We are releasing

[jira] [Created] (PARQUET-2262) Fix local build failure from maven-surefire-plugin due to missing surefire.argLine

2023-03-26 Thread Gang Wu (Jira)
Gang Wu created PARQUET-2262: Summary: Fix local build failure from maven-surefire-plugin due to missing surefire.argLine Key: PARQUET-2262 URL: https://issues.apache.org/jira/browse/PARQUET-2262

[jira] [Commented] (PARQUET-2224) Publish SBOM artifacts

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17705145#comment-17705145 ] Gang Wu commented on PARQUET-2224: -- Thanks for reminding me. I have assigned it to you. [~dongjoon]

[jira] [Assigned] (PARQUET-2224) Publish SBOM artifacts

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu reassigned PARQUET-2224: Assignee: Dongjoon Hyun > Publish SBOM artifacts > -- > >

[jira] [Resolved] (PARQUET-2154) ParquetFileReader should close its input stream when `filterRowGroups` throw Exception in constructor

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2154. -- Resolution: Fixed > ParquetFileReader should close its input stream when `filterRowGroups` throw >

[jira] [Resolved] (PARQUET-2161) Row positions are computed incorrectly when range or offset metadata filter is used

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2161. -- Resolution: Fixed > Row positions are computed incorrectly when range or offset metadata filter >

[jira] [Resolved] (PARQUET-2138) Add ShowBloomFilterCommand to parquet-cli

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2138. -- Resolution: Fixed > Add ShowBloomFilterCommand to parquet-cli >

[jira] [Resolved] (PARQUET-2134) Incorrect type checking in HadoopStreams.wrap

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2134. -- Resolution: Fixed > Incorrect type checking in HadoopStreams.wrap >

[jira] [Resolved] (PARQUET-2155) Upgrade protobuf version to 3.17.3

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2155. -- Assignee: Chao Sun Resolution: Fixed > Upgrade protobuf version to 3.17.3 >

[jira] [Resolved] (PARQUET-2167) CLI show footer command fails if Parquet file contains date fields

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2167. -- Resolution: Fixed > CLI show footer command fails if Parquet file contains date fields >

[jira] [Resolved] (PARQUET-2169) Upgrade Avro to version 1.11.1

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2169. -- Resolution: Fixed > Upgrade Avro to version 1.11.1 > -- > >

[jira] [Resolved] (PARQUET-2191) Upgrade Scala to 2.12.17

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2191. -- Resolution: Fixed > Upgrade Scala to 2.12.17 > > > Key:

[jira] [Resolved] (PARQUET-2192) Add Java 17 build test to GitHub action

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2192. -- Resolution: Fixed > Add Java 17 build test to GitHub action >

[jira] [Resolved] (PARQUET-2185) ParquetReader constructed using builder fails to read encrypted files

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2185. -- Resolution: Fixed > ParquetReader constructed using builder fails to read encrypted files >

[jira] [Resolved] (PARQUET-2197) Document uniform encryption

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2197. -- Resolution: Fixed > Document uniform encryption > --- > >

[jira] [Resolved] (PARQUET-2176) Parquet writers should allow for configurable index/statistics truncation

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2176. -- Resolution: Fixed > Parquet writers should allow for configurable index/statistics truncation >

[jira] [Resolved] (PARQUET-1711) [parquet-protobuf] stack overflow when work with well known json type

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-1711. -- Resolution: Fixed > [parquet-protobuf] stack overflow when work with well known json type >

[jira] [Assigned] (PARQUET-2195) Add scan command to parquet-cli

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu reassigned PARQUET-2195: Assignee: Gang Wu > Add scan command to parquet-cli > --- > >

[jira] [Resolved] (PARQUET-2195) Add scan command to parquet-cli

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2195. -- Resolution: Fixed > Add scan command to parquet-cli > --- > >

[jira] [Resolved] (PARQUET-2177) Fix parquet-cli not to fail showing descriptions

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2177. -- Resolution: Fixed > Fix parquet-cli not to fail showing descriptions >

[jira] [Resolved] (PARQUET-2198) Vulnerabilities in jackson-databind

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2198. -- Resolution: Fixed > Vulnerabilities in jackson-databind > --- > >

[jira] [Resolved] (PARQUET-2208) Add details to nested column encryption config doc and exception text

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2208. -- Assignee: Gidon Gershinsky Resolution: Fixed > Add details to nested column encryption config

[jira] [Resolved] (PARQUET-2224) Publish SBOM artifacts

2023-03-26 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2224. -- Resolution: Fixed > Publish SBOM artifacts > -- > > Key:

[jira] [Resolved] (PARQUET-2103) crypto exception in print toPrettyJSON

2023-03-25 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2103. -- Resolution: Fixed > crypto exception in print toPrettyJSON > --

[jira] [Resolved] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-03-25 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2159. -- Fix Version/s: (was: 1.13.0) Resolution: Fixed > Parquet bit-packing de/encode

[jira] [Updated] (PARQUET-2252) Make some methods public to allow external projects to implement page skipping

2023-03-25 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2252: - Issue Type: Improvement (was: New Feature) > Make some methods public to allow external projects to

[jira] [Resolved] (PARQUET-2164) CapacityByteArrayOutputStream overflow while writing causes negative row group sizes to be written

2023-03-25 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2164. -- Fix Version/s: (was: 1.12.3) Resolution: Fixed > CapacityByteArrayOutputStream overflow

[jira] [Resolved] (PARQUET-2202) Redundant String allocation on the hot path in CapacityByteArrayOutputStream.setByte

2023-03-25 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2202. -- Resolution: Fixed > Redundant String allocation on the hot path in >

[jira] [Updated] (PARQUET-2259) [Site] Update parquet site

2023-03-16 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2259: - Summary: [Site] Update parquet site (was: Update parquet site) > [Site] Update parquet site >

[jira] [Created] (PARQUET-2259) Update parquet site

2023-03-16 Thread Gang Wu (Jira)
Gang Wu created PARQUET-2259: Summary: Update parquet site Key: PARQUET-2259 URL: https://issues.apache.org/jira/browse/PARQUET-2259 Project: Parquet Issue Type: Task Components:

[jira] [Resolved] (PARQUET-2219) ParquetFileReader throws a runtime exception when a file contains only headers and now row data

2023-03-16 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2219. -- Resolution: Fixed > ParquetFileReader throws a runtime exception when a file contains only >

[jira] [Resolved] (PARQUET-2230) Add a new rewrite command powered by ParquetRewriter

2023-03-16 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2230. -- Resolution: Fixed > Add a new rewrite command powered by ParquetRewriter >

[jira] [Commented] (PARQUET-2255) BloomFilter and float point is ambiguous

2023-03-13 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699716#comment-17699716 ] Gang Wu commented on PARQUET-2255: -- I think there is a similar issue in the dictionary encoding of

[jira] [Commented] (PARQUET-2256) Adding Compression for BloomFilter

2023-03-13 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699694#comment-17699694 ] Gang Wu commented on PARQUET-2256: -- Apache ORC supports compression of bloom filter. It would be nice

[jira] [Created] (PARQUET-2257) [Format] Add bloom_filter_length to ColumnMetaData

2023-03-13 Thread Gang Wu (Jira)
Gang Wu created PARQUET-2257: Summary: [Format] Add bloom_filter_length to ColumnMetaData Key: PARQUET-2257 URL: https://issues.apache.org/jira/browse/PARQUET-2257 Project: Parquet Issue Type:

[jira] [Updated] (PARQUET-2256) Adding Compression for BloomFilter

2023-03-13 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2256: - Component/s: (was: parquet-cpp) > Adding Compression for BloomFilter >

[jira] [Updated] (PARQUET-2256) Adding Compression for BloomFilter

2023-03-13 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2256: - Component/s: parquet-format > Adding Compression for BloomFilter > --

[jira] [Commented] (PARQUET-2255) BloomFilter and float point is ambiguous

2023-03-13 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699691#comment-17699691 ] Gang Wu commented on PARQUET-2255: -- cc [~gszadovszky] [~emkornfi...@gmail.com] > BloomFilter and

[jira] [Commented] (PARQUET-2255) BloomFilter and float point is ambiguous

2023-03-13 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699686#comment-17699686 ] Gang Wu commented on PARQUET-2255: -- These are good questions. Let me try to answer them from the

[jira] [Commented] (PARQUET-2254) Build a BloomFilter with a more precise size

2023-03-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699223#comment-17699223 ] Gang Wu commented on PARQUET-2254: -- The optimization in the filter makes sense to me. Back to the

[jira] [Commented] (PARQUET-2254) Build a BloomFilter with a more precise size

2023-03-07 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17697463#comment-17697463 ] Gang Wu commented on PARQUET-2254: -- Here are two questions: 1) creating bloom filters without explicit

[jira] [Created] (PARQUET-2253) Postpone dictionary encoding decision for starting null pages.

2023-03-02 Thread Gang Wu (Jira)
Gang Wu created PARQUET-2253: Summary: Postpone dictionary encoding decision for starting null pages. Key: PARQUET-2253 URL: https://issues.apache.org/jira/browse/PARQUET-2253 Project: Parquet

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-02-27 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693890#comment-17693890 ] Gang Wu commented on PARQUET-: -- The implementations are consistent between parquet-cpp and

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-02-27 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693881#comment-17693881 ] Gang Wu commented on PARQUET-: -- I think the reason is that *DataPageHeaderV2* has two fields to

[jira] [Assigned] (PARQUET-1629) Page-level CRC checksum verification for DataPageV2

2023-02-24 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu reassigned PARQUET-1629: Assignee: Gang Wu > Page-level CRC checksum verification for DataPageV2 >

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-02-21 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17691626#comment-17691626 ] Gang Wu commented on PARQUET-2249: -- [~jfinis] Yes, please submit a PR when you are ready. Thanks. >

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-02-20 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17691199#comment-17691199 ] Gang Wu commented on PARQUET-2249: -- As of today, there are many different parquet implementations

[jira] [Commented] (PARQUET-2249) Parquet spec (parquet.thrift) is inconsistent w.r.t. ColumnIndex + NaNs

2023-02-19 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17691009#comment-17691009 ] Gang Wu commented on PARQUET-2249: -- When a page only contains NaN, the page statistics do not set

[jira] [Commented] (PARQUET-2240) DateTimeFormatter is used in static context, but not thread safe

2023-02-17 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17690260#comment-17690260 ] Gang Wu commented on PARQUET-2240: -- Thanks for reporting the issue. Could you please open a PR?

[jira] [Created] (PARQUET-2248) ParquetRewriter supports merging files by record

2023-02-16 Thread Gang Wu (Jira)
Gang Wu created PARQUET-2248: Summary: ParquetRewriter supports merging files by record Key: PARQUET-2248 URL: https://issues.apache.org/jira/browse/PARQUET-2248 Project: Parquet Issue Type:

[jira] [Resolved] (PARQUET-2229) ParquetRewriter supports masking and encrypting the same column

2023-02-11 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2229. -- Resolution: Fixed > ParquetRewriter supports masking and encrypting the same column >

<    1   2   3   4   >