[jira] [Assigned] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls

2023-02-10 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu reassigned PARQUET-2241: Assignee: Gang Wu > ByteStreamSplitDecoder broken in presence of nulls >

[jira] [Updated] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls

2023-02-10 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2241: - Fix Version/s: (was: format-2.10.0) > ByteStreamSplitDecoder broken in presence of nulls >

[jira] [Updated] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls

2023-02-10 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2241: - Component/s: parquet-mr > ByteStreamSplitDecoder broken in presence of nulls >

[jira] [Commented] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls

2023-02-10 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17686941#comment-17686941 ] Gang Wu commented on PARQUET-2241: -- Have you seen any relevant issue in production? [~gershinsky]

[jira] [Commented] (PARQUET-2241) ByteStreamSplitDecoder broken in presence of nulls

2023-02-10 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17686938#comment-17686938 ] Gang Wu commented on PARQUET-2241: -- It seems that the *ByteStreamSplitValuesReader* in the parquet-mr

[jira] [Resolved] (PARQUET-2227) Refactor different file rewriters to use single implementation

2023-01-29 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2227. -- Resolution: Fixed > Refactor different file rewriters to use single implementation >

[jira] [Commented] (PARQUET-2233) Parquet Travis CI jobs to be turned off February 15th

2023-01-24 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17680439#comment-17680439 ] Gang Wu commented on PARQUET-2233: -- I see there is a comment in the travis yaml file saying that the

[jira] [Comment Edited] (PARQUET-1622) Add BYTE_STREAM_SPLIT encoding

2023-01-17 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678056#comment-17678056 ] Gang Wu edited comment on PARQUET-1622 at 1/18/23 3:05 AM: --- The issue raised

[jira] [Commented] (PARQUET-1622) Add BYTE_STREAM_SPLIT encoding

2023-01-17 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678056#comment-17678056 ] Gang Wu commented on PARQUET-1622: -- The issue raised by [~mwish] above may also exist in the

[jira] [Created] (PARQUET-2230) Add a new rewrite command powered by ParquetRewriter

2023-01-14 Thread Gang Wu (Jira)
Gang Wu created PARQUET-2230: Summary: Add a new rewrite command powered by ParquetRewriter Key: PARQUET-2230 URL: https://issues.apache.org/jira/browse/PARQUET-2230 Project: Parquet Issue Type:

[jira] [Created] (PARQUET-2229) ParquetRewriter supports masking and encrypting the same column

2023-01-14 Thread Gang Wu (Jira)
Gang Wu created PARQUET-2229: Summary: ParquetRewriter supports masking and encrypting the same column Key: PARQUET-2229 URL: https://issues.apache.org/jira/browse/PARQUET-2229 Project: Parquet

[jira] [Created] (PARQUET-2228) ParquetRewriter supports more than one input file

2023-01-14 Thread Gang Wu (Jira)
Gang Wu created PARQUET-2228: Summary: ParquetRewriter supports more than one input file Key: PARQUET-2228 URL: https://issues.apache.org/jira/browse/PARQUET-2228 Project: Parquet Issue Type:

[jira] [Created] (PARQUET-2227) Refactor different file rewriters to use single implementation

2023-01-14 Thread Gang Wu (Jira)
Gang Wu created PARQUET-2227: Summary: Refactor different file rewriters to use single implementation Key: PARQUET-2227 URL: https://issues.apache.org/jira/browse/PARQUET-2227 Project: Parquet

[jira] [Assigned] (PARQUET-2219) ParquetFileReader throws a runtime exception when a file contains only headers and now row data

2023-01-08 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu reassigned PARQUET-2219: Assignee: Gang Wu > ParquetFileReader throws a runtime exception when a file contains only >

[jira] [Commented] (PARQUET-2221) [Format] Encoding spec incorrect for dictionary fallback

2023-01-03 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654256#comment-17654256 ] Gang Wu commented on PARQUET-2221: -- IMHO, the specs is authoritative to the reader implementation to

[jira] [Commented] (PARQUET-2219) ParquetFileReader throws a runtime exception when a file contains only headers and now row data

2022-12-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648350#comment-17648350 ] Gang Wu commented on PARQUET-2219: -- cc [~emkornfield] > ParquetFileReader throws a runtime exception

[jira] [Commented] (PARQUET-2219) ParquetFileReader throws a runtime exception when a file contains only headers and now row data

2022-12-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648342#comment-17648342 ] Gang Wu commented on PARQUET-2219: -- According to the error message, it seems that empty row group is

[jira] [Resolved] (PARQUET-2196) Support LZ4_RAW codec

2022-12-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu resolved PARQUET-2196. -- Resolution: Resolved > Support LZ4_RAW codec > - > > Key:

[jira] [Assigned] (PARQUET-2196) Support LZ4_RAW codec

2022-12-15 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu reassigned PARQUET-2196: Assignee: Gang Wu > Support LZ4_RAW codec > - > > Key:

[jira] [Commented] (PARQUET-2075) Unified Rewriter Tool

2022-12-09 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645349#comment-17645349 ] Gang Wu commented on PARQUET-2075: -- As discussed offline, I will work on it. So I just changed the

[jira] [Assigned] (PARQUET-2075) Unified Rewriter Tool

2022-12-09 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu reassigned PARQUET-2075: Assignee: Gang Wu (was: Xinli Shang) > Unified Rewriter Tool > --- > >

[jira] [Commented] (PARQUET-1404) [C++] Add index pages to the format to support efficient page skipping to parquet-cpp

2022-12-01 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641983#comment-17641983 ] Gang Wu commented on PARQUET-1404: -- Hi [~encodedgeek], I will definitely go through your change as

[jira] [Commented] (PARQUET-1404) [C++] Add index pages to the format to support efficient page skipping to parquet-cpp

2022-12-01 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641967#comment-17641967 ] Gang Wu commented on PARQUET-1404: -- Hi [~mdeepak], I am working onĀ 

[jira] [Assigned] (PARQUET-1404) [C++] Add index pages to the format to support efficient page skipping to parquet-cpp

2022-12-01 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu reassigned PARQUET-1404: Assignee: Gang Wu (was: Deepak Majeti) > [C++] Add index pages to the format to support

[jira] [Created] (PARQUET-2211) [C++] Print ColumnMetaData.encoding_stats field

2022-11-01 Thread Gang Wu (Jira)
Gang Wu created PARQUET-2211: Summary: [C++] Print ColumnMetaData.encoding_stats field Key: PARQUET-2211 URL: https://issues.apache.org/jira/browse/PARQUET-2211 Project: Parquet Issue Type:

[jira] [Updated] (PARQUET-2196) Support LZ4_RAW codec

2022-09-27 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2196: - Description: There is a long history about the LZ4 interoperability of parquet files between

[jira] [Updated] (PARQUET-2196) Support LZ4_RAW codec

2022-09-27 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated PARQUET-2196: - Description: There is a long history about the LZ4 interoperability of parquet files between

[jira] [Created] (PARQUET-2196) Support LZ4_RAW codec

2022-09-27 Thread Gang Wu (Jira)
Gang Wu created PARQUET-2196: Summary: Support LZ4_RAW codec Key: PARQUET-2196 URL: https://issues.apache.org/jira/browse/PARQUET-2196 Project: Parquet Issue Type: Improvement

[jira] [Created] (PARQUET-2195) Add scan command to parquet-cli

2022-09-23 Thread Gang Wu (Jira)
Gang Wu created PARQUET-2195: Summary: Add scan command to parquet-cli Key: PARQUET-2195 URL: https://issues.apache.org/jira/browse/PARQUET-2195 Project: Parquet Issue Type: Improvement

<    1   2   3   4