[jira] [Commented] (PARQUET-2221) [Format] Encoding spec incorrect for dictionary fallback

2023-11-21 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788519#comment-17788519 ] Micah Kornfield commented on PARQUET-2221: -- I agree with [~wgtmac] here.  I th

[jira] [Commented] (PARQUET-2345) The Parquet Spec doesn't specify whether multiple columns are allowed to have the same name.

2023-10-01 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770902#comment-17770902 ] Micah Kornfield commented on PARQUET-2345: -- I've at least seen in the wild two

[jira] [Created] (PARQUET-2261) [Format] Add statistics that reflect decoded size to metadata

2023-03-25 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-2261: Summary: [Format] Add statistics that reflect decoded size to metadata Key: PARQUET-2261 URL: https://issues.apache.org/jira/browse/PARQUET-2261 Project: Parq

[jira] [Resolved] (PARQUET-2225) [C++] Allow reading dense with RecordReader

2023-03-03 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2225. -- Fix Version/s: cpp-11.0.0 Resolution: Fixed Issue resolved by pull request 178

[jira] [Assigned] (PARQUET-2201) Add Stress test for RecordReader SkipRecords

2023-02-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-2201: Assignee: fatemah > Add Stress test for RecordReader SkipRecords > -

[jira] [Resolved] (PARQUET-2201) Add Stress test for RecordReader SkipRecords

2023-02-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2201. -- Fix Version/s: cpp-11.0.0 Resolution: Fixed Issue resolved by pull request 148

[jira] [Assigned] (PARQUET-2210) [C++] Skip pages based on header metadata using a callback

2023-01-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-2210: Assignee: fatemah > [C++] Skip pages based on header metadata using a callback >

[jira] [Resolved] (PARQUET-2210) [C++] Skip pages based on header metadata using a callback

2023-01-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2210. -- Fix Version/s: cpp-11.0.0 Resolution: Fixed Issue resolved by pull request 146

[jira] [Commented] (PARQUET-2219) ParquetFileReader throws a runtime exception when a file contains only headers and now row data

2023-01-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1762#comment-1762 ] Micah Kornfield commented on PARQUET-2219: -- I'm not aware of anything in the s

[jira] [Commented] (PARQUET-1222) Specify a well-defined sorting order for float and double types

2022-10-08 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17614581#comment-17614581 ] Micah Kornfield commented on PARQUET-1222: -- Elevating the specification level

[jira] [Commented] (PARQUET-1222) Specify a well-defined sorting order for float and double types

2022-09-29 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611356#comment-17611356 ] Micah Kornfield commented on PARQUET-1222: -- I'd propose the following "fix": -

[jira] [Commented] (PARQUET-2175) Skip method skips levels and not rows for repeated fields

2022-08-24 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584388#comment-17584388 ] Micah Kornfield commented on PARQUET-2175: -- I think the current signature is

[jira] [Resolved] (PARQUET-2172) [C++] Make field return const NodePtr& instead of forcing copy of shared_ptr

2022-08-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2172. -- Resolution: Fixed > [C++] Make field return const NodePtr& instead of forcing copy of

[jira] [Updated] (PARQUET-2172) [C++] Make field return const NodePtr& instead of forcing copy of shared_ptr

2022-08-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-2172: - Fix Version/s: cpp-10.0.0 > [C++] Make field return const NodePtr& instead of forcing c

[jira] [Created] (PARQUET-2172) [C++] Make field return const NodePtr& instead of forcing copy of shared_ptr

2022-08-11 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-2172: Summary: [C++] Make field return const NodePtr& instead of forcing copy of shared_ptr Key: PARQUET-2172 URL: https://issues.apache.org/jira/browse/PARQUET-2172

[jira] [Assigned] (PARQUET-2172) [C++] Make field return const NodePtr& instead of forcing copy of shared_ptr

2022-08-11 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-2172: Assignee: Micah Kornfield > [C++] Make field return const NodePtr& instead of fo

[jira] [Commented] (PARQUET-1711) [parquet-protobuf] stack overflow when work with well known json type

2022-07-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570331#comment-17570331 ] Micah Kornfield commented on PARQUET-1711: -- {quote}[~emkornfield] Can we expec

[jira] [Resolved] (PARQUET-2163) Parquet C++ Float Runtime Error in Decimal Schema

2022-07-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2163. -- Fix Version/s: cpp-9.0.0 Resolution: Fixed Issue resolved by pull request 1345

[jira] [Commented] (PARQUET-1711) [parquet-protobuf] stack overflow when work with well known json type

2022-05-29 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543672#comment-17543672 ] Micah Kornfield commented on PARQUET-1711: -- the way one could handle this is a

[jira] [Commented] (PARQUET-2122) Adding Bloom filter to small Parquet file bloats in size X1700

2022-05-09 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534123#comment-17534123 ] Micah Kornfield commented on PARQUET-2122: -- I believe the answer is the Bloom

[jira] [Commented] (PARQUET-2133) Support Int8 and Int16 as basic type

2022-04-08 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519738#comment-17519738 ] Micah Kornfield commented on PARQUET-2133: -- before we start working on it it s

[jira] [Resolved] (PARQUET-2131) Number values decoded DCHECKs should be exceptions

2022-03-04 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2131. -- Fix Version/s: cpp-8.0.0 Resolution: Fixed Issue resolved by pull request 1249

[jira] [Resolved] (PARQUET-2130) Crash on non-standard map key name in debug

2022-03-04 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2130. -- Fix Version/s: cpp-8.0.0 Resolution: Fixed Issue resolved by pull request 1248

[jira] [Updated] (PARQUET-2118) [C++] thift_internal.h assumes shared_ptr type in some cases

2022-02-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-2118: - Component/s: parquet-cpp > [C++] thift_internal.h assumes shared_ptr type in some cases

[jira] [Updated] (PARQUET-2118) [C++] thift_internal.h assumes shared_ptr type in some cases

2022-02-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-2118: - Summary: [C++] thift_internal.h assumes shared_ptr type in some cases (was: thift_inte

[jira] [Moved] (PARQUET-2118) thift_internal.h assumes shared_ptr type in some cases

2022-02-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-15596 to PARQUET-2118: -- Key: PARQUET-2118 (was: ARROW-15596) Workflow: patc

[jira] [Updated] (PARQUET-1361) [C++] 1.4.1 library allows creation of parquet file w/NULL values for INT types

2022-01-03 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1361: - Component/s: parquet-mr > [C++] 1.4.1 library allows creation of parquet file w/NULL va

[jira] [Resolved] (PARQUET-2095) [C++] Read Parquet file with MapArray

2021-10-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2095. -- Resolution: Not A Problem > [C++] Read Parquet file with MapArray > -

[jira] [Commented] (PARQUET-2095) [C++] Read Parquet file with MapArray

2021-10-07 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425958#comment-17425958 ] Micah Kornfield commented on PARQUET-2095: -- Hi [~longshanpdd] did the above re

[jira] [Created] (PARQUET-2099) [C++] Statistics::num_values() is misleading

2021-09-30 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-2099: Summary: [C++] Statistics::num_values() is misleading Key: PARQUET-2099 URL: https://issues.apache.org/jira/browse/PARQUET-2099 Project: Parquet Iss

[jira] [Commented] (PARQUET-2095) [C++] Read Parquet file with MapArray

2021-09-26 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17420316#comment-17420316 ] Micah Kornfield commented on PARQUET-2095: -- Can you run ValidateFull on the ar

[jira] [Commented] (PARQUET-2095) [C++] Read Parquet file with MapArray

2021-09-25 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17420158#comment-17420158 ] Micah Kornfield commented on PARQUET-2095: -- Hi it isn't clear if this is repor

[jira] [Commented] (PARQUET-2092) [Go] Fix in go implementation

2021-09-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415060#comment-17415060 ] Micah Kornfield commented on PARQUET-2092: -- OK, would you mind closing this an

[jira] [Commented] (PARQUET-2092) [Go] Fix in go implementation

2021-09-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415056#comment-17415056 ] Micah Kornfield commented on PARQUET-2092: -- [~zeroshade] the move option if yo

[jira] [Commented] (PARQUET-2092) [Go] Fix in go implementation

2021-09-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415050#comment-17415050 ] Micah Kornfield commented on PARQUET-2092: -- Hmm, it doesn't look like I have p

[jira] [Commented] (PARQUET-2092) [Go] Fix in go implementation

2021-09-14 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415048#comment-17415048 ] Micah Kornfield commented on PARQUET-2092: -- I'm going to move this to the Arro

[jira] [Resolved] (PARQUET-2090) [C++] Parquet writes incorrect file_offset

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2090. -- Resolution: Invalid > [C++] Parquet writes incorrect file_offset > -

[jira] [Commented] (PARQUET-2090) [C++] Parquet writes incorrect file_offset

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17414716#comment-17414716 ] Micah Kornfield commented on PARQUET-2090: -- [~csun]  according the [spec|http

[jira] [Assigned] (PARQUET-2089) [C++] RowGroupMetaData file_offset set incorrectly

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-2089: Assignee: Micah Kornfield > [C++] RowGroupMetaData file_offset set incorrectly >

[jira] [Assigned] (PARQUET-2090) [C++] Parquet writes incorrect file_offset

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-2090: Assignee: Micah Kornfield > [C++] Parquet writes incorrect file_offset > --

[jira] [Updated] (PARQUET-2089) [C++] RowGroupMetaData file_offset set incorrectly

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-2089: - Summary: [C++] RowGroupMetaData file_offset set incorrectly (was: RowGroupMetaData fil

[jira] [Commented] (PARQUET-2089) RowGroupMetaData file_offset set incorrectly

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17414695#comment-17414695 ] Micah Kornfield commented on PARQUET-2089: -- CC [~zeroshade] > RowGroupMetaDat

[jira] [Commented] (PARQUET-2090) [C++] Parquet writes incorrect file_offset

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17414694#comment-17414694 ] Micah Kornfield commented on PARQUET-2090: -- CC [~zeroshade] > [C++] Parquet w

[jira] [Moved] (PARQUET-2090) [C++] Parquet writes incorrect file_offset

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-13941 to PARQUET-2090: -- Component/s: (was: Parquet) parquet-cpp

[jira] [Moved] (PARQUET-2089) RowGroupMetaData file_offset set incorrectly

2021-09-13 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-13609 to PARQUET-2089: -- Component/s: (was: C++) parquet-cpp

[jira] [Commented] (PARQUET-1361) [C++] 1.4.1 library allows creation of parquet file w/NULL values for INT types

2021-08-22 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402882#comment-17402882 ] Micah Kornfield commented on PARQUET-1361: -- Sorry for the late reply, but I th

[jira] [Created] (PARQUET-2067) [C++] null_count and num_nulls incorrect for repeated columns

2021-07-15 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-2067: Summary: [C++] null_count and num_nulls incorrect for repeated columns Key: PARQUET-2067 URL: https://issues.apache.org/jira/browse/PARQUET-2067 Project: Par

[jira] [Moved] (PARQUET-2066) [C++][Parquet] num_rows is incorrect for nested types

2021-07-15 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-13349 to PARQUET-2066: -- Component/s: (was: Parquet) (was: C+

[jira] [Resolved] (PARQUET-2056) [C++] Add ability for retrieving dictionary and indices separately for ColumnReader

2021-06-17 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-2056. -- Fix Version/s: cpp-5.0.0 Resolution: Fixed Issue resolved by pull request 1053

[jira] [Assigned] (PARQUET-2056) [C++] Add ability for retrieving dictionary and indices separately for ColumnReader

2021-06-15 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-2056: Assignee: Jinpeng Zhou (was: Micah Kornfield) > [C++] Add ability for retrievi

[jira] [Updated] (PARQUET-2056) [C++] Add ability for retrieving dictionary and indices separately for ColumnReader

2021-06-08 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-2056: - Component/s: parquet-cpp > [C++] Add ability for retrieving dictionary and indices sep

[jira] [Moved] (PARQUET-2056) [C++] Add ability for retrieving dictionary and indices separately for ColumnReader

2021-06-08 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-13012 to PARQUET-2056: -- Key: PARQUET-2056 (was: ARROW-13012) Workflow: patc

[jira] [Commented] (PARQUET-1122) [C++] Support 2-level list encoding in Arrow decoding

2021-04-02 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17314180#comment-17314180 ] Micah Kornfield commented on PARQUET-1122: -- Yes, all common types should not b

[jira] [Resolved] (PARQUET-1122) [C++] Support 2-level list encoding in Arrow decoding

2021-04-02 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-1122. -- Resolution: Implemented > [C++] Support 2-level list encoding in Arrow decoding > ---

[jira] [Resolved] (PARQUET-1990) [C++] ConvertedType::NA is written out in some cases

2021-03-31 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-1990. -- Resolution: Fixed Issue resolved by pull request 9863 [https://github.com/apache/arro

[jira] [Commented] (PARQUET-1991) Reserve ConvertedType==24 due to bug in parquet-cpp implementation

2021-03-31 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312478#comment-17312478 ] Micah Kornfield commented on PARQUET-1991: -- I'm OK with won't fix.  I was thin

[jira] [Commented] (PARQUET-1990) [C++] ConvertedType::NA is written out in some cases

2021-03-31 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312462#comment-17312462 ] Micah Kornfield commented on PARQUET-1990: -- Nice find on the reverted format c

[jira] [Updated] (PARQUET-2003) Decimal Statistics emitted by parquet-cpp are broken

2021-03-18 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-2003: - Summary: Decimal Statistics emitted by parquet-cpp are broken (was: Decimal Statistic

[jira] [Created] (PARQUET-2003) Decimal Statistics admitted for parquet-cpp are broken

2021-03-18 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-2003: Summary: Decimal Statistics admitted for parquet-cpp are broken Key: PARQUET-2003 URL: https://issues.apache.org/jira/browse/PARQUET-2003 Project: Parquet

[jira] [Commented] (PARQUET-1995) [C++][Parquet] Crash at parquet::TypedColumnWriterImpl<>::WriteBatchSpaced

2021-03-09 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298550#comment-17298550 ] Micah Kornfield commented on PARQUET-1995: -- Well it seems like a bug someplace

[jira] [Commented] (PARQUET-1995) [C++][Parquet] Crash at parquet::TypedColumnWriterImpl<>::WriteBatchSpaced

2021-03-09 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298523#comment-17298523 ] Micah Kornfield commented on PARQUET-1995: -- we've also had some other bugs rel

[jira] [Commented] (PARQUET-1995) [C++][Parquet] Crash at parquet::TypedColumnWriterImpl<>::WriteBatchSpaced

2021-03-09 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298515#comment-17298515 ] Micah Kornfield commented on PARQUET-1995: -- This is a little bit hard to diagn

[jira] [Updated] (PARQUET-1990) [C++] ConvertedType::NA is written out in some cases

2021-02-28 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1990: - Summary: [C++] ConvertedType::NA is written out in some cases (was: [C++] ConvertedTy

[jira] [Created] (PARQUET-1991) Reserve ConvertedType==24 due to bug in parquet-cpp implementation

2021-02-28 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1991: Summary: Reserve ConvertedType==24 due to bug in parquet-cpp implementation Key: PARQUET-1991 URL: https://issues.apache.org/jira/browse/PARQUET-1991 Project:

[jira] [Updated] (PARQUET-1990) [C++] ConvertedType::NA is attempted to be written out in some cases

2021-02-28 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1990: - Description: This makes it an invalid thrift enum.  ::NA is a placeholder enum internal

[jira] [Created] (PARQUET-1990) [C++] ConvertedType::NA is attempted to be written out in some cases

2021-02-28 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1990: Summary: [C++] ConvertedType::NA is attempted to be written out in some cases Key: PARQUET-1990 URL: https://issues.apache.org/jira/browse/PARQUET-1990 Projec

[jira] [Commented] (PARQUET-1783) [C++] Parquet statistics wrong for dictionary type

2021-02-28 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292623#comment-17292623 ] Micah Kornfield commented on PARQUET-1783: -- This was reported as ARROW-11634 

[jira] [Assigned] (PARQUET-1655) [C++] Decimal comparisons used for min/max statistics are not correct

2021-02-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-1655: Assignee: Micah Kornfield > [C++] Decimal comparisons used for min/max statistic

[jira] [Commented] (PARQUET-1987) Document how a schema can have columns splitted over different files

2021-02-20 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287834#comment-17287834 ] Micah Kornfield commented on PARQUET-1987: -- This was discussed a little bit on

[jira] [Commented] (PARQUET-1987) Document how a schema can have columns splitted over different files

2021-02-20 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287833#comment-17287833 ] Micah Kornfield commented on PARQUET-1987: -- CC [~raduteodorescu] > Document h

[jira] [Commented] (PARQUET-1985) Improve integration tests between implementations

2021-02-17 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286274#comment-17286274 ] Micah Kornfield commented on PARQUET-1985: -- I think trying to shoe horn struct

[jira] [Commented] (PARQUET-1958) Forced UTF8 encoding of BYTE_ARRAY on stream::read/write

2021-01-22 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270548#comment-17270548 ] Micah Kornfield commented on PARQUET-1958: -- I actually am not sure that the ch

[jira] [Commented] (PARQUET-1946) Parquet File not readable by Google big query (works with Spark)

2020-12-06 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244962#comment-17244962 ] Micah Kornfield commented on PARQUET-1946: -- I'm not an expert on the tool.  Lo

[jira] [Commented] (PARQUET-1946) Parquet File not readable by Google big query (works with Spark)

2020-11-29 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240479#comment-17240479 ] Micah Kornfield commented on PARQUET-1946: -- Are you using V2 datapages?  BQ do

[jira] [Commented] (PARQUET-1936) WriteBatchSpaced writes incorrect value for parquet when input contains NULL list

2020-11-15 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232503#comment-17232503 ] Micah Kornfield commented on PARQUET-1936: -- [~Ruta Dhaneshwar] sure, do you ma

[jira] [Commented] (PARQUET-1936) WriteBatchSpaced writes incorrect value for parquet when input contains NULL list

2020-10-30 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17223995#comment-17223995 ] Micah Kornfield commented on PARQUET-1936: -- [~Ruta Dhaneshwar] part of this mi

[jira] [Commented] (PARQUET-1935) [C++][Parquet] nullptr access violation when writing arrays of non-nullable values

2020-10-30 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17223994#comment-17223994 ] Micah Kornfield commented on PARQUET-1935: -- Yes this is possibly a 2.0.1 bug f

[jira] [Commented] (PARQUET-1935) [C++][Parquet] nullptr access violation when writing arrays of non-nullable values

2020-10-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219933#comment-17219933 ] Micah Kornfield commented on PARQUET-1935: -- One workaround for this is to dete

[jira] [Moved] (PARQUET-1935) [C++][Parquet] nullptr access violation when writing arrays of non-nullable values

2020-10-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-10377 to PARQUET-1935: -- Key: PARQUET-1935 (was: ARROW-10377) Affec

[jira] [Updated] (PARQUET-1935) [C++][Parquet] nullptr access violation when writing arrays of non-nullable values

2020-10-23 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1935: - Component/s: parquet-cpp > [C++][Parquet] nullptr access violation when writing arrays

[jira] [Created] (PARQUET-1933) [Format] Clarify encodings and data page guidance.

2020-10-21 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1933: Summary: [Format] Clarify encodings and data page guidance. Key: PARQUET-1933 URL: https://issues.apache.org/jira/browse/PARQUET-1933 Project: Parquet

[jira] [Resolved] (PARQUET-1904) [C++] Export file_offset in RowGroupMetaData

2020-08-26 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved PARQUET-1904. -- Resolution: Fixed > [C++] Export file_offset in RowGroupMetaData > --

[jira] [Commented] (PARQUET-1904) [C++] Export file_offset in RowGroupMetaData

2020-08-26 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185582#comment-17185582 ] Micah Kornfield commented on PARQUET-1904: -- [~wesm] [~uwe] I don't have access

[jira] [Moved] (PARQUET-1904) [C++] Export file_offset in RowGroupMetaData

2020-08-26 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield moved ARROW-9824 to PARQUET-1904: - Component/s: (was: C++) parquet-cpp

[jira] [Created] (PARQUET-1899) [C++] Deprecated ReadBatchSpaced in parquet/column_reader

2020-08-19 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1899: Summary: [C++] Deprecated ReadBatchSpaced in parquet/column_reader Key: PARQUET-1899 URL: https://issues.apache.org/jira/browse/PARQUET-1899 Project: Parquet

[jira] [Assigned] (PARQUET-1882) Writing an all-null column and then reading it with buffered_stream aborts the process

2020-07-11 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-1882: Assignee: Micah Kornfield > Writing an all-null column and then reading it with

[jira] [Created] (PARQUET-1877) [C++] Reconcile container size with string size for memory issues

2020-06-16 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1877: Summary: [C++] Reconcile container size with string size for memory issues Key: PARQUET-1877 URL: https://issues.apache.org/jira/browse/PARQUET-1877 Project:

[jira] [Assigned] (PARQUET-1839) values_read not updated in ReadBatchSpaced

2020-05-03 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-1839: Assignee: Micah Kornfield > values_read not updated in ReadBatchSpaced > --

[jira] [Commented] (PARQUET-1841) [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance

2020-04-21 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089347#comment-17089347 ] Micah Kornfield commented on PARQUET-1841: -- I've been using parquet-arrow-read

[jira] [Comment Edited] (PARQUET-1841) [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance

2020-04-16 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085431#comment-17085431 ] Micah Kornfield edited comment on PARQUET-1841 at 4/17/20, 4:52 AM: -

[jira] [Commented] (PARQUET-1841) [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance

2020-04-16 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085433#comment-17085433 ] Micah Kornfield commented on PARQUET-1841: -- To get this assigned to you you wi

[jira] [Commented] (PARQUET-1841) [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance

2020-04-16 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085431#comment-17085431 ] Micah Kornfield commented on PARQUET-1841: -- For AVX512 enabled processor the m

[jira] [Assigned] (PARQUET-1841) [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance

2020-04-16 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield reassigned PARQUET-1841: Assignee: Micah Kornfield > [C++] Experiment to see if using SIMD shuffle operat

[jira] [Created] (PARQUET-1841) [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance

2020-04-13 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1841: Summary: [C++] Experiment to see if using SIMD shuffle operations for DecodeSpaced improves performance Key: PARQUET-1841 URL: https://issues.apache.org/jira/browse/PARQUE

[jira] [Updated] (PARQUET-1840) [C++] DecodeSpaced copies more values then necessary

2020-04-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1840: - Summary: [C++] DecodeSpaced copies more values then necessary (was: [C++] DecodeSpaced

[jira] [Updated] (PARQUET-1840) DecodeSpaced copies/touches more values then necessary

2020-04-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1840: - Summary: DecodeSpaced copies/touches more values then necessary (was: DecodeSpaced cop

[jira] [Updated] (PARQUET-1840) [C++] DecodeSpaced copies/touches more values then necessary

2020-04-12 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1840: - Summary: [C++] DecodeSpaced copies/touches more values then necessary (was: DecodeSpac

[jira] [Created] (PARQUET-1840) DecodeSpaced copies more values then necessary

2020-04-12 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1840: Summary: DecodeSpaced copies more values then necessary Key: PARQUET-1840 URL: https://issues.apache.org/jira/browse/PARQUET-1840 Project: Parquet Is

[jira] [Updated] (PARQUET-1838) [C++] Expose an API that allows direct writing of RLE information for rep/def levels when writing parquet files

2020-04-10 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated PARQUET-1838: - Description: When  writing data to parquet it can potentially be more efficient to have

[jira] [Created] (PARQUET-1838) [C++] Expose an API that allows direct writing of RLE information for rep/def levels when writing parquet files

2020-04-10 Thread Micah Kornfield (Jira)
Micah Kornfield created PARQUET-1838: Summary: [C++] Expose an API that allows direct writing of RLE information for rep/def levels when writing parquet files Key: PARQUET-1838 URL: https://issues.apache.org/j

  1   2   >