[GitHub] [orc] wgtmac commented on a diff in pull request #1089: ORC-1152:[C++] Support writing short decimals in RLEv2

2022-04-17 Thread GitBox
wgtmac commented on code in PR #1089: URL: https://github.com/apache/orc/pull/1089#discussion_r851909221 ## c++/test/TestWriter.cc: ## @@ -1996,5 +1996,5 @@ namespace orc { } } - INSTANTIATE_TEST_CASE_P(OrcTest, WriterTest, Values(FileVersion::v_0_11(), FileVersion::

[GitHub] [orc] dongjoon-hyun commented on a diff in pull request #1089: ORC-1152:[C++] Support writing short decimals in RLEv2

2022-04-17 Thread GitBox
dongjoon-hyun commented on code in PR #1089: URL: https://github.com/apache/orc/pull/1089#discussion_r851862440 ## c++/test/TestWriter.cc: ## @@ -1996,5 +1996,5 @@ namespace orc { } } - INSTANTIATE_TEST_CASE_P(OrcTest, WriterTest, Values(FileVersion::v_0_11(), FileVe

[GitHub] [orc] dongjoon-hyun commented on a diff in pull request #1088: ORC-1151: [C++] Fix `ColumnWriter` for non-UTC Timestamp columns

2022-04-17 Thread GitBox
dongjoon-hyun commented on code in PR #1088: URL: https://github.com/apache/orc/pull/1088#discussion_r851862133 ## c++/src/ColumnWriter.cc: ## @@ -1837,7 +1837,7 @@ namespace orc { // TimestampVectorBatch already stores data in UTC int64_t millsUTC = secs[i] *

[GitHub] [orc] wgtmac commented on a diff in pull request #1089: ORC-1152:[C++] Support writing short decimals in RLEv2

2022-04-17 Thread GitBox
wgtmac commented on code in PR #1089: URL: https://github.com/apache/orc/pull/1089#discussion_r851860205 ## c++/test/TestWriter.cc: ## @@ -1996,5 +1996,5 @@ namespace orc { } } - INSTANTIATE_TEST_CASE_P(OrcTest, WriterTest, Values(FileVersion::v_0_11(), FileVersion::

[GitHub] [orc] wgtmac commented on a diff in pull request #1088: ORC-1151: [C++] Fix `ColumnWriter` for non-UTC Timestamp columns

2022-04-17 Thread GitBox
wgtmac commented on code in PR #1088: URL: https://github.com/apache/orc/pull/1088#discussion_r851858410 ## c++/src/ColumnWriter.cc: ## @@ -1837,7 +1837,7 @@ namespace orc { // TimestampVectorBatch already stores data in UTC int64_t millsUTC = secs[i] * 1000 +

[GitHub] [orc] wgtmac commented on a diff in pull request #1087: ORC-1150: [C++] Optimize RowReaderImpl::computeBatchSize() by pre-computation

2022-04-17 Thread GitBox
wgtmac commented on code in PR #1087: URL: https://github.com/apache/orc/pull/1087#discussion_r851849272 ## c++/src/Reader.cc: ## @@ -1186,41 +1186,46 @@ namespace orc { uint64_t currentRowInStripe,

[GitHub] [orc] dongjoon-hyun commented on a diff in pull request #1088: ORC-1151: [C++] Fix `ColumnWriter` for non-UTC Timestamp columns

2022-04-17 Thread GitBox
dongjoon-hyun commented on code in PR #1088: URL: https://github.com/apache/orc/pull/1088#discussion_r851841698 ## c++/src/ColumnWriter.cc: ## @@ -1837,7 +1837,7 @@ namespace orc { // TimestampVectorBatch already stores data in UTC int64_t millsUTC = secs[i] *

[GitHub] [orc] dongjoon-hyun commented on a diff in pull request #1088: ORC-1151: [C++] Fix `ColumnWriter` for non-UTC Timestamp columns

2022-04-17 Thread GitBox
dongjoon-hyun commented on code in PR #1088: URL: https://github.com/apache/orc/pull/1088#discussion_r851849032 ## c++/src/ColumnWriter.cc: ## @@ -1837,7 +1837,7 @@ namespace orc { // TimestampVectorBatch already stores data in UTC int64_t millsUTC = secs[i] *

[GitHub] [orc] dongjoon-hyun commented on a diff in pull request #1088: ORC-1151: [C++] Fix `ColumnWriter` for non-UTC Timestamp columns

2022-04-17 Thread GitBox
dongjoon-hyun commented on code in PR #1088: URL: https://github.com/apache/orc/pull/1088#discussion_r851841698 ## c++/src/ColumnWriter.cc: ## @@ -1837,7 +1837,7 @@ namespace orc { // TimestampVectorBatch already stores data in UTC int64_t millsUTC = secs[i] *

[GitHub] [orc] dongjoon-hyun commented on a diff in pull request #1088: ORC-1151: [C++] Fix `ColumnWriter` for non-UTC Timestamp columns

2022-04-17 Thread GitBox
dongjoon-hyun commented on code in PR #1088: URL: https://github.com/apache/orc/pull/1088#discussion_r851842056 ## c++/test/TestTimestampStatistics.cc: ## @@ -57,4 +62,97 @@ namespace orc { EXPECT_EQ("Data type: Timestamp\nValues: 12\nHas null: no\nMinimum: 1995-01-01 00:0

[GitHub] [orc] dongjoon-hyun commented on a diff in pull request #1088: ORC-1151: [C++] Fix `ColumnWriter` for non-UTC Timestamp columns

2022-04-17 Thread GitBox
dongjoon-hyun commented on code in PR #1088: URL: https://github.com/apache/orc/pull/1088#discussion_r851842026 ## c++/test/TestTimestampStatistics.cc: ## @@ -57,4 +62,97 @@ namespace orc { EXPECT_EQ("Data type: Timestamp\nValues: 12\nHas null: no\nMinimum: 1995-01-01 00:0

[GitHub] [orc] dongjoon-hyun commented on a diff in pull request #1088: ORC-1151: [C++] Fix `ColumnWriter` for non-UTC Timestamp columns

2022-04-17 Thread GitBox
dongjoon-hyun commented on code in PR #1088: URL: https://github.com/apache/orc/pull/1088#discussion_r851841698 ## c++/src/ColumnWriter.cc: ## @@ -1837,7 +1837,7 @@ namespace orc { // TimestampVectorBatch already stores data in UTC int64_t millsUTC = secs[i] *

[GitHub] [orc] noirello commented on a diff in pull request #1088: ORC-1151: [C++] Incorrect statistics for Timestamp column with non UTC writer time zones

2022-04-17 Thread GitBox
noirello commented on code in PR #1088: URL: https://github.com/apache/orc/pull/1088#discussion_r851760919 ## c++/src/ColumnWriter.cc: ## @@ -1837,7 +1837,7 @@ namespace orc { // TimestampVectorBatch already stores data in UTC int64_t millsUTC = secs[i] * 1000

[GitHub] [orc] noirello commented on a diff in pull request #1088: ORC-1151: [C++] Incorrect statistics for Timestamp column with non UTC writer time zones

2022-04-17 Thread GitBox
noirello commented on code in PR #1088: URL: https://github.com/apache/orc/pull/1088#discussion_r851760641 ## c++/src/ColumnWriter.cc: ## @@ -1837,7 +1837,7 @@ namespace orc { // TimestampVectorBatch already stores data in UTC int64_t millsUTC = secs[i] * 1000

[GitHub] [orc] coderex2522 opened a new pull request, #1089: ORC-1152:[C++] Support writing short decimals in RLEv2

2022-04-17 Thread GitBox
coderex2522 opened a new pull request, #1089: URL: https://github.com/apache/orc/pull/1089 ### What changes were proposed in this pull request? This PR is to support writing short decimal64 in ORCv2. The original writer codes come from [ORC-49](https://github.com/apache/orc/pull/257)

[jira] [Created] (ORC-1152) [C++] Support encoding decimals in RLEv2

2022-04-17 Thread ZhangXin (Jira)
ZhangXin created ORC-1152: - Summary: [C++] Support encoding decimals in RLEv2 Key: ORC-1152 URL: https://issues.apache.org/jira/browse/ORC-1152 Project: ORC Issue Type: Task Reporter: Zha