Hi Gang,
For writes I'm seeing "parquet-mr version 1.11.1" and "parquet-mr version
1.10.1". I need to look more into the page headers to check for
consistency. At the column level, in some cases the number of values read
by pyarrow is consistent with num_rows and in some cases it is consistent
[
https://issues.apache.org/jira/browse/PARQUET-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790918#comment-17790918
]
ASF GitHub Bot commented on PARQUET-2396:
-
zhangjiashen commented on code in PR #1219:
URL:
zhangjiashen commented on code in PR #1219:
URL: https://github.com/apache/parquet-mr/pull/1219#discussion_r1408788644
##
parquet-column/src/main/java/org/apache/parquet/internal/column/columnindex/ColumnIndexBuilder.java:
##
@@ -298,24 +295,22 @@ public >
[
https://issues.apache.org/jira/browse/PARQUET-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790917#comment-17790917
]
ASF GitHub Bot commented on PARQUET-2396:
-
zhangjiashen commented on code in PR #1219:
URL:
zhangjiashen commented on code in PR #1219:
URL: https://github.com/apache/parquet-mr/pull/1219#discussion_r1408788644
##
parquet-column/src/main/java/org/apache/parquet/internal/column/columnindex/ColumnIndexBuilder.java:
##
@@ -298,24 +295,22 @@ public >
Hi Micah,
Does the FileMetaData.version [1] provide any information about
the writer? What about the num_values in each page header? Is
the actual number of values consistent with num_values in the
ColumnMetaData?
[1]
[
https://issues.apache.org/jira/browse/PARQUET-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790840#comment-17790840
]
ASF GitHub Bot commented on PARQUET-2386:
-
wgtmac commented on PR #1209:
URL:
wgtmac commented on PR #1209:
URL: https://github.com/apache/parquet-mr/pull/1209#issuecomment-1831045581
Thanks for the improvement!
Could you please take a look at this? @shangxinli @gszadovszky @Fokko
--
This is an automated message from the Apache Git Service.
To respond to the
[
https://issues.apache.org/jira/browse/PARQUET-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790822#comment-17790822
]
ASF GitHub Bot commented on PARQUET-2397:
-
Fokko opened a new pull request, #1220:
URL:
Fokko opened a new pull request, #1220:
URL: https://github.com/apache/parquet-mr/pull/1220
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them in
Fokko Driesprong created PARQUET-2397:
-
Summary: Make use of `isEmpty`
Key: PARQUET-2397
URL: https://issues.apache.org/jira/browse/PARQUET-2397
Project: Parquet
Issue Type: Improvement
Fokko Driesprong created PARQUET-2396:
-
Summary: Refactor `ColumnIndexBuilder`
Key: PARQUET-2396
URL: https://issues.apache.org/jira/browse/PARQUET-2396
Project: Parquet
Issue Type:
Fokko opened a new pull request, #1219:
URL: https://github.com/apache/parquet-mr/pull/1219
Small refactor to improve readability
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
[
https://issues.apache.org/jira/browse/PARQUET-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790819#comment-17790819
]
ASF GitHub Bot commented on PARQUET-2396:
-
Fokko opened a new pull request, #1219:
URL:
[
https://issues.apache.org/jira/browse/PARQUET-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790815#comment-17790815
]
ASF GitHub Bot commented on PARQUET-2395:
-
Fokko opened a new pull request, #1218:
URL:
Fokko opened a new pull request, #1218:
URL: https://github.com/apache/parquet-mr/pull/1218
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them in
Fokko Driesprong created PARQUET-2395:
-
Summary: Prefer `singletonList`
Key: PARQUET-2395
URL: https://issues.apache.org/jira/browse/PARQUET-2395
Project: Parquet
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/PARQUET-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fokko Driesprong updated PARQUET-2395:
--
Summary: Prefer `singletonList` over `asList` (was: Prefer `singletonList`)
>
Fokko Driesprong created PARQUET-2394:
-
Summary: Use `computeIfAbsent` in `MessageColumnIO`
Key: PARQUET-2394
URL: https://issues.apache.org/jira/browse/PARQUET-2394
Project: Parquet
[
https://issues.apache.org/jira/browse/PARQUET-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790812#comment-17790812
]
ASF GitHub Bot commented on PARQUET-2394:
-
Fokko opened a new pull request, #1217:
URL:
Fokko opened a new pull request, #1217:
URL: https://github.com/apache/parquet-mr/pull/1217
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them in
[
https://issues.apache.org/jira/browse/PARQUET-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790810#comment-17790810
]
ASF GitHub Bot commented on PARQUET-2393:
-
Fokko opened a new pull request, #1216:
URL:
Fokko Driesprong created PARQUET-2393:
-
Summary: Make `ColumnIOCreatorVisitor` static
Key: PARQUET-2393
URL: https://issues.apache.org/jira/browse/PARQUET-2393
Project: Parquet
Issue
Fokko opened a new pull request, #1216:
URL: https://github.com/apache/parquet-mr/pull/1216
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them in
[
https://issues.apache.org/jira/browse/PARQUET-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790807#comment-17790807
]
ASF GitHub Bot commented on PARQUET-2392:
-
Fokko opened a new pull request, #1215:
URL:
Fokko opened a new pull request, #1215:
URL: https://github.com/apache/parquet-mr/pull/1215
Make sure you have checked _all_ steps below.
StringBuilder only makes sense when you concatenate in a loop.
### Jira
- [ ] My PR addresses the following [Parquet
Fokko Driesprong created PARQUET-2392:
-
Summary: Remove StringBuilder in `LogicalTypeAnnotation`
Key: PARQUET-2392
URL: https://issues.apache.org/jira/browse/PARQUET-2392
Project: Parquet
[
https://issues.apache.org/jira/browse/PARQUET-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790805#comment-17790805
]
ASF GitHub Bot commented on PARQUET-2391:
-
Fokko opened a new pull request, #1214:
URL:
Fokko opened a new pull request, #1214:
URL: https://github.com/apache/parquet-mr/pull/1214
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them in
Fokko Driesprong created PARQUET-2391:
-
Summary: Remove unnecessary unboxing
Key: PARQUET-2391
URL: https://issues.apache.org/jira/browse/PARQUET-2391
Project: Parquet
Issue Type:
[
https://issues.apache.org/jira/browse/PARQUET-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790800#comment-17790800
]
ASF GitHub Bot commented on PARQUET-2390:
-
Fokko opened a new pull request, #1213:
URL:
Fokko opened a new pull request, #1213:
URL: https://github.com/apache/parquet-mr/pull/1213
They are easier to read
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/)
Fokko Driesprong created PARQUET-2390:
-
Summary: Replace anonymouse functions with lambda's
Key: PARQUET-2390
URL: https://issues.apache.org/jira/browse/PARQUET-2390
Project: Parquet
Fokko opened a new pull request, #1212:
URL: https://github.com/apache/parquet-mr/pull/1212
Just some cleanup
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues
Fokko Driesprong created PARQUET-2389:
-
Summary: Remove redundant initializers
Key: PARQUET-2389
URL: https://issues.apache.org/jira/browse/PARQUET-2389
Project: Parquet
Issue Type:
[
https://issues.apache.org/jira/browse/PARQUET-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790797#comment-17790797
]
ASF GitHub Bot commented on PARQUET-2389:
-
Fokko opened a new pull request, #1212:
URL:
[
https://issues.apache.org/jira/browse/PARQUET-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790795#comment-17790795
]
ASF GitHub Bot commented on PARQUET-2388:
-
Fokko opened a new pull request, #1211:
URL:
Fokko opened a new pull request, #1211:
URL: https://github.com/apache/parquet-mr/pull/1211
Not being used
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues
Fokko Driesprong created PARQUET-2388:
-
Summary: Deprecate `CHARSETS` on `PlainValuesWriter`
Key: PARQUET-2388
URL: https://issues.apache.org/jira/browse/PARQUET-2388
Project: Parquet
[
https://issues.apache.org/jira/browse/PARQUET-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790793#comment-17790793
]
ASF GitHub Bot commented on PARQUET-2387:
-
Fokko opened a new pull request, #1210:
URL:
Fokko opened a new pull request, #1210:
URL: https://github.com/apache/parquet-mr/pull/1210
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them in
Fokko Driesprong created PARQUET-2387:
-
Summary: Simplify `hasFieldsIgnored` expression
Key: PARQUET-2387
URL: https://issues.apache.org/jira/browse/PARQUET-2387
Project: Parquet
Issue
[
https://issues.apache.org/jira/browse/PARQUET-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790791#comment-17790791
]
ASF GitHub Bot commented on PARQUET-2344:
-
Fokko commented on PR #1192:
URL:
Fokko commented on PR #1192:
URL: https://github.com/apache/parquet-mr/pull/1192#issuecomment-1830859619
@wgtmac Thanks for splitting out the format upgrade. Always a good idea to
make PRs smaller.
I finally fixed all the tests, and this looks good to go to me
--
This is an
[
https://issues.apache.org/jira/browse/PARQUET-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fokko Driesprong resolved PARQUET-2296.
---
Resolution: Fixed
> Bump easymock from 3.4 to 5.1.0
>
[
https://issues.apache.org/jira/browse/PARQUET-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fokko Driesprong resolved PARQUET-2300.
---
Assignee: Gang Wu
Resolution: Fixed
> Update jackson-core 2.13.4 to a
[
https://issues.apache.org/jira/browse/PARQUET-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fokko Driesprong resolved PARQUET-2336.
---
Resolution: Fixed
> Add caching key to CodecFactory
>
[
https://issues.apache.org/jira/browse/PARQUET-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fokko Driesprong resolved PARQUET-2384.
---
Resolution: Fixed
> Mark toOriginalType as deprecated
>
[
https://issues.apache.org/jira/browse/PARQUET-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fokko Driesprong resolved PARQUET-2368.
---
Resolution: Fixed
> Update japicmp to 1.18.1
>
>
>
[
https://issues.apache.org/jira/browse/PARQUET-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fokko Driesprong updated PARQUET-2382:
--
Fix Version/s: 2.0.0
(was: 1.14.0)
> Remove the deprecated
[
https://issues.apache.org/jira/browse/PARQUET-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790790#comment-17790790
]
ASF GitHub Bot commented on PARQUET-2384:
-
Fokko merged PR #1202:
URL:
Fokko merged PR #1206:
URL: https://github.com/apache/parquet-mr/pull/1206
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
[
https://issues.apache.org/jira/browse/PARQUET-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790789#comment-17790789
]
ASF GitHub Bot commented on PARQUET-2384:
-
Fokko commented on PR #1202:
URL:
Fokko merged PR #1202:
URL: https://github.com/apache/parquet-mr/pull/1202
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
Fokko commented on PR #1202:
URL: https://github.com/apache/parquet-mr/pull/1202#issuecomment-1830855336
Thanks for the review @wgtmac
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
We've recently encountered files that have inconsistencies between the
number of rows specified in the row group [1] and the total number of
values in a column [2] for non-repeated columns (within a file there is
inconsistency between columns but all counts appear to be greater than or
equal to
[
https://issues.apache.org/jira/browse/PARQUET-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790682#comment-17790682
]
ASF GitHub Bot commented on PARQUET-2386:
-
amousavigourabi commented on PR #1209:
URL:
amousavigourabi commented on PR #1209:
URL: https://github.com/apache/parquet-mr/pull/1209#issuecomment-1830328306
The `.editorconfig` has been expanded for IntelliJ and is mostly compliant
with the Spotless configuration. IntelliJ refactoring and Spotless have some
minor disagreements on
[
https://issues.apache.org/jira/browse/PARQUET-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790681#comment-17790681
]
ASF GitHub Bot commented on PARQUET-2386:
-
amousavigourabi commented on PR #1209:
URL:
amousavigourabi commented on PR #1209:
URL: https://github.com/apache/parquet-mr/pull/1209#issuecomment-1830320974
@wgtmac
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
[
https://issues.apache.org/jira/browse/PARQUET-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790680#comment-17790680
]
ASF GitHub Bot commented on PARQUET-2386:
-
amousavigourabi opened a new pull request, #1209:
amousavigourabi opened a new pull request, #1209:
URL: https://github.com/apache/parquet-mr/pull/1209
Make sure you have checked _all_ steps below.
### Jira
- [x] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
62 matches
Mail list logo