[GitHub] [arrow] nevi-me commented on pull request #8402: ARROW-8426: [Rust] [Parquet] - Add more support for converting Dicts

2020-10-26 Thread GitBox
nevi-me commented on pull request #8402: URL: https://github.com/apache/arrow/pull/8402#issuecomment-716974211 Merged This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [arrow] nevi-me commented on pull request #8402: ARROW-8426: [Rust] [Parquet] - Add more support for converting Dicts

2020-10-25 Thread GitBox
nevi-me commented on pull request #8402: URL: https://github.com/apache/arrow/pull/8402#issuecomment-716110760 @carols10cents @alamb I think the whole reader logic needs replumbing ... There's at least a 1:1 mapping between Parquet types and Arrow types, and we can cast from Arrow types to

[GitHub] [arrow] nevi-me commented on pull request #8402: ARROW-8426: [Rust] [Parquet] - Add more support for converting Dicts

2020-10-24 Thread GitBox
nevi-me commented on pull request #8402: URL: https://github.com/apache/arrow/pull/8402#issuecomment-716092619 I've botched this branch a bit with my rebase on the parquet branch. I rebased it against the parquet branch, but then I started getting stack overflows on datafusion and parqu

[GitHub] [arrow] nevi-me commented on pull request #8402: ARROW-8426: [Rust] [Parquet] - Add more support for converting Dicts

2020-10-10 Thread GitBox
nevi-me commented on pull request #8402: URL: https://github.com/apache/arrow/pull/8402#issuecomment-706556497 Parquet's dictionary encoding is a complexity on its own. My understanding's that after a certain size, the dictionary no longer grows, but the additional values are stored the no