As a follow-up to the "V3" Discussions [1][2] I wanted to start a thread on
improvements to encodings.
There are several areas to pursue here:
1. Curating a standard set of benchmarks and criteria for determining if a
new encoding is worth adding.
2. Developing new encodings
3. Better implement
As a follow-up to the "V3" Discussions [1][2] I wanted to start a thread on
improvements to the footer metadata.
Based on conversation so far, there have been a few proposals [3][4][5] to
help better support files with wide schemas and many row-groups. I think
there are a lot of interesting ideas
As a follow-up to the "V3" Discussions [1][2] I wanted to start a
discussion to see who is interested in improving Parquet infrastructure.
In particular, as we consider newer features, I think we should be
considering regular major version releases, to allow for new features to
become default.
The
Hi Everyone,
Just to follow up, conversations on the summary doc [1] have largely slowed
down. In my mind I think of roughly three different tracks, and I'll start
threads to get a sense of who is interested (please be on the lookout for
discussion threads). I think as those conversations branch
Thanks a lot Weston for bringing this up.
Last time we discussed a potential java upgrade, Hadoop was the one not
allowing us to do so. Hadoop is still on java 8.
If we want to keep Arrow on the latest version, we will need to upgrade to
java 11. In this case we won't be able to support Hadoop wit