Hi Parquet Dev -

There have been some conversations about content stored on the
parquet-format github repo vs. the website. Doing a cursory pass of the
parquet-format <https://github.com/apache/parquet-format> repo, it looks
like, other than the markdown documentation stored in the repo, most of the
core code was marked as deprecated here:
https://github.com/apache/parquet-format/pull/105, content was moved to
parquet-mr, and that entire repo really only exists to host this file:
https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift.
It's possible I'm missing something, but is my understanding correct?

If so, would it make sense to just deprecate parquet-format as a repo, move
the content to be exclusively hosted on parquet-site
<https://github.com/apache/parquet-site/tree/asf-site>, and host the thrift
file elsewhere? This would solve the content duplication problem between
parquet format and the website, and would cut down on having to manage a
separate repo. I know there is benefit to having comments/discussions on
PRs or issues on the repo, but we could also pretty easily port this to the
site.

I'm sure this proposal will elicit some strong responses, but wanted to see
if anyone had insights here / if I'm missing anything.

Thanks, Vinoo


<[email protected]>

Reply via email to