Hi Parquet Dev - There have been some conversations about content stored on the parquet-format github repo vs. the website. Doing a cursory pass of the parquet-format <https://github.com/apache/parquet-format> repo, it looks like, other than the markdown documentation stored in the repo, most of the core code was marked as deprecated here: https://github.com/apache/parquet-format/pull/105, content was moved to parquet-mr, and that entire repo really only exists to host this file: https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift. It's possible I'm missing something, but is my understanding correct?
If so, would it make sense to just deprecate parquet-format as a repo, move the content to be exclusively hosted on parquet-site <https://github.com/apache/parquet-site/tree/asf-site>, and host the thrift file elsewhere? This would solve the content duplication problem between parquet format and the website, and would cut down on having to manage a separate repo. I know there is benefit to having comments/discussions on PRs or issues on the repo, but we could also pretty easily port this to the site. I'm sure this proposal will elicit some strong responses, but wanted to see if anyone had insights here / if I'm missing anything. Thanks, Vinoo <[email protected]>
