emkornfield commented on code in PR #53:
URL: https://github.com/apache/parquet-site/pull/53#discussion_r1583283195


##########
content/en/docs/Overview/_index.md:
##########
@@ -7,3 +7,30 @@ description: >
 ---
 
 Apache Parquet is a columnar storage format available to any project in the 
Hadoop ecosystem, regardless of the choice of data processing framework, data 
model or programming language.
+
+This documentation contains information about both the 
[parquet-mr](https://github.com/apache/parquet-mr) and 
[parquet-format](https://github.com/apache/parquet-format) projects. 
+
+
+### Parquet Format
+
+The "Parquet Format" project hosts the official specification of the Parquet 
file format, defining how data is structured and stored. This specification, 
along with Thrift metadata definitions and other crucial components, is 
essential for developers to effectively read and write Parquet files. The 
parquet-format project specifically contains the format specifications needed 
to understand and properly utilize Parquet files.
+
+As a repository focused on specification, the parquet-format repository does 
not contain source code. 
+
+
+### Parquet MR 
+
+The parquet-mr GitHub repository is part of the Apache Parquet project and 
specifically focuses on providing Java tools for handling the Parquet file 
format within the Hadoop ecosystem. Essentially, this repository includes all 
the necessary Java libraries and modules that allow developers to read and 
write Parquet files.
+
+ Parquet MR can be thought of the a "reference" implementation of 
parquet-format. There are a number of other Parquet Format implementations, 
such as [parquet-cpp](https://github.com/apache/parquet-cpp) and [parquet 
rust](https://github.com/apache/arrow-rs/blob/master/parquet/README.md). 
+
+
+* Java/Scala Implementation: It contains the core Java/Scala implementation of 
the Parquet format, making it possible to use Parquet files in Java 
applications, particularly those based on Hadoop.
+
+* Utilities and APIs: It provides various utilities and APIs for working with 
Parquet files, including tools for data import/export, schema management, and 
data conversion.
+
+
+###  Other Clients / Libraries / Tools

Review Comment:
   I'm not sure it is crucial I added some more below.  I think the discussion 
on the ML on how the parquet community views other implementations is something 
we should document when we come to a consensus.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to