wgtmac commented on code in PR #69:
URL: https://github.com/apache/parquet-site/pull/69#discussion_r1667870201
##########
content/en/docs/File Format/_index.md:
##########
@@ -11,29 +11,29 @@ This file and the thrift definition should be read together
to understand the fo
```
4-byte magic number "PAR1"
- <Column 1 Chunk 1 + Column Metadata>
- <Column 2 Chunk 1 + Column Metadata>
+ <Column 1 Chunk 1>
+ <Column 2 Chunk 1>
...
- <Column N Chunk 1 + Column Metadata>
- <Column 1 Chunk 2 + Column Metadata>
- <Column 2 Chunk 2 + Column Metadata>
+ <Column N Chunk 1>
+ <Column 1 Chunk 2>
+ <Column 2 Chunk 2>
...
- <Column N Chunk 2 + Column Metadata>
+ <Column N Chunk 2>
...
- <Column 1 Chunk M + Column Metadata>
- <Column 2 Chunk M + Column Metadata>
+ <Column 1 Chunk M>
+ <Column 2 Chunk M>
...
- <Column N Chunk M + Column Metadata>
+ <Column N Chunk M>
File Metadata
4-byte length in bytes of file metadata (little endian)
4-byte magic number "PAR1"
```
In the above example, there are N columns in this table, split into M row
-groups. The file metadata contains the locations of all the column metadata
+groups. The file metadata contains the locations of all the column chunk
start locations. More details on what is contained in the metadata can be
found
in the Thrift definition.
-Metadata is written after the data to allow for single pass writing.
+File Metadata is written after the data to allow for single pass writing.
Review Comment:
```suggestion
File metadata is written after the data to allow for single pass writing.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]