rdblue commented on code in PR #4945:
URL: https://github.com/apache/iceberg/pull/4945#discussion_r892950181


##########
format/spec.md:
##########
@@ -631,6 +632,29 @@ When expiring snapshots, retention policies in table and 
snapshot references are
     2. The snapshot is not one of the first `min-snapshots-to-keep` in the 
branch (including the branch's referenced snapshot)
 5. Expire any snapshot not in the set of snapshots to retain.
 
+#### Statistics file
+
+Statistics files are valid [Puffin files](../puffin-spec). Statistics are 
informational. A reader can choose to
+ignore statistics information. Statistics support is not required to read the 
table correctly.
+
+Statistics files' metadata within `statistics` table snapshot field is a 
struct with the following fields:
+
+| v2         | Field name                      | Type                          
         | Description                                                          
                                  |
+|------------|---------------------------------|----------------------------------------|--------------------------------------------------------------------------------------------------------|
+| _required_ | **`statistics-path`**           | `string`                      
         | Path of the statistics file. See [Puffin file 
format](../puffin-spec).                                 |
+| _required_ | **`file-size-in-bytes`**        | `long`                        
         | Size of the statistics file.                                         
                                  |
+| _required_ | **`file-footer-size-in-bytes`** | `long`                        
         | Size of the statistics file's footer. See [Puffin file 
format](../puffin-spec) for footer definition.  |
+| _required_ | **`source-sequence-number`**    | `long`                        
         | Table sequence number at which the stats were calculated             
                                  |
+| _required_ | **`statistics-metadata`**       | `list<statistic metadata>` 
(see below) | A list of the statistics metadata for statistics contained in the 
file with structure described below. |

Review Comment:
   Can we call this something other than `statistics-metadata`? Maybe 
`blob-metadata` instead since this is likely to be shared with indexes?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to