(iceberg) branch main updated: Spec: Fix table of content generation (#11067)

russellspitzer Fri, 25 Oct 2024 13:37:39 -0700

This is an automated email from the ASF dual-hosted git repository.

russellspitzer pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iceberg.git



The following commit(s) were added to refs/heads/main by this push:
     new 7738e1d722 Spec: Fix table of content generation (#11067)
7738e1d722 is described below

commit 7738e1d7228474e36f661cfa1a15a2e8f8410bcd
Author: Ajantha Bhat <[email protected]>
AuthorDate: Sat Oct 26 02:07:05 2024 +0530

    Spec: Fix table of content generation (#11067)
---
 format/spec.md | 94 +++++++++++++++++++++++++++++-----------------------------
 1 file changed, 47 insertions(+), 47 deletions(-)

diff --git a/format/spec.md b/format/spec.md
index 601cbcc3bc..6b80e876ed 100644
--- a/format/spec.md
+++ b/format/spec.md
@@ -30,13 +30,13 @@ Versions 1 and 2 of the Iceberg spec are complete and 
adopted by the community.
 
 The format version number is incremented when new features are added that will 
break forward-compatibility---that is, when older readers would not read newer 
table features correctly. Tables may continue to be written with an older 
version of the spec to ensure compatibility by not using features that are not 
yet implemented by processing engines.
 
-#### Version 1: Analytic Data Tables
+### Version 1: Analytic Data Tables
 
 Version 1 of the Iceberg spec defines how to manage large analytic tables 
using immutable file formats: Parquet, Avro, and ORC.
 
 All version 1 data and metadata files are valid after upgrading a table to 
version 2. [Appendix E](#version-2) documents how to default version 2 fields 
when reading version 1 metadata.
 
-#### Version 2: Row-level Deletes
+### Version 2: Row-level Deletes
 
 Version 2 of the Iceberg spec adds row-level updates and deletes for analytic 
tables with immutable files.
 
@@ -44,7 +44,7 @@ The primary change in version 2 adds delete files to encode 
rows that are delete
 
 In addition to row-level deletes, version 2 makes some requirements stricter 
for writers. The full set of changes are listed in [Appendix E](#version-2).
 
-#### Version 3: Extended Types and Capabilities
+### Version 3: Extended Types and Capabilities
 
 Version 3 of the Iceberg spec extends data types and existing metadata 
structures to add new capabilities:
 
@@ -75,7 +75,7 @@ Data files in snapshots are tracked by one or more manifest 
files that contain a
 
 The manifests that make up a snapshot are stored in a manifest list file. Each 
manifest list stores metadata about manifests, including partition stats and 
data file counts. These stats are used to avoid reading manifests that are not 
required for an operation.
 
-#### Optimistic Concurrency
+### Optimistic Concurrency
 
 An atomic swap of one table metadata file for another provides the basis for 
serializable isolation. Readers use the snapshot that was current when they 
load the table metadata and are not affected by changes until they refresh and 
pick up a new metadata location.
 
@@ -85,7 +85,7 @@ If the snapshot on which an update is based is no longer 
current, the writer mus
 
 The conditions required by a write to successfully commit determines the 
isolation level. Writers can select what to validate and can make different 
isolation guarantees.
 
-#### Sequence Numbers
+### Sequence Numbers
 
 The relative age of data and delete files relies on a sequence number that is 
assigned to every successful commit. When a snapshot is created for a commit, 
it is optimistically assigned the next sequence number, and it is written into 
the snapshot's metadata. If the commit fails and must be retried, the sequence 
number is reassigned and written into new snapshot metadata.
 
@@ -94,7 +94,7 @@ All manifests, data files, and delete files created for a 
snapshot inherit the s
 Inheriting the sequence number from manifest metadata allows writing a new 
manifest once and reusing it in commit retries. To change a sequence number for 
a retry, only the manifest list must be rewritten -- which would be rewritten 
anyway with the latest set of manifests.
 
 
-#### Row-level Deletes
+### Row-level Deletes
 
 Row-level deletes are stored in delete files.
 
@@ -106,7 +106,7 @@ There are two ways to encode a row-level delete:
 Like data files, delete files are tracked by partition. In general, a delete 
file must be applied to older data files with the same partition; see [Scan 
Planning](#scan-planning) for details. Column metrics can be used to determine 
whether a delete file's rows overlap the contents of a data file or a scan 
range.
 
 
-#### File System Operations
+### File System Operations
 
 Iceberg only requires that file systems support the following operations:
 
@@ -121,9 +121,9 @@ Tables do not require random-access writes. Once written, 
data and metadata file
 Tables do not require rename, except for tables that use atomic rename to 
implement the commit operation for new metadata files.
 
 
-# Specification
+## Specification
 
-### Terms
+#### Terms
 
 * **Schema** -- Names and types of fields in a table.
 * **Partition spec** -- A definition of how partition values are derived from 
data fields.
@@ -133,7 +133,7 @@ Tables do not require rename, except for tables that use 
atomic rename to implem
 * **Data file** -- A file that contains rows of a table.
 * **Delete file** -- A file that encodes rows of a table that are deleted by 
position or data values.
 
-### Writer requirements
+#### Writer requirements
 
 Some tables in this spec have columns that specify requirements for tables by 
version. These requirements are intended for writers when adding metadata files 
(including manifests files and manifest lists) to a table with the given 
version.
 
@@ -158,19 +158,19 @@ Readers should be more permissive because v1 metadata 
files are allowed in v2 ta
 
 Readers may be more strict for metadata JSON files because the JSON files are 
not reused and will always match the table version. Required fields that were 
not present in or were optional in prior versions may be handled as required 
fields. For example, a v2 table that is missing `last-sequence-number` can 
throw an exception.
 
-### Writing data files
+#### Writing data files
 
 All columns must be written to data files even if they introduce redundancy 
with metadata stored in manifest files (e.g. columns with identity partition 
transforms). Writing all columns provides a backup in case of corruption or 
bugs in the metadata layer.
 
 Writers are not allowed to commit files with a partition spec that contains a 
field with an unknown transform.
 
-## Schemas and Data Types
+### Schemas and Data Types
 
 A table's **schema** is a list of named columns. All data types are either 
primitives or nested types, which are maps, lists, or structs. A table schema 
is also a struct type.
 
 For the representations of these types in Avro, ORC, and Parquet file formats, 
see Appendix A.
 
-### Nested Types
+#### Nested Types
 
 A **`struct`** is a tuple of typed values. Each field in the tuple is named 
and has an integer id that is unique in the table schema. Each field can be 
either optional or required, meaning that values can (or cannot) be null. 
Fields may be any type. Fields may have an optional comment or doc string. 
Fields can have [default values](#default-values).
 
@@ -178,7 +178,7 @@ A **`list`** is a collection of values with some element 
type. The element field
 
 A **`map`** is a collection of key-value pairs with a key type and a value 
type. Both the key field and value field each have an integer id that is unique 
in the table schema. Map keys are required and map values can be either 
optional or required. Both map keys and map values may be any type, including 
nested types.
 
-### Primitive Types
+#### Primitive Types
 
 Supported primitive types are defined in the table below. Primitive types 
added after v1 have an "added by" version that is the first spec version in 
which the type is allowed. For example, nanosecond-precision timestamps are 
part of the v3 spec; using v3 types in v1 or v2 tables can break forward 
compatibility.
 
@@ -211,7 +211,7 @@ Notes:
 For details on how to serialize a schema to JSON, see Appendix C.
 
 
-### Default values
+#### Default values
 
 Default values can be tracked for struct fields (both nested structs and the 
top-level schema's struct). There can be two defaults with a field:
 
@@ -227,7 +227,7 @@ All columns of `unknown` type must default to null. 
Non-null values for `initial
 Default values are attributes of fields in schemas and serialized with fields 
in the JSON format. See [Appendix C](#appendix-c-json-serialization).
 
 
-### Schema Evolution
+#### Schema Evolution
 
 Schemas may be evolved by type promotion or adding, deleting, renaming, or 
reordering fields in structs (both nested structs and the top-level schema’s 
struct).
 
@@ -275,7 +275,7 @@ Struct evolution requires the following rules for default 
values:
 * If a field value is missing from a struct's `write-default`, the field's 
`write-default` must be used for the field
 
 
-#### Column Projection
+##### Column Projection
 
 Columns in Iceberg data files are selected by field id. The table schema's 
column names and order may change after a data file is written, and projection 
must be done using field ids.
 
@@ -307,7 +307,7 @@ Field mapping fields are constrained by the following rules:
 
 For details on serialization, see [Appendix C](#name-mapping-serialization).
 
-### Identifier Field IDs
+#### Identifier Field IDs
 
 A schema can optionally track the set of primitive fields that identify rows 
in a table, using the property `identifier-field-ids` (see JSON encoding in 
Appendix C).
 
@@ -316,7 +316,7 @@ Two rows are the "same"---that is, the rows represent the 
same entity---if the i
 Identifier fields may be nested in structs but cannot be nested within maps or 
lists. Float, double, and optional fields cannot be used as identifier fields 
and a nested field cannot be used as an identifier field if it is nested in an 
optional struct, to avoid null values in identifiers.
 
 
-### Reserved Field IDs
+#### Reserved Field IDs
 
 Iceberg tables must not use field ids greater than 2147483447 
(`Integer.MAX_VALUE - 200`). This id range is reserved for metadata columns 
that can be used in user data schemas, like the `_file` column that holds the 
file path in which a row was stored.
 
@@ -335,7 +335,7 @@ The set of metadata columns is:
 | **`2147483543  _row_id`**        | `long`        | A unique long assigned 
when row-lineage is enabled, see [Row Lineage](#row-lineage)                    
|
 | **`2147483542  _last_updated_sequence_number`**   | `long`        | The 
sequence number which last updated this row when row-lineage is enabled [Row 
Lineage](#row-lineage) |
 
-### Row Lineage
+#### Row Lineage
 
 In v3 and later, an Iceberg table can track row lineage fields for all newly 
created rows.  Row lineage is enabled by setting the field `row-lineage` to 
true in the table's metadata. When enabled, engines must maintain the 
`next-row-id` table field and the following row-level fields when writing data 
files:
 
@@ -347,7 +347,7 @@ These fields are assigned and updated by inheritance 
because the commit sequence
 When row lineage is enabled, new snapshots cannot include [Equality 
Deletes](#equality-delete-files). Row lineage is incompatible with equality 
deletes because lineage values must be maintained, but equality deletes are 
used to avoid reading existing data before writing changes.
 
 
-#### Row lineage assignment
+##### Row lineage assignment
 
 Row lineage fields are written when row lineage is enabled. When not enabled, 
row lineage fields (`_row_id` and `_last_updated_sequence_number`) must not be 
written to data files. The rest of this section applies when row lineage is 
enabled.
 
@@ -368,7 +368,7 @@ When an existing row is moved to a different data file for 
any reason, writers a
 3. If the write has not modified the row, the existing non-null 
`_last_updated_sequence_number` value must be copied to the new data file
 
 
-#### Row lineage example
+##### Row lineage example
 
 This example demonstrates how `_row_id` and `_last_updated_sequence_number` 
are assigned for a snapshot when row lineage is enabled. This starts with a 
table with row lineage enabled and a `next-row-id` of 1000.
 
@@ -409,7 +409,7 @@ Files `data2` and `data3` are written with `null` for 
`first_row_id` and are ass
 When the new snapshot is committed, the table's `next-row-id` must also be 
updated (even if the new snapshot is not in the main branch). Because 225 rows 
were added (`added1`: 100 + `added2`: 0 + `added3`: 125), the new value is 
1,000 + 225 = 1,225:
 
 
-### Enabling Row Lineage for Non-empty Tables
+##### Enabling Row Lineage for Non-empty Tables
 
 Any snapshot without the field `first-row-id` does not have any lineage 
information and values for `_row_id` and `_last_updated_sequence_number` cannot 
be assigned accurately.  
 
@@ -419,7 +419,7 @@ null should be explicitly written. After this point, rows 
are treated as if they
 and assigned `row_id` and `_last_updated_sequence_number` as if they were new 
rows.
 
 
-## Partitioning
+### Partitioning
 
 Data files are stored in manifests with a tuple of partition values that are 
used in scans to filter out files that cannot contain records that match the 
scan’s filter predicate. Partition values for a data file must be the same for 
all records stored in the data file. (Manifests store data files from any 
partition, as long as the partition spec is the same for the data files.)
 
@@ -440,7 +440,7 @@ Two partition specs are considered equivalent with each 
other if they have the s
 
 Partition field IDs must be reused if an existing partition spec contains an 
equivalent field.
 
-### Partition Transforms
+#### Partition Transforms
 
 | Transform name    | Description                                              
    | Source types                                                              
                                | Result type |
 
|-------------------|--------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|-------------|
@@ -458,7 +458,7 @@ All transforms must return `null` for a `null` input value.
 The `void` transform may be used to replace the transform in an existing 
partition field so that the field is effectively dropped in v1 tables. See 
partition evolution below.
 
 
-### Bucket Transform Details
+#### Bucket Transform Details
 
 Bucket partition transforms use a 32-bit hash of the source value. The 32-bit 
hash implementation is the 32-bit Murmur3 hash, x86 variant, seeded with 0.
 
@@ -475,7 +475,7 @@ Notes:
 For hash function details by type, see Appendix B.
 
 
-### Truncate Transform Details
+#### Truncate Transform Details
 
 | **Type**      | **Config**            | **Truncate specification**           
                            | **Examples**                     |
 
|---------------|-----------------------|------------------------------------------------------------------|----------------------------------|
@@ -493,7 +493,7 @@ Notes:
 4. In contrast to strings, binary values do not have an assumed encoding and 
are truncated to `L` bytes.
 
 
-### Partition Evolution
+#### Partition Evolution
 
 Table partitioning can be evolved by adding, removing, renaming, or reordering 
partition spec fields.
 
@@ -510,7 +510,7 @@ In v1, partition field IDs were not tracked, but were 
assigned sequentially star
 3. Only add partition fields at the end of the previous partition spec
 
 
-## Sorting
+### Sorting
 
 Users can sort their data within partitions by columns to gain performance. 
The information on how the data is sorted can be declared per data or delete 
file, by a **sort order**.
 
@@ -530,7 +530,7 @@ Sorting floating-point numbers should produce the following 
behavior: `-NaN` < `
 A data or delete file is associated with a sort order by the sort order's id 
within [a manifest](#manifests). Therefore, the table must declare all the sort 
orders for lookup. A table could also be configured with a default sort order 
id, indicating how the new data should be sorted by default. Writers should use 
this default sort order to sort the data on write, but are not required to if 
the default order is prohibitively expensive, as it would be for streaming 
writes.
 
 
-## Manifests
+### Manifests
 
 A manifest is an immutable Avro file that lists data files or delete files, 
along with each file’s partition data tuple, metrics, and tracking information. 
One or more manifest files are used to store a [snapshot](#snapshots), which 
tracks all of the files in a table at some point in time. Manifests are tracked 
by a [manifest list](#manifest-lists) for each table snapshot.
 
@@ -598,7 +598,7 @@ The `partition` struct stores the tuple of partition values 
for each file. Its t
 The column metrics maps are used when filtering to select both data and delete 
files. For delete files, the metrics must store bounds and counts for all 
deleted rows, or must be omitted. Storing metrics for deleted rows ensures that 
the values can be used during job planning to find delete files that must be 
merged during a scan.
 
 
-### Manifest Entry Fields
+#### Manifest Entry Fields
 
 The manifest entry fields are used to keep track of the snapshot in which 
files were added or logically deleted. The `data_file` struct is nested inside 
of the manifest entry so that it can be easily passed to job planning without 
the manifest entry fields.
 
@@ -616,7 +616,7 @@ Notes:
 1. Technically, data files can be deleted when the last snapshot that contains 
the file as “live” data is garbage collected. But this is harder to detect and 
requires finding the diff of multiple snapshots. It is easier to track what 
files are deleted in a snapshot and delete them when that snapshot expires.  It 
is not recommended to add a deleted file back to a table. Adding a deleted file 
can lead to edge cases where incremental deletes can break table snapshots.
 2. Manifest list files are required in v2, so that the `sequence_number` and 
`snapshot_id` to inherit are always available.
 
-### Sequence Number Inheritance
+#### Sequence Number Inheritance
 
 Manifests track the sequence number when a data or delete file was added to 
the table.
 
@@ -629,7 +629,7 @@ Inheriting sequence numbers through the metadata tree 
allows writing a new manif
 
 When reading v1 manifests with no sequence number column, sequence numbers for 
all files must default to 0.
 
-### First Row ID Inheritance
+#### First Row ID Inheritance
 
 Row ID inheritance is used when row lineage is enabled. When not enabled, a 
data file's `first_row_id` must always be set to `null`. The rest of this 
section applies when row lineage is enabled.
 
@@ -639,7 +639,7 @@ When reading, the `first_row_id` is assigned by replacing 
`null` with the manife
 
 The `first_row_id` is only inherited for added data files. The inherited value 
must be written into the data file metadata for existing and deleted entries. 
The value of `first_row_id` for delete files is always `null`.
 
-## Snapshots
+### Snapshots
 
 A snapshot consists of the following fields:
 
@@ -673,7 +673,7 @@ Manifests for a snapshot are tracked by a manifest list.
 Valid snapshots are stored as a list in table metadata. For serialization, see 
Appendix C.
 
 
-### Snapshot Row IDs
+#### Snapshot Row IDs
 
 When row lineage is not enabled, `first-row-id` must be omitted. The rest of 
this section applies when row lineage is enabled.
 
@@ -811,13 +811,13 @@ When expiring snapshots, retention policies in table and 
snapshot references are
     2. The snapshot is not one of the first `min-snapshots-to-keep` in the 
branch (including the branch's referenced snapshot)
 5. Expire any snapshot not in the set of snapshots to retain.
 
-## Table Metadata
+### Table Metadata
 
 Table metadata is stored as JSON. Each table metadata change creates a new 
table metadata file that is committed by an atomic operation. This operation is 
used to ensure that a new version of table metadata replaces the version on 
which it was based. This produces a linear history of table versions and 
ensures that concurrent writes are not lost.
 
 The atomic operation used to commit metadata depends on how tables are tracked 
and is not standardized by this spec. See the sections below for examples.
 
-### Table Metadata Fields
+#### Table Metadata Fields
 
 Table metadata consists of the following fields:
 
@@ -853,7 +853,7 @@ For serialization details, see Appendix C.
 
 When a new snapshot is added, the table's `next-row-id` should be updated to 
the previous `next-row-id` plus the sum of `record_count` for all data files 
added in the snapshot (this is also equal to the sum of `added_rows_count` for 
all manifests added in the snapshot). This ensures that `next-row-id` is always 
higher than any assigned row ID in the table.
 
-### Table Statistics
+#### Table Statistics
 
 Table statistics files are valid [Puffin files](puffin-spec.md). Statistics 
are informational. A reader can choose to
 ignore statistics information. Statistics support is not required to read the 
table correctly. A table can contain
@@ -881,7 +881,7 @@ Blob metadata is a struct with the following fields:
 | _optional_ | _optional_ | **`properties`** | `map<string, string>` | 
Additional properties associated with the statistic. Subset of Blob properties 
in the Puffin file. |
 
 
-### Partition Statistics
+#### Partition Statistics
 
 Partition statistics files are based on [partition statistics file 
spec](#partition-statistics-file). 
 Partition statistics are not required for reading or planning and readers may 
ignore them.
@@ -897,7 +897,7 @@ Partition statistics file must be registered in the table 
metadata file to be co
 | _required_ | _required_ | **`statistics-path`** | `string` | Path of the 
partition statistics file. See [Partition statistics 
file](#partition-statistics-file). |
 | _required_ | _required_ | **`file-size-in-bytes`** | `long` | Size of the 
partition statistics file. |
 
-#### Partition Statistics File
+##### Partition Statistics File
 
 Statistics information for each unique partition tuple is stored as a row in 
any of the data file format of the table (for example, Parquet or ORC).
 These rows must be sorted (in ascending manner with NULL FIRST) by `partition` 
field to optimize filtering rows while scanning.
@@ -934,7 +934,7 @@ The unified partition type looks like `Struct<field#1, 
field#2, field#3>`.
 and then the table has evolved into `spec#1` which has just one field 
`{field#2}`.
 The unified partition type looks like `Struct<field#1, field#2>`.
 
-## Commit Conflict Resolution and Retry
+### Commit Conflict Resolution and Retry
 
 When two commits happen at the same time and are based on the same version, 
only one commit will succeed. In most cases, the failed commit can be applied 
to the new current version of table metadata and retried. Updates verify the 
conditions under which they can be applied to a new version and retry if those 
conditions are met.
 
@@ -944,7 +944,7 @@ When two commits happen at the same time and are based on 
the same version, only
 *   Table schema updates and partition spec changes must validate that the 
schema has not changed between the base version and the current version.
 
 
-### File System Tables
+#### File System Tables
 
 _Note: This file system based scheme to commit a metadata file is 
**deprecated** and will be removed in version 4 of this spec. The scheme is 
**unsafe** in object stores and local file systems._
 
@@ -963,7 +963,7 @@ Notes:
 
 1. The file system table scheme is implemented in 
[HadoopTableOperations](../javadoc/{{ icebergVersion 
}}/index.html?org/apache/iceberg/hadoop/HadoopTableOperations.html).
 
-### Metastore Tables
+#### Metastore Tables
 
 The atomic swap needed to commit new versions of table metadata can be 
implemented by storing a pointer in a metastore or database that is updated 
with a check-and-put operation [1]. The check-and-put validates that the 
version of the table that a write is based on is still current and then makes 
the new metadata from the write the current version.
 
@@ -980,7 +980,7 @@ Notes:
 1. The metastore table scheme is partly implemented in 
[BaseMetastoreTableOperations](../javadoc/{{ icebergVersion 
}}/index.html?org/apache/iceberg/BaseMetastoreTableOperations.html).
 
 
-## Delete Formats
+### Delete Formats
 
 This section details how to encode row-level deletes in Iceberg delete files. 
Row-level deletes are not supported in v1.
 
@@ -991,7 +991,7 @@ Row-level delete files are tracked by manifests, like data 
files. A separate set
 Both position and equality deletes allow encoding deleted row values with a 
delete. This can be used to reconstruct a stream of changes to a table.
 
 
-### Position Delete Files
+#### Position Delete Files
 
 Position-based delete files identify deleted rows by file and position in one 
or more data files, and may optionally contain the deleted row.
 
@@ -1016,7 +1016,7 @@ The rows in the delete file must be sorted by `file_path` 
then `pos` to optimize
 *  Sorting by `file_path` allows filter pushdown by file in columnar storage 
formats.
 *  Sorting by `pos` allows filtering rows while scanning, to avoid keeping 
deletes in memory.
 
-### Equality Delete Files
+#### Equality Delete Files
 
 Equality delete files identify deleted rows in a collection of data files by 
one or more column values, and may optionally contain additional columns of the 
deleted row.
 
@@ -1068,7 +1068,7 @@ equality_ids=[1, 2]
 If a delete column in an equality delete file is later dropped from the table, 
it must still be used when applying the equality deletes. If a column was added 
to a table and later used as a delete column in an equality delete file, the 
column value is read for older data files using normal projection rules 
(defaults to `null`).
 
 
-### Delete File Stats
+#### Delete File Stats
 
 Manifests hold the same statistics for delete files and data files. For delete 
files, the metrics describe the values that were deleted.

(iceberg) branch main updated: Spec: Fix table of content generation (#11067)

Reply via email to