jackye1995 opened a new pull request #2688:
URL: https://github.com/apache/iceberg/pull/2688


   Add DynamoDB catalog implementation, with the following specifications:
   1. identifier column (partition key): table identifier string, or 
`NAMESPACE` for namespaces
   2. namespace column (sort key): namespace string
   3. a global secondary index with namespace as partition key, identifier as 
sort key
   4. version column : UUID string, used for optimistic locking
   5. updated_at column : timestamp long, used to record latest update time
   6. created_at column : timestamp long, used to record initial create time
   7. p.[property_key] column : string, used to store properties (namespace 
property or Iceberg-defined table properties including `table_type`, 
`metadata_location` and `previous_metadata_location`)
   
   This design has the following benefits:
   1. table name is used directly as partition key to avoid any potential hot 
partition issue, comparing to use namespace as partition key and table name as 
sort key
   2. namespace operations are clustered in a single partition to avoid 
affecting table commit operations
   3. a reverse GSI is used for list table operation, and all other operations 
are single row ops or single partition query
   4. a string UUID version field is used instead of updated_at to avoid 2 
processes committing at the same millisecond
   5. multi-row transaction is used for `renameTable` to ensure idempotency
   6. storage per row and update overhead is minimized by flattening properties 
with a `p.` prefix, instead of placing them in a single nested map type column.
   
   Limitations:
   1. To avoid complications in parsing namespace, dot (`.`) is not allowed in 
any level of namespace
   2. Similarly, to avoid complications in parsing table identifier, dot is not 
allowed in table name.
   
   @yyanyy @rdblue @SreeramGarlapati @johnclara @danielcweeks 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to