This is an automated email from the ASF dual-hosted git repository. abukor pushed a commit to branch branch-1.17.x in repository https://gitbox.apache.org/repos/asf/kudu.git
The following commit(s) were added to refs/heads/branch-1.17.x by this push: new 476aaa995 KUDU-1945 Update docs with non-unique PK 476aaa995 is described below commit 476aaa995f17605849ecebd646dcd4deee83dcbf Author: Marton Greber <greber...@gmail.com> AuthorDate: Wed Apr 26 18:42:41 2023 +0000 KUDU-1945 Update docs with non-unique PK Added small update to cover non-unique primary key. For further info, I added a link to the examples folder. Right now we only have the C++ example in place for non-unique PK, I plan to translate that example to Java and Python as well. Change-Id: I84e1c6b85d4fdb5ac95bad611246c071a63bcd31 Reviewed-on: http://gerrit.cloudera.org:8080/19809 Tested-by: Kudu Jenkins Reviewed-by: Wenzhe Zhou <wz...@cloudera.com> Reviewed-by: Abhishek Chennaka <achenn...@cloudera.com> Reviewed-on: http://gerrit.cloudera.org:8080/19924 Reviewed-by: Marton Greber <greber...@gmail.com> Reviewed-by: Yingchun Lai <laiyingc...@apache.org> --- docs/administration.adoc | 3 +++ docs/schema_design.adoc | 38 +++++++++++++++++++++++++++++++++++++- 2 files changed, 40 insertions(+), 1 deletion(-) diff --git a/docs/administration.adoc b/docs/administration.adoc index 7aad2731a..ed9a88727 100644 --- a/docs/administration.adoc +++ b/docs/administration.adoc @@ -317,6 +317,9 @@ link:https://spark.apache.org/docs/latest/#downloading[Spark documentation]. Additionally review the Apache Spark documentation for link:https://spark.apache.org/docs/latest/submitting-applications.html[Submitting Applications]. +NOTE: Restoring tables with non-unique primary keys/auto-incrementing columns is +not supported currently. + ==== Backing up tables To backup one or more Kudu tables the `KuduBackup` Spark job can be used. diff --git a/docs/schema_design.adoc b/docs/schema_design.adoc index fc7800b4e..95d4d251c 100644 --- a/docs/schema_design.adoc +++ b/docs/schema_design.adoc @@ -234,9 +234,12 @@ or double type. Once set during table creation, the set of columns in the primary key may not be altered. -Unlike an RDBMS, Kudu does not provide an auto-incrementing column feature, +Unlike an RDBMS, Kudu does not provide an explicit auto-incrementing column feature, so the application must always provide the full primary key during insert. +Columns which do not satisfy the uniqueness constraint can still be used as primary keys, by +specifying them as non-unique primary keys. + Row delete and update operations must also specify the full primary key of the row to be changed. Kudu does not natively support range deletes or updates. @@ -257,6 +260,39 @@ NOTE: Primary key indexing optimizations apply to scans on individual tablets. See the <<partition-pruning>> section for details on how scans can use predicates to skip entire tablets. +[[non-unique_primary_keys]] +=== Non-unique Primary Key Index + +While specifying columns as non-unique primary key, Kudu internally creates an auto-incrementing +column. The specified columns and the auto-incrementing column form the effective primary key. + +NOTE: The auto-incrementing counter which is used to assign value for auto-incrementing column is +managed by Kudu, the counter values are monotonically increasing per tablet. + +Non-unique primary key columns must be non-nullable, and may not be a boolean, float +or double type. + +Once set during table creation, the set of columns in the non-unique primary key and the +auto-incrementing column can not be altered. + +For inserts, one has to provide values for the non-unique primary key columns without specifying +the values for auto-incrementing column. The auto-incrementing column is populated on the server +side automatically. + +For updates/deletes the full set of key columns is necessary. One has to perform a scan before +update/delete operation to get the auto-incrementing value. + +Upsert operation is not supported on tables with non-unique primary key. + +The non-unique primary key values of a column may not be updated after the row is inserted. +However, the row may be deleted and re-inserted with the updated value, moreover a new +auto-incrementing counter value is assigned during insertion for auto-incrementing column. + +Restoring tables with non-unique primary keys is not supported currently. + +For more details on how to use non-unique primary key, please check the +link:https://github.com/apache/kudu/tree/master/examples[examples] folder. + [[Backfilling]] === Considerations for Backfill Inserts