[jira] [Commented] (KUDU-1945) Support generation of surrogate primary keys (or tables with no PK)

ASF subversion and git services (Jira) Thu, 19 Jan 2023 20:03:07 -0800


    [ 
https://issues.apache.org/jira/browse/KUDU-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678977#comment-17678977
 ]


ASF subversion and git services commented on KUDU-1945:
-------------------------------------------------------

Commit f64ed2aac40515ae46132a9bc3cdf7ad5b3f33de in kudu's branch 
refs/heads/master from wzhou-code
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=f64ed2aac ]

KUDU-1945: Support non unique primary key for Java client

This patch adds new APIs to create ColumnSchema with non unique
primary key for Java client. When a table with non unique primary
key is created, Auto-Incrementing Column "auto_incrementing_id" will
be added automatically to the table as the key column. The non-unique
key columns and the auto-incrementing column together form the
effective primary key.

UPSERT/UPSERT_IGNORE operations are not supported now for Kudu table
with auto-incrementing column due to limitation in Kudu server.

Auto-Incrementing column cannot be added, removed or renamed with
Alter Table APIs.

Testing:
 - Added unit-test for Java client library.
 - Manually ran integration test with Impala for creating table
   with non unique primary key, and ran queries for operations:
   describe/insert/update/delete/upsert/CTAS/select/alter, etc.
   Passed Kudu related end-to-end tests.

Change-Id: I7e2501d6b3d66f6466959e4f3f1ed0f5e08dfe5c
Reviewed-on: http://gerrit.cloudera.org:8080/19384
Reviewed-by: Alexey Serbin <ale...@apache.org>
Reviewed-by: Abhishek Chennaka <achenn...@cloudera.com>
Tested-by: Alexey Serbin <ale...@apache.org>


> Support generation of surrogate primary keys (or tables with no PK)
> -------------------------------------------------------------------
>
>                 Key: KUDU-1945
>                 URL: https://issues.apache.org/jira/browse/KUDU-1945
>             Project: Kudu
>          Issue Type: New Feature
>          Components: client, master, tablet
>            Reporter: Todd Lipcon
>            Priority: Major
>              Labels: roadmap-candidate
>
> Many use cases have data where there is no "natural" primary key. For 
> example, a web log use case mostly cares about partitioning and not about 
> precise sorting by timestamp, and timestamps themselves are not necessarily 
> unique. Rather than forcing users to come up with their own surrogate primary 
> keys, Kudu should support some kind of "auto_increment" equivalent which 
> generates primary keys on insertion. Alternatively, Kudu could support tables 
> which are partitioned but not internally sorted.
> The advantages would be:
> - Kudu can pick primary keys on insertion to guarantee that there is no 
> compaction required on the table (eg always assign a new key higher than any 
> existing key in the local tablet). This can improve write throughput 
> substantially, especially compared to naive PK generation schemes that a user 
> might pick such as UUID, which would generate a uniform random-insert 
> workload (worst case for performance)
> - Make Kudu easier to use for such use cases (no extra client code necessary)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KUDU-1945) Support generation of surrogate primary keys (or tables with no PK)

Reply via email to