[ https://issues.apache.org/jira/browse/KUDU-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17702312#comment-17702312 ]
ASF subversion and git services commented on KUDU-1945: ------------------------------------------------------- Commit 80a18400024d36f84c7fa53eac0fba8e2f8b726d in kudu's branch refs/heads/master from Marton Greber [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=80a184000 ] KUDU-1945 Auto-incrementing column, Python client This patch adds the Python part of the client side changes for the auto incrementing column feature. A new ColumnSpec called non_unique_primary_key is added. Semantically it behaves like primary_key: - only one column can have the non_unique_primary_key ColumnSpec in a given SchemaBuilder context, - if it exists, it must be defined in the first place, - compound keys are defined through a set function. Functionally, non-unique primary keys don't need to fulfill the uniqueness constraint. An auto incrementing column is added in the background automatically once a non-unique primary key is specified. The non-unique keys and the auto incrementing column together form the effective primary key. Some technical notes: - The name of the auto incrementing column is hardcoded into the kudu.Schema class. This is a reserved column name, users can't create columns with it. On the client facing side, this reserved string is reachable through kudu.Schema.get_auto_incrementing_column_name(). - In this initial version there is no support for UPSERT and UPSERT_IGNORE operations. - With non-unique primary key, one can't use the tuple/list initialization for new inserts. Change-Id: I94622680c5eb32eb2746a3b84c73699c1a37618c Reviewed-on: http://gerrit.cloudera.org:8080/19566 Tested-by: Kudu Jenkins Reviewed-by: Abhishek Chennaka <achenn...@cloudera.com> Reviewed-by: Yingchun Lai <laiyingc...@apache.org> Reviewed-by: Alexey Serbin <ale...@apache.org> > Support generation of surrogate primary keys (or tables with no PK) > ------------------------------------------------------------------- > > Key: KUDU-1945 > URL: https://issues.apache.org/jira/browse/KUDU-1945 > Project: Kudu > Issue Type: New Feature > Components: client, master, tablet > Reporter: Todd Lipcon > Priority: Major > Labels: roadmap-candidate > > Many use cases have data where there is no "natural" primary key. For > example, a web log use case mostly cares about partitioning and not about > precise sorting by timestamp, and timestamps themselves are not necessarily > unique. Rather than forcing users to come up with their own surrogate primary > keys, Kudu should support some kind of "auto_increment" equivalent which > generates primary keys on insertion. Alternatively, Kudu could support tables > which are partitioned but not internally sorted. > The advantages would be: > - Kudu can pick primary keys on insertion to guarantee that there is no > compaction required on the table (eg always assign a new key higher than any > existing key in the local tablet). This can improve write throughput > substantially, especially compared to naive PK generation schemes that a user > might pick such as UUID, which would generate a uniform random-insert > workload (worst case for performance) > - Make Kudu easier to use for such use cases (no extra client code necessary) -- This message was sent by Atlassian Jira (v8.20.10#820010)