[ https://issues.apache.org/jira/browse/KUDU-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678977#comment-17678977 ]
ASF subversion and git services commented on KUDU-1945: ------------------------------------------------------- Commit f64ed2aac40515ae46132a9bc3cdf7ad5b3f33de in kudu's branch refs/heads/master from wzhou-code [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=f64ed2aac ] KUDU-1945: Support non unique primary key for Java client This patch adds new APIs to create ColumnSchema with non unique primary key for Java client. When a table with non unique primary key is created, Auto-Incrementing Column "auto_incrementing_id" will be added automatically to the table as the key column. The non-unique key columns and the auto-incrementing column together form the effective primary key. UPSERT/UPSERT_IGNORE operations are not supported now for Kudu table with auto-incrementing column due to limitation in Kudu server. Auto-Incrementing column cannot be added, removed or renamed with Alter Table APIs. Testing: - Added unit-test for Java client library. - Manually ran integration test with Impala for creating table with non unique primary key, and ran queries for operations: describe/insert/update/delete/upsert/CTAS/select/alter, etc. Passed Kudu related end-to-end tests. Change-Id: I7e2501d6b3d66f6466959e4f3f1ed0f5e08dfe5c Reviewed-on: http://gerrit.cloudera.org:8080/19384 Reviewed-by: Alexey Serbin <ale...@apache.org> Reviewed-by: Abhishek Chennaka <achenn...@cloudera.com> Tested-by: Alexey Serbin <ale...@apache.org> > Support generation of surrogate primary keys (or tables with no PK) > ------------------------------------------------------------------- > > Key: KUDU-1945 > URL: https://issues.apache.org/jira/browse/KUDU-1945 > Project: Kudu > Issue Type: New Feature > Components: client, master, tablet > Reporter: Todd Lipcon > Priority: Major > Labels: roadmap-candidate > > Many use cases have data where there is no "natural" primary key. For > example, a web log use case mostly cares about partitioning and not about > precise sorting by timestamp, and timestamps themselves are not necessarily > unique. Rather than forcing users to come up with their own surrogate primary > keys, Kudu should support some kind of "auto_increment" equivalent which > generates primary keys on insertion. Alternatively, Kudu could support tables > which are partitioned but not internally sorted. > The advantages would be: > - Kudu can pick primary keys on insertion to guarantee that there is no > compaction required on the table (eg always assign a new key higher than any > existing key in the local tablet). This can improve write throughput > substantially, especially compared to naive PK generation schemes that a user > might pick such as UUID, which would generate a uniform random-insert > workload (worst case for performance) > - Make Kudu easier to use for such use cases (no extra client code necessary) -- This message was sent by Atlassian Jira (v8.20.10#820010)