Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17553 )
Change subject: WIP IMPALA-10557: Support Kudu's multi-row transaction ...................................................................... Patch Set 3: (7 comments) http://gerrit.cloudera.org:8080/#/c/17553/2/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/17553/2/be/src/service/client-request-state.h@756 PS2, Line 756: pen transaction nit: is this specific to Hive? If so, maybe mention that in these comments? Same with AbortTransaction, GetTransactionId, ClearTransactionState? http://gerrit.cloudera.org:8080/#/c/17553/2/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/17553/2/be/src/service/query-options.h@252 PS2, Line 252: ADVANCED Given this feature is considered experimental in Kudu, would it make sense to mark this as DEVELOPMENT? Is there any kind of experimental tag that Impala exposes? http://gerrit.cloudera.org:8080/#/c/17553/2/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: http://gerrit.cloudera.org:8080/#/c/17553/2/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@206 PS2, Line 206: t nit: maybe prefix this with "kudu" so it's clear the transactionality is specific to Kudu? Same elsewhere? Especially so there's no risk of confusing this with Hive transactions (or the next system's transactions). http://gerrit.cloudera.org:8080/#/c/17553/1/fe/src/main/java/org/apache/impala/common/KuduTransactionManager.java File fe/src/main/java/org/apache/impala/common/KuduTransactionManager.java: http://gerrit.cloudera.org:8080/#/c/17553/1/fe/src/main/java/org/apache/impala/common/KuduTransactionManager.java@54 PS1, Line 54: put(queryId, txn); > No, it does not support. Right, IIUC Impala doesn't have this capability right now, but we may consider extending Kudu's API to include a transaction ID rather than relying on query ID. As far as I can tell, each Impala query does correspond to a single transaction, so this key seems to be ok for now. http://gerrit.cloudera.org:8080/#/c/17553/2/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/17553/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1827 PS2, Line 1827: timeline.markEvent("Kudu transaction aborted"); nit: does this timeline have a query ID associated with it already? If not, maybe add the query ID here too? http://gerrit.cloudera.org:8080/#/c/17553/3/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/17553/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1717 PS3, Line 1717: else if ((targetTable instanceof FeKuduTable) : && queryOptions.isKudu_transaction()) { : FeKuduTable kuduTable = (FeKuduTable) targetTable; : KuduClient client = KuduUtil.getKuduClient(kuduTable.getKuduMasterHosts()); : KuduTransaction txn = null; : try { : // Open Kudu transaction. : txn = client.newTransaction(); : timeline.markEvent("Kudu transaction opened with query id: " : + queryCtx.getQuery_id().toString()); : insertStmt.setTransactionToken(txn.serialize()); : kuduTxnManager_.addTransaction(queryCtx.getQuery_id(), txn); : queryCtx.setIs_kudu_transactional(true); : } catch (IOException e) { : if (txn != null) txn.close(); : throw new TransactionException(e.getMessage()); : } : } On Slack we discussed some considerations with regards to which daemon is in charge of creating vs aborting vs committing the transaction. It'd be nice if either here, or perhaps in the KuduTransactionManager, we could mention those considerations. Especially if it explains why the commit flow doesn't look identical to that of Hive transactions. http://gerrit.cloudera.org:8080/#/c/17553/2/testdata/cluster/node_templates/common/etc/kudu/tserver.conf.tmpl File testdata/cluster/node_templates/common/etc/kudu/tserver.conf.tmpl: http://gerrit.cloudera.org:8080/#/c/17553/2/testdata/cluster/node_templates/common/etc/kudu/tserver.conf.tmpl@18 PS2, Line 18: -time_source=system_unsync In your tests, we'll need to add --enable_txn_system_client_init here as well. -- To view, visit http://gerrit.cloudera.org:8080/17553 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I876ada48991afdff5d61b5d6a0417571aba7cb34 Gerrit-Change-Number: 17553 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Comment-Date: Tue, 08 Jun 2021 02:11:55 +0000 Gerrit-HasComments: Yes