[ 
https://issues.apache.org/jira/browse/KUDU-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin updated KUDU-3666:
--------------------------------
    Summary: ReplicatedAlterTableTest.AlterTableAndDropTablet fails from time 
to time due to absence of exactly-once semantics for AlterTable  (was: 
ReplicatedAlterTableTest.AlterTableAndDropTablet fails from time to time due to 
absence of only-once semantics for AlterTable)

> ReplicatedAlterTableTest.AlterTableAndDropTablet fails from time to time due 
> to absence of exactly-once semantics for AlterTable
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KUDU-3666
>                 URL: https://issues.apache.org/jira/browse/KUDU-3666
>             Project: Kudu
>          Issue Type: Bug
>          Components: master, test
>            Reporter: Alexey Serbin
>            Priority: Major
>         Attachments: alter_table-test.00-debug.txt.xz, 
> alter_table-test.00-release.txt.xz, alter_table-test.01-release.txt.xz
>
>
> The ReplicatedAlterTableTest.AlterTableAndDropTablet fails from time to time. 
>  Failures are manifested by error messages like below:
> {noformat}
> src/kudu/integration-tests/alter_table-test.cc:2378: Failure
> Failed                                                                        
>   
> Bad status: Already present: The column already exists: new_c39 
> {noformat}
> {noformat}
> src/kudu/integration-tests/alter_table-test.cc:2378: Failure
> Failed
> Bad status: Already present: The column already exists: new_c44  
> {noformat}
> {noformat}
> src/kudu/integration-tests/alter_table-test.cc:2385: Failure
> Failed
> Bad status: Invalid argument: no range partition to drop: 9 <= VALUES < 10 
> {noformat}
> The culprit seems to be a retried AlterTable RPC request.  The client assumed 
> that the request failed, but in fact the request succeeded at the server 
> side.  To address the issue, we need to enable exactly-once RPC semantics 
> (i.e. kudu.rpc.track_rpc_result option in protobuf) for 
> AlterTable(AlterTableRequestPB) RPC method of masters as well.  At the time 
> of writing, we have it enabled only for Write(WriteRequestPB) RPC method of 
> tablet servers.
> Full test logs are attached for convenience.  In each of the logs, the 
> evidence of re-attempted RPC request can be found, e.g.:
> {noformat}
> W20250531 02:02:56.259193 30345 master_proxy_rpc.cc:203] Re-attempting 
> AlterTable request to leader Master (127.22.204.254:43629)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to