I stumbled on an idea for a simpler block job transactions implementation. This series uses patches from John Snow's "[PATCH v6 00/10] block: incremental backup transactions" series. I wasn't sure if the idea would work, hence this RFC instead of asking John to spend his time chasing the idea.
This series solves the problem at the blockjob level instead of in qmp 'transaction'. This way no callbacks need to be intercepted and refcounts don't need to be added to existing objects. The advantage is this approach only requires small changes to blockdev.c and block/backup.c. The logic is encapsulated in the BlockJobTxn object, keeping the transaction behavior isolated and easy to maintain. Recap: motivation for block job transactions -------------------------------------------- If an incremental backup block job fails then we reclaim the bitmap so the job can be retried. The problem comes when multiple jobs are started as part of a qmp 'transaction' command. We need to group these jobs in a transaction so that either all jobs complete successfully or all bitmaps are reclaimed. Without transactions, there is a case where some jobs complete successfully and throw away their bitmaps, making it impossible to retry the backup by rerunning the command if one of the jobs fails. How does this implementation work? ---------------------------------- These patches add a BlockJobTxn object with the following API: txn = block_job_txn_new(); block_job_txn_add_job(txn, job1); block_job_txn_add_job(txn, job2); block_job_txn_begin(); The jobs either both complete successfully or they both fail/cancel. If the user cancels job1 then job2 will also be cancelled and vice versa. Jobs stay alive waiting for other jobs to complete. They can be cancelled by the user during this time. Job blockers are still in effect and no other block job can run on this device in the meantime (since QEMU currently only allows 1 job per device). This is the main drawback to this approach but reasonable since you probably don't want to run other jobs/operations until you're sure the backup was successful (you won't be able to retry a failed backup if there's a new job running). Adding transaction support to the backup job is very easy. It just needs to make a call before throwing away the bitmap and returning from its coroutine: block_job_txn_prepare_to_complete(job->txn, job, ret); if (job->sync_bitmap) { BdrvDirtyBitmap *bm; if (ret < 0 || block_job_is_cancelled(&job->common)) { ... John Snow (4): qapi: Add transaction support to block-dirty-bitmap operations iotests: add transactional incremental backup test block: rename BlkTransactionState and BdrvActionOps iotests: 124 - transactional failure test Kashyap Chamarthy (1): qmp-commands.hx: Update the supported 'transaction' operations Stefan Hajnoczi (4): block: keep bitmap if incremental backup job is cancelled block: add block job transactions blockdev: make BlockJobTxn available to qmp 'transaction' block/backup: support block job transactions block.c | 19 ++- block/backup.c | 9 +- blockdev.c | 298 +++++++++++++++++++++++++++++++++++---------- blockjob.c | 160 ++++++++++++++++++++++++ docs/bitmaps.md | 6 +- include/block/block.h | 2 +- include/block/block_int.h | 6 +- include/block/blockjob.h | 49 ++++++++ qapi-schema.json | 6 +- qmp-commands.hx | 21 +++- tests/qemu-iotests/124 | 180 ++++++++++++++++++++++++++- tests/qemu-iotests/124.out | 4 +- trace-events | 4 + 13 files changed, 679 insertions(+), 85 deletions(-) -- 2.4.2