* Daniel P. Berrangé (berra...@redhat.com) wrote: > Due to its long term heritage most of the migration code just invokes > 'error_report' when problems hit. This was fine for HMP, since the > messages get redirected from stderr, into the HMP console. It is not > OK for QMP because the errors will not be fed back to the QMP client. > > This wasn't a terrible real world problem with QMP so far because > live migration happens in the background, so at least on the target side > there is not a QMP command that needs to capture the incoming migration. > It is a problem on the source side but it doesn't hit frequently as the > source side has fewer failure scenarios. None the less on both sides it > would be desirable if 'query-migrate' can report errors correctly. > With the introduction of the load-snapshot QMP commands, the need for > error reporting becomes more pressing. > > Wiring up good error reporting is a large and difficult job, which > this series does NOT complete. The focus here has been on converting > all methods in savevm.c which have an 'int' return value capable of > reporting errors. This covers most of the infrastructure for controlling > the migration state serialization / protocol. > > The remaining part that is missing error reporting are the callbacks in > the VMStateDescription struct which can return failure codes, but have > no "Error **errp" parameter. Thinking about how this might be dealt with > in future, a big bang conversion is likely non-viable. We'll probably > want to introduce a duplicate set of callbacks with the "Error **errp" > parameter and convert impls in batches, eventually removing the > original callbacks. I don't intend todo that myself in the immediate > future. > > IOW, this patch series probably solves 50% of the problem, but we > still do need the rest to get ideal error reporting. > > In doing this savevm conversion I noticed a bunch of places which > see and then ignore errors. I only fixed one or two of them which > were clearly dubious. Other places in savevm.c where it seemed it > was probably ok to ignore errors, I've left using error_report() > on the basis that those are really warnings. Perhaps they could > be changed to warn_report() instead. > > There are alot of patches here, but I felt it was easier to review > for correctness if I converted 1 function at a time. The series > does not neccessarily have to be reviewed/appied in 1 go.
After this series, what do my errors look like, and where do they end up? Do I get my nice backtrace shwoing that device failed, then that was part of that one... Dave > Daniel P. Berrangé (33): > migration: push Error **errp into qemu_loadvm_state() > migration: push Error **errp into qemu_loadvm_state_header() > migration: push Error **errp into qemu_loadvm_state_setup() > migration: push Error **errp into qemu_load_device_state() > migration: push Error **errp into qemu_loadvm_state_main() > migration: push Error **errp into qemu_loadvm_section_start_full() > migration: push Error **errp into qemu_loadvm_section_part_end() > migration: push Error **errp into loadvm_process_command() > migration: push Error **errp into loadvm_handle_cmd_packaged() > migration: push Error **errp into loadvm_postcopy_handle_advise() > migration: push Error **errp into ram_postcopy_incoming_init() > migration: push Error **errp into loadvm_postcopy_handle_listen() > migration: push Error **errp into loadvm_postcopy_handle_run() > migration: push Error **errp into loadvm_postcopy_ram_handle_discard() > migration: make loadvm_postcopy_handle_resume() void > migration: push Error **errp into loadvm_handle_recv_bitmap() > migration: push Error **errp into loadvm_process_enable_colo() > migration: push Error **errp into colo_init_ram_cache() > migration: push Error **errp into check_section_footer() > migration: push Error **errp into global_state_store() > migration: remove error reporting from qemu_fopen_bdrv() callers > migration: push Error **errp into qemu_savevm_state_iterate() > migration: simplify some error reporting in save_snapshot() > migration: push Error **errp into qemu_savevm_state_setup() > migration: push Error **errp into qemu_savevm_state_complete_precopy() > migration: push Error **errp into > qemu_savevm_state_complete_precopy_non_iterable() > migration: push Error **errp into qemu_savevm_state_complete_precopy() > migration: push Error **errp into qemu_savevm_send_packaged() > migration: push Error **errp into qemu_savevm_live_state() > migration: push Error **errp into qemu_save_device_state() > migration: push Error **errp into qemu_savevm_state_resume_prepare() > migration: push Error **errp into postcopy_resume_handshake() > migration: push Error **errp into postcopy_do_resume() > > include/migration/colo.h | 2 +- > include/migration/global_state.h | 2 +- > migration/colo.c | 12 +- > migration/global_state.c | 6 +- > migration/migration.c | 80 ++- > migration/postcopy-ram.c | 8 +- > migration/postcopy-ram.h | 2 +- > migration/ram.c | 17 +- > migration/ram.h | 4 +- > migration/savevm.c | 594 ++++++++++-------- > migration/savevm.h | 23 +- > .../tests/internal-snapshots-qapi.out | 3 +- > 12 files changed, 427 insertions(+), 326 deletions(-) > > -- > 2.29.2 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK