On Mon, 08 May 2023 11:01:55 +0200 Cornelia Huck <coh...@redhat.com> wrote:
> On Mon, May 08 2023, Markus Armbruster <arm...@redhat.com> wrote: > > > css_clear_io_interrupt() aborts on unexpected ioctl() errors, and I > > wonder whether that's appropriate. Let's have a closer look: Just for my understanding, was there a field problem with this code, or is it more a theoretical (i.e. no know crashes)? > > > > static void css_clear_io_interrupt(uint16_t subchannel_id, > > uint16_t subchannel_nr) > > { > > Error *err = NULL; > > static bool no_clear_irq; > > S390FLICState *fs = s390_get_flic(); > > S390FLICStateClass *fsc = s390_get_flic_class(fs); > > int r; > > > > if (unlikely(no_clear_irq)) { > > return; > > } > > r = fsc->clear_io_irq(fs, subchannel_id, subchannel_nr); > > switch (r) { > > case 0: > > break; > > case -ENOSYS: > > no_clear_irq = true; > > /* > > * Ignore unavailability, as the user can't do anything > > * about it anyway. > > */ > > break; > > default: > > error_setg_errno(&err, -r, "unexpected error condition"); > > error_propagate(&error_abort, err); > > } > > } > > > > The default case is abort() with a liberal amount of lipstick applied. > > Let's ignore the lipstick and focus on the abort(). Nod. > > > > fsc->clear_io_irq ist either qemu_s390_clear_io_flic() order > > kvm_s390_clear_io_flic(). Right. > > > > Only kvm_s390_clear_io_flic() can return non-zero: -errno when ioctl() > > fails. Agreed, this is the case right now. This was not the case when the code was written qemu_s390_clear_io_flic() used to be missing functionality and always returned -ENOSYS. > > > > The ioctl() is KVM_SET_DEVICE_ATTR for KVM_DEV_FLIC_CLEAR_IO_IRQ with > > subchannel_id and subchannel_nr. I.e. we assume that this can only fail > > with ENOSYS, und crash hard when the assumption turns out to be wrong. Yes this is the assumption and the current behavior. > > > > Is this error condition a programming error? I figure it can be one > > only if the ioctl()'s contract promises us it cannot fail in any other > > way unless we violate preconditions. AFAIK and AFAIR it is indeed only possible in case of a programming error somewhere, and this was almost certainly my intention with this code. For example if the future implementer of a meaningful qemu_s390_clear_io_flic() was to decide to use a multitude of error codes, the implementer would also have to touch this and handle those accordingly to avoid crashes. On the ioctl() is KVM_SET_DEVICE_ATTR for KVM_DEV_FLIC_CLEAR_IO_IRQ I'm afraid there is no really authoritative contract, and the current implementation, the documentation under Documentation/virt/kvm in the Linux source tree and this code in QEMU are the de-facto contract. linux/Documentation/virt/kvm/api.rst says """ 4.81 KVM_HAS_DEVICE_ATTR ------------------------ :Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device, KVM_CAP_VCPU_ATTRIBUTES for vcpu device KVM_CAP_SYS_ATTRIBUTES for system (/dev/kvm) device :Type: device ioctl, vm ioctl, vcpu ioctl :Parameters: struct kvm_device_attr :Returns: 0 on success, -1 on error Errors: ===== ============================================================= ENXIO The group or attribute is unknown/unsupported for this device or hardware support is missing. ===== ============================================================= Tests whether a device supports a particular attribute. A successful return indicates the attribute is implemented. It does not necessarily indicate that the attribute can be read or written in the device's current state. "addr" is ignored. """ and we do check for availability and cover that via -ENOSYS. For KVM_DEV_FLIC_CLEAR_IO_IRQ is just the following error code documented in linux/Documentation/virt/kvm/devices/s390_flic.rst which is to my knowledge the most authoritative source. """ .. note:: The KVM_DEV_FLIC_CLEAR_IO_IRQ ioctl will return EINVAL in case a zero schid is specified """ but a look in the code will tell us that -EFAULT is also possible if the supplied address is broken. To sum it up, there is nothing to go wrong with the given operation, and to my best knowledge seeing an error code on the ioctl would either indicate a programming error on the client side (QEMU messed it up) or there is something wrong with the kernel. > > > > Is the error condition fatal, i.e. continuing would be unsafe? If the kernel is broken, probably. It is certainly unexpected. > > > > If it's a fatal programming error, then abort() is appropriate. > > > > If it's fatal, but not a programming error, we should exit(1) instead. It might not be a QEMU programming error. I really see no reason why would a combination of a sane QEMU and a sane kernel give us another error code than -ENOSYS. > > > > If it's a survivable programming error, use of abort() is a matter of > > taste. The fact that we might have failed to clear up some interrupts which we are obligated to clean up by the s390 architecture is not expected to have grave consequences. > > From what I remember, this was introduced to clean up a potentially > queued interrupt that is not supposed to be delivered, so the worst > thing that could happen on failure is a spurious interrupt (same as what > could happen if the kernel flic doesn't provide this function in the > first place.) My main worry would be changes/breakages on the kernel > side (while the QEMU side remains unchanged). Agreed. And I hope anybody changing the kernel would test the new error code and notice the QEMU crashes. This was my intention in the first place. > > So, I think we should continue to log the error in any case; but I don't > have a strong opinion as to whether we should use exit(1) (as I wouldn't > consider it a programming error) or just continue. Halil, your choice :) > Neither do I have a strong opinion. I think a hard crash is easier to spot than a warning message (I mean both in CI and in case of manual testing). But it is a trade-off. Just carrying on without checking error codes is in my opinion not really likely to get us in the pickle either. I don't think the function preformed is essential. Especially not for a Linux guest. For me this is really an 'assert' situation. Is there a QEMU way of dealing with that? Regards, Halil