Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
On Tue, 2013-02-05 at 15:42 +0200, Gleb Natapov wrote: > On Tue, Feb 05, 2013 at 06:37:35AM -0700, Alex Williamson wrote: > > On Tue, 2013-02-05 at 12:05 +, Pandarathil, Vijaymohan R wrote: > > > > > > > -Original Message- > > > > From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci- > > > > ow...@vger.kernel.org] On Behalf Of Gleb Natapov > > > > Sent: Tuesday, February 05, 2013 3:37 AM > > > > To: Pandarathil, Vijaymohan R > > > > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E; > > > > k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org; > > > > linux-kernel@vger.kernel.org > > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for > > > > VFIO- > > > > PCI devices > > > > > > > > On Tue, Feb 05, 2013 at 10:59:41AM +, Pandarathil, Vijaymohan R > > > > wrote: > > > > > > > > > > > > > > > > -Original Message- > > > > > > From: Gleb Natapov [mailto:g...@redhat.com] > > > > > > Sent: Tuesday, February 05, 2013 1:21 AM > > > > > > To: Pandarathil, Vijaymohan R > > > > > > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E; > > > > > > k...@vger.kernel.org; qemu-de...@nongnu.org; > > > > > > linux-...@vger.kernel.org; > > > > > > linux-kernel@vger.kernel.org > > > > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER > > > > > > for > > > > VFIO- > > > > > > PCI devices > > > > > > > > > > > > On Tue, Feb 05, 2013 at 09:05:19AM +, Pandarathil, Vijaymohan R > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > -Original Message- > > > > > > > > From: Gleb Natapov [mailto:g...@redhat.com] > > > > > > > > Sent: Tuesday, February 05, 2013 12:05 AM > > > > > > > > To: Blue Swirl > > > > > > > > Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; > > > > Ortiz, > > > > > > Lance > > > > > > > > E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux- > > > > > > p...@vger.kernel.org; > > > > > > > > linux-kernel@vger.kernel.org > > > > > > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support > > > > > > > > AER > > > > for > > > > > > VFIO- > > > > > > > > PCI devices > > > > > > > > > > > > > > > > On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote: > > > > > > > > > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R > > > > > > > > > wrote: > > > > > > > > > > - Create eventfd per vfio device assigned to a guest > > > > and > > > > > > > > register an > > > > > > > > > > event handler > > > > > > > > > > > > > > > > > > > > - This fd is passed to the vfio_pci driver through > > > > > > > > > > the > > > > > > SET_IRQ > > > > > > > > ioctl > > > > > > > > > > > > > > > > > > > > - When the device encounters an error, the eventfd > > > > > > > > > > is > > > > > > signalled > > > > > > > > > > and the qemu eventfd handler gets invoked. > > > > > > > > > > > > > > > > > > > > - In the handler decide what action to take. Current > > > > action > > > > > > > > taken > > > > > > > > > > is to terminate the guest. > > > > > > > > > > > > > > > > > > Usually this is not OK, but I guess this is not guest > > > > triggerable. > > > > > > > > > > > > > > > > > Still not OK. Why not stop a guest with appropriate stop reason? > > > > > > > > > > > > > > The thinking was that since this is a hardware error, we would > > >
Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
On Tue, Feb 05, 2013 at 06:37:35AM -0700, Alex Williamson wrote: > On Tue, 2013-02-05 at 12:05 +, Pandarathil, Vijaymohan R wrote: > > > > > -Original Message- > > > From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci- > > > ow...@vger.kernel.org] On Behalf Of Gleb Natapov > > > Sent: Tuesday, February 05, 2013 3:37 AM > > > To: Pandarathil, Vijaymohan R > > > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E; > > > k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org; > > > linux-kernel@vger.kernel.org > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for > > > VFIO- > > > PCI devices > > > > > > On Tue, Feb 05, 2013 at 10:59:41AM +, Pandarathil, Vijaymohan R wrote: > > > > > > > > > > > > > -Original Message- > > > > > From: Gleb Natapov [mailto:g...@redhat.com] > > > > > Sent: Tuesday, February 05, 2013 1:21 AM > > > > > To: Pandarathil, Vijaymohan R > > > > > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E; > > > > > k...@vger.kernel.org; qemu-de...@nongnu.org; > > > > > linux-...@vger.kernel.org; > > > > > linux-kernel@vger.kernel.org > > > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for > > > VFIO- > > > > > PCI devices > > > > > > > > > > On Tue, Feb 05, 2013 at 09:05:19AM +, Pandarathil, Vijaymohan R > > > wrote: > > > > > > > > > > > > > > > > > > > -Original Message- > > > > > > > From: Gleb Natapov [mailto:g...@redhat.com] > > > > > > > Sent: Tuesday, February 05, 2013 12:05 AM > > > > > > > To: Blue Swirl > > > > > > > Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; > > > Ortiz, > > > > > Lance > > > > > > > E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux- > > > > > p...@vger.kernel.org; > > > > > > > linux-kernel@vger.kernel.org > > > > > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER > > > for > > > > > VFIO- > > > > > > > PCI devices > > > > > > > > > > > > > > On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote: > > > > > > > > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R > > > > > > > > wrote: > > > > > > > > > - Create eventfd per vfio device assigned to a guest > > > and > > > > > > > register an > > > > > > > > > event handler > > > > > > > > > > > > > > > > > > - This fd is passed to the vfio_pci driver through the > > > > > SET_IRQ > > > > > > > ioctl > > > > > > > > > > > > > > > > > > - When the device encounters an error, the eventfd is > > > > > signalled > > > > > > > > > and the qemu eventfd handler gets invoked. > > > > > > > > > > > > > > > > > > - In the handler decide what action to take. Current > > > action > > > > > > > taken > > > > > > > > > is to terminate the guest. > > > > > > > > > > > > > > > > Usually this is not OK, but I guess this is not guest > > > triggerable. > > > > > > > > > > > > > > > Still not OK. Why not stop a guest with appropriate stop reason? > > > > > > > > > > > > The thinking was that since this is a hardware error, we would want > > > to > > > > > stop the guest at the earliest. The hw_error() routine which aborts > > > > > the > > > > > qemu process was suggested by Alex and that seemed appropriate. > > > > > Earlier > > > I > > > > > was using qemu_system_shutdown_request(). Any suggestions ? > > > > > > > > > > > I am thinking vm_stop(). Stopping SMP guest (and UP too in fact) > > > > > involves sending IPIs to other cpus running guest's vcpus. Both exit() > > > > > and vm_stop() will do it, but f
Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
On Tue, 2013-02-05 at 12:05 +, Pandarathil, Vijaymohan R wrote: > > > -Original Message- > > From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci- > > ow...@vger.kernel.org] On Behalf Of Gleb Natapov > > Sent: Tuesday, February 05, 2013 3:37 AM > > To: Pandarathil, Vijaymohan R > > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E; > > k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org; > > linux-kernel@vger.kernel.org > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO- > > PCI devices > > > > On Tue, Feb 05, 2013 at 10:59:41AM +, Pandarathil, Vijaymohan R wrote: > > > > > > > > > > -Original Message- > > > > From: Gleb Natapov [mailto:g...@redhat.com] > > > > Sent: Tuesday, February 05, 2013 1:21 AM > > > > To: Pandarathil, Vijaymohan R > > > > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E; > > > > k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org; > > > > linux-kernel@vger.kernel.org > > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for > > VFIO- > > > > PCI devices > > > > > > > > On Tue, Feb 05, 2013 at 09:05:19AM +, Pandarathil, Vijaymohan R > > wrote: > > > > > > > > > > > > > > > > -Original Message- > > > > > > From: Gleb Natapov [mailto:g...@redhat.com] > > > > > > Sent: Tuesday, February 05, 2013 12:05 AM > > > > > > To: Blue Swirl > > > > > > Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; > > Ortiz, > > > > Lance > > > > > > E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux- > > > > p...@vger.kernel.org; > > > > > > linux-kernel@vger.kernel.org > > > > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER > > for > > > > VFIO- > > > > > > PCI devices > > > > > > > > > > > > On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote: > > > > > > > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R > > > > > > > wrote: > > > > > > > > - Create eventfd per vfio device assigned to a guest > > and > > > > > > register an > > > > > > > > event handler > > > > > > > > > > > > > > > > - This fd is passed to the vfio_pci driver through the > > > > SET_IRQ > > > > > > ioctl > > > > > > > > > > > > > > > > - When the device encounters an error, the eventfd is > > > > signalled > > > > > > > > and the qemu eventfd handler gets invoked. > > > > > > > > > > > > > > > > - In the handler decide what action to take. Current > > action > > > > > > taken > > > > > > > > is to terminate the guest. > > > > > > > > > > > > > > Usually this is not OK, but I guess this is not guest > > triggerable. > > > > > > > > > > > > > Still not OK. Why not stop a guest with appropriate stop reason? > > > > > > > > > > The thinking was that since this is a hardware error, we would want > > to > > > > stop the guest at the earliest. The hw_error() routine which aborts the > > > > qemu process was suggested by Alex and that seemed appropriate. Earlier > > I > > > > was using qemu_system_shutdown_request(). Any suggestions ? > > > > > > > > > I am thinking vm_stop(). Stopping SMP guest (and UP too in fact) > > > > involves sending IPIs to other cpus running guest's vcpus. Both exit() > > > > and vm_stop() will do it, but former is implicitly in the kernel and > > > > later is explicitly in QEMU. > > > > > > I had used vm_stop(RUN_STATE_SHUTDOWN) earlier in my code. But while > > testing, guest ended up in a hang rather than exiting. There seems to some > > cleanup work which is being done as part of vm_stop. In our case, we wanted > > the guest to exit immediately. So use of hw_error() seemed appropriate. > > > > > What makes you think it hang? It stopped, precisely what it should do if > > you call vm_stop(). Now it is possible for vm user to investigate what > > happened and even salvage some data from guest memory. > > That was ignorance on my part on the expected behavior of vm_stop(). > So what you are suggesting is to stop the guest displaying an appropriate > error/next-steps message and have the users do any > data-collection/investigation > and then manually kill the guest, if they so desire. Right ? > > Sounds reasonable. As long as the guest is not touching the device, it should > be okay. > Alex, Any comments ? What's the libvirt behavior when a guest goes to vm_stop? My only concern would be whether the user is going to be confused by a state where the vm is still up, but not running. I imagine they'll have to manually stop it and restart it to continue. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
On Tue, Feb 05, 2013 at 12:05:11PM +, Pandarathil, Vijaymohan R wrote: > > > > -Original Message- > > From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci- > > ow...@vger.kernel.org] On Behalf Of Gleb Natapov > > Sent: Tuesday, February 05, 2013 3:37 AM > > To: Pandarathil, Vijaymohan R > > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E; > > k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org; > > linux-kernel@vger.kernel.org > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO- > > PCI devices > > > > On Tue, Feb 05, 2013 at 10:59:41AM +, Pandarathil, Vijaymohan R wrote: > > > > > > > > > > -Original Message- > > > > From: Gleb Natapov [mailto:g...@redhat.com] > > > > Sent: Tuesday, February 05, 2013 1:21 AM > > > > To: Pandarathil, Vijaymohan R > > > > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E; > > > > k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org; > > > > linux-kernel@vger.kernel.org > > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for > > VFIO- > > > > PCI devices > > > > > > > > On Tue, Feb 05, 2013 at 09:05:19AM +, Pandarathil, Vijaymohan R > > wrote: > > > > > > > > > > > > > > > > -Original Message- > > > > > > From: Gleb Natapov [mailto:g...@redhat.com] > > > > > > Sent: Tuesday, February 05, 2013 12:05 AM > > > > > > To: Blue Swirl > > > > > > Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; > > Ortiz, > > > > Lance > > > > > > E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux- > > > > p...@vger.kernel.org; > > > > > > linux-kernel@vger.kernel.org > > > > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER > > for > > > > VFIO- > > > > > > PCI devices > > > > > > > > > > > > On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote: > > > > > > > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R > > > > > > > wrote: > > > > > > > > - Create eventfd per vfio device assigned to a guest > > and > > > > > > register an > > > > > > > > event handler > > > > > > > > > > > > > > > > - This fd is passed to the vfio_pci driver through the > > > > SET_IRQ > > > > > > ioctl > > > > > > > > > > > > > > > > - When the device encounters an error, the eventfd is > > > > signalled > > > > > > > > and the qemu eventfd handler gets invoked. > > > > > > > > > > > > > > > > - In the handler decide what action to take. Current > > action > > > > > > taken > > > > > > > > is to terminate the guest. > > > > > > > > > > > > > > Usually this is not OK, but I guess this is not guest > > triggerable. > > > > > > > > > > > > > Still not OK. Why not stop a guest with appropriate stop reason? > > > > > > > > > > The thinking was that since this is a hardware error, we would want > > to > > > > stop the guest at the earliest. The hw_error() routine which aborts the > > > > qemu process was suggested by Alex and that seemed appropriate. Earlier > > I > > > > was using qemu_system_shutdown_request(). Any suggestions ? > > > > > > > > > I am thinking vm_stop(). Stopping SMP guest (and UP too in fact) > > > > involves sending IPIs to other cpus running guest's vcpus. Both exit() > > > > and vm_stop() will do it, but former is implicitly in the kernel and > > > > later is explicitly in QEMU. > > > > > > I had used vm_stop(RUN_STATE_SHUTDOWN) earlier in my code. But while > > testing, guest ended up in a hang rather than exiting. There seems to some > > cleanup work which is being done as part of vm_stop. In our case, we wanted > > the guest to exit immediately. So use of hw_error() seemed appropriate. > > > > > What makes you think it hang? It stopped, precisely what it should do if > > you call vm_stop(). Now it is possible for vm user to investigate what > > happened and even salvage some data from guest memory. > > That was ignorance on my part on the expected behavior of vm_stop(). > So what you are suggesting is to stop the guest displaying an appropriate > error/next-steps message and have the users do any > data-collection/investigation > and then manually kill the guest, if they so desire. Right ? > Yes. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
> -Original Message- > From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci- > ow...@vger.kernel.org] On Behalf Of Gleb Natapov > Sent: Tuesday, February 05, 2013 3:37 AM > To: Pandarathil, Vijaymohan R > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E; > k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org; > linux-kernel@vger.kernel.org > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO- > PCI devices > > On Tue, Feb 05, 2013 at 10:59:41AM +, Pandarathil, Vijaymohan R wrote: > > > > > > > -Original Message- > > > From: Gleb Natapov [mailto:g...@redhat.com] > > > Sent: Tuesday, February 05, 2013 1:21 AM > > > To: Pandarathil, Vijaymohan R > > > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E; > > > k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org; > > > linux-kernel@vger.kernel.org > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for > VFIO- > > > PCI devices > > > > > > On Tue, Feb 05, 2013 at 09:05:19AM +, Pandarathil, Vijaymohan R > wrote: > > > > > > > > > > > > > -Original Message- > > > > > From: Gleb Natapov [mailto:g...@redhat.com] > > > > > Sent: Tuesday, February 05, 2013 12:05 AM > > > > > To: Blue Swirl > > > > > Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; > Ortiz, > > > Lance > > > > > E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux- > > > p...@vger.kernel.org; > > > > > linux-kernel@vger.kernel.org > > > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER > for > > > VFIO- > > > > > PCI devices > > > > > > > > > > On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote: > > > > > > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R > > > > > > wrote: > > > > > > > - Create eventfd per vfio device assigned to a guest > and > > > > > register an > > > > > > > event handler > > > > > > > > > > > > > > - This fd is passed to the vfio_pci driver through the > > > SET_IRQ > > > > > ioctl > > > > > > > > > > > > > > - When the device encounters an error, the eventfd is > > > signalled > > > > > > > and the qemu eventfd handler gets invoked. > > > > > > > > > > > > > > - In the handler decide what action to take. Current > action > > > > > taken > > > > > > > is to terminate the guest. > > > > > > > > > > > > Usually this is not OK, but I guess this is not guest > triggerable. > > > > > > > > > > > Still not OK. Why not stop a guest with appropriate stop reason? > > > > > > > > The thinking was that since this is a hardware error, we would want > to > > > stop the guest at the earliest. The hw_error() routine which aborts the > > > qemu process was suggested by Alex and that seemed appropriate. Earlier > I > > > was using qemu_system_shutdown_request(). Any suggestions ? > > > > > > > I am thinking vm_stop(). Stopping SMP guest (and UP too in fact) > > > involves sending IPIs to other cpus running guest's vcpus. Both exit() > > > and vm_stop() will do it, but former is implicitly in the kernel and > > > later is explicitly in QEMU. > > > > I had used vm_stop(RUN_STATE_SHUTDOWN) earlier in my code. But while > testing, guest ended up in a hang rather than exiting. There seems to some > cleanup work which is being done as part of vm_stop. In our case, we wanted > the guest to exit immediately. So use of hw_error() seemed appropriate. > > > What makes you think it hang? It stopped, precisely what it should do if > you call vm_stop(). Now it is possible for vm user to investigate what > happened and even salvage some data from guest memory. That was ignorance on my part on the expected behavior of vm_stop(). So what you are suggesting is to stop the guest displaying an appropriate error/next-steps message and have the users do any data-collection/investigation and then manually kill the guest, if they so desire. Right ? Sounds reasonable. As long as the guest is not touching the device, it should be okay. Alex, Any comments ? Thanks Vijay > > -- > Gleb. > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
On Tue, Feb 05, 2013 at 10:59:41AM +, Pandarathil, Vijaymohan R wrote: > > > > -Original Message- > > From: Gleb Natapov [mailto:g...@redhat.com] > > Sent: Tuesday, February 05, 2013 1:21 AM > > To: Pandarathil, Vijaymohan R > > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E; > > k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org; > > linux-kernel@vger.kernel.org > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO- > > PCI devices > > > > On Tue, Feb 05, 2013 at 09:05:19AM +, Pandarathil, Vijaymohan R wrote: > > > > > > > > > > -Original Message- > > > > From: Gleb Natapov [mailto:g...@redhat.com] > > > > Sent: Tuesday, February 05, 2013 12:05 AM > > > > To: Blue Swirl > > > > Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; Ortiz, > > Lance > > > > E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux- > > p...@vger.kernel.org; > > > > linux-kernel@vger.kernel.org > > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for > > VFIO- > > > > PCI devices > > > > > > > > On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote: > > > > > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R > > > > > wrote: > > > > > > - Create eventfd per vfio device assigned to a guest and > > > > register an > > > > > > event handler > > > > > > > > > > > > - This fd is passed to the vfio_pci driver through the > > SET_IRQ > > > > ioctl > > > > > > > > > > > > - When the device encounters an error, the eventfd is > > signalled > > > > > > and the qemu eventfd handler gets invoked. > > > > > > > > > > > > - In the handler decide what action to take. Current action > > > > taken > > > > > > is to terminate the guest. > > > > > > > > > > Usually this is not OK, but I guess this is not guest triggerable. > > > > > > > > > Still not OK. Why not stop a guest with appropriate stop reason? > > > > > > The thinking was that since this is a hardware error, we would want to > > stop the guest at the earliest. The hw_error() routine which aborts the > > qemu process was suggested by Alex and that seemed appropriate. Earlier I > > was using qemu_system_shutdown_request(). Any suggestions ? > > > > > I am thinking vm_stop(). Stopping SMP guest (and UP too in fact) > > involves sending IPIs to other cpus running guest's vcpus. Both exit() > > and vm_stop() will do it, but former is implicitly in the kernel and > > later is explicitly in QEMU. > > I had used vm_stop(RUN_STATE_SHUTDOWN) earlier in my code. But while testing, > guest ended up in a hang rather than exiting. There seems to some cleanup > work which is being done as part of vm_stop. In our case, we wanted the guest > to exit immediately. So use of hw_error() seemed appropriate. > What makes you think it hang? It stopped, precisely what it should do if you call vm_stop(). Now it is possible for vm user to investigate what happened and even salvage some data from guest memory. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
> -Original Message- > From: Gleb Natapov [mailto:g...@redhat.com] > Sent: Tuesday, February 05, 2013 1:21 AM > To: Pandarathil, Vijaymohan R > Cc: Blue Swirl; Alex Williamson; Bjorn Helgaas; Ortiz, Lance E; > k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org; > linux-kernel@vger.kernel.org > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO- > PCI devices > > On Tue, Feb 05, 2013 at 09:05:19AM +, Pandarathil, Vijaymohan R wrote: > > > > > > > -Original Message- > > > From: Gleb Natapov [mailto:g...@redhat.com] > > > Sent: Tuesday, February 05, 2013 12:05 AM > > > To: Blue Swirl > > > Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; Ortiz, > Lance > > > E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux- > p...@vger.kernel.org; > > > linux-kernel@vger.kernel.org > > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for > VFIO- > > > PCI devices > > > > > > On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote: > > > > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R > > > > wrote: > > > > > - Create eventfd per vfio device assigned to a guest and > > > register an > > > > > event handler > > > > > > > > > > - This fd is passed to the vfio_pci driver through the > SET_IRQ > > > ioctl > > > > > > > > > > - When the device encounters an error, the eventfd is > signalled > > > > > and the qemu eventfd handler gets invoked. > > > > > > > > > > - In the handler decide what action to take. Current action > > > taken > > > > > is to terminate the guest. > > > > > > > > Usually this is not OK, but I guess this is not guest triggerable. > > > > > > > Still not OK. Why not stop a guest with appropriate stop reason? > > > > The thinking was that since this is a hardware error, we would want to > stop the guest at the earliest. The hw_error() routine which aborts the > qemu process was suggested by Alex and that seemed appropriate. Earlier I > was using qemu_system_shutdown_request(). Any suggestions ? > > > I am thinking vm_stop(). Stopping SMP guest (and UP too in fact) > involves sending IPIs to other cpus running guest's vcpus. Both exit() > and vm_stop() will do it, but former is implicitly in the kernel and > later is explicitly in QEMU. I had used vm_stop(RUN_STATE_SHUTDOWN) earlier in my code. But while testing, guest ended up in a hang rather than exiting. There seems to some cleanup work which is being done as part of vm_stop. In our case, we wanted the guest to exit immediately. So use of hw_error() seemed appropriate. Thoughts ? Vijay > > -- > Gleb. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
On Tue, Feb 05, 2013 at 09:05:19AM +, Pandarathil, Vijaymohan R wrote: > > > > -Original Message- > > From: Gleb Natapov [mailto:g...@redhat.com] > > Sent: Tuesday, February 05, 2013 12:05 AM > > To: Blue Swirl > > Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; Ortiz, Lance > > E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org; > > linux-kernel@vger.kernel.org > > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO- > > PCI devices > > > > On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote: > > > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R > > > wrote: > > > > - Create eventfd per vfio device assigned to a guest and > > register an > > > > event handler > > > > > > > > - This fd is passed to the vfio_pci driver through the SET_IRQ > > ioctl > > > > > > > > - When the device encounters an error, the eventfd is signalled > > > > and the qemu eventfd handler gets invoked. > > > > > > > > - In the handler decide what action to take. Current action > > taken > > > > is to terminate the guest. > > > > > > Usually this is not OK, but I guess this is not guest triggerable. > > > > > Still not OK. Why not stop a guest with appropriate stop reason? > > The thinking was that since this is a hardware error, we would want to stop > the guest at the earliest. The hw_error() routine which aborts the qemu > process was suggested by Alex and that seemed appropriate. Earlier I was > using qemu_system_shutdown_request(). Any suggestions ? > I am thinking vm_stop(). Stopping SMP guest (and UP too in fact) involves sending IPIs to other cpus running guest's vcpus. Both exit() and vm_stop() will do it, but former is implicitly in the kernel and later is explicitly in QEMU. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
> -Original Message- > From: Gleb Natapov [mailto:g...@redhat.com] > Sent: Tuesday, February 05, 2013 12:05 AM > To: Blue Swirl > Cc: Pandarathil, Vijaymohan R; Alex Williamson; Bjorn Helgaas; Ortiz, Lance > E; k...@vger.kernel.org; qemu-de...@nongnu.org; linux-...@vger.kernel.org; > linux-kernel@vger.kernel.org > Subject: Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO- > PCI devices > > On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote: > > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R > > wrote: > > > - Create eventfd per vfio device assigned to a guest and > register an > > > event handler > > > > > > - This fd is passed to the vfio_pci driver through the SET_IRQ > ioctl > > > > > > - When the device encounters an error, the eventfd is signalled > > > and the qemu eventfd handler gets invoked. > > > > > > - In the handler decide what action to take. Current action > taken > > > is to terminate the guest. > > > > Usually this is not OK, but I guess this is not guest triggerable. > > > Still not OK. Why not stop a guest with appropriate stop reason? The thinking was that since this is a hardware error, we would want to stop the guest at the earliest. The hw_error() routine which aborts the qemu process was suggested by Alex and that seemed appropriate. Earlier I was using qemu_system_shutdown_request(). Any suggestions ? Thanks Vijay > > -- > Gleb. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
On Sun, Feb 03, 2013 at 04:36:11PM +, Blue Swirl wrote: > On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R > wrote: > > - Create eventfd per vfio device assigned to a guest and register an > > event handler > > > > - This fd is passed to the vfio_pci driver through the SET_IRQ ioctl > > > > - When the device encounters an error, the eventfd is signalled > > and the qemu eventfd handler gets invoked. > > > > - In the handler decide what action to take. Current action taken > > is to terminate the guest. > > Usually this is not OK, but I guess this is not guest triggerable. > Still not OK. Why not stop a guest with appropriate stop reason? -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
On Sun, 2013-02-03 at 14:10 +, Pandarathil, Vijaymohan R wrote: > - Create eventfd per vfio device assigned to a guest and register an > event handler > > - This fd is passed to the vfio_pci driver through the SET_IRQ ioctl > > - When the device encounters an error, the eventfd is signalled > and the qemu eventfd handler gets invoked. > > - In the handler decide what action to take. Current action taken > is to terminate the guest. > > Signed-off-by: Vijay Mohan Pandarathil > --- > hw/vfio_pci.c | 105 > + > linux-headers/linux/vfio.h | 1 + > 2 files changed, 106 insertions(+) > > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c > index c51ae67..4e2f768 100644 > --- a/hw/vfio_pci.c > +++ b/hw/vfio_pci.c > @@ -130,6 +130,8 @@ typedef struct VFIODevice { > QLIST_ENTRY(VFIODevice) next; > struct VFIOGroup *group; > bool reset_works; > +EventNotifier err_notifier; > +bool pci_aer; Re-order these for alignment please. ie: struct VFIOGroup *group; EventNotifier err_notifier; bool reset_works; bool pci_aer; > } VFIODevice; > > typedef struct VFIOGroup { > @@ -1922,6 +1924,106 @@ static void vfio_put_device(VFIODevice *vdev) > } > } > > +static void vfio_err_notifier_handler(void *opaque) > +{ > +VFIODevice *vdev = opaque; > + > +if (!event_notifier_test_and_clear(&vdev->err_notifier)) { > +return; > +} > + > +/* > + * TBD. Retrieve the error details and decide what action > + * needs to be taken. One of the actions could be to pass > + * the error to the guest and have the guest driver recover > + * from the error. This requires that PCIe capabilities be > + * exposed to the guest. At present, we just terminate the > + * guest to contain the error. > + */ > + > +error_report("%s (%04x:%02x:%02x.%x)" > +"Unrecoverable error detected... Terminating guest\n", > +__func__, vdev->host.domain, vdev->host.bus, > +vdev->host.slot, vdev->host.function); > + > +hw_error("(%04x:%02x:%02x.%x) Unrecoverable device error\n", > +vdev->host.domain, vdev->host.bus, > +vdev->host.slot, vdev->host.function); > + > +return; As Blue Swirl mentions, these returns at the end of void functions are unnecessary. > +} > + > +static void vfio_register_err_notifier(VFIODevice *vdev) > +{ > +int ret; > +int argsz; > +struct vfio_irq_set *irq_set; > +int32_t *pfd; > + > +if (event_notifier_init(&vdev->err_notifier, 0)) { > +error_report("vfio: Warning: Unable to init event notifier for error > detection\n"); > +return; > +} > + > +argsz = sizeof(*irq_set) + sizeof(*pfd); > + > +irq_set = g_malloc0(argsz); > +irq_set->argsz = argsz; > +irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | > + VFIO_IRQ_SET_ACTION_TRIGGER; > +irq_set->index = VFIO_PCI_ERR_IRQ_INDEX; > +irq_set->start = 0; > +irq_set->count = 1; > +pfd = (int32_t *)&irq_set->data; > + > +*pfd = event_notifier_get_fd(&vdev->err_notifier); > +qemu_set_fd_handler(*pfd, vfio_err_notifier_handler, NULL, vdev); > + > +ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set); > +if (ret) { > +DPRINTF("vfio: Error notification not supported for the device\n"); We should know this already though, right? Where's our call to VFIO_DEVICE_GET_IRQ_INFO for this index? I'd expect that should happen in vfio_get_device where it can set some flag true, then this function would exit immediately if that flag isn't set. Then by the time we're here, it's a legitimate error_report if we think this should work and doesn't. > +qemu_set_fd_handler(*pfd, NULL, NULL, vdev); > +event_notifier_cleanup(&vdev->err_notifier); > +g_free(irq_set); > +return; > +} > +g_free(irq_set); > +vdev->pci_aer = 1; bool, so set to true or false. > +return; > +} > +static void vfio_unregister_err_notifier(VFIODevice *vdev) > +{ > +int argsz; > +struct vfio_irq_set *irq_set; > +int32_t *pfd; > +int ret; > + > +if (!vdev->pci_aer) { > +return; > +} > + > +argsz = sizeof(*irq_set) + sizeof(*pfd); > + > +irq_set = g_malloc0(argsz); > +irq_set->argsz = argsz; > +irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | > + VFIO_IRQ_SET_ACTION_TRIGGER; > +irq_set->index = VFIO_PCI_ERR_IRQ_INDEX; > +irq_set->start = 0; > +irq_set->count = 1; > +pfd = (int32_t *)&irq_set->data; > +*pfd = -1; > + > +ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set); > +if (ret) { > +DPRINTF("vfio: Failed to de-assign error fd: %d\n", ret); This is also a legitimate error_report. In general, if kernel vfio-pci does or doesn't support a non-critical feature, that's a DPRINTF. If it's told us the feature is there and something d
Re: [PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
On Sun, Feb 3, 2013 at 2:10 PM, Pandarathil, Vijaymohan R wrote: > - Create eventfd per vfio device assigned to a guest and register an > event handler > > - This fd is passed to the vfio_pci driver through the SET_IRQ ioctl > > - When the device encounters an error, the eventfd is signalled > and the qemu eventfd handler gets invoked. > > - In the handler decide what action to take. Current action taken > is to terminate the guest. Usually this is not OK, but I guess this is not guest triggerable. > > Signed-off-by: Vijay Mohan Pandarathil > --- > hw/vfio_pci.c | 105 > + > linux-headers/linux/vfio.h | 1 + > 2 files changed, 106 insertions(+) > > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c > index c51ae67..4e2f768 100644 > --- a/hw/vfio_pci.c > +++ b/hw/vfio_pci.c > @@ -130,6 +130,8 @@ typedef struct VFIODevice { > QLIST_ENTRY(VFIODevice) next; > struct VFIOGroup *group; > bool reset_works; > +EventNotifier err_notifier; > +bool pci_aer; > } VFIODevice; > > typedef struct VFIOGroup { > @@ -1922,6 +1924,106 @@ static void vfio_put_device(VFIODevice *vdev) > } > } > > +static void vfio_err_notifier_handler(void *opaque) > +{ > +VFIODevice *vdev = opaque; > + > +if (!event_notifier_test_and_clear(&vdev->err_notifier)) { > +return; > +} > + > +/* > + * TBD. Retrieve the error details and decide what action > + * needs to be taken. One of the actions could be to pass > + * the error to the guest and have the guest driver recover > + * from the error. This requires that PCIe capabilities be > + * exposed to the guest. At present, we just terminate the > + * guest to contain the error. > + */ > + > +error_report("%s (%04x:%02x:%02x.%x)" > +"Unrecoverable error detected... Terminating guest\n", > +__func__, vdev->host.domain, vdev->host.bus, > +vdev->host.slot, vdev->host.function); > + > +hw_error("(%04x:%02x:%02x.%x) Unrecoverable device error\n", > +vdev->host.domain, vdev->host.bus, > +vdev->host.slot, vdev->host.function); > + > +return; Useless, please remove. > +} > + > +static void vfio_register_err_notifier(VFIODevice *vdev) > +{ > +int ret; > +int argsz; > +struct vfio_irq_set *irq_set; > +int32_t *pfd; > + > +if (event_notifier_init(&vdev->err_notifier, 0)) { > +error_report("vfio: Warning: Unable to init event notifier for error > detection\n"); > +return; > +} > + > +argsz = sizeof(*irq_set) + sizeof(*pfd); > + > +irq_set = g_malloc0(argsz); > +irq_set->argsz = argsz; > +irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | > + VFIO_IRQ_SET_ACTION_TRIGGER; > +irq_set->index = VFIO_PCI_ERR_IRQ_INDEX; > +irq_set->start = 0; > +irq_set->count = 1; > +pfd = (int32_t *)&irq_set->data; > + > +*pfd = event_notifier_get_fd(&vdev->err_notifier); > +qemu_set_fd_handler(*pfd, vfio_err_notifier_handler, NULL, vdev); > + > +ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set); > +if (ret) { > +DPRINTF("vfio: Error notification not supported for the device\n"); > +qemu_set_fd_handler(*pfd, NULL, NULL, vdev); > +event_notifier_cleanup(&vdev->err_notifier); > +g_free(irq_set); > +return; > +} > +g_free(irq_set); > +vdev->pci_aer = 1; > +return; Ditto. > +} > +static void vfio_unregister_err_notifier(VFIODevice *vdev) > +{ > +int argsz; > +struct vfio_irq_set *irq_set; > +int32_t *pfd; > +int ret; > + > +if (!vdev->pci_aer) { > +return; > +} > + > +argsz = sizeof(*irq_set) + sizeof(*pfd); > + > +irq_set = g_malloc0(argsz); > +irq_set->argsz = argsz; > +irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | > + VFIO_IRQ_SET_ACTION_TRIGGER; > +irq_set->index = VFIO_PCI_ERR_IRQ_INDEX; > +irq_set->start = 0; > +irq_set->count = 1; > +pfd = (int32_t *)&irq_set->data; > +*pfd = -1; > + > +ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set); > +if (ret) { > +DPRINTF("vfio: Failed to de-assign error fd: %d\n", ret); > +} > +g_free(irq_set); > +qemu_set_fd_handler(event_notifier_get_fd(&vdev->err_notifier), > +NULL, NULL, vdev); > +event_notifier_cleanup(&vdev->err_notifier); > +return; Ditto. > +} > static int vfio_initfn(PCIDevice *pdev) > { > VFIODevice *pvdev, *vdev = DO_UPCAST(VFIODevice, pdev, pdev); > @@ -2032,6 +2134,8 @@ static int vfio_initfn(PCIDevice *pdev) > } > } > > +vfio_register_err_notifier(vdev); > + > return 0; > > out_teardown: > @@ -2049,6 +2153,7 @@ static void vfio_exitfn(PCIDevice *pdev) > VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev); > VFIOGroup *group = vdev->group; > > +vfio_unregister_err_noti
[PATCH v3 3/3] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices
- Create eventfd per vfio device assigned to a guest and register an event handler - This fd is passed to the vfio_pci driver through the SET_IRQ ioctl - When the device encounters an error, the eventfd is signalled and the qemu eventfd handler gets invoked. - In the handler decide what action to take. Current action taken is to terminate the guest. Signed-off-by: Vijay Mohan Pandarathil --- hw/vfio_pci.c | 105 + linux-headers/linux/vfio.h | 1 + 2 files changed, 106 insertions(+) diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c index c51ae67..4e2f768 100644 --- a/hw/vfio_pci.c +++ b/hw/vfio_pci.c @@ -130,6 +130,8 @@ typedef struct VFIODevice { QLIST_ENTRY(VFIODevice) next; struct VFIOGroup *group; bool reset_works; +EventNotifier err_notifier; +bool pci_aer; } VFIODevice; typedef struct VFIOGroup { @@ -1922,6 +1924,106 @@ static void vfio_put_device(VFIODevice *vdev) } } +static void vfio_err_notifier_handler(void *opaque) +{ +VFIODevice *vdev = opaque; + +if (!event_notifier_test_and_clear(&vdev->err_notifier)) { +return; +} + +/* + * TBD. Retrieve the error details and decide what action + * needs to be taken. One of the actions could be to pass + * the error to the guest and have the guest driver recover + * from the error. This requires that PCIe capabilities be + * exposed to the guest. At present, we just terminate the + * guest to contain the error. + */ + +error_report("%s (%04x:%02x:%02x.%x)" +"Unrecoverable error detected... Terminating guest\n", +__func__, vdev->host.domain, vdev->host.bus, +vdev->host.slot, vdev->host.function); + +hw_error("(%04x:%02x:%02x.%x) Unrecoverable device error\n", +vdev->host.domain, vdev->host.bus, +vdev->host.slot, vdev->host.function); + +return; +} + +static void vfio_register_err_notifier(VFIODevice *vdev) +{ +int ret; +int argsz; +struct vfio_irq_set *irq_set; +int32_t *pfd; + +if (event_notifier_init(&vdev->err_notifier, 0)) { +error_report("vfio: Warning: Unable to init event notifier for error detection\n"); +return; +} + +argsz = sizeof(*irq_set) + sizeof(*pfd); + +irq_set = g_malloc0(argsz); +irq_set->argsz = argsz; +irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | + VFIO_IRQ_SET_ACTION_TRIGGER; +irq_set->index = VFIO_PCI_ERR_IRQ_INDEX; +irq_set->start = 0; +irq_set->count = 1; +pfd = (int32_t *)&irq_set->data; + +*pfd = event_notifier_get_fd(&vdev->err_notifier); +qemu_set_fd_handler(*pfd, vfio_err_notifier_handler, NULL, vdev); + +ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set); +if (ret) { +DPRINTF("vfio: Error notification not supported for the device\n"); +qemu_set_fd_handler(*pfd, NULL, NULL, vdev); +event_notifier_cleanup(&vdev->err_notifier); +g_free(irq_set); +return; +} +g_free(irq_set); +vdev->pci_aer = 1; +return; +} +static void vfio_unregister_err_notifier(VFIODevice *vdev) +{ +int argsz; +struct vfio_irq_set *irq_set; +int32_t *pfd; +int ret; + +if (!vdev->pci_aer) { +return; +} + +argsz = sizeof(*irq_set) + sizeof(*pfd); + +irq_set = g_malloc0(argsz); +irq_set->argsz = argsz; +irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | + VFIO_IRQ_SET_ACTION_TRIGGER; +irq_set->index = VFIO_PCI_ERR_IRQ_INDEX; +irq_set->start = 0; +irq_set->count = 1; +pfd = (int32_t *)&irq_set->data; +*pfd = -1; + +ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set); +if (ret) { +DPRINTF("vfio: Failed to de-assign error fd: %d\n", ret); +} +g_free(irq_set); +qemu_set_fd_handler(event_notifier_get_fd(&vdev->err_notifier), +NULL, NULL, vdev); +event_notifier_cleanup(&vdev->err_notifier); +return; +} static int vfio_initfn(PCIDevice *pdev) { VFIODevice *pvdev, *vdev = DO_UPCAST(VFIODevice, pdev, pdev); @@ -2032,6 +2134,8 @@ static int vfio_initfn(PCIDevice *pdev) } } +vfio_register_err_notifier(vdev); + return 0; out_teardown: @@ -2049,6 +2153,7 @@ static void vfio_exitfn(PCIDevice *pdev) VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev); VFIOGroup *group = vdev->group; +vfio_unregister_err_notifier(vdev); pci_device_set_intx_routing_notifier(&vdev->pdev, NULL); vfio_disable_interrupts(vdev); if (vdev->intx.mmap_timer) { diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h index f787b72..6b20849 100644 --- a/linux-headers/linux/vfio.h +++ b/linux-headers/linux/vfio.h @@ -310,6 +310,7 @@ enum { VFIO_PCI_INTX_IRQ_INDEX, VFIO_PCI_MSI_IRQ_INDEX, VFIO_PCI_MSIX_IRQ_INDEX, + VFIO_PCI_ERR_IRQ_INDEX,