On 29.01.19 14:50, Pierre Morel wrote: > On 29/01/2019 11:24, David Hildenbrand wrote: >>>>> I'm wondering what the architecture says regarding those events -- can >>>>> someone with access to the documentation comment? >>>> >>>> Ping. Any comments from the IBM folks? > > Hi, > > Sorry to have wait so long. > At least Collin was faster. > >>>> >>> >>> >>> So the idea here is that if we have a PCI device that is the process of >>> being deconfigured and we are also in the middle of a reset, then let's >>> accelerate deconfiguring of the PCI device during the reset. Makes sense. > > to me too. > However, how do we ensure that the guest got time to respond to the > first deconfigure request?
30 seconds, then the reboot. On a reboot, I don't see why we should give a guest more time. "It's dead", rip out the card as the guest refused to hand it back. (maybe it crashed! but after a reboot the guest state is reset and baught back to life) > >>> >>> Note: >>> >>> The callback function will deconfigure the the device and put it into >>> standby mode. However, a PCI device should only go into standby from the >>> *disabled state* (which it could already be in due to the unplug >>> sequence), or from a *permanent error state* (something we should >>> hopefully never see -- this means something went seriously wrong with >>> the device). > > Not completely exact, the CHSC event 0x304, on the initiative of the > host, force the "deconfigure state" from "configure state" generally, > whatever sub-state it has (enabled/disabled/error...). > >> >> Right, this should already have been checked before setting up the timer. > > Apropos timer, do we need a timer or wouldn't it be better to use a > delay / a timer + condition? I don't think we need a timer at all. > > AFAIU we get out of the unplug without waiting for any answer from the > guest and we surely get the timer triggering after the reset has been done. > That seems bad. This is the case right now, correct. > > >> >>> >>> Two things I'm concerned about: >>> >>> 1) >>> >>> What I would suggest is adding a check for the pbdev->state for >>> ZPCI_FS_DISABLED || ZPCI_FS_PERMANENT_ERROR. If it is in either of these >>> states, then we're safe to deconfigure and put into standby. If the >> >> We setup a timer if !ZPCI_FS_STANDBY and !ZPCI_FS_RESERVED. >> >> So for >> - ZPCI_FS_DISABLED >> - ZPCI_FS_ENABLED >> - ZPCI_FS_BLOCKED >> - ZPCI_FS_ERROR >> - ZPCI_FS_PERMANENT_ERROR >> >> We setup a timer and simply go ahead and unplug the device when the >> timer expires ("forced unplug"). > > I agree only for ZPCI_FS_ENABLED why do we need to be smooth for other > states? > ZPCI_FS_DISABLED may be a candidate even I do not see the interrest but > other states of the device should issue a HP_EVENT_DECONFIGURE_REQUEST > and we do not need a timer (or any delay) You can always expect that your guest driver is dead. > >> >> Changing that behavior might be more invasive. Simply not unplugging in >> s390_pcihost_timer_cb() on some of these states would mean that somebody >> issued a request and that requests is simply lost/ignored. Not sure if >> that is what we want. I think we need separate patches to change >> something like that. Especially >> >> 1. What happens if the device was in ZPCI_FS_DISABLED, the guest ignores >> the unplug request and moves the device to ZPCI_FS_ENABLED before the >> timer expires? These are corner cases to consider. > > +1, we must ensure to do the work inside the unplug CB. > >> >> 2. Do we need a timer at all? Now that Patch #1 introduces >> unplug_requests, we are free to ignore requests from the user if the >> guest is not reacting. I would really favor getting rid of the timer >> completely. Was there a special reason why this was introduced? > > Yes, to let a chance to the guest to smoothly relinquish the device. > (for example sync/clean the disk) > However I do not think it is right implemented. > >> >> No other architecture (e.g. ACPI) uses such a timer. They use a simple >> flag to remember if a request is pending. I would really favor going >> into that direction. > > I am not sure that the Intel architecture is a good example. :) Right, we all learned that zPCI did it better. (sorry ;) ) > > AFAIU they do not wait for the guest to have relinquish the device. > Or do they? > How long do they wait? They wait for ever. And I guess we should do the same thing. If the guest driver is broken (and this is really a rare scenario!) we would not get the device back. Which is perfectly fine in my point of view. In all other scenarios I guess the guest will simply respond in a timely manner. And ripping out stuff from the guest always feels wrong. (yes the channel subsystem works like that, but here we actually have a choice) If we reboot, we can unplug the device. Otherwise, let's keep it simply and don't use a timer. Thanks! -- Thanks, David / dhildenb