[Bug 1925496] Re: nvme disk cannot be hotplugged after removal
This issue has been moved to the new bug tracker here: https://gitlab.com/qemu-project/qemu/-/issues/423 Thus let's close this version in the Launchpad tracker now. ** Bug watch added: gitlab.com/qemu-project/qemu/-/issues #423 https://gitlab.com/qemu-project/qemu/-/issues/423 ** Changed in: qemu Status: Incomplete => Invalid -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1925496 Title: nvme disk cannot be hotplugged after removal Status in QEMU: Invalid Bug description: Hello, When I try to re-add an nvme disk shortly after removing it, I get an error about duplicate ID. See the following commands to reproduce. This happens consistently on all VMs that I tested: attach == $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; $VAR1 = { 'execute' => 'device_add', 'arguments' => { 'serial' => 'nvme1', 'drive' => 'drive-nvme1', 'driver' => 'nvme', 'id' => 'nvme1' } }; detach === $VAR1 = { 'arguments' => { 'id' => 'nvme1' }, 'execute' => 'device_del' }; $VAR1 = { 'execute' => 'human-monitor-command', 'arguments' => { 'command-line' => 'drive_del drive-nvme1' } }; reattach === $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; and I get: "Duplicate ID 'drive-nvme1' for drive" although it does not show up in query-block or query-pci anymore after the first detach. Is this a bug or am I missing something? Please advise. Best regards, Oguz To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1925496/+subscriptions
[Bug 1925496] Re: nvme disk cannot be hotplugged after removal
The QEMU project is currently moving its bug tracking to another system. For this we need to know how to transfer the bug to the new system if (if still necessary). Thus we're setting the status to "Incomplete" now. In the unlikely case that the bug has already been fixed in the latest upstream version of QEMU, then please close this ticket as "Fix released". If it is not fixed yet and you think that this bug report here should be moved to the new system, then you have two options: 1) If you already have an account on gitlab.com, please open a new ticket for this problem in our new tracker here: https://gitlab.com/qemu-project/qemu/-/issues and then close this ticket here on Launchpad (or let it expire auto- matically after 60 days). Please mention the URL of this bug ticket on Launchpad in the new ticket on GitLab. 2) If you don't have an account on gitlab.com and don't intend to get one, but still would like to keep this ticket opened, then please switch the state back to "New" or "Confirmed" within the next 60 days (other- wise it will get closed as "Expired"). We will then eventually migrate the ticket automatically to the new system (but you won't be the reporter of the bug in the new system and thus you won't get notified on changes anymore). Thank you and sorry for the inconvenience. ** Changed in: qemu Status: Confirmed => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1925496 Title: nvme disk cannot be hotplugged after removal Status in QEMU: Incomplete Bug description: Hello, When I try to re-add an nvme disk shortly after removing it, I get an error about duplicate ID. See the following commands to reproduce. This happens consistently on all VMs that I tested: attach == $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; $VAR1 = { 'execute' => 'device_add', 'arguments' => { 'serial' => 'nvme1', 'drive' => 'drive-nvme1', 'driver' => 'nvme', 'id' => 'nvme1' } }; detach === $VAR1 = { 'arguments' => { 'id' => 'nvme1' }, 'execute' => 'device_del' }; $VAR1 = { 'execute' => 'human-monitor-command', 'arguments' => { 'command-line' => 'drive_del drive-nvme1' } }; reattach === $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; and I get: "Duplicate ID 'drive-nvme1' for drive" although it does not show up in query-block or query-pci anymore after the first detach. Is this a bug or am I missing something? Please advise. Best regards, Oguz To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1925496/+subscriptions
Re: [Bug 1925496] Re: nvme disk cannot be hotplugged after removal
On 5/3/21 9:27 AM, Klaus Jensen wrote: > On Apr 28 15:00, Max Reitz wrote: >> On 28.04.21 12:12, Klaus Jensen wrote: >>> On Apr 28 09:31, Oguz Bektas wrote: > My understanding is that this is the expected behavior. The reason is > that the drive cannot be deleted immediately when the device is > hot-unplugged, since it might not be safe (other parts of QEMU could > be using it, like background block jobs). > > On the other hand, the fact that if the drive is removed explicitly > through QMP (or in the monitor with drive_del), the drive id is > remains "in use". This might be a completely different bug that is > unrelated to the nvme device. using the same commands I can hot-plug and hot-unplug a scsi disk like this without issue - this behavior only appeared on nvme devices. >>> >>> Kevin, Max, can you shed any light on this? >>> >>> Specifically what the expected behavior is wrt. to the drive when >>> unplugging a device that has one attached? >>> >>> If the scsi disk is capable of "cleaning up" immediately, then I >>> suppose that some steps are missing in the nvme unrealization. >> > > Hi Max, > > Thanks for your help! > >> I’m not very strong when it comes to question for guest devices, but >> looking into the QAPI documentation, it says that when DEVICE_DELETED >> is emitted, it’s safe to reuse the device ID. Before then, it isn’t, >> because the guest may yet need to release the device. >> > > This is specifically related to releasing the "drive", not the device. > Problem is that when the device is deleted (using device_del), the pci > device goes away rapidly, but the drive (as shown in `info block`) > lingers for a couple of seconds before going into the "unlocked" state. > If the drive is then deleted (`drive_del`) everything is good, but if > the drive is deleted within those couple of seconds, the drive_del > completes successfully, but the drive id never becomes available again. > >> So the questions that come to my mind are: >> - Do nvme guest devices have a release protocol with the guest, so >> that it just may take some time for the guest to release the device? >> I.e. that this would be perfectly normal and documented behavior? >> (Perhaps this just isn’t the case for scsi, or the guest just releases >> those devices much quicker) >> > > The NVMe device is a PCIDevice, I wonder if that is what adds some delay > on this? > >> - Did Oguz always wait for the DEVICE_DELETED event before reusing the >> ID? Sounds like it would be a bug if reusing the ID after getting >> that event failed. >> > > From the bug report, it does not look like anything like that is done. > > I basically dont understand the deletion protocol here and why the drive > is not released immediately. Even if I add a call to > blockdev_mark_auto_del() there is a delay, but the drive does get > automatically deleted. FWIW, I've just sent a patch to re-enable NVMe namespace hotplug; basically you can 'hot-delete' an nvme device via 'device_del ', but you cannot 'hot-add' an nvme device via 'device_add '. Or, rather, you can, but then you end up with a NVME controller with no namespaces which tends to be kinda pointless. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect h...@suse.de +49 911 74053 688 SUSE Software Solutions Germany GmbH, 90409 Nürnberg GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
Re: [Bug 1925496] Re: nvme disk cannot be hotplugged after removal
On Apr 28 15:00, Max Reitz wrote: On 28.04.21 12:12, Klaus Jensen wrote: On Apr 28 09:31, Oguz Bektas wrote: My understanding is that this is the expected behavior. The reason is that the drive cannot be deleted immediately when the device is hot-unplugged, since it might not be safe (other parts of QEMU could be using it, like background block jobs). On the other hand, the fact that if the drive is removed explicitly through QMP (or in the monitor with drive_del), the drive id is remains "in use". This might be a completely different bug that is unrelated to the nvme device. using the same commands I can hot-plug and hot-unplug a scsi disk like this without issue - this behavior only appeared on nvme devices. Kevin, Max, can you shed any light on this? Specifically what the expected behavior is wrt. to the drive when unplugging a device that has one attached? If the scsi disk is capable of "cleaning up" immediately, then I suppose that some steps are missing in the nvme unrealization. Hi Max, Thanks for your help! I’m not very strong when it comes to question for guest devices, but looking into the QAPI documentation, it says that when DEVICE_DELETED is emitted, it’s safe to reuse the device ID. Before then, it isn’t, because the guest may yet need to release the device. This is specifically related to releasing the "drive", not the device. Problem is that when the device is deleted (using device_del), the pci device goes away rapidly, but the drive (as shown in `info block`) lingers for a couple of seconds before going into the "unlocked" state. If the drive is then deleted (`drive_del`) everything is good, but if the drive is deleted within those couple of seconds, the drive_del completes successfully, but the drive id never becomes available again. So the questions that come to my mind are: - Do nvme guest devices have a release protocol with the guest, so that it just may take some time for the guest to release the device? I.e. that this would be perfectly normal and documented behavior? (Perhaps this just isn’t the case for scsi, or the guest just releases those devices much quicker) The NVMe device is a PCIDevice, I wonder if that is what adds some delay on this? - Did Oguz always wait for the DEVICE_DELETED event before reusing the ID? Sounds like it would be a bug if reusing the ID after getting that event failed. From the bug report, it does not look like anything like that is done. I basically dont understand the deletion protocol here and why the drive is not released immediately. Even if I add a call to blockdev_mark_auto_del() there is a delay, but the drive does get automatically deleted. signature.asc Description: PGP signature
Re: [Bug 1925496] Re: nvme disk cannot be hotplugged after removal
On 28.04.21 12:12, Klaus Jensen wrote: On Apr 28 09:31, Oguz Bektas wrote: My understanding is that this is the expected behavior. The reason is that the drive cannot be deleted immediately when the device is hot-unplugged, since it might not be safe (other parts of QEMU could be using it, like background block jobs). On the other hand, the fact that if the drive is removed explicitly through QMP (or in the monitor with drive_del), the drive id is remains "in use". This might be a completely different bug that is unrelated to the nvme device. using the same commands I can hot-plug and hot-unplug a scsi disk like this without issue - this behavior only appeared on nvme devices. Kevin, Max, can you shed any light on this? Specifically what the expected behavior is wrt. to the drive when unplugging a device that has one attached? If the scsi disk is capable of "cleaning up" immediately, then I suppose that some steps are missing in the nvme unrealization. I’m not very strong when it comes to question for guest devices, but looking into the QAPI documentation, it says that when DEVICE_DELETED is emitted, it’s safe to reuse the device ID. Before then, it isn’t, because the guest may yet need to release the device. So the questions that come to my mind are: - Do nvme guest devices have a release protocol with the guest, so that it just may take some time for the guest to release the device? I.e. that this would be perfectly normal and documented behavior? (Perhaps this just isn’t the case for scsi, or the guest just releases those devices much quicker) - Did Oguz always wait for the DEVICE_DELETED event before reusing the ID? Sounds like it would be a bug if reusing the ID after getting that event failed. Max
Re: [Bug 1925496] Re: nvme disk cannot be hotplugged after removal
On Apr 28 09:31, Oguz Bektas wrote: My understanding is that this is the expected behavior. The reason is that the drive cannot be deleted immediately when the device is hot-unplugged, since it might not be safe (other parts of QEMU could be using it, like background block jobs). On the other hand, the fact that if the drive is removed explicitly through QMP (or in the monitor with drive_del), the drive id is remains "in use". This might be a completely different bug that is unrelated to the nvme device. using the same commands I can hot-plug and hot-unplug a scsi disk like this without issue - this behavior only appeared on nvme devices. Kevin, Max, can you shed any light on this? Specifically what the expected behavior is wrt. to the drive when unplugging a device that has one attached? If the scsi disk is capable of "cleaning up" immediately, then I suppose that some steps are missing in the nvme unrealization. signature.asc Description: PGP signature
[Bug 1925496] Re: nvme disk cannot be hotplugged after removal
> My understanding is that this is the expected behavior. The reason is that the drive cannot be deleted immediately when the device is hot- unplugged, since it might not be safe (other parts of QEMU could be using it, like background block jobs). > On the other hand, the fact that if the drive is removed explicitly through QMP (or in the monitor with drive_del), the drive id is remains "in use". This might be a completely different bug that is unrelated to the nvme device. using the same commands I can hot-plug and hot-unplug a scsi disk like this without issue - this behavior only appeared on nvme devices. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1925496 Title: nvme disk cannot be hotplugged after removal Status in QEMU: Confirmed Bug description: Hello, When I try to re-add an nvme disk shortly after removing it, I get an error about duplicate ID. See the following commands to reproduce. This happens consistently on all VMs that I tested: attach == $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; $VAR1 = { 'execute' => 'device_add', 'arguments' => { 'serial' => 'nvme1', 'drive' => 'drive-nvme1', 'driver' => 'nvme', 'id' => 'nvme1' } }; detach === $VAR1 = { 'arguments' => { 'id' => 'nvme1' }, 'execute' => 'device_del' }; $VAR1 = { 'execute' => 'human-monitor-command', 'arguments' => { 'command-line' => 'drive_del drive-nvme1' } }; reattach === $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; and I get: "Duplicate ID 'drive-nvme1' for drive" although it does not show up in query-block or query-pci anymore after the first detach. Is this a bug or am I missing something? Please advise. Best regards, Oguz To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1925496/+subscriptions
[Bug 1925496] Re: nvme disk cannot be hotplugged after removal
So, I had to investigate this a bit, since it is a part of QEMU that I'm not too familiar with. My understanding is that this is the expected behavior. The reason is that the drive cannot be deleted immediately when the device is hot- unplugged, since it might not be safe (other parts of QEMU could be using it, like background block jobs). What we *can* do, is make sure we mark the drive for auto deletion when it is safe to do so. I'll add a patch for that. On the other hand, the fact that if the drive is removed explicitly through QMP (or in the monitor with drive_del), the drive id is remains "in use". This might be a completely different bug that is unrelated to the nvme device. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1925496 Title: nvme disk cannot be hotplugged after removal Status in QEMU: Confirmed Bug description: Hello, When I try to re-add an nvme disk shortly after removing it, I get an error about duplicate ID. See the following commands to reproduce. This happens consistently on all VMs that I tested: attach == $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; $VAR1 = { 'execute' => 'device_add', 'arguments' => { 'serial' => 'nvme1', 'drive' => 'drive-nvme1', 'driver' => 'nvme', 'id' => 'nvme1' } }; detach === $VAR1 = { 'arguments' => { 'id' => 'nvme1' }, 'execute' => 'device_del' }; $VAR1 = { 'execute' => 'human-monitor-command', 'arguments' => { 'command-line' => 'drive_del drive-nvme1' } }; reattach === $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; and I get: "Duplicate ID 'drive-nvme1' for drive" although it does not show up in query-block or query-pci anymore after the first detach. Is this a bug or am I missing something? Please advise. Best regards, Oguz To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1925496/+subscriptions
[Bug 1925496] Re: nvme disk cannot be hotplugged after removal
BTW Re: Regression, I think it's not, because this didn't work a year ago either, but I wasn't sure if it's a bug. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1925496 Title: nvme disk cannot be hotplugged after removal Status in QEMU: Confirmed Bug description: Hello, When I try to re-add an nvme disk shortly after removing it, I get an error about duplicate ID. See the following commands to reproduce. This happens consistently on all VMs that I tested: attach == $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; $VAR1 = { 'execute' => 'device_add', 'arguments' => { 'serial' => 'nvme1', 'drive' => 'drive-nvme1', 'driver' => 'nvme', 'id' => 'nvme1' } }; detach === $VAR1 = { 'arguments' => { 'id' => 'nvme1' }, 'execute' => 'device_del' }; $VAR1 = { 'execute' => 'human-monitor-command', 'arguments' => { 'command-line' => 'drive_del drive-nvme1' } }; reattach === $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; and I get: "Duplicate ID 'drive-nvme1' for drive" although it does not show up in query-block or query-pci anymore after the first detach. Is this a bug or am I missing something? Please advise. Best regards, Oguz To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1925496/+subscriptions
[Bug 1925496] Re: nvme disk cannot be hotplugged after removal
** Changed in: qemu Assignee: (unassigned) => Klaus Jensen (birkelund) ** Changed in: qemu Status: New => Confirmed -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1925496 Title: nvme disk cannot be hotplugged after removal Status in QEMU: Confirmed Bug description: Hello, When I try to re-add an nvme disk shortly after removing it, I get an error about duplicate ID. See the following commands to reproduce. This happens consistently on all VMs that I tested: attach == $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; $VAR1 = { 'execute' => 'device_add', 'arguments' => { 'serial' => 'nvme1', 'drive' => 'drive-nvme1', 'driver' => 'nvme', 'id' => 'nvme1' } }; detach === $VAR1 = { 'arguments' => { 'id' => 'nvme1' }, 'execute' => 'device_del' }; $VAR1 = { 'execute' => 'human-monitor-command', 'arguments' => { 'command-line' => 'drive_del drive-nvme1' } }; reattach === $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; and I get: "Duplicate ID 'drive-nvme1' for drive" although it does not show up in query-block or query-pci anymore after the first detach. Is this a bug or am I missing something? Please advise. Best regards, Oguz To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1925496/+subscriptions
[Bug 1925496] Re: nvme disk cannot be hotplugged after removal
Hi, this is happening on qemu 5.2.0 -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1925496 Title: nvme disk cannot be hotplugged after removal Status in QEMU: New Bug description: Hello, When I try to re-add an nvme disk shortly after removing it, I get an error about duplicate ID. See the following commands to reproduce. This happens consistently on all VMs that I tested: attach == $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; $VAR1 = { 'execute' => 'device_add', 'arguments' => { 'serial' => 'nvme1', 'drive' => 'drive-nvme1', 'driver' => 'nvme', 'id' => 'nvme1' } }; detach === $VAR1 = { 'arguments' => { 'id' => 'nvme1' }, 'execute' => 'device_del' }; $VAR1 = { 'execute' => 'human-monitor-command', 'arguments' => { 'command-line' => 'drive_del drive-nvme1' } }; reattach === $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; and I get: "Duplicate ID 'drive-nvme1' for drive" although it does not show up in query-block or query-pci anymore after the first detach. Is this a bug or am I missing something? Please advise. Best regards, Oguz To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1925496/+subscriptions
[Bug 1925496] Re: nvme disk cannot be hotplugged after removal
Hi, What QEMU version is this happening on? Is this the -rc4, is it a regression? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1925496 Title: nvme disk cannot be hotplugged after removal Status in QEMU: New Bug description: Hello, When I try to re-add an nvme disk shortly after removing it, I get an error about duplicate ID. See the following commands to reproduce. This happens consistently on all VMs that I tested: attach == $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; $VAR1 = { 'execute' => 'device_add', 'arguments' => { 'serial' => 'nvme1', 'drive' => 'drive-nvme1', 'driver' => 'nvme', 'id' => 'nvme1' } }; detach === $VAR1 = { 'arguments' => { 'id' => 'nvme1' }, 'execute' => 'device_del' }; $VAR1 = { 'execute' => 'human-monitor-command', 'arguments' => { 'command-line' => 'drive_del drive-nvme1' } }; reattach === $VAR1 = { 'arguments' => { 'command-line' => 'drive_add auto "file=/dev/zvol/rpool/data/vm-2-disk-1,if=none,id=drive-nvme1,format=raw,cache=none,aio=native,detect-zeroes=on"' }, 'execute' => 'human-monitor-command' }; and I get: "Duplicate ID 'drive-nvme1' for drive" although it does not show up in query-block or query-pci anymore after the first detach. Is this a bug or am I missing something? Please advise. Best regards, Oguz To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1925496/+subscriptions