Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
On 28/09/2018 00:03, Sander Eikelenboom wrote: > On 27/09/18 23:48, Boris Ostrovsky wrote: >> On 9/27/18 5:37 PM, Jens Axboe wrote: >>> On 9/27/18 2:33 PM, Sander Eikelenboom wrote: On 27/09/18 21:06, Boris Ostrovsky wrote: > On 9/27/18 2:56 PM, Jens Axboe wrote: >> On 9/27/18 12:52 PM, Sander Eikelenboom wrote: >>> On 27/09/18 16:26, Jens Axboe wrote: On 9/27/18 1:12 AM, Juergen Gross wrote: > On 22/09/18 21:55, Boris Ostrovsky wrote: >> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >> added support for purging persistent grants when they are not in >> use. As >> part of the purge, the grants were removed from the grant buffer, >> This >> eventually causes the buffer to become empty, with BUG_ON triggered >> in >> get_free_grant(). This can be observed even on an idle system, within >> 20-30 minutes. >> >> We should keep the grants in the buffer when purging, and only free >> the >> grant ref. >> >> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >> Signed-off-by: Boris Ostrovsky > Reviewed-by: Juergen Gross Since Konrad is out, I'm going to queue this up for 4.19. >>> Hi Boris/Juergen. >>> >>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch >>> from Boris pulled on top. >>> Unfortunately it made a VM hang (probably because it's rootFS is >>> shuffled from under it's feet > What do you mean by "rootFS is shuffled from under it's feet " ? Assumption that block-front getting borked and either a kernel crash or rootfs becoming mounted readonly. Didn't (try) to check though. >>> and it gave these in dom0 dmesg: >>> >>> [ 9251.696090] xen-blkback: requesting a grant already in use >>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the >>> tree >>> [ 9251.715781] xen-blkback: requesting a grant already in use >>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the >>> tree >>> [ 9251.735698] xen-blkback: requesting a grant already in use >>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the >>> tree >>> >>> The VM was a HVM with 4 vcpu's and 2 phy disks: >>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 >>> (x86_64-abi) persistent grants >>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 >>> (x86_64-abi) persistent grants >>> >>> >>> Currently i have been running 4.19-rc5 with xen-next on top and commit >>> a46b53672b2c reverted, for a couple of days. That seems to run stable >>> for me (since it's a small box so i'm not hit by what a46b53672b2c >>> tried to fix. >>> >>> If you can come up with a debug patch i can give that a spin tomorrow >>> evening or in the weekend, so we are hopefully still in time for the >>> 4.19 release. >> At this late in the game, might make more sense to simply revert the >> buggy commit. Especially since what is currently out there doesn't fix >> the issue for you. Don't know if Boris or Juergen have a hunch about the issue, if not perhaps a revert is the best. >>> Anyone? Unless I hear otherwise, I'll revert the series tomorrow. >> >> Juergen may have something to say by tomorrow, but from my perspective, >> given that we are coming up on rc6 --- yes. >> >> I looked at the patches again and didn't see anything obvious. >> >> -boris > > Could also be that what i hit is a latent bug, > that is not caused by these patches but merely got uncovered by them. > > xl dmesg also shows quite some: > (XEN) [2018-09-24 03:15:46.847] grant_table.c:1755:d14v0 Expanding d14 > grant table from 19 to 20 frames > (XEN) [2018-09-24 03:15:46.849] grant_table.c:1755:d14v0 Expanding d14 > grant table from 20 to 21 frames > (and has done that for ages on my box not leading to any direct problems to > my knowledge) > > I don't know if there could be related and something around the (persistent) > grants for block devices could be leaking under some conditions? I could reproduce the issue Boris has seen and I have found the fault in his patch. Just testing a fix. Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
On 27/09/18 23:48, Boris Ostrovsky wrote: > On 9/27/18 5:37 PM, Jens Axboe wrote: >> On 9/27/18 2:33 PM, Sander Eikelenboom wrote: >>> On 27/09/18 21:06, Boris Ostrovsky wrote: On 9/27/18 2:56 PM, Jens Axboe wrote: > On 9/27/18 12:52 PM, Sander Eikelenboom wrote: >> On 27/09/18 16:26, Jens Axboe wrote: >>> On 9/27/18 1:12 AM, Juergen Gross wrote: On 22/09/18 21:55, Boris Ostrovsky wrote: > Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") > added support for purging persistent grants when they are not in use. > As > part of the purge, the grants were removed from the grant buffer, This > eventually causes the buffer to become empty, with BUG_ON triggered in > get_free_grant(). This can be observed even on an idle system, within > 20-30 minutes. > > We should keep the grants in the buffer when purging, and only free > the > grant ref. > > Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") > Signed-off-by: Boris Ostrovsky Reviewed-by: Juergen Gross >>> Since Konrad is out, I'm going to queue this up for 4.19. >>> >> Hi Boris/Juergen. >> >> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch >> from Boris pulled on top. >> Unfortunately it made a VM hang (probably because it's rootFS is >> shuffled from under it's feet What do you mean by "rootFS is shuffled from under it's feet " ? >>> Assumption that block-front getting borked and either a kernel crash or >>> rootfs becoming mounted readonly. Didn't (try) to check though. >>> >> and it gave these in dom0 dmesg: >> >> [ 9251.696090] xen-blkback: requesting a grant already in use >> [ 9251.705861] xen-blkback: trying to add a gref that's already in the >> tree >> [ 9251.715781] xen-blkback: requesting a grant already in use >> [ 9251.725756] xen-blkback: trying to add a gref that's already in the >> tree >> [ 9251.735698] xen-blkback: requesting a grant already in use >> [ 9251.745573] xen-blkback: trying to add a gref that's already in the >> tree >> >> The VM was a HVM with 4 vcpu's and 2 phy disks: >> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) >> persistent grants >> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) >> persistent grants >> >> >> Currently i have been running 4.19-rc5 with xen-next on top and commit >> a46b53672b2c reverted, for a couple of days. That seems to run stable >> for me (since it's a small box so i'm not hit by what a46b53672b2c >> tried to fix. >> >> If you can come up with a debug patch i can give that a spin tomorrow >> evening or in the weekend, so we are hopefully still in time for the >> 4.19 release. > At this late in the game, might make more sense to simply revert the > buggy commit. Especially since what is currently out there doesn't fix > the issue for you. >>> Don't know if Boris or Juergen have a hunch about the issue, if not >>> perhaps a revert is the best. >> Anyone? Unless I hear otherwise, I'll revert the series tomorrow. > > Juergen may have something to say by tomorrow, but from my perspective, > given that we are coming up on rc6 --- yes. > > I looked at the patches again and didn't see anything obvious. > > -boris Could also be that what i hit is a latent bug, that is not caused by these patches but merely got uncovered by them. xl dmesg also shows quite some: (XEN) [2018-09-24 03:15:46.847] grant_table.c:1755:d14v0 Expanding d14 grant table from 19 to 20 frames (XEN) [2018-09-24 03:15:46.849] grant_table.c:1755:d14v0 Expanding d14 grant table from 20 to 21 frames (and has done that for ages on my box not leading to any direct problems to my knowledge) I don't know if there could be related and something around the (persistent) grants for block devices could be leaking under some conditions? -- Sander ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
On 9/27/18 5:37 PM, Jens Axboe wrote: > On 9/27/18 2:33 PM, Sander Eikelenboom wrote: >> On 27/09/18 21:06, Boris Ostrovsky wrote: >>> On 9/27/18 2:56 PM, Jens Axboe wrote: On 9/27/18 12:52 PM, Sander Eikelenboom wrote: > On 27/09/18 16:26, Jens Axboe wrote: >> On 9/27/18 1:12 AM, Juergen Gross wrote: >>> On 22/09/18 21:55, Boris Ostrovsky wrote: Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") added support for purging persistent grants when they are not in use. As part of the purge, the grants were removed from the grant buffer, This eventually causes the buffer to become empty, with BUG_ON triggered in get_free_grant(). This can be observed even on an idle system, within 20-30 minutes. We should keep the grants in the buffer when purging, and only free the grant ref. Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") Signed-off-by: Boris Ostrovsky >>> Reviewed-by: Juergen Gross >> Since Konrad is out, I'm going to queue this up for 4.19. >> > Hi Boris/Juergen. > > Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch > from Boris pulled on top. > Unfortunately it made a VM hang (probably because it's rootFS is shuffled > from under it's feet >>> What do you mean by "rootFS is shuffled from under it's feet " ? >> Assumption that block-front getting borked and either a kernel crash or >> rootfs becoming mounted readonly. Didn't (try) to check though. >> > and it gave these in dom0 dmesg: > > [ 9251.696090] xen-blkback: requesting a grant already in use > [ 9251.705861] xen-blkback: trying to add a gref that's already in the > tree > [ 9251.715781] xen-blkback: requesting a grant already in use > [ 9251.725756] xen-blkback: trying to add a gref that's already in the > tree > [ 9251.735698] xen-blkback: requesting a grant already in use > [ 9251.745573] xen-blkback: trying to add a gref that's already in the > tree > > The VM was a HVM with 4 vcpu's and 2 phy disks: > xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) > persistent grants > xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) > persistent grants > > > Currently i have been running 4.19-rc5 with xen-next on top and commit > a46b53672b2c reverted, for a couple of days. That seems to run stable > for me (since it's a small box so i'm not hit by what a46b53672b2c > tried to fix. > > If you can come up with a debug patch i can give that a spin tomorrow > evening or in the weekend, so we are hopefully still in time for the > 4.19 release. At this late in the game, might make more sense to simply revert the buggy commit. Especially since what is currently out there doesn't fix the issue for you. >> Don't know if Boris or Juergen have a hunch about the issue, if not >> perhaps a revert is the best. > Anyone? Unless I hear otherwise, I'll revert the series tomorrow. Juergen may have something to say by tomorrow, but from my perspective, given that we are coming up on rc6 --- yes. I looked at the patches again and didn't see anything obvious. -boris ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
On 9/27/18 2:33 PM, Sander Eikelenboom wrote: > On 27/09/18 21:06, Boris Ostrovsky wrote: >> On 9/27/18 2:56 PM, Jens Axboe wrote: >>> On 9/27/18 12:52 PM, Sander Eikelenboom wrote: On 27/09/18 16:26, Jens Axboe wrote: > On 9/27/18 1:12 AM, Juergen Gross wrote: >> On 22/09/18 21:55, Boris Ostrovsky wrote: >>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >>> added support for purging persistent grants when they are not in use. As >>> part of the purge, the grants were removed from the grant buffer, This >>> eventually causes the buffer to become empty, with BUG_ON triggered in >>> get_free_grant(). This can be observed even on an idle system, within >>> 20-30 minutes. >>> >>> We should keep the grants in the buffer when purging, and only free the >>> grant ref. >>> >>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >>> Signed-off-by: Boris Ostrovsky >> Reviewed-by: Juergen Gross > Since Konrad is out, I'm going to queue this up for 4.19. > Hi Boris/Juergen. Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from Boris pulled on top. Unfortunately it made a VM hang (probably because it's rootFS is shuffled from under it's feet >> >> What do you mean by "rootFS is shuffled from under it's feet " ? > > Assumption that block-front getting borked and either a kernel crash or > rootfs becoming mounted readonly. Didn't (try) to check though. > and it gave these in dom0 dmesg: [ 9251.696090] xen-blkback: requesting a grant already in use [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree [ 9251.715781] xen-blkback: requesting a grant already in use [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree [ 9251.735698] xen-blkback: requesting a grant already in use [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree The VM was a HVM with 4 vcpu's and 2 phy disks: xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) persistent grants xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) persistent grants Currently i have been running 4.19-rc5 with xen-next on top and commit a46b53672b2c reverted, for a couple of days. That seems to run stable for me (since it's a small box so i'm not hit by what a46b53672b2c tried to fix. If you can come up with a debug patch i can give that a spin tomorrow evening or in the weekend, so we are hopefully still in time for the 4.19 release. >>> At this late in the game, might make more sense to simply revert the >>> buggy commit. Especially since what is currently out there doesn't fix >>> the issue for you. > > Don't know if Boris or Juergen have a hunch about the issue, if not > perhaps a revert is the best. Anyone? Unless I hear otherwise, I'll revert the series tomorrow. -- Jens Axboe ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
On 27/09/18 21:06, Boris Ostrovsky wrote: > On 9/27/18 2:56 PM, Jens Axboe wrote: >> On 9/27/18 12:52 PM, Sander Eikelenboom wrote: >>> On 27/09/18 16:26, Jens Axboe wrote: On 9/27/18 1:12 AM, Juergen Gross wrote: > On 22/09/18 21:55, Boris Ostrovsky wrote: >> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >> added support for purging persistent grants when they are not in use. As >> part of the purge, the grants were removed from the grant buffer, This >> eventually causes the buffer to become empty, with BUG_ON triggered in >> get_free_grant(). This can be observed even on an idle system, within >> 20-30 minutes. >> >> We should keep the grants in the buffer when purging, and only free the >> grant ref. >> >> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >> Signed-off-by: Boris Ostrovsky > Reviewed-by: Juergen Gross Since Konrad is out, I'm going to queue this up for 4.19. >>> Hi Boris/Juergen. >>> >>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch >>> from Boris pulled on top. >>> Unfortunately it made a VM hang (probably because it's rootFS is shuffled >>> from under it's feet > > What do you mean by "rootFS is shuffled from under it's feet " ? Assumption that block-front getting borked and either a kernel crash or rootfs becoming mounted readonly. Didn't (try) to check though. >>> and it gave these in dom0 dmesg: >>> >>> [ 9251.696090] xen-blkback: requesting a grant already in use >>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree >>> [ 9251.715781] xen-blkback: requesting a grant already in use >>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree >>> [ 9251.735698] xen-blkback: requesting a grant already in use >>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree >>> >>> The VM was a HVM with 4 vcpu's and 2 phy disks: >>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) >>> persistent grants >>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) >>> persistent grants >>> >>> >>> Currently i have been running 4.19-rc5 with xen-next on top and commit >>> a46b53672b2c reverted, for a couple of days. That seems to run stable >>> for me (since it's a small box so i'm not hit by what a46b53672b2c >>> tried to fix. >>> >>> If you can come up with a debug patch i can give that a spin tomorrow >>> evening or in the weekend, so we are hopefully still in time for the >>> 4.19 release. >> At this late in the game, might make more sense to simply revert the >> buggy commit. Especially since what is currently out there doesn't fix >> the issue for you. Don't know if Boris or Juergen have a hunch about the issue, if not perhaps a revert is the best. > If decision is to revert then I think the whole series needs to be > reverted. > > -boris > For Boris and Juergen: Would it make sense to have an "xen-next" branch in the xen-tip tree that is: - based on the previous stable kernel - and has the for-linus branches for the upcoming kernel release on top; - and has the pathes for net(-next) and block changes on top (since these don't go via the tree but only via mailing-list patches); (which are scattered, difficult to track and use for automated testing) - and dependency patches for the above if necessary to be able to build. So there is one branch that can be used to test ALL pending kernel related Xen patches and which could be used in OSStest without as many potential false alarms as linux-next will have ? -- Sander ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
On 9/27/18 1:06 PM, Boris Ostrovsky wrote: > On 9/27/18 2:56 PM, Jens Axboe wrote: >> On 9/27/18 12:52 PM, Sander Eikelenboom wrote: >>> On 27/09/18 16:26, Jens Axboe wrote: On 9/27/18 1:12 AM, Juergen Gross wrote: > On 22/09/18 21:55, Boris Ostrovsky wrote: >> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >> added support for purging persistent grants when they are not in use. As >> part of the purge, the grants were removed from the grant buffer, This >> eventually causes the buffer to become empty, with BUG_ON triggered in >> get_free_grant(). This can be observed even on an idle system, within >> 20-30 minutes. >> >> We should keep the grants in the buffer when purging, and only free the >> grant ref. >> >> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >> Signed-off-by: Boris Ostrovsky > Reviewed-by: Juergen Gross Since Konrad is out, I'm going to queue this up for 4.19. >>> Hi Boris/Juergen. >>> >>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch >>> from Boris pulled on top. >>> Unfortunately it made a VM hang (probably because it's rootFS is shuffled >>> from under it's feet > > What do you mean by "rootFS is shuffled from under it's feet " ? > >>> and it gave these in dom0 dmesg: >>> >>> [ 9251.696090] xen-blkback: requesting a grant already in use >>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree >>> [ 9251.715781] xen-blkback: requesting a grant already in use >>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree >>> [ 9251.735698] xen-blkback: requesting a grant already in use >>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree >>> >>> The VM was a HVM with 4 vcpu's and 2 phy disks: >>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) >>> persistent grants >>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) >>> persistent grants >>> >>> >>> Currently i have been running 4.19-rc5 with xen-next on top and commit >>> a46b53672b2c reverted, for a couple of days. That seems to run stable >>> for me (since it's a small box so i'm not hit by what a46b53672b2c >>> tried to fix. >>> >>> If you can come up with a debug patch i can give that a spin tomorrow >>> evening or in the weekend, so we are hopefully still in time for the >>> 4.19 release. >> At this late in the game, might make more sense to simply revert the >> buggy commit. Especially since what is currently out there doesn't fix >> the issue for you. > > If decision is to revert then I think the whole series needs to be > reverted. Yes, definitely. -- Jens Axboe ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
On 9/27/18 2:56 PM, Jens Axboe wrote: > On 9/27/18 12:52 PM, Sander Eikelenboom wrote: >> On 27/09/18 16:26, Jens Axboe wrote: >>> On 9/27/18 1:12 AM, Juergen Gross wrote: On 22/09/18 21:55, Boris Ostrovsky wrote: > Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") > added support for purging persistent grants when they are not in use. As > part of the purge, the grants were removed from the grant buffer, This > eventually causes the buffer to become empty, with BUG_ON triggered in > get_free_grant(). This can be observed even on an idle system, within > 20-30 minutes. > > We should keep the grants in the buffer when purging, and only free the > grant ref. > > Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") > Signed-off-by: Boris Ostrovsky Reviewed-by: Juergen Gross >>> Since Konrad is out, I'm going to queue this up for 4.19. >>> >> Hi Boris/Juergen. >> >> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from >> Boris pulled on top. >> Unfortunately it made a VM hang (probably because it's rootFS is shuffled >> from under it's feet What do you mean by "rootFS is shuffled from under it's feet " ? >> and it gave these in dom0 dmesg: >> >> [ 9251.696090] xen-blkback: requesting a grant already in use >> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree >> [ 9251.715781] xen-blkback: requesting a grant already in use >> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree >> [ 9251.735698] xen-blkback: requesting a grant already in use >> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree >> >> The VM was a HVM with 4 vcpu's and 2 phy disks: >> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) >> persistent grants >> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) >> persistent grants >> >> >> Currently i have been running 4.19-rc5 with xen-next on top and commit >> a46b53672b2c reverted, for a couple of days. That seems to run stable >> for me (since it's a small box so i'm not hit by what a46b53672b2c >> tried to fix. >> >> If you can come up with a debug patch i can give that a spin tomorrow >> evening or in the weekend, so we are hopefully still in time for the >> 4.19 release. > At this late in the game, might make more sense to simply revert the > buggy commit. Especially since what is currently out there doesn't fix > the issue for you. If decision is to revert then I think the whole series needs to be reverted. -boris ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
On 9/27/18 12:52 PM, Sander Eikelenboom wrote: > On 27/09/18 16:26, Jens Axboe wrote: >> On 9/27/18 1:12 AM, Juergen Gross wrote: >>> On 22/09/18 21:55, Boris Ostrovsky wrote: Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") added support for purging persistent grants when they are not in use. As part of the purge, the grants were removed from the grant buffer, This eventually causes the buffer to become empty, with BUG_ON triggered in get_free_grant(). This can be observed even on an idle system, within 20-30 minutes. We should keep the grants in the buffer when purging, and only free the grant ref. Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") Signed-off-by: Boris Ostrovsky >>> >>> Reviewed-by: Juergen Gross >> >> Since Konrad is out, I'm going to queue this up for 4.19. >> > > Hi Boris/Juergen. > > Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from > Boris pulled on top. > Unfortunately it made a VM hang (probably because it's rootFS is shuffled > from under it's feet > and it gave these in dom0 dmesg: > > [ 9251.696090] xen-blkback: requesting a grant already in use > [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree > [ 9251.715781] xen-blkback: requesting a grant already in use > [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree > [ 9251.735698] xen-blkback: requesting a grant already in use > [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree > > The VM was a HVM with 4 vcpu's and 2 phy disks: > xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) > persistent grants > xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) > persistent grants > > > Currently i have been running 4.19-rc5 with xen-next on top and commit > a46b53672b2c reverted, for a couple of days. That seems to run stable > for me (since it's a small box so i'm not hit by what a46b53672b2c > tried to fix. > > If you can come up with a debug patch i can give that a spin tomorrow > evening or in the weekend, so we are hopefully still in time for the > 4.19 release. At this late in the game, might make more sense to simply revert the buggy commit. Especially since what is currently out there doesn't fix the issue for you. -- Jens Axboe ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
On 27/09/18 16:26, Jens Axboe wrote: > On 9/27/18 1:12 AM, Juergen Gross wrote: >> On 22/09/18 21:55, Boris Ostrovsky wrote: >>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >>> added support for purging persistent grants when they are not in use. As >>> part of the purge, the grants were removed from the grant buffer, This >>> eventually causes the buffer to become empty, with BUG_ON triggered in >>> get_free_grant(). This can be observed even on an idle system, within >>> 20-30 minutes. >>> >>> We should keep the grants in the buffer when purging, and only free the >>> grant ref. >>> >>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >>> Signed-off-by: Boris Ostrovsky >> >> Reviewed-by: Juergen Gross > > Since Konrad is out, I'm going to queue this up for 4.19. > Hi Boris/Juergen. Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from Boris pulled on top. Unfortunately it made a VM hang (probably because it's rootFS is shuffled from under it's feet and it gave these in dom0 dmesg: [ 9251.696090] xen-blkback: requesting a grant already in use [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree [ 9251.715781] xen-blkback: requesting a grant already in use [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree [ 9251.735698] xen-blkback: requesting a grant already in use [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree The VM was a HVM with 4 vcpu's and 2 phy disks: xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) persistent grants xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) persistent grants Currently i have been running 4.19-rc5 with xen-next on top and commit a46b53672b2c reverted, for a couple of days. That seems to run stable for me (since it's a small box so i'm not hit by what a46b53672b2c tried to fix. If you can come up with a debug patch i can give that a spin tomorrow evening or in the weekend, so we are hopefully still in time for the 4.19 release. -- Sander ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
On 9/27/18 1:12 AM, Juergen Gross wrote: > On 22/09/18 21:55, Boris Ostrovsky wrote: >> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >> added support for purging persistent grants when they are not in use. As >> part of the purge, the grants were removed from the grant buffer, This >> eventually causes the buffer to become empty, with BUG_ON triggered in >> get_free_grant(). This can be observed even on an idle system, within >> 20-30 minutes. >> >> We should keep the grants in the buffer when purging, and only free the >> grant ref. >> >> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >> Signed-off-by: Boris Ostrovsky > > Reviewed-by: Juergen Gross Since Konrad is out, I'm going to queue this up for 4.19. -- Jens Axboe ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
On 22/09/18 21:55, Boris Ostrovsky wrote: > Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") > added support for purging persistent grants when they are not in use. As > part of the purge, the grants were removed from the grant buffer, This > eventually causes the buffer to become empty, with BUG_ON triggered in > get_free_grant(). This can be observed even on an idle system, within > 20-30 minutes. > > We should keep the grants in the buffer when purging, and only free the > grant ref. > > Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") > Signed-off-by: Boris Ostrovsky Reviewed-by: Juergen Gross Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") added support for purging persistent grants when they are not in use. As part of the purge, the grants were removed from the grant buffer, This eventually causes the buffer to become empty, with BUG_ON triggered in get_free_grant(). This can be observed even on an idle system, within 20-30 minutes. We should keep the grants in the buffer when purging, and only free the grant ref. Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") Signed-off-by: Boris Ostrovsky --- drivers/block/xen-blkfront.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index a71d817e900d..3b441fe69c0d 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -2667,11 +2667,9 @@ static void purge_persistent_grants(struct blkfront_info *info) gnttab_query_foreign_access(gnt_list_entry->gref)) continue; - list_del(&gnt_list_entry->node); gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL); + gnt_list_entry->gref = GRANT_INVALID_REF; rinfo->persistent_gnts_c--; - __free_page(gnt_list_entry->page); - kfree(gnt_list_entry); } spin_unlock_irqrestore(&rinfo->ring_lock, flags); -- 2.17.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel