Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer

2018-09-27 Thread Juergen Gross
On 28/09/2018 00:03, Sander Eikelenboom wrote:
> On 27/09/18 23:48, Boris Ostrovsky wrote:
>> On 9/27/18 5:37 PM, Jens Axboe wrote:
>>> On 9/27/18 2:33 PM, Sander Eikelenboom wrote:
 On 27/09/18 21:06, Boris Ostrovsky wrote:
> On 9/27/18 2:56 PM, Jens Axboe wrote:
>> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
>>> On 27/09/18 16:26, Jens Axboe wrote:
 On 9/27/18 1:12 AM, Juergen Gross wrote:
> On 22/09/18 21:55, Boris Ostrovsky wrote:
>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>> added support for purging persistent grants when they are not in 
>> use. As
>> part of the purge, the grants were removed from the grant buffer, 
>> This
>> eventually causes the buffer to become empty, with BUG_ON triggered 
>> in
>> get_free_grant(). This can be observed even on an idle system, within
>> 20-30 minutes.
>>
>> We should keep the grants in the buffer when purging, and only free 
>> the
>> grant ref.
>>
>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>> Signed-off-by: Boris Ostrovsky 
> Reviewed-by: Juergen Gross 
 Since Konrad is out, I'm going to queue this up for 4.19.

>>> Hi Boris/Juergen.
>>>
>>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch 
>>> from Boris pulled on top. 
>>> Unfortunately it made a VM hang (probably because it's rootFS is 
>>> shuffled from under it's feet 
> What do you mean by "rootFS is shuffled from under it's feet " ?
 Assumption that block-front getting borked and either a kernel crash or 
 rootfs becoming mounted readonly. Didn't (try) to check though.

>>> and it gave these in dom0 dmesg:
>>>
>>> [ 9251.696090] xen-blkback: requesting a grant already in use
>>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the 
>>> tree
>>> [ 9251.715781] xen-blkback: requesting a grant already in use
>>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the 
>>> tree
>>> [ 9251.735698] xen-blkback: requesting a grant already in use
>>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the 
>>> tree
>>>
>>> The VM was a HVM with 4 vcpu's and 2 phy disks:
>>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 
>>> (x86_64-abi) persistent grants
>>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 
>>> (x86_64-abi) persistent grants
>>>
>>>
>>> Currently i have been running 4.19-rc5 with xen-next on top and commit
>>> a46b53672b2c reverted, for a couple of days. That seems to run stable
>>> for me (since it's a small box so i'm not hit by what a46b53672b2c
>>> tried to fix.
>>>
>>> If you can come up with a debug patch i can give that a spin tomorrow
>>> evening or in the weekend, so we are hopefully still in time for the
>>> 4.19 release.
>> At this late in the game, might make more sense to simply revert the
>> buggy commit.  Especially since what is currently out there doesn't fix
>> the issue for you.
 Don't know if Boris or Juergen have a hunch about the issue, if not
 perhaps a revert is the best.
>>> Anyone? Unless I hear otherwise, I'll revert the series tomorrow.
>>
>> Juergen may have something to say by tomorrow, but from my perspective,
>> given that we are coming up on rc6 --- yes.
>>
>> I looked at the patches again and didn't see anything obvious.
>>
>> -boris
> 
> Could also be that what i hit is a latent bug, 
> that is not caused by these patches but merely got uncovered by them.
> 
> xl dmesg also shows quite some:
> (XEN) [2018-09-24 03:15:46.847] grant_table.c:1755:d14v0 Expanding d14 
> grant table from 19 to 20 frames
> (XEN) [2018-09-24 03:15:46.849] grant_table.c:1755:d14v0 Expanding d14 
> grant table from 20 to 21 frames
> (and has done that for ages on my box not leading to any direct problems to 
> my knowledge)
> 
> I don't know if there could be related and something around the (persistent) 
> grants for block devices could be leaking under some conditions?

I could reproduce the issue Boris has seen and I have found the fault
in his patch. Just testing a fix.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer

2018-09-27 Thread Sander Eikelenboom
On 27/09/18 23:48, Boris Ostrovsky wrote:
> On 9/27/18 5:37 PM, Jens Axboe wrote:
>> On 9/27/18 2:33 PM, Sander Eikelenboom wrote:
>>> On 27/09/18 21:06, Boris Ostrovsky wrote:
 On 9/27/18 2:56 PM, Jens Axboe wrote:
> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
>> On 27/09/18 16:26, Jens Axboe wrote:
>>> On 9/27/18 1:12 AM, Juergen Gross wrote:
 On 22/09/18 21:55, Boris Ostrovsky wrote:
> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
> added support for purging persistent grants when they are not in use. 
> As
> part of the purge, the grants were removed from the grant buffer, This
> eventually causes the buffer to become empty, with BUG_ON triggered in
> get_free_grant(). This can be observed even on an idle system, within
> 20-30 minutes.
>
> We should keep the grants in the buffer when purging, and only free 
> the
> grant ref.
>
> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
> Signed-off-by: Boris Ostrovsky 
 Reviewed-by: Juergen Gross 
>>> Since Konrad is out, I'm going to queue this up for 4.19.
>>>
>> Hi Boris/Juergen.
>>
>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch 
>> from Boris pulled on top. 
>> Unfortunately it made a VM hang (probably because it's rootFS is 
>> shuffled from under it's feet 
 What do you mean by "rootFS is shuffled from under it's feet " ?
>>> Assumption that block-front getting borked and either a kernel crash or 
>>> rootfs becoming mounted readonly. Didn't (try) to check though.
>>>
>> and it gave these in dom0 dmesg:
>>
>> [ 9251.696090] xen-blkback: requesting a grant already in use
>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the 
>> tree
>> [ 9251.715781] xen-blkback: requesting a grant already in use
>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the 
>> tree
>> [ 9251.735698] xen-blkback: requesting a grant already in use
>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the 
>> tree
>>
>> The VM was a HVM with 4 vcpu's and 2 phy disks:
>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) 
>> persistent grants
>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) 
>> persistent grants
>>
>>
>> Currently i have been running 4.19-rc5 with xen-next on top and commit
>> a46b53672b2c reverted, for a couple of days. That seems to run stable
>> for me (since it's a small box so i'm not hit by what a46b53672b2c
>> tried to fix.
>>
>> If you can come up with a debug patch i can give that a spin tomorrow
>> evening or in the weekend, so we are hopefully still in time for the
>> 4.19 release.
> At this late in the game, might make more sense to simply revert the
> buggy commit.  Especially since what is currently out there doesn't fix
> the issue for you.
>>> Don't know if Boris or Juergen have a hunch about the issue, if not
>>> perhaps a revert is the best.
>> Anyone? Unless I hear otherwise, I'll revert the series tomorrow.
> 
> Juergen may have something to say by tomorrow, but from my perspective,
> given that we are coming up on rc6 --- yes.
> 
> I looked at the patches again and didn't see anything obvious.
> 
> -boris

Could also be that what i hit is a latent bug, 
that is not caused by these patches but merely got uncovered by them.

xl dmesg also shows quite some:
(XEN) [2018-09-24 03:15:46.847] grant_table.c:1755:d14v0 Expanding d14 
grant table from 19 to 20 frames
(XEN) [2018-09-24 03:15:46.849] grant_table.c:1755:d14v0 Expanding d14 
grant table from 20 to 21 frames
(and has done that for ages on my box not leading to any direct problems to my 
knowledge)

I don't know if there could be related and something around the (persistent) 
grants for block devices could be leaking under some conditions?

--
Sander


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer

2018-09-27 Thread Boris Ostrovsky
On 9/27/18 5:37 PM, Jens Axboe wrote:
> On 9/27/18 2:33 PM, Sander Eikelenboom wrote:
>> On 27/09/18 21:06, Boris Ostrovsky wrote:
>>> On 9/27/18 2:56 PM, Jens Axboe wrote:
 On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
> On 27/09/18 16:26, Jens Axboe wrote:
>> On 9/27/18 1:12 AM, Juergen Gross wrote:
>>> On 22/09/18 21:55, Boris Ostrovsky wrote:
 Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
 added support for purging persistent grants when they are not in use. 
 As
 part of the purge, the grants were removed from the grant buffer, This
 eventually causes the buffer to become empty, with BUG_ON triggered in
 get_free_grant(). This can be observed even on an idle system, within
 20-30 minutes.

 We should keep the grants in the buffer when purging, and only free the
 grant ref.

 Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
 Signed-off-by: Boris Ostrovsky 
>>> Reviewed-by: Juergen Gross 
>> Since Konrad is out, I'm going to queue this up for 4.19.
>>
> Hi Boris/Juergen.
>
> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch 
> from Boris pulled on top. 
> Unfortunately it made a VM hang (probably because it's rootFS is shuffled 
> from under it's feet 
>>> What do you mean by "rootFS is shuffled from under it's feet " ?
>> Assumption that block-front getting borked and either a kernel crash or 
>> rootfs becoming mounted readonly. Didn't (try) to check though.
>>
> and it gave these in dom0 dmesg:
>
> [ 9251.696090] xen-blkback: requesting a grant already in use
> [ 9251.705861] xen-blkback: trying to add a gref that's already in the 
> tree
> [ 9251.715781] xen-blkback: requesting a grant already in use
> [ 9251.725756] xen-blkback: trying to add a gref that's already in the 
> tree
> [ 9251.735698] xen-blkback: requesting a grant already in use
> [ 9251.745573] xen-blkback: trying to add a gref that's already in the 
> tree
>
> The VM was a HVM with 4 vcpu's and 2 phy disks:
> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) 
> persistent grants
> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) 
> persistent grants
>
>
> Currently i have been running 4.19-rc5 with xen-next on top and commit
> a46b53672b2c reverted, for a couple of days. That seems to run stable
> for me (since it's a small box so i'm not hit by what a46b53672b2c
> tried to fix.
>
> If you can come up with a debug patch i can give that a spin tomorrow
> evening or in the weekend, so we are hopefully still in time for the
> 4.19 release.
 At this late in the game, might make more sense to simply revert the
 buggy commit.  Especially since what is currently out there doesn't fix
 the issue for you.
>> Don't know if Boris or Juergen have a hunch about the issue, if not
>> perhaps a revert is the best.
> Anyone? Unless I hear otherwise, I'll revert the series tomorrow.

Juergen may have something to say by tomorrow, but from my perspective,
given that we are coming up on rc6 --- yes.

I looked at the patches again and didn't see anything obvious.

-boris



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer

2018-09-27 Thread Jens Axboe
On 9/27/18 2:33 PM, Sander Eikelenboom wrote:
> On 27/09/18 21:06, Boris Ostrovsky wrote:
>> On 9/27/18 2:56 PM, Jens Axboe wrote:
>>> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
 On 27/09/18 16:26, Jens Axboe wrote:
> On 9/27/18 1:12 AM, Juergen Gross wrote:
>> On 22/09/18 21:55, Boris Ostrovsky wrote:
>>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>> added support for purging persistent grants when they are not in use. As
>>> part of the purge, the grants were removed from the grant buffer, This
>>> eventually causes the buffer to become empty, with BUG_ON triggered in
>>> get_free_grant(). This can be observed even on an idle system, within
>>> 20-30 minutes.
>>>
>>> We should keep the grants in the buffer when purging, and only free the
>>> grant ref.
>>>
>>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>> Signed-off-by: Boris Ostrovsky 
>> Reviewed-by: Juergen Gross 
> Since Konrad is out, I'm going to queue this up for 4.19.
>
 Hi Boris/Juergen.

 Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch 
 from Boris pulled on top. 
 Unfortunately it made a VM hang (probably because it's rootFS is shuffled 
 from under it's feet 
>>
>> What do you mean by "rootFS is shuffled from under it's feet " ?
> 
> Assumption that block-front getting borked and either a kernel crash or 
> rootfs becoming mounted readonly. Didn't (try) to check though.
> 
 and it gave these in dom0 dmesg:

 [ 9251.696090] xen-blkback: requesting a grant already in use
 [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
 [ 9251.715781] xen-blkback: requesting a grant already in use
 [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
 [ 9251.735698] xen-blkback: requesting a grant already in use
 [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree

 The VM was a HVM with 4 vcpu's and 2 phy disks:
 xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) 
 persistent grants
 xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) 
 persistent grants


 Currently i have been running 4.19-rc5 with xen-next on top and commit
 a46b53672b2c reverted, for a couple of days. That seems to run stable
 for me (since it's a small box so i'm not hit by what a46b53672b2c
 tried to fix.

 If you can come up with a debug patch i can give that a spin tomorrow
 evening or in the weekend, so we are hopefully still in time for the
 4.19 release.
>>> At this late in the game, might make more sense to simply revert the
>>> buggy commit.  Especially since what is currently out there doesn't fix
>>> the issue for you.
>
> Don't know if Boris or Juergen have a hunch about the issue, if not
> perhaps a revert is the best.

Anyone? Unless I hear otherwise, I'll revert the series tomorrow.

-- 
Jens Axboe


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer

2018-09-27 Thread Sander Eikelenboom
On 27/09/18 21:06, Boris Ostrovsky wrote:
> On 9/27/18 2:56 PM, Jens Axboe wrote:
>> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
>>> On 27/09/18 16:26, Jens Axboe wrote:
 On 9/27/18 1:12 AM, Juergen Gross wrote:
> On 22/09/18 21:55, Boris Ostrovsky wrote:
>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>> added support for purging persistent grants when they are not in use. As
>> part of the purge, the grants were removed from the grant buffer, This
>> eventually causes the buffer to become empty, with BUG_ON triggered in
>> get_free_grant(). This can be observed even on an idle system, within
>> 20-30 minutes.
>>
>> We should keep the grants in the buffer when purging, and only free the
>> grant ref.
>>
>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>> Signed-off-by: Boris Ostrovsky 
> Reviewed-by: Juergen Gross 
 Since Konrad is out, I'm going to queue this up for 4.19.

>>> Hi Boris/Juergen.
>>>
>>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch 
>>> from Boris pulled on top. 
>>> Unfortunately it made a VM hang (probably because it's rootFS is shuffled 
>>> from under it's feet 
> 
> What do you mean by "rootFS is shuffled from under it's feet " ?

Assumption that block-front getting borked and either a kernel crash or rootfs 
becoming mounted readonly. Didn't (try) to check though.

>>> and it gave these in dom0 dmesg:
>>>
>>> [ 9251.696090] xen-blkback: requesting a grant already in use
>>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
>>> [ 9251.715781] xen-blkback: requesting a grant already in use
>>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
>>> [ 9251.735698] xen-blkback: requesting a grant already in use
>>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree
>>>
>>> The VM was a HVM with 4 vcpu's and 2 phy disks:
>>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) 
>>> persistent grants
>>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) 
>>> persistent grants
>>>
>>>
>>> Currently i have been running 4.19-rc5 with xen-next on top and commit
>>> a46b53672b2c reverted, for a couple of days. That seems to run stable
>>> for me (since it's a small box so i'm not hit by what a46b53672b2c
>>> tried to fix.
>>>
>>> If you can come up with a debug patch i can give that a spin tomorrow
>>> evening or in the weekend, so we are hopefully still in time for the
>>> 4.19 release.
>> At this late in the game, might make more sense to simply revert the
>> buggy commit.  Especially since what is currently out there doesn't fix
>> the issue for you.
Don't know if Boris or Juergen have a hunch about the issue, if not perhaps a 
revert is the best. 

> If decision is to revert then I think the whole series needs to be
> reverted.
> 
> -boris
> 

For Boris and Juergen:
Would it make sense to have an "xen-next" branch in the xen-tip tree that is:
- based on the previous stable kernel
- and has the for-linus branches for the upcoming kernel release on top;
- and has the pathes for net(-next) and block changes on top (since these don't 
go via the tree but only via mailing-list patches);
  (which are scattered, difficult to track and use for automated testing)
- and dependency patches for the above if necessary to be able to build.

So there is one branch that can be used to test ALL pending kernel related Xen 
patches and which could be used in OSStest without as
many potential false alarms as linux-next will have ?

--
Sander

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer

2018-09-27 Thread Jens Axboe
On 9/27/18 1:06 PM, Boris Ostrovsky wrote:
> On 9/27/18 2:56 PM, Jens Axboe wrote:
>> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
>>> On 27/09/18 16:26, Jens Axboe wrote:
 On 9/27/18 1:12 AM, Juergen Gross wrote:
> On 22/09/18 21:55, Boris Ostrovsky wrote:
>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>> added support for purging persistent grants when they are not in use. As
>> part of the purge, the grants were removed from the grant buffer, This
>> eventually causes the buffer to become empty, with BUG_ON triggered in
>> get_free_grant(). This can be observed even on an idle system, within
>> 20-30 minutes.
>>
>> We should keep the grants in the buffer when purging, and only free the
>> grant ref.
>>
>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>> Signed-off-by: Boris Ostrovsky 
> Reviewed-by: Juergen Gross 
 Since Konrad is out, I'm going to queue this up for 4.19.

>>> Hi Boris/Juergen.
>>>
>>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch 
>>> from Boris pulled on top. 
>>> Unfortunately it made a VM hang (probably because it's rootFS is shuffled 
>>> from under it's feet 
> 
> What do you mean by "rootFS is shuffled from under it's feet " ?
> 
>>> and it gave these in dom0 dmesg:
>>>
>>> [ 9251.696090] xen-blkback: requesting a grant already in use
>>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
>>> [ 9251.715781] xen-blkback: requesting a grant already in use
>>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
>>> [ 9251.735698] xen-blkback: requesting a grant already in use
>>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree
>>>
>>> The VM was a HVM with 4 vcpu's and 2 phy disks:
>>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) 
>>> persistent grants
>>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) 
>>> persistent grants
>>>
>>>
>>> Currently i have been running 4.19-rc5 with xen-next on top and commit
>>> a46b53672b2c reverted, for a couple of days. That seems to run stable
>>> for me (since it's a small box so i'm not hit by what a46b53672b2c
>>> tried to fix.
>>>
>>> If you can come up with a debug patch i can give that a spin tomorrow
>>> evening or in the weekend, so we are hopefully still in time for the
>>> 4.19 release.
>> At this late in the game, might make more sense to simply revert the
>> buggy commit.  Especially since what is currently out there doesn't fix
>> the issue for you.
> 
> If decision is to revert then I think the whole series needs to be
> reverted.

Yes, definitely.

-- 
Jens Axboe


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer

2018-09-27 Thread Boris Ostrovsky
On 9/27/18 2:56 PM, Jens Axboe wrote:
> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
>> On 27/09/18 16:26, Jens Axboe wrote:
>>> On 9/27/18 1:12 AM, Juergen Gross wrote:
 On 22/09/18 21:55, Boris Ostrovsky wrote:
> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
> added support for purging persistent grants when they are not in use. As
> part of the purge, the grants were removed from the grant buffer, This
> eventually causes the buffer to become empty, with BUG_ON triggered in
> get_free_grant(). This can be observed even on an idle system, within
> 20-30 minutes.
>
> We should keep the grants in the buffer when purging, and only free the
> grant ref.
>
> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
> Signed-off-by: Boris Ostrovsky 
 Reviewed-by: Juergen Gross 
>>> Since Konrad is out, I'm going to queue this up for 4.19.
>>>
>> Hi Boris/Juergen.
>>
>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from 
>> Boris pulled on top. 
>> Unfortunately it made a VM hang (probably because it's rootFS is shuffled 
>> from under it's feet 

What do you mean by "rootFS is shuffled from under it's feet " ?

>> and it gave these in dom0 dmesg:
>>
>> [ 9251.696090] xen-blkback: requesting a grant already in use
>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
>> [ 9251.715781] xen-blkback: requesting a grant already in use
>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
>> [ 9251.735698] xen-blkback: requesting a grant already in use
>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree
>>
>> The VM was a HVM with 4 vcpu's and 2 phy disks:
>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) 
>> persistent grants
>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) 
>> persistent grants
>>
>>
>> Currently i have been running 4.19-rc5 with xen-next on top and commit
>> a46b53672b2c reverted, for a couple of days. That seems to run stable
>> for me (since it's a small box so i'm not hit by what a46b53672b2c
>> tried to fix.
>>
>> If you can come up with a debug patch i can give that a spin tomorrow
>> evening or in the weekend, so we are hopefully still in time for the
>> 4.19 release.
> At this late in the game, might make more sense to simply revert the
> buggy commit.  Especially since what is currently out there doesn't fix
> the issue for you.

If decision is to revert then I think the whole series needs to be
reverted.

-boris


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer

2018-09-27 Thread Jens Axboe
On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
> On 27/09/18 16:26, Jens Axboe wrote:
>> On 9/27/18 1:12 AM, Juergen Gross wrote:
>>> On 22/09/18 21:55, Boris Ostrovsky wrote:
 Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
 added support for purging persistent grants when they are not in use. As
 part of the purge, the grants were removed from the grant buffer, This
 eventually causes the buffer to become empty, with BUG_ON triggered in
 get_free_grant(). This can be observed even on an idle system, within
 20-30 minutes.

 We should keep the grants in the buffer when purging, and only free the
 grant ref.

 Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
 Signed-off-by: Boris Ostrovsky 
>>>
>>> Reviewed-by: Juergen Gross 
>>
>> Since Konrad is out, I'm going to queue this up for 4.19.
>>
> 
> Hi Boris/Juergen.
> 
> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from 
> Boris pulled on top. 
> Unfortunately it made a VM hang (probably because it's rootFS is shuffled 
> from under it's feet 
> and it gave these in dom0 dmesg:
> 
> [ 9251.696090] xen-blkback: requesting a grant already in use
> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
> [ 9251.715781] xen-blkback: requesting a grant already in use
> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
> [ 9251.735698] xen-blkback: requesting a grant already in use
> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree
> 
> The VM was a HVM with 4 vcpu's and 2 phy disks:
> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) 
> persistent grants
> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) 
> persistent grants
> 
> 
> Currently i have been running 4.19-rc5 with xen-next on top and commit
> a46b53672b2c reverted, for a couple of days. That seems to run stable
> for me (since it's a small box so i'm not hit by what a46b53672b2c
> tried to fix.
> 
> If you can come up with a debug patch i can give that a spin tomorrow
> evening or in the weekend, so we are hopefully still in time for the
> 4.19 release.

At this late in the game, might make more sense to simply revert the
buggy commit.  Especially since what is currently out there doesn't fix
the issue for you.

-- 
Jens Axboe


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer

2018-09-27 Thread Sander Eikelenboom
On 27/09/18 16:26, Jens Axboe wrote:
> On 9/27/18 1:12 AM, Juergen Gross wrote:
>> On 22/09/18 21:55, Boris Ostrovsky wrote:
>>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>> added support for purging persistent grants when they are not in use. As
>>> part of the purge, the grants were removed from the grant buffer, This
>>> eventually causes the buffer to become empty, with BUG_ON triggered in
>>> get_free_grant(). This can be observed even on an idle system, within
>>> 20-30 minutes.
>>>
>>> We should keep the grants in the buffer when purging, and only free the
>>> grant ref.
>>>
>>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>> Signed-off-by: Boris Ostrovsky 
>>
>> Reviewed-by: Juergen Gross 
> 
> Since Konrad is out, I'm going to queue this up for 4.19.
> 

Hi Boris/Juergen.

Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from 
Boris pulled on top. 
Unfortunately it made a VM hang (probably because it's rootFS is shuffled from 
under it's feet 
and it gave these in dom0 dmesg:

[ 9251.696090] xen-blkback: requesting a grant already in use
[ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
[ 9251.715781] xen-blkback: requesting a grant already in use
[ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
[ 9251.735698] xen-blkback: requesting a grant already in use
[ 9251.745573] xen-blkback: trying to add a gref that's already in the tree

The VM was a HVM with 4 vcpu's and 2 phy disks:
xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) 
persistent grants
xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) 
persistent grants


Currently i have been running 4.19-rc5 with xen-next on top and commit 
a46b53672b2c reverted,
for a couple of days. That seems to run stable for me (since it's a small box 
so i'm not hit
by what a46b53672b2c tried to fix.

If you can come up with a debug patch i can give that a spin tomorrow evening 
or in the weekend,
so we are hopefully still in time for the 4.19 release.

--
Sander

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer

2018-09-27 Thread Jens Axboe
On 9/27/18 1:12 AM, Juergen Gross wrote:
> On 22/09/18 21:55, Boris Ostrovsky wrote:
>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>> added support for purging persistent grants when they are not in use. As
>> part of the purge, the grants were removed from the grant buffer, This
>> eventually causes the buffer to become empty, with BUG_ON triggered in
>> get_free_grant(). This can be observed even on an idle system, within
>> 20-30 minutes.
>>
>> We should keep the grants in the buffer when purging, and only free the
>> grant ref.
>>
>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>> Signed-off-by: Boris Ostrovsky 
> 
> Reviewed-by: Juergen Gross 

Since Konrad is out, I'm going to queue this up for 4.19.

-- 
Jens Axboe


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer

2018-09-27 Thread Juergen Gross
On 22/09/18 21:55, Boris Ostrovsky wrote:
> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
> added support for purging persistent grants when they are not in use. As
> part of the purge, the grants were removed from the grant buffer, This
> eventually causes the buffer to become empty, with BUG_ON triggered in
> get_free_grant(). This can be observed even on an idle system, within
> 20-30 minutes.
> 
> We should keep the grants in the buffer when purging, and only free the
> grant ref.
> 
> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
> Signed-off-by: Boris Ostrovsky 

Reviewed-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer

2018-09-22 Thread Boris Ostrovsky
Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
added support for purging persistent grants when they are not in use. As
part of the purge, the grants were removed from the grant buffer, This
eventually causes the buffer to become empty, with BUG_ON triggered in
get_free_grant(). This can be observed even on an idle system, within
20-30 minutes.

We should keep the grants in the buffer when purging, and only free the
grant ref.

Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
Signed-off-by: Boris Ostrovsky 
---
 drivers/block/xen-blkfront.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index a71d817e900d..3b441fe69c0d 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -2667,11 +2667,9 @@ static void purge_persistent_grants(struct blkfront_info 
*info)
gnttab_query_foreign_access(gnt_list_entry->gref))
continue;
 
-   list_del(&gnt_list_entry->node);
gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL);
+   gnt_list_entry->gref = GRANT_INVALID_REF;
rinfo->persistent_gnts_c--;
-   __free_page(gnt_list_entry->page);
-   kfree(gnt_list_entry);
}
 
spin_unlock_irqrestore(&rinfo->ring_lock, flags);
-- 
2.17.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel