Re: [PATCH] virtio_balloon: fix deadlock on OOM
On 10/13/2017 09:21 PM, Michael S. Tsirkin wrote: fill_balloon doing memory allocations under balloon_lock can cause a deadlock when leak_balloon is called from virtballoon_oom_notify and tries to take same lock. To fix, split page allocation and enqueue and do allocations outside the lock. Here's a detailed analysis of the deadlock by Tetsuo Handa: In leak_balloon(), mutex_lock(>balloon_lock) is called in order to serialize against fill_balloon(). But in fill_balloon(), alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE] implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY is specified, this allocation attempt might indirectly depend on somebody else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect __GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via virtballoon_oom_notify() via blocking_notifier_call_chain() callback via out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it will cause OOM lockup. Thus, do not wait for vb->balloon_lock mutex if leak_balloon() is called from out_of_memory(). Thread1 Thread2 fill_balloon() takes a balloon_lock balloon_page_enqueue() alloc_page(GFP_HIGHUSER_MOVABLE) direct reclaim (__GFP_FS context) takes a fs lock waits for that fs lock alloc_page(GFP_NOFS) __alloc_pages_may_oom() takes the oom_lock out_of_memory() blocking_notifier_call_chain() leak_balloon() tries to take that balloon_lock and deadlocks Reported-by: Tetsuo HandaCc: Michal Hocko Cc: Wei Wang --- The "virtio-balloon enhancement" series has a dependency on this patch. Could you send out a new version soon? Or I can include it in the series if you want. Best, Wei
Re: [PATCH] virtio_balloon: fix deadlock on OOM
On 10/13/2017 09:21 PM, Michael S. Tsirkin wrote: fill_balloon doing memory allocations under balloon_lock can cause a deadlock when leak_balloon is called from virtballoon_oom_notify and tries to take same lock. To fix, split page allocation and enqueue and do allocations outside the lock. Here's a detailed analysis of the deadlock by Tetsuo Handa: In leak_balloon(), mutex_lock(>balloon_lock) is called in order to serialize against fill_balloon(). But in fill_balloon(), alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE] implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY is specified, this allocation attempt might indirectly depend on somebody else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect __GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via virtballoon_oom_notify() via blocking_notifier_call_chain() callback via out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it will cause OOM lockup. Thus, do not wait for vb->balloon_lock mutex if leak_balloon() is called from out_of_memory(). Thread1 Thread2 fill_balloon() takes a balloon_lock balloon_page_enqueue() alloc_page(GFP_HIGHUSER_MOVABLE) direct reclaim (__GFP_FS context) takes a fs lock waits for that fs lock alloc_page(GFP_NOFS) __alloc_pages_may_oom() takes the oom_lock out_of_memory() blocking_notifier_call_chain() leak_balloon() tries to take that balloon_lock and deadlocks Reported-by: Tetsuo Handa Cc: Michal Hocko Cc: Wei Wang --- The "virtio-balloon enhancement" series has a dependency on this patch. Could you send out a new version soon? Or I can include it in the series if you want. Best, Wei
Re: [PATCH] virtio_balloon: fix deadlock on OOM
On 10/19/2017 01:19 AM, Michael S. Tsirkin wrote: On Fri, Oct 13, 2017 at 11:06:23PM +0900, Tetsuo Handa wrote: Michael S. Tsirkin wrote: This is a replacement for [PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify() but unlike that patch it actually deflates on oom even in presence of lock contention. But Wei Wang is proposing VIRTIO_BALLOON_F_SG which will try to allocate memory, isn't he? Hopefully that can be fixed by allocating outside the lock. I think that would still have an issue even without the lock, because we can't do any memory allocation in the OOM code path. Probably, we could write a separate function, leak_balloon_oom() for the oom notifier, which puts the oom deflating pages to the vq one by one, and kick when the vq is full. In this case, we would need to stop the normal leak_balloon while oom deflating starts. However, a better optimization I think would be to do some kind of consolidation, since leak_balloon is already deflating, leak_ballon_oom can just count the number of pages that have been deflated by leak_balloon and return when it reaches oom_pages. Best, Wei
Re: [PATCH] virtio_balloon: fix deadlock on OOM
On 10/19/2017 01:19 AM, Michael S. Tsirkin wrote: On Fri, Oct 13, 2017 at 11:06:23PM +0900, Tetsuo Handa wrote: Michael S. Tsirkin wrote: This is a replacement for [PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify() but unlike that patch it actually deflates on oom even in presence of lock contention. But Wei Wang is proposing VIRTIO_BALLOON_F_SG which will try to allocate memory, isn't he? Hopefully that can be fixed by allocating outside the lock. I think that would still have an issue even without the lock, because we can't do any memory allocation in the OOM code path. Probably, we could write a separate function, leak_balloon_oom() for the oom notifier, which puts the oom deflating pages to the vq one by one, and kick when the vq is full. In this case, we would need to stop the normal leak_balloon while oom deflating starts. However, a better optimization I think would be to do some kind of consolidation, since leak_balloon is already deflating, leak_ballon_oom can just count the number of pages that have been deflated by leak_balloon and return when it reaches oom_pages. Best, Wei
Re: [PATCH] virtio_balloon: fix deadlock on OOM
On Fri, Oct 13, 2017 at 11:06:23PM +0900, Tetsuo Handa wrote: > Michael S. Tsirkin wrote: > > This is a replacement for > > [PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify() > > but unlike that patch it actually deflates on oom even in presence of > > lock contention. > > But Wei Wang is proposing VIRTIO_BALLOON_F_SG which will try to allocate > memory, isn't he? Hopefully that can be fixed by allocating outside the lock. > > > > drivers/virtio/virtio_balloon.c| 30 ++ > > include/linux/balloon_compaction.h | 38 > > +- > > mm/balloon_compaction.c| 27 +-- > > 3 files changed, 80 insertions(+), 15 deletions(-) > > > > diff --git a/drivers/virtio/virtio_balloon.c > > b/drivers/virtio/virtio_balloon.c > > index f0b3a0b..725e366 100644 > > --- a/drivers/virtio/virtio_balloon.c > > +++ b/drivers/virtio/virtio_balloon.c > > @@ -143,16 +143,14 @@ static void set_page_pfns(struct virtio_balloon *vb, > > > > static unsigned fill_balloon(struct virtio_balloon *vb, size_t num) > > { > > - struct balloon_dev_info *vb_dev_info = >vb_dev_info; > > unsigned num_allocated_pages; > > + unsigned num_pfns; > > + struct page *page; > > + LIST_HEAD(pages); > > > > - /* We can only do one array worth at a time. */ > > - num = min(num, ARRAY_SIZE(vb->pfns)); > > - > > I don't think moving this min() to later is correct, for > "num" can be e.g. 1048576, can't it? Good catch, will fix. Thanks! > > - mutex_lock(>balloon_lock); > > - for (vb->num_pfns = 0; vb->num_pfns < num; > > -vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { > > - struct page *page = balloon_page_enqueue(vb_dev_info); > > + for (num_pfns = 0; num_pfns < num; > > +num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { > > + struct page *page = balloon_page_alloc(); > > > > if (!page) { > > dev_info_ratelimited(>vdev->dev, > > @@ -162,6 +160,22 @@ static unsigned fill_balloon(struct virtio_balloon > > *vb, size_t num) > > msleep(200); > > break; > > } > > + > > + balloon_page_push(, page); > > + } > > If balloon_page_alloc() did not fail, it will queue "num" > (e.g. 1048576) pages into pages list, won't it? > > > + > > + /* We can only do one array worth at a time. */ > > + num = min(num, ARRAY_SIZE(vb->pfns)); > > + > > Now we cap "num" to VIRTIO_BALLOON_ARRAY_PFNS_MAX (which is 256), but > > > + mutex_lock(>balloon_lock); > > + > > + vb->num_pfns = 0; > > + > > + while ((page = balloon_page_pop())) { > > this loop will repeat for e.g. 1048576 times, and > > > + balloon_page_enqueue(>vb_dev_info, page); > > + > > + vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE; > > + > > we increment vb->num_pfns for e.g. 1048576 times which will go > beyond VIRTIO_BALLOON_ARRAY_PFNS_MAX array index. > > > set_page_pfns(vb, vb->pfns + vb->num_pfns, page); > > vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE; > > if (!virtio_has_feature(vb->vdev,
Re: [PATCH] virtio_balloon: fix deadlock on OOM
On Fri, Oct 13, 2017 at 11:06:23PM +0900, Tetsuo Handa wrote: > Michael S. Tsirkin wrote: > > This is a replacement for > > [PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify() > > but unlike that patch it actually deflates on oom even in presence of > > lock contention. > > But Wei Wang is proposing VIRTIO_BALLOON_F_SG which will try to allocate > memory, isn't he? Hopefully that can be fixed by allocating outside the lock. > > > > drivers/virtio/virtio_balloon.c| 30 ++ > > include/linux/balloon_compaction.h | 38 > > +- > > mm/balloon_compaction.c| 27 +-- > > 3 files changed, 80 insertions(+), 15 deletions(-) > > > > diff --git a/drivers/virtio/virtio_balloon.c > > b/drivers/virtio/virtio_balloon.c > > index f0b3a0b..725e366 100644 > > --- a/drivers/virtio/virtio_balloon.c > > +++ b/drivers/virtio/virtio_balloon.c > > @@ -143,16 +143,14 @@ static void set_page_pfns(struct virtio_balloon *vb, > > > > static unsigned fill_balloon(struct virtio_balloon *vb, size_t num) > > { > > - struct balloon_dev_info *vb_dev_info = >vb_dev_info; > > unsigned num_allocated_pages; > > + unsigned num_pfns; > > + struct page *page; > > + LIST_HEAD(pages); > > > > - /* We can only do one array worth at a time. */ > > - num = min(num, ARRAY_SIZE(vb->pfns)); > > - > > I don't think moving this min() to later is correct, for > "num" can be e.g. 1048576, can't it? Good catch, will fix. Thanks! > > - mutex_lock(>balloon_lock); > > - for (vb->num_pfns = 0; vb->num_pfns < num; > > -vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { > > - struct page *page = balloon_page_enqueue(vb_dev_info); > > + for (num_pfns = 0; num_pfns < num; > > +num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { > > + struct page *page = balloon_page_alloc(); > > > > if (!page) { > > dev_info_ratelimited(>vdev->dev, > > @@ -162,6 +160,22 @@ static unsigned fill_balloon(struct virtio_balloon > > *vb, size_t num) > > msleep(200); > > break; > > } > > + > > + balloon_page_push(, page); > > + } > > If balloon_page_alloc() did not fail, it will queue "num" > (e.g. 1048576) pages into pages list, won't it? > > > + > > + /* We can only do one array worth at a time. */ > > + num = min(num, ARRAY_SIZE(vb->pfns)); > > + > > Now we cap "num" to VIRTIO_BALLOON_ARRAY_PFNS_MAX (which is 256), but > > > + mutex_lock(>balloon_lock); > > + > > + vb->num_pfns = 0; > > + > > + while ((page = balloon_page_pop())) { > > this loop will repeat for e.g. 1048576 times, and > > > + balloon_page_enqueue(>vb_dev_info, page); > > + > > + vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE; > > + > > we increment vb->num_pfns for e.g. 1048576 times which will go > beyond VIRTIO_BALLOON_ARRAY_PFNS_MAX array index. > > > set_page_pfns(vb, vb->pfns + vb->num_pfns, page); > > vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE; > > if (!virtio_has_feature(vb->vdev,
Re: [PATCH] virtio_balloon: fix deadlock on OOM
Michael S. Tsirkin wrote: > This is a replacement for > [PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify() > but unlike that patch it actually deflates on oom even in presence of > lock contention. But Wei Wang is proposing VIRTIO_BALLOON_F_SG which will try to allocate memory, isn't he? > > drivers/virtio/virtio_balloon.c| 30 ++ > include/linux/balloon_compaction.h | 38 > +- > mm/balloon_compaction.c| 27 +-- > 3 files changed, 80 insertions(+), 15 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index f0b3a0b..725e366 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -143,16 +143,14 @@ static void set_page_pfns(struct virtio_balloon *vb, > > static unsigned fill_balloon(struct virtio_balloon *vb, size_t num) > { > - struct balloon_dev_info *vb_dev_info = >vb_dev_info; > unsigned num_allocated_pages; > + unsigned num_pfns; > + struct page *page; > + LIST_HEAD(pages); > > - /* We can only do one array worth at a time. */ > - num = min(num, ARRAY_SIZE(vb->pfns)); > - I don't think moving this min() to later is correct, for "num" can be e.g. 1048576, can't it? > - mutex_lock(>balloon_lock); > - for (vb->num_pfns = 0; vb->num_pfns < num; > - vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { > - struct page *page = balloon_page_enqueue(vb_dev_info); > + for (num_pfns = 0; num_pfns < num; > + num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { > + struct page *page = balloon_page_alloc(); > > if (!page) { > dev_info_ratelimited(>vdev->dev, > @@ -162,6 +160,22 @@ static unsigned fill_balloon(struct virtio_balloon *vb, > size_t num) > msleep(200); > break; > } > + > + balloon_page_push(, page); > + } If balloon_page_alloc() did not fail, it will queue "num" (e.g. 1048576) pages into pages list, won't it? > + > + /* We can only do one array worth at a time. */ > + num = min(num, ARRAY_SIZE(vb->pfns)); > + Now we cap "num" to VIRTIO_BALLOON_ARRAY_PFNS_MAX (which is 256), but > + mutex_lock(>balloon_lock); > + > + vb->num_pfns = 0; > + > + while ((page = balloon_page_pop())) { this loop will repeat for e.g. 1048576 times, and > + balloon_page_enqueue(>vb_dev_info, page); > + > + vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE; > + we increment vb->num_pfns for e.g. 1048576 times which will go beyond VIRTIO_BALLOON_ARRAY_PFNS_MAX array index. > set_page_pfns(vb, vb->pfns + vb->num_pfns, page); > vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE; > if (!virtio_has_feature(vb->vdev,
Re: [PATCH] virtio_balloon: fix deadlock on OOM
Michael S. Tsirkin wrote: > This is a replacement for > [PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify() > but unlike that patch it actually deflates on oom even in presence of > lock contention. But Wei Wang is proposing VIRTIO_BALLOON_F_SG which will try to allocate memory, isn't he? > > drivers/virtio/virtio_balloon.c| 30 ++ > include/linux/balloon_compaction.h | 38 > +- > mm/balloon_compaction.c| 27 +-- > 3 files changed, 80 insertions(+), 15 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index f0b3a0b..725e366 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -143,16 +143,14 @@ static void set_page_pfns(struct virtio_balloon *vb, > > static unsigned fill_balloon(struct virtio_balloon *vb, size_t num) > { > - struct balloon_dev_info *vb_dev_info = >vb_dev_info; > unsigned num_allocated_pages; > + unsigned num_pfns; > + struct page *page; > + LIST_HEAD(pages); > > - /* We can only do one array worth at a time. */ > - num = min(num, ARRAY_SIZE(vb->pfns)); > - I don't think moving this min() to later is correct, for "num" can be e.g. 1048576, can't it? > - mutex_lock(>balloon_lock); > - for (vb->num_pfns = 0; vb->num_pfns < num; > - vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { > - struct page *page = balloon_page_enqueue(vb_dev_info); > + for (num_pfns = 0; num_pfns < num; > + num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { > + struct page *page = balloon_page_alloc(); > > if (!page) { > dev_info_ratelimited(>vdev->dev, > @@ -162,6 +160,22 @@ static unsigned fill_balloon(struct virtio_balloon *vb, > size_t num) > msleep(200); > break; > } > + > + balloon_page_push(, page); > + } If balloon_page_alloc() did not fail, it will queue "num" (e.g. 1048576) pages into pages list, won't it? > + > + /* We can only do one array worth at a time. */ > + num = min(num, ARRAY_SIZE(vb->pfns)); > + Now we cap "num" to VIRTIO_BALLOON_ARRAY_PFNS_MAX (which is 256), but > + mutex_lock(>balloon_lock); > + > + vb->num_pfns = 0; > + > + while ((page = balloon_page_pop())) { this loop will repeat for e.g. 1048576 times, and > + balloon_page_enqueue(>vb_dev_info, page); > + > + vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE; > + we increment vb->num_pfns for e.g. 1048576 times which will go beyond VIRTIO_BALLOON_ARRAY_PFNS_MAX array index. > set_page_pfns(vb, vb->pfns + vb->num_pfns, page); > vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE; > if (!virtio_has_feature(vb->vdev,
Re: [PATCH] virtio_balloon: fix deadlock on OOM
On Fri 13-10-17 16:21:22, Michael S. Tsirkin wrote: > fill_balloon doing memory allocations under balloon_lock > can cause a deadlock when leak_balloon is called from > virtballoon_oom_notify and tries to take same lock. > > To fix, split page allocation and enqueue and do allocations outside the lock. OK, that sounds like a better fix. As long as there are no other allocations or indirect waiting for an allocation this should work correctly. Thanks! > Here's a detailed analysis of the deadlock by Tetsuo Handa: > > In leak_balloon(), mutex_lock(>balloon_lock) is called in order to > serialize against fill_balloon(). But in fill_balloon(), > alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is > called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE] > implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY > is specified, this allocation attempt might indirectly depend on somebody > else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect > __GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via > virtballoon_oom_notify() via blocking_notifier_call_chain() callback via > out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock > mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it > will cause OOM lockup. Thus, do not wait for vb->balloon_lock mutex if > leak_balloon() is called from out_of_memory(). > > Thread1 Thread2 > fill_balloon() > takes a balloon_lock > balloon_page_enqueue() > alloc_page(GFP_HIGHUSER_MOVABLE) > direct reclaim (__GFP_FS context) takes a fs lock > waits for that fs lock alloc_page(GFP_NOFS) > __alloc_pages_may_oom() > takes the oom_lock > out_of_memory() > > blocking_notifier_call_chain() > leak_balloon() > tries to take > that balloon_lock and deadlocks > > Reported-by: Tetsuo Handa> Cc: Michal Hocko > Cc: Wei Wang > --- > > This is a replacement for > [PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify() > but unlike that patch it actually deflates on oom even in presence of > lock contention. > > drivers/virtio/virtio_balloon.c| 30 ++ > include/linux/balloon_compaction.h | 38 > +- > mm/balloon_compaction.c| 27 +-- > 3 files changed, 80 insertions(+), 15 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index f0b3a0b..725e366 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -143,16 +143,14 @@ static void set_page_pfns(struct virtio_balloon *vb, > > static unsigned fill_balloon(struct virtio_balloon *vb, size_t num) > { > - struct balloon_dev_info *vb_dev_info = >vb_dev_info; > unsigned num_allocated_pages; > + unsigned num_pfns; > + struct page *page; > + LIST_HEAD(pages); > > - /* We can only do one array worth at a time. */ > - num = min(num, ARRAY_SIZE(vb->pfns)); > - > - mutex_lock(>balloon_lock); > - for (vb->num_pfns = 0; vb->num_pfns < num; > - vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { > - struct page *page = balloon_page_enqueue(vb_dev_info); > + for (num_pfns = 0; num_pfns < num; > + num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { > + struct page *page = balloon_page_alloc(); > > if (!page) { > dev_info_ratelimited(>vdev->dev, > @@ -162,6 +160,22 @@ static unsigned fill_balloon(struct virtio_balloon *vb, > size_t num) > msleep(200); > break; > } > + > + balloon_page_push(, page); > + } > + > + /* We can only do one array worth at a time. */ > + num = min(num, ARRAY_SIZE(vb->pfns)); > + > + mutex_lock(>balloon_lock); > + > + vb->num_pfns = 0; > + > + while ((page = balloon_page_pop())) { > + balloon_page_enqueue(>vb_dev_info, page); > + > + vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE; > + > set_page_pfns(vb, vb->pfns + vb->num_pfns, page); > vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE; > if (!virtio_has_feature(vb->vdev, > diff --git a/include/linux/balloon_compaction.h > b/include/linux/balloon_compaction.h > index 79542b2..88cfac4 100644 > --- a/include/linux/balloon_compaction.h > +++
Re: [PATCH] virtio_balloon: fix deadlock on OOM
On Fri 13-10-17 16:21:22, Michael S. Tsirkin wrote: > fill_balloon doing memory allocations under balloon_lock > can cause a deadlock when leak_balloon is called from > virtballoon_oom_notify and tries to take same lock. > > To fix, split page allocation and enqueue and do allocations outside the lock. OK, that sounds like a better fix. As long as there are no other allocations or indirect waiting for an allocation this should work correctly. Thanks! > Here's a detailed analysis of the deadlock by Tetsuo Handa: > > In leak_balloon(), mutex_lock(>balloon_lock) is called in order to > serialize against fill_balloon(). But in fill_balloon(), > alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is > called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE] > implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY > is specified, this allocation attempt might indirectly depend on somebody > else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect > __GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via > virtballoon_oom_notify() via blocking_notifier_call_chain() callback via > out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock > mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it > will cause OOM lockup. Thus, do not wait for vb->balloon_lock mutex if > leak_balloon() is called from out_of_memory(). > > Thread1 Thread2 > fill_balloon() > takes a balloon_lock > balloon_page_enqueue() > alloc_page(GFP_HIGHUSER_MOVABLE) > direct reclaim (__GFP_FS context) takes a fs lock > waits for that fs lock alloc_page(GFP_NOFS) > __alloc_pages_may_oom() > takes the oom_lock > out_of_memory() > > blocking_notifier_call_chain() > leak_balloon() > tries to take > that balloon_lock and deadlocks > > Reported-by: Tetsuo Handa > Cc: Michal Hocko > Cc: Wei Wang > --- > > This is a replacement for > [PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify() > but unlike that patch it actually deflates on oom even in presence of > lock contention. > > drivers/virtio/virtio_balloon.c| 30 ++ > include/linux/balloon_compaction.h | 38 > +- > mm/balloon_compaction.c| 27 +-- > 3 files changed, 80 insertions(+), 15 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index f0b3a0b..725e366 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -143,16 +143,14 @@ static void set_page_pfns(struct virtio_balloon *vb, > > static unsigned fill_balloon(struct virtio_balloon *vb, size_t num) > { > - struct balloon_dev_info *vb_dev_info = >vb_dev_info; > unsigned num_allocated_pages; > + unsigned num_pfns; > + struct page *page; > + LIST_HEAD(pages); > > - /* We can only do one array worth at a time. */ > - num = min(num, ARRAY_SIZE(vb->pfns)); > - > - mutex_lock(>balloon_lock); > - for (vb->num_pfns = 0; vb->num_pfns < num; > - vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { > - struct page *page = balloon_page_enqueue(vb_dev_info); > + for (num_pfns = 0; num_pfns < num; > + num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { > + struct page *page = balloon_page_alloc(); > > if (!page) { > dev_info_ratelimited(>vdev->dev, > @@ -162,6 +160,22 @@ static unsigned fill_balloon(struct virtio_balloon *vb, > size_t num) > msleep(200); > break; > } > + > + balloon_page_push(, page); > + } > + > + /* We can only do one array worth at a time. */ > + num = min(num, ARRAY_SIZE(vb->pfns)); > + > + mutex_lock(>balloon_lock); > + > + vb->num_pfns = 0; > + > + while ((page = balloon_page_pop())) { > + balloon_page_enqueue(>vb_dev_info, page); > + > + vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE; > + > set_page_pfns(vb, vb->pfns + vb->num_pfns, page); > vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE; > if (!virtio_has_feature(vb->vdev, > diff --git a/include/linux/balloon_compaction.h > b/include/linux/balloon_compaction.h > index 79542b2..88cfac4 100644 > --- a/include/linux/balloon_compaction.h > +++ b/include/linux/balloon_compaction.h > @@ -49,6 +49,7 @@ > #include > #include > #include >
[PATCH] virtio_balloon: fix deadlock on OOM
fill_balloon doing memory allocations under balloon_lock can cause a deadlock when leak_balloon is called from virtballoon_oom_notify and tries to take same lock. To fix, split page allocation and enqueue and do allocations outside the lock. Here's a detailed analysis of the deadlock by Tetsuo Handa: In leak_balloon(), mutex_lock(>balloon_lock) is called in order to serialize against fill_balloon(). But in fill_balloon(), alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE] implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY is specified, this allocation attempt might indirectly depend on somebody else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect __GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via virtballoon_oom_notify() via blocking_notifier_call_chain() callback via out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it will cause OOM lockup. Thus, do not wait for vb->balloon_lock mutex if leak_balloon() is called from out_of_memory(). Thread1 Thread2 fill_balloon() takes a balloon_lock balloon_page_enqueue() alloc_page(GFP_HIGHUSER_MOVABLE) direct reclaim (__GFP_FS context) takes a fs lock waits for that fs lock alloc_page(GFP_NOFS) __alloc_pages_may_oom() takes the oom_lock out_of_memory() blocking_notifier_call_chain() leak_balloon() tries to take that balloon_lock and deadlocks Reported-by: Tetsuo HandaCc: Michal Hocko Cc: Wei Wang --- This is a replacement for [PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify() but unlike that patch it actually deflates on oom even in presence of lock contention. drivers/virtio/virtio_balloon.c| 30 ++ include/linux/balloon_compaction.h | 38 +- mm/balloon_compaction.c| 27 +-- 3 files changed, 80 insertions(+), 15 deletions(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index f0b3a0b..725e366 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -143,16 +143,14 @@ static void set_page_pfns(struct virtio_balloon *vb, static unsigned fill_balloon(struct virtio_balloon *vb, size_t num) { - struct balloon_dev_info *vb_dev_info = >vb_dev_info; unsigned num_allocated_pages; + unsigned num_pfns; + struct page *page; + LIST_HEAD(pages); - /* We can only do one array worth at a time. */ - num = min(num, ARRAY_SIZE(vb->pfns)); - - mutex_lock(>balloon_lock); - for (vb->num_pfns = 0; vb->num_pfns < num; -vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { - struct page *page = balloon_page_enqueue(vb_dev_info); + for (num_pfns = 0; num_pfns < num; +num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { + struct page *page = balloon_page_alloc(); if (!page) { dev_info_ratelimited(>vdev->dev, @@ -162,6 +160,22 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num) msleep(200); break; } + + balloon_page_push(, page); + } + + /* We can only do one array worth at a time. */ + num = min(num, ARRAY_SIZE(vb->pfns)); + + mutex_lock(>balloon_lock); + + vb->num_pfns = 0; + + while ((page = balloon_page_pop())) { + balloon_page_enqueue(>vb_dev_info, page); + + vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE; + set_page_pfns(vb, vb->pfns + vb->num_pfns, page); vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE; if (!virtio_has_feature(vb->vdev, diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h index 79542b2..88cfac4 100644 --- a/include/linux/balloon_compaction.h +++ b/include/linux/balloon_compaction.h @@ -49,6 +49,7 @@ #include #include #include +#include /* * Balloon device information descriptor. @@ -66,9 +67,14 @@ struct balloon_dev_info { struct inode *inode; }; -extern struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info); +extern struct page *balloon_page_alloc(void); +extern void
[PATCH] virtio_balloon: fix deadlock on OOM
fill_balloon doing memory allocations under balloon_lock can cause a deadlock when leak_balloon is called from virtballoon_oom_notify and tries to take same lock. To fix, split page allocation and enqueue and do allocations outside the lock. Here's a detailed analysis of the deadlock by Tetsuo Handa: In leak_balloon(), mutex_lock(>balloon_lock) is called in order to serialize against fill_balloon(). But in fill_balloon(), alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE] implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY is specified, this allocation attempt might indirectly depend on somebody else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect __GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via virtballoon_oom_notify() via blocking_notifier_call_chain() callback via out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it will cause OOM lockup. Thus, do not wait for vb->balloon_lock mutex if leak_balloon() is called from out_of_memory(). Thread1 Thread2 fill_balloon() takes a balloon_lock balloon_page_enqueue() alloc_page(GFP_HIGHUSER_MOVABLE) direct reclaim (__GFP_FS context) takes a fs lock waits for that fs lock alloc_page(GFP_NOFS) __alloc_pages_may_oom() takes the oom_lock out_of_memory() blocking_notifier_call_chain() leak_balloon() tries to take that balloon_lock and deadlocks Reported-by: Tetsuo Handa Cc: Michal Hocko Cc: Wei Wang --- This is a replacement for [PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify() but unlike that patch it actually deflates on oom even in presence of lock contention. drivers/virtio/virtio_balloon.c| 30 ++ include/linux/balloon_compaction.h | 38 +- mm/balloon_compaction.c| 27 +-- 3 files changed, 80 insertions(+), 15 deletions(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index f0b3a0b..725e366 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -143,16 +143,14 @@ static void set_page_pfns(struct virtio_balloon *vb, static unsigned fill_balloon(struct virtio_balloon *vb, size_t num) { - struct balloon_dev_info *vb_dev_info = >vb_dev_info; unsigned num_allocated_pages; + unsigned num_pfns; + struct page *page; + LIST_HEAD(pages); - /* We can only do one array worth at a time. */ - num = min(num, ARRAY_SIZE(vb->pfns)); - - mutex_lock(>balloon_lock); - for (vb->num_pfns = 0; vb->num_pfns < num; -vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { - struct page *page = balloon_page_enqueue(vb_dev_info); + for (num_pfns = 0; num_pfns < num; +num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { + struct page *page = balloon_page_alloc(); if (!page) { dev_info_ratelimited(>vdev->dev, @@ -162,6 +160,22 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num) msleep(200); break; } + + balloon_page_push(, page); + } + + /* We can only do one array worth at a time. */ + num = min(num, ARRAY_SIZE(vb->pfns)); + + mutex_lock(>balloon_lock); + + vb->num_pfns = 0; + + while ((page = balloon_page_pop())) { + balloon_page_enqueue(>vb_dev_info, page); + + vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE; + set_page_pfns(vb, vb->pfns + vb->num_pfns, page); vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE; if (!virtio_has_feature(vb->vdev, diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h index 79542b2..88cfac4 100644 --- a/include/linux/balloon_compaction.h +++ b/include/linux/balloon_compaction.h @@ -49,6 +49,7 @@ #include #include #include +#include /* * Balloon device information descriptor. @@ -66,9 +67,14 @@ struct balloon_dev_info { struct inode *inode; }; -extern struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info); +extern struct page *balloon_page_alloc(void); +extern void balloon_page_enqueue(struct balloon_dev_info *b_dev_info, +