On 2017/03/02 14:19, Xiong Zhou wrote:
> On Wed, Mar 01, 2017 at 04:37:31PM -0800, Christoph Hellwig wrote:
>> On Wed, Mar 01, 2017 at 12:46:34PM +0800, Xiong Zhou wrote:
>>> Hi,
>>>
>>> It's reproduciable, not everytime though. Ext4 works fine.
>>
>> On ext4 fsstress won't run bulkstat because it doesn't exist.  Either
>> way this smells like a MM issue to me as there were not XFS changes
>> in that area recently.
> 
> Yap.
> 
> First bad commit:
> 
> commit 5d17a73a2ebeb8d1c6924b91e53ab2650fe86ffb
> Author: Michal Hocko <mho...@suse.com>
> Date:   Fri Feb 24 14:58:53 2017 -0800
> 
>     vmalloc: back off when the current task is killed
> 
> Reverting this commit on top of
>   e5d56ef Merge tag 'watchdog-for-linus-v4.11'
> survives the tests.
> 

Looks like kmem_zalloc_greedy() is broken.
It loops forever until vzalloc() succeeds.
If somebody (not limited to the OOM killer) sends SIGKILL and
vmalloc() backs off, kmem_zalloc_greedy() will loop forever.

----------------------------------------
void *
kmem_zalloc_greedy(size_t *size, size_t minsize, size_t maxsize)
{
        void            *ptr;
        size_t          kmsize = maxsize;

        while (!(ptr = vzalloc(kmsize))) {
                if ((kmsize >>= 1) <= minsize)
                        kmsize = minsize;
        }
        if (ptr)
                *size = kmsize;
        return ptr;
}
----------------------------------------

So, commit 5d17a73a2ebeb8d1("vmalloc: back off when the current task is
killed") implemented __GFP_KILLABLE flag and automatically applied that
flag. As a result, those who are not ready to fail upon SIGKILL are
confused. ;-)

Reply via email to