On Mon, Apr 01, 2019 at 11:29:56AM +0300, Nikolay Borisov wrote:
> David reported that following the async cow path refactoring that got into 
> misc-next he started seeing premature OOM kills due to large page allocations 
> (order 4). This was due to now the whole async context (alongisde ancillary 
> structures, describing chunks) being allocated in a single kmalloc call. The 
> first patch of this series aims to fix that by converting the kmalloc call to 
> kvmalloc. This will have virtually no perfromance impact for small writes and 
> for larger ones will fallback to vmalloc. 
> 
> However, while testing the first patch I observed BUG_ONs in the async
> csum calculation path to trigger. This is clearly caused by the fact that 
> the first patch introduces a fairly large memory allocation. Nevertheless, 
> having large allocations in the csum path without a fallback to some sort of 
> order 0 page allocation or at least graceful handling of memory errors is 
> gross.
> Patch 2 aims to rectify this by switching the allocation in btrfs_csum_one_bio
> to using kvmalloc, giving greated chance for success. Even if patch 1 is 
> ignored
> nothing prevents the csum calculation to trigger the bug on depending on the 
> memory 
> usage/fragmentation of the server so IMO it's beneficial either ways.

For the record: the patches that introduced the preallocation + followup
cleanups have been removed from misc-next until we're sure that the
supposed change in IO submission path does not introduce further
surprises.

The potential performance impact is about to be evaluated. The error
not-handling currently depends on the kmalloc behaviour that does not
fail for small allocations, so current code is safer for users.

The preallocation provides a nicer way to handle errors early and fails
gracefully, unfortunately this is out of the safe zone of kmalloc so
this introduces a new problem.

The io submission path is probably the biggest and worst BUG_ON error
handling misuse, but also the hardest one to fix. We'll get there, but
as the prealloc patches showed we need to be careful.

Reply via email to