On 28.03.19 г. 16:11 ч., David Sterba wrote:
> On Thu, Mar 28, 2019 at 02:49:30PM +0200, Nikolay Borisov wrote:
>>
>>
>> On 27.03.19 г. 19:23 ч., David Sterba wrote:
>>> On Tue, Mar 12, 2019 at 05:20:24PM +0200, Nikolay Borisov wrote:
>>>> @@ -1190,45 +1201,71 @@ static int cow_file_range_async(struct inode
>>>> *inode, struct page *locked_page,
>>>> unsigned int write_flags)
>>>> {
>>>> struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>>>> - struct async_cow *async_cow;
>>>> + struct async_cow *ctx;
>>>> + struct async_chunk *async_chunk;
>>>> unsigned long nr_pages;
>>>> u64 cur_end;
>>>> + u64 num_chunks = DIV_ROUND_UP(end - start, SZ_512K);
>>>> + int i;
>>>> + bool should_compress;
>>>>
>>>> clear_extent_bit(&BTRFS_I(inode)->io_tree, start, end, EXTENT_LOCKED,
>>>> 1, 0, NULL);
>>>> - while (start < end) {
>>>> - async_cow = kmalloc(sizeof(*async_cow), GFP_NOFS);
>>>> - BUG_ON(!async_cow); /* -ENOMEM */
>>>> +
>>>> + if (BTRFS_I(inode)->flags & BTRFS_INODE_NOCOMPRESS &&
>>>> + !btrfs_test_opt(fs_info, FORCE_COMPRESS)) {
>>>> + num_chunks = 1;
>>>> + should_compress = false;
>>>> + } else {
>>>> + should_compress = true;
>>>> + }
>>>> +
>>>> + ctx = kmalloc(struct_size(ctx, chunks, num_chunks), GFP_NOFS);
>>>
>>> This leads to OOM due to high order allocation. And this is worse than
>>> the previous state, where there are many small allocation that could
>>> potentially fail (but most likely will not due to GFP_NOSF and size <
>>> PAGE_SIZE).
>>>
>>> So this needs to be reworked to avoid the costly allocations or reverted
>>> to the previous state.
>>
>> Right, makes sense. In order to have a simplified submission logic I
>> think to rework the allocation to have a loop that allocates a single
>> item for every chunk or alternatively switch to using kvmalloc? I think
>> the fact that vmalloced memory might not be contiguous is not critical
>> for the metadata structures in this case?
>
> kvmalloc would work here, though there is some overhead involved
> compared to bare kmalloc (the mappings). As the call happens on writeout
> path, I'd be cautious about unnecessary overhead.
>
> If looping over the range works, then we can allocate the largest size
> that does not require kvmalloc (PAGE_ALLOC_COSTLY_ORDER) and then reuse
> it for each iteration.
I'm afraid I don't understand your hybrid suggestion. Ie something along
the lines of :
if (struct_size(ctx, chunks, num_chunks) < ((2 << PAGE_ALLOC_COSTLY_ORDER) *
PAGE_SIZE)) {
use kmalloc
} else {
????
}
>