Re: [PATCH] btrfs: trim: fix range start validity check

Qu Wenruo Thu, 08 Aug 2019 01:42:13 -0700


On 2019/8/8 下午4:34, Anand Jain wrote:
> 
> 
>> On 8 Aug 2019, at 1:55 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>>
>> [...]
>>>>>
>>>>> Fundamentally, logical address space has no relevance in the user context,
>>>>> so also I don’t understand your view on how anyone shall use the 
>>>>> range::start
>>>>> even if there is no check?
>>>>
>>>> range::start == bg_bytenr, range::len = bg_len to trim only a bg.
>>>>
>>>
>>> Thanks for the efforts in explaining.
>>>
>>> My point is- it should not be one single bg but it should rather be all 
>>> bg(s) in the specified range [start, length] so the %range.start=0 and 
>>> %range.length=<U64MAX/total_bytes> should trim all the bg(s).
>>
>> That's the common usage, but it doesn't mean it's the only usage.
> 
>  Oh you are right. man page doesn’t restrict range.start to be within 
> super_total_bytes. It's only generic/260 that it is trying to enforce.


This reminds me again about the test case update of generic/260.

It looks like my previous update is not yet merged...

Thanks,
Qu

> 
>> Above bg range trim is also a valid use case.
>>
>>> May be your next question is- as we relocate the chunks how would the user 
>>> ever know correct range.start to use? for which I don’t have an answer and 
>>> the same question again applies to your proposal range.start=[0 to U64MAX] 
>>> as well.
>>>
>>> So I am asking you again, even if you allow range.start=[0 to U64MAX] how 
>>> will the user use it? Can you please explain?
>>
>> There are a lot of tools to show the bg bytenr and usage of each bg.
>> It isn't a problem at all.
>>
> 
> External tools sounds better than some logic within kernel to perform such a 
> transformations. Now I get your idea. My bad.
> 
> I am withdrawing this patch.
> 
> Thanks, Anand
> 
>>>
>>>
>>>> And that bg_bytenr is at 128T, since the fs has gone through several
>>>> balance.
>>>> But there is only one device, and its size is only 1T.
>>>>
>>>> Please tell me how to trim that block group only?
>>>>
>>>
>>> Block groups are something internal the users don’t have to worry about it. 
>>> The range is [0 to totalbytes] for start and [0 to U64MAX] for len is fair.
>>
>> Nope, users sometimes care. Especially for the usage of each bg.
>>
>> Furthermore, we have vusage/vrange filter for balance, so user is not
>> blocked from the whole bg thing.
>>
>>>
>>>>>
>>>>> As in the man page it's ok to adjust the range internally, and as length
>>>>> can be up to U64MAX we can still trim beyond super_total_bytes?
>>>>
>>>> As I said already, super_total_bytes makes no sense in logical address
>>>> space.
>>>
>>> But super_total_bytes makes sense in the user land though, on the other 
>>> hand logical address space which you are trying to expose to the user land 
>>> does not make sense to me.
>>
>> Nope, super_total_bytes in fact makes no sense under most cases.
>> It doesn't even shows the up limit of usable space. (E.g. For all RADI1
>> profiles, it's only half the space at most. Even for all SINGLE
>> profiles, it doesn't account the 1M reserved space).
>>
>> It's a good thing to detect device list corruption, but despite that, it
>> really doesn't make much sense.
>>
>> For logical address space, as explains, we have tools (not in
>> btrfs-progs though) and interface (balance vrange filter) to take use of
>> them.
>>
>>>
>>>> As super_total_bytes is just the sum of all devices, it's a device layer
>>>> thing, nothing to do with logical address space.
>>>>
>>>> You're mixing logical bytenr with something not even a device physical
>>>> offset, how can it be correct?
>>>>
>>>> Let me make it more clear, btrfs has its own logical address space
>>>> unrelated to whatever the devices mapping are.
>>>> It's always [0, U64_MAX], no matter how many devices there are.
>>>>
>>>> If btrfs is implemented using dm, it should be more clear.
>>>>
>>>> (single device btrfs)
>>>>         |
>>>> (dm linear, 0 ~ U64_MAX, virtual devices)<- that's logical address space
>>>> |   |   |    |
>>>> |   |   |    \- (dm raid1, 1T ~ 1T + 128M, devid1 XXX, devid2 XXX)
>>>> |   |   \------ (dm raid0, 2T ~ 2T + 1G, devid1 XXX, devid2 XXX)
>>>> |   \---------- (dm raid1, 128G ~ 128G + 128M, devi1 XXX, devid xxx)
>>>> \-------------- (dm raid0, 1M ~ 1M + 1G, devid1 xxx, devid2 xxx).
>>>>
>>>> If we're trim such fs layout, you tell me which offset you should use.
>>>>
>>>
>>> There is no perfect solution, the nearest solution I can think - map 
>>> range.start and range.len to the physical disk range and search and discard 
>>> free spaces in that range.
>>
>> Nope, that's way worse than current behavior.
>> See the above example, how did you pass devid? Above case is using RAID0
>> and RAID1 on two devices, how do you handle that?
>> Furthermore, btrfs can have different devices sizes for RAID profiles,
>> how to handle that them? Using super total bytes would easily exceed
>> every devices boundary.
>>
>> Yes, the current behavior is not the perfect solution either, but you're
>> attacking from the wrong direction.
>> In fact, for allocated bgs, the current behavior is the best solution,
>> you can choose to trim any range and you have the tool like Hans'
>> python-btrfs.
>>
>> The not-so-perfect part is about the unallocated range.
>> IIRC things like thin-provision LVM choose not to trim the unallocated
>> part, while btrfs choose to trim all the unallocated part.
>>
>> If you're arguing how btrfs handles unallocated space, I have no word to
>> defend at all. But for the logical address part? I can't have more words
>> to spare.
>>
>> Thanks,
>> Qu
>>
>>> This idea may be ok for raids/linear profiles, but again as btrfs can 
>>> relocate the chunks its not perfect.
>>>
>>> Thanks, Anand
>>>
>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>
>>>>> Thanks, Anand
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>
>>>>>>>
>>>>>>> The change log is also vague to me, doesn't explain why you are
>>>>>>> re-adding that check.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>>>
>>>>>>>>      /*
>>>>>>>>       * NOTE: Don't truncate the range using super->total_bytes.  
>>>>>>>> Bytenr of
>>>>>>>> --
>>>>>>>> 2.21.0 (Apple Git-120)
>

signature.asc
Description: OpenPGP digital signature

Re: [PATCH] btrfs: trim: fix range start validity check

Reply via email to