On 2019/8/8 下午4:34, Anand Jain wrote: > > >> On 8 Aug 2019, at 1:55 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: >> >> [...] >>>>> >>>>> Fundamentally, logical address space has no relevance in the user context, >>>>> so also I don’t understand your view on how anyone shall use the >>>>> range::start >>>>> even if there is no check? >>>> >>>> range::start == bg_bytenr, range::len = bg_len to trim only a bg. >>>> >>> >>> Thanks for the efforts in explaining. >>> >>> My point is- it should not be one single bg but it should rather be all >>> bg(s) in the specified range [start, length] so the %range.start=0 and >>> %range.length=<U64MAX/total_bytes> should trim all the bg(s). >> >> That's the common usage, but it doesn't mean it's the only usage. > > Oh you are right. man page doesn’t restrict range.start to be within > super_total_bytes. It's only generic/260 that it is trying to enforce.
This reminds me again about the test case update of generic/260. It looks like my previous update is not yet merged... Thanks, Qu > >> Above bg range trim is also a valid use case. >> >>> May be your next question is- as we relocate the chunks how would the user >>> ever know correct range.start to use? for which I don’t have an answer and >>> the same question again applies to your proposal range.start=[0 to U64MAX] >>> as well. >>> >>> So I am asking you again, even if you allow range.start=[0 to U64MAX] how >>> will the user use it? Can you please explain? >> >> There are a lot of tools to show the bg bytenr and usage of each bg. >> It isn't a problem at all. >> > > External tools sounds better than some logic within kernel to perform such a > transformations. Now I get your idea. My bad. > > I am withdrawing this patch. > > Thanks, Anand > >>> >>> >>>> And that bg_bytenr is at 128T, since the fs has gone through several >>>> balance. >>>> But there is only one device, and its size is only 1T. >>>> >>>> Please tell me how to trim that block group only? >>>> >>> >>> Block groups are something internal the users don’t have to worry about it. >>> The range is [0 to totalbytes] for start and [0 to U64MAX] for len is fair. >> >> Nope, users sometimes care. Especially for the usage of each bg. >> >> Furthermore, we have vusage/vrange filter for balance, so user is not >> blocked from the whole bg thing. >> >>> >>>>> >>>>> As in the man page it's ok to adjust the range internally, and as length >>>>> can be up to U64MAX we can still trim beyond super_total_bytes? >>>> >>>> As I said already, super_total_bytes makes no sense in logical address >>>> space. >>> >>> But super_total_bytes makes sense in the user land though, on the other >>> hand logical address space which you are trying to expose to the user land >>> does not make sense to me. >> >> Nope, super_total_bytes in fact makes no sense under most cases. >> It doesn't even shows the up limit of usable space. (E.g. For all RADI1 >> profiles, it's only half the space at most. Even for all SINGLE >> profiles, it doesn't account the 1M reserved space). >> >> It's a good thing to detect device list corruption, but despite that, it >> really doesn't make much sense. >> >> For logical address space, as explains, we have tools (not in >> btrfs-progs though) and interface (balance vrange filter) to take use of >> them. >> >>> >>>> As super_total_bytes is just the sum of all devices, it's a device layer >>>> thing, nothing to do with logical address space. >>>> >>>> You're mixing logical bytenr with something not even a device physical >>>> offset, how can it be correct? >>>> >>>> Let me make it more clear, btrfs has its own logical address space >>>> unrelated to whatever the devices mapping are. >>>> It's always [0, U64_MAX], no matter how many devices there are. >>>> >>>> If btrfs is implemented using dm, it should be more clear. >>>> >>>> (single device btrfs) >>>> | >>>> (dm linear, 0 ~ U64_MAX, virtual devices)<- that's logical address space >>>> | | | | >>>> | | | \- (dm raid1, 1T ~ 1T + 128M, devid1 XXX, devid2 XXX) >>>> | | \------ (dm raid0, 2T ~ 2T + 1G, devid1 XXX, devid2 XXX) >>>> | \---------- (dm raid1, 128G ~ 128G + 128M, devi1 XXX, devid xxx) >>>> \-------------- (dm raid0, 1M ~ 1M + 1G, devid1 xxx, devid2 xxx). >>>> >>>> If we're trim such fs layout, you tell me which offset you should use. >>>> >>> >>> There is no perfect solution, the nearest solution I can think - map >>> range.start and range.len to the physical disk range and search and discard >>> free spaces in that range. >> >> Nope, that's way worse than current behavior. >> See the above example, how did you pass devid? Above case is using RAID0 >> and RAID1 on two devices, how do you handle that? >> Furthermore, btrfs can have different devices sizes for RAID profiles, >> how to handle that them? Using super total bytes would easily exceed >> every devices boundary. >> >> Yes, the current behavior is not the perfect solution either, but you're >> attacking from the wrong direction. >> In fact, for allocated bgs, the current behavior is the best solution, >> you can choose to trim any range and you have the tool like Hans' >> python-btrfs. >> >> The not-so-perfect part is about the unallocated range. >> IIRC things like thin-provision LVM choose not to trim the unallocated >> part, while btrfs choose to trim all the unallocated part. >> >> If you're arguing how btrfs handles unallocated space, I have no word to >> defend at all. But for the logical address part? I can't have more words >> to spare. >> >> Thanks, >> Qu >> >>> This idea may be ok for raids/linear profiles, but again as btrfs can >>> relocate the chunks its not perfect. >>> >>> Thanks, Anand >>> >>> >>>> Thanks, >>>> Qu >>>> >>>>> >>>>> Thanks, Anand >>>>> >>>>> >>>>>> Thanks, >>>>>> Qu >>>>>> >>>>>>> >>>>>>> The change log is also vague to me, doesn't explain why you are >>>>>>> re-adding that check. >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>>> >>>>>>>> /* >>>>>>>> * NOTE: Don't truncate the range using super->total_bytes. >>>>>>>> Bytenr of >>>>>>>> -- >>>>>>>> 2.21.0 (Apple Git-120) >
signature.asc
Description: OpenPGP digital signature