On 12/9/19 12:37 AM, David Sterba wrote:
On Mon, Aug 26, 2019 at 05:04:36PM +0800, Anand Jain wrote:
Function call chain __btrfs_map_block()->find_live_mirror() uses
thread pid to determine the %mirror_num when the mirror_num=0.
This patch introduces a framework so that we can add policies to determine
the %mirror_num. And also adds the devid as the readmirror policy.
The new property is stored as an item in the device tree as show below.
(BTRFS_READMIRROR_OBJECTID, BTRFS_PERSISTENT_ITEM_KEY, devid)
To be able to set and get this new property also introduces new ioctls
BTRFS_IOC_GET_READMIRROR and BTRFS_IOC_SET_READMIRROR. The ioctl argument
is defined as
struct btrfs_ioctl_readmirror_args {
__u64 type; /* RW */
__u64 device_bitmap; /* RW */
}
I don't remember if there was a suggestion to use ioctls for read
mirror, but the property interface should be sufficient. Besides this
ioctl interafce is quite an anti-pattern: narrow use, non-extensible
structure, alternative and more convenient interface already available.
Extended attribute interface f(get/set)attr is inode bound, but
the readmirror property is filesystem bound. For the readmirror we
can still use the extended attribute, but it might be considered as
abuse which we haven't done so far, here below [1] is the list of
property with the interface it uses and where the property is saved.
[1]
property interface save-as
ro ioctl root-item
label ioctl super-block
compression xattr xattr
v1:readmirror xattr xattr
v2:readmirror ioctl dev-tree-item
You are asking for the combination of
property interface save-as
readmirror xattr dev-tree-item
I can give a try.
An usage example as follows:
btrfs property set /btrfs readmirror devid:1,3
btrfs property get /btrfs readmirror
readmirror devid:1 3
btrfs property set /btrfs readmirror ""
btrfs property get /btrfs readmirror
readmirror default
This patchset has been tested completely, however marked as RFC for the
following reasons and comments on them (or any other) are appreciated as
usual.
. The new objectid is defined as
#define BTRFS_READMIRROR_OBJECTID -1ULL
Need consent we are fine to use this value, and with this value it
shall be placed just before the DEV_STATS_OBJECTID item which is more
frequently used only during the device errors.
-1 can be considered a special value in other cases, not necessarily
here but if the ordering of items is the only reason I'd say no. The
keys/items will most likely land in the same node so there's no point
forcing the order.
Agreed. Any suggestion on the value for the BTRFS_READMIRROR_OBJECTID.
. I am using a u64 bitmap to represent the devices id, so the max device
id that we could represent is 63, its a kind of limitation which should
be addressed before integration, I wonder if there is any suggestion?
Uh 63, so now an implementation detail is posing a global limit? That
sounds backwards.
Yes. And I was thinking of u64 array but that doesn't scale as well.
Anyways I have in the list to try using xattr interface which may
address this issue.
Kindly note that, multiple ioctls with each time representing a set of
device(s) is not a choice because we need to make sure the readmirror
changes happens in a commit transaction.
I believe this can be guaranteed by the properties interface, ie. single
value gets processed at once and with some locking around the state of
devices can be updated atomically.
The open question is still how to store the readmirror property
per-device, it could be either an item or bit inside the device
structure.
We discussed that before here.
https://lore.kernel.org/linux-btrfs/8d31c3a2-3fb0-63af-3173-948ed0e84...@oracle.com/
Besides the interface, I'm kind of missing the usecase description what
is expected from the read mirror policies and how they could affect
various scenarios. Maybe it was in some of the previous iterations, it's
hard too track everything so this should be part of the cover letter (or
at leat linked if it's too much text).
Yep. My mistake I missed it link it. Sorry about that.
Thanks, Anand