于 2014年04月09日 12:33, Anand Jain 写道:
A bit of background of btrfs fi show.
As such original btrfs fi show had too many problems since btrfs-progs
wasn't much consulting btrfs-kernel to determine various
mounted device status. This was a serious problem sometime back.
Various patches brings btrfs-progs to communicate with kernel
and to tell the world the exact status as seen by the kernel.
And avoid any btrfs status fabrication in the user land for the
mounted disks.
The old code was move into the option "-d" and newer which
would print status as per the kernel comes under "-m"
which stands for mounted.
The cli btrfs fi show (with no option) would use both -m
and user-land (lblkid) for unmounted btrfs to show various
disks status.
But here this patch is again bring the old problem and idea into life,
that is it fabricates "missing" at the userland for the mounted btrfs.
As shown in the below testcase it has bad cascading consequences.
This patch must be taken out.
If your worry is about xfstests/btrfs/003 fails, yes it does, we
don't have the right solution as of now to fix it.
Yes, I'm still worry about the xfstests testcase.
Personally, I think you should try to remove the disk remove test in
btrfs/003 first.
(Though I think it could be much harder to archieve it)
If xfstests devs can be persuaded and NOTE added to btrfs-progs document,
I'll be completely OK for the revert of this patch.
On 09/04/2014 11:26, Qu Wenruo wrote:
Yes, the deleted device scan is still one of the deep problems yet.
But my patch is not intented to deal anything related to the problem.
For me, I am *only* going to deal with the *exit code* problem,
'btrfs fi show <device>' executes correctly(OK, only part of it
exactlly),
but exit code is still 1, which is the bug I'm trying to fix it.
Looks out of context, am I missing something ?
Sorry for getting mixed with the patch dealing with the exit code.
(Previous patch I sent)
IMO, the users/admins may never be interested in the inside mechanism
nor algorithm,
but the output and exit value things.
user trust btrfs-progs to tell them whats going on in the btrfs kernel.
IMO user trust btrfs-progs to give them right info on btrfs, not care
nor aware of the relationship
between kernel.
there is no point in fabricating things (like missing) at the user land
when btrfs kernel isn't aware of it.
From the respect of normal users/admins, they does not cares how it is
done but the result.
So if user finds missing devices still output in 'btrfs fi show', they
will be confused.
Though I think any sane users/admins will call 'btrfs dev del' first,
so the problem still comes to the btrfs/003 test case.
This bug is much like the previous 'btrfs fi show' bugs that breaks some
xfstests test case
I guess you are talking about the btrfs/003 test case, think the
correct-way that a xfstest should have determined if the disk is
missing is to dump the btrfs_fs_devices -> btrfs_device and to check
if flag missing is set. xfstest trust btrfs-progs but btrfs-progs don't
do it.
If there is real test program which would check btrfs disk status from
the btrfs kernel memory then again this bug (missing flag is not set)
would exist.
This bug is not the dev delete but the error message due to the
unimplemented
btrfs ioctl things below.
(always showing a error due to the scan_kernel_v2 function which calls a
unimplemented ioctl interface),
The problem is its matching kernel patch is not integrated.
if some patch breaks the old exit code or output, then it should be
fixed to maintain the old
output/exit code (except some big decision is made to change it).
Yes but fix it in the right way - notify btrfs kernel about the
missing disks, which would/should set missing flag in the btrfs_device
list.
As I mentioned, this patch is just a workaround for testcase btrfs/003.
No mean to fix the whole things.
As mentioned at the beginning, I prefer to remove the device remove test
in testcase, then the patch can be reverted without any complain from me.
Thanks,
Qu.
So for the output/exit code consistence, it should be fixed even the
patch may not means a cure but
only a workaround for you.
There is no short cut. As shown your patch has the cascading problem.
When btrfs-kernel fixes this issue, And if David and Josef agrees a
patch on top of btrfs-devlist patch will help to fix this problem
once for all.
Thanks, Anand
Thanks,
Qu.
于 2014年04月09日 11:04, Anand Jain 写道:
Below shows the bug cascading to this patch.
And now to fix this I think we shouldn't fix/workaround in the
btrfs-progs again!, fix it in the btrfs-kernel (or leave it open
until suitable fix is found, I tried and failed. but don't fix it
in a wrong way). If you want to help to fix this problem: Find out
if we could get kobject notification with in kernel when disks gets
disappeared.
I have been advocating btrfs-progs should _not_ add its intelligence
and it should be as transparent as possible in showing the kernel's
status. This should be seriously considered.
(-----------
For patches to take this approach the core problem here is different
and hope we could correct it..
First, we have a superficial and wrong measuring tape (xfstest) and
we are trying to fix the product using it And in between is
btrfs-progs
which is trying to add more superficial-ness.
2nd, btrfs Wiki has a theory and thus sets the direction that
btrfs-progs would copy code from btrfs-kernel, I seriously doubt
if that's a good idea.
If you want to make btrfs-progs as intelligent as btrfs-kernel
(which I don't understand why you should ? since the purpose of
btrfs-progs and btrfs-kernel are different) then first you need
develop a mini synchronization mechanism between btrfs-progs and btrfs
kernel which is as good as two active nodes FS which says from my
experience with Solaris/SAM-QFS. Developing a synchronization
mechanism is not in the plan here. Further from the End user
Application (DB) performance perspective calling sync at the need of
something like btrfs-progs is a very very bad idea. Applications would
experience jitters in their steady state performance. Once Solaris had
this issue and we fixed it.
-----------)
Have fun. ;-)
----------------------------------------------------------------
$ btrfs dev scan
Scanning for Btrfs filesystems
$ mount /dev/sdc /btrfs
$ btrfs fi show
Label: none uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
Total devices 2 FS bytes used 663.81MiB
devid 1 size 1.10GiB used 1.10GiB path /dev/sdf
devid 2 size 1.10GiB used 1.08GiB path /dev/sdc
$ devmgt show
host0 sda
host1 sdf
host2 sdc
host3 sdd
host4 sde
$ devmgt detach /dev/sdf
-----/dev/kmsg----
sd 1:0:0:0: [sdf] Stopping disk
SUBSYSTEM=scsi
DEVICE=+scsi:1:0:0:0
ata2.00: disabled
------------------
detach /dev/sdf successful
(as a known bug btrfs kernel does not know device is missing, missing
flag isn't set, as shown below)
$ btrfs-devlist
fsid name uuid (seed_fsid sprout_fsid)
(fs_latest_devid fs_num_devices fs_open_devices fs_rw_devices
fs_missing_devices fs_total_devices) fs_total_rw_bytes
fs_num_can_discard fs_latest_trans
devid gen total_bytes disk_total_bytes bytes_used type io_align
io_width sector_size fmode
fs_flags
dev_flags
dfbf136d-e8d2-489b-8ee1-be0d5999769d /dev/sdf
13715cc5-3aeb-4523-b02c-a072fd427a00 (null null)
(2 2 2 2 0 2) 2363490304 0 7
1 5 1181745152 1181745152 1181745152 0 4096 4096 4096 0x83
fs_Mounted|not_fs_Seeding|fs_Rotating
Writable|MD|not_Missing|not_Discard|not_Replace_tgt|not_Run_pending|not_Nobarriers|Stat_valid|Stat_dirty|Bdev
dfbf136d-e8d2-489b-8ee1-be0d5999769d /dev/sdc
12ad34f7-8d58-44fa-95cf-b2bbc0cec69d (null null)
(2 2 2 2 0 2) 2363490304 0 7
2 7 1181745152 1181745152 1160773632 0 4096 4096 4096 0x83
fs_Mounted|not_fs_Seeding|fs_Rotating
Writable|MD|not_Missing|not_Discard|not_Replace_tgt|not_Run_pending|not_Nobarriers|Stat_valid|Stat_dirty|Bdev
(below btrfs-progs patch added intelligence to tell the world that
device is missing)
Ref:
~~~~~~~
commit 2ae6a037efd52ae0fa30052d456ad07f074f5d54
Author: Qu Wenruo <quwen...@cn.fujitsu.com>
Date: Fri Feb 7 15:07:19 2014 +0800
btrfs-progs: Add missing devices check for mounted btrfs.
~~~~~~~
$ btrfs fi show
Label: none uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
Total devices 2 FS bytes used 663.81MiB
devid 2 size 1.10GiB used 1.08GiB path /dev/sdc
*** Some devices missing
$ btrfs dev add /dev/sde /btrfs
$ btrfs fi show
Label: none uuid: dfbf136d-e8d2-489b-8ee1-be0d5999769d
Total devices 3 FS bytes used 663.81MiB
devid 2 size 1.10GiB used 1.08GiB path /dev/sdc
devid 3 size 1.04GiB used 0.00 path /dev/sde
*** Some devices missing
Now the bug is delete missing fails. Since kernel don't
understand whats missing.
$ btrfs dev del missing /btrfs
ERROR: error removing the device 'missing' - no missing devices found
to remove
$
---------------------------------------------------------------------------
On 02/10/2014 08:36 AM, Qu Wenruo wrote:
On Fri, 07 Feb 2014 17:34:46 +0800, Anand Jain wrote:
IMO btrfs-progs shouldn't add its intelligence to know if disk
is missing. If btrfs-kernel doesn't know when disk is missing
that's a bug to fix in btrfs-kernel. yes that indeed true as
of now in btrfs-kernel. btrfs kernel has no idea when disk
goes missing, just -EIO doesn't tell btrfs that. I am trying
to fix this first.
But the problem is there isn't good way with in btrfs/FS
to know when disk goes missing. did I miss anything ?
Yes, kernel detection is the best way.
But since it has no better way to detect missing device, I think the
btrfs-progs way fix is good enough for now.
Since btrfs fi show with "-d" options will scan the /dev to find fs
and
check missing disks,
I think adds some user-land check even using the ioctl way is still
somewhat reasonable.
Thanks
Qu
Thanks, Anand
On 02/07/2014 02:45 PM, Qu Wenruo wrote:
In btrfs/003 of xfstest, it will check whether btrfs fi show can
find
missing devices.
But before the patch, btrfs-progs will not check whether device
missing
if given a mounted btrfs mountpoint/block device.
This patch fixes the bug and will pass btrfs/003.
Signed-off-by: Qu Wenruo <quwen...@cn.fujitsu.com>
Cc: Anand Jain <anand.j...@oracle.com>
---
cmds-filesystem.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index 384d1b9..4c9933d 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -363,6 +363,8 @@ static int print_one_fs(struct
btrfs_ioctl_fs_info_args *fs_info,
char *label, char *path)
{
int i;
+ int fd;
+ int missing;
char uuidbuf[BTRFS_UUID_UNPARSED_SIZE];
struct btrfs_ioctl_dev_info_args *tmp_dev_info;
int ret;
@@ -385,6 +387,14 @@ static int print_one_fs(struct
btrfs_ioctl_fs_info_args *fs_info,
for (i = 0; i < fs_info->num_devices; i++) {
tmp_dev_info = (struct btrfs_ioctl_dev_info_args
*)&dev_info[i];
+
+ /* Add check for missing devices even mounted */
+ fd = open((char *)tmp_dev_info->path, O_RDONLY);
+ if (fd < 0) {
+ missing = 1;
+ continue;
+ }
+ close(fd);
printf("\tdevid %4llu size %s used %s path %s\n",
tmp_dev_info->devid,
pretty_size(tmp_dev_info->total_bytes),
@@ -392,6 +402,8 @@ static int print_one_fs(struct
btrfs_ioctl_fs_info_args *fs_info,
tmp_dev_info->path);
}
+ if (missing)
+ printf("\t*** Some devices missing\n");
printf("\n");
return 0;
}
--
To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html