Re: [ceph-users] pg inconsistent and repair doesn't work

2017-10-25 Thread Wei Jin
I found it is similar to bug: http://tracker.ceph.com/issues/21388.
And fix it by rados command.

The pg inconsistent info is like following,wish it could be fixed in the future.

root@n10-075-019:/var/lib/ceph/osd/ceph-27/current/1.fcd_head# rados
list-inconsistent-obj 1.fcd --format=json-pretty
{
"epoch": 2373,
"inconsistents": [
{
"object": {
"name": "103528d.0058",
"nspace": "fsvolumens_87c46348-9869-11e7-8525-3497f65a8415",
"locator": "",
"snap": "head",
"version": 147490
},
"errors": [],
"union_shard_errors": [
"size_mismatch_oi"
],
"selected_object_info":
"1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head(2401'147490
client.901549.1:33749 dirty|omap_digest s 3461120 uv 147490 od
 alloc_hint [0 0])",
"shards": [
{
"osd": 27,
"errors": [
"size_mismatch_oi"
],
"size": 0,
"omap_digest": "0x",
"data_digest": "0x"
},
{
"osd": 62,
"errors": [
"size_mismatch_oi"
],
"size": 0,
"omap_digest": "0x",
"data_digest": "0x"
},
{
"osd": 133,
"errors": [
"size_mismatch_oi"
],
"size": 0,
"omap_digest": "0x",
"data_digest": "0x"
}
]
}
]
}

On Wed, Oct 25, 2017 at 12:05 PM, Wei Jin  wrote:
> Hi, list,
>
> We ran into pg deep scrub error. And we tried to repair it by `ceph pg
> repair pgid`. But it didn't work. We also verified object files,  and
> found both 3 replicas were zero size. What's the problem, whether it
> is a bug? And how to fix the inconsistent? I haven't restarted the
> osds so far as I am not sure whether it works.
>
> ceph version: 10.2.9
> user case: cephfs
> kernel client: 4.4/4.9
>
> Error info from primary osd:
>
> root@n10-075-019:~# grep -Hn 'ERR' /var/log/ceph/ceph-osd.27.log.1
> /var/log/ceph/ceph-osd.27.log.1:3038:2017-10-25 04:47:34.460536
> 7f39c4829700 -1 log_channel(cluster) log [ERR] : 1.fcd shard 27: soid
> 1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head
> size 0 != size 3461120 from auth oi
> 1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head(2401'147490
> client.901549.1:33749 dirty|omap_digest s 3461120 uv 147490 od
>  alloc_hint [0 0])
> /var/log/ceph/ceph-osd.27.log.1:3039:2017-10-25 04:47:34.460722
> 7f39c4829700 -1 log_channel(cluster) log [ERR] : 1.fcd shard 62: soid
> 1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head
> size 0 != size 3461120 from auth oi
> 1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head(2401'147490
> client.901549.1:33749 dirty|omap_digest s 3461120 uv 147490 od
>  alloc_hint [0 0])
> /var/log/ceph/ceph-osd.27.log.1:3040:2017-10-25 04:47:34.460725
> 7f39c4829700 -1 log_channel(cluster) log [ERR] : 1.fcd shard 133: soid
> 1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head
> size 0 != size 3461120 from auth oi
> 1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head(2401'147490
> client.901549.1:33749 dirty|omap_digest s 3461120 uv 147490 od
>  alloc_hint [0 0])
> /var/log/ceph/ceph-osd.27.log.1:3041:2017-10-25 04:47:34.460800
> 7f39c4829700 -1 log_channel(cluster) log [ERR] : 1.fcd soid
> 1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head:
> failed to pick suitable auth object
> /var/log/ceph/ceph-osd.27.log.1:3042:2017-10-25 04:47:34.461458
> 7f39c4829700 -1 log_channel(cluster) log [ERR] : deep-scrub 1.fcd
> 1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head
> on disk size (0) does not match object info size (3461120) adjusted
> for ondisk to (3461120)
> /var/log/ceph/ceph-osd.27.log.1:3043:2017-10-25 04:47:44.645934
> 7f39c4829700 -1 log_channel(cluster) log [ERR] : 1.fcd deep-scrub 4
> errors
>
>
> Object file info:
>
> root@n10-075-019:/var/lib/ceph/osd/ceph-27/current/1.fcd_head# find .
> -name "103528d.0058__head_12086FCD*"
> ./DIR_D/DIR_C/DIR_F/DIR_6/DIR_8/103528d.0058__head_12086FCD_fsvolumens\u87c46348-9869-11e7-8525-3497f65a8415_1
> root@n10-075-019:/var/lib/ceph/osd/ceph-27/current/1.fcd_head# ls -al
> ./DIR_D/DIR_C/DIR_F/DIR_6/DIR_8/103528d.0058__head_12086FCD*
> -rw-r--r-- 1 

[ceph-users] pg inconsistent and repair doesn't work

2017-10-24 Thread Wei Jin
Hi, list,

We ran into pg deep scrub error. And we tried to repair it by `ceph pg
repair pgid`. But it didn't work. We also verified object files,  and
found both 3 replicas were zero size. What's the problem, whether it
is a bug? And how to fix the inconsistent? I haven't restarted the
osds so far as I am not sure whether it works.

ceph version: 10.2.9
user case: cephfs
kernel client: 4.4/4.9

Error info from primary osd:

root@n10-075-019:~# grep -Hn 'ERR' /var/log/ceph/ceph-osd.27.log.1
/var/log/ceph/ceph-osd.27.log.1:3038:2017-10-25 04:47:34.460536
7f39c4829700 -1 log_channel(cluster) log [ERR] : 1.fcd shard 27: soid
1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head
size 0 != size 3461120 from auth oi
1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head(2401'147490
client.901549.1:33749 dirty|omap_digest s 3461120 uv 147490 od
 alloc_hint [0 0])
/var/log/ceph/ceph-osd.27.log.1:3039:2017-10-25 04:47:34.460722
7f39c4829700 -1 log_channel(cluster) log [ERR] : 1.fcd shard 62: soid
1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head
size 0 != size 3461120 from auth oi
1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head(2401'147490
client.901549.1:33749 dirty|omap_digest s 3461120 uv 147490 od
 alloc_hint [0 0])
/var/log/ceph/ceph-osd.27.log.1:3040:2017-10-25 04:47:34.460725
7f39c4829700 -1 log_channel(cluster) log [ERR] : 1.fcd shard 133: soid
1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head
size 0 != size 3461120 from auth oi
1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head(2401'147490
client.901549.1:33749 dirty|omap_digest s 3461120 uv 147490 od
 alloc_hint [0 0])
/var/log/ceph/ceph-osd.27.log.1:3041:2017-10-25 04:47:34.460800
7f39c4829700 -1 log_channel(cluster) log [ERR] : 1.fcd soid
1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head:
failed to pick suitable auth object
/var/log/ceph/ceph-osd.27.log.1:3042:2017-10-25 04:47:34.461458
7f39c4829700 -1 log_channel(cluster) log [ERR] : deep-scrub 1.fcd
1:b3f61048:fsvolumens_87c46348-9869-11e7-8525-3497f65a8415::103528d.0058:head
on disk size (0) does not match object info size (3461120) adjusted
for ondisk to (3461120)
/var/log/ceph/ceph-osd.27.log.1:3043:2017-10-25 04:47:44.645934
7f39c4829700 -1 log_channel(cluster) log [ERR] : 1.fcd deep-scrub 4
errors


Object file info:

root@n10-075-019:/var/lib/ceph/osd/ceph-27/current/1.fcd_head# find .
-name "103528d.0058__head_12086FCD*"
./DIR_D/DIR_C/DIR_F/DIR_6/DIR_8/103528d.0058__head_12086FCD_fsvolumens\u87c46348-9869-11e7-8525-3497f65a8415_1
root@n10-075-019:/var/lib/ceph/osd/ceph-27/current/1.fcd_head# ls -al
./DIR_D/DIR_C/DIR_F/DIR_6/DIR_8/103528d.0058__head_12086FCD*
-rw-r--r-- 1 ceph ceph 0 Oct 24 22:04
./DIR_D/DIR_C/DIR_F/DIR_6/DIR_8/103528d.0058__head_12086FCD_fsvolumens\u87c46348-9869-11e7-8525-3497f65a8415_1
root@n10-075-019:/var/lib/ceph/osd/ceph-27/current/1.fcd_head#


root@n10-075-028:/var/lib/ceph/osd/ceph-62/current/1.fcd_head# find .
-name "103528d.0058__head_12086FCD*"
./DIR_D/DIR_C/DIR_F/DIR_6/DIR_8/103528d.0058__head_12086FCD_fsvolumens\u87c46348-9869-11e7-8525-3497f65a8415_1
root@n10-075-028:/var/lib/ceph/osd/ceph-62/current/1.fcd_head# ls -al
./DIR_D/DIR_C/DIR_F/DIR_6/DIR_8/103528d.0058__head_12086FCD*
-rw-r--r-- 1 ceph ceph 0 Oct 24 22:04
./DIR_D/DIR_C/DIR_F/DIR_6/DIR_8/103528d.0058__head_12086FCD_fsvolumens\u87c46348-9869-11e7-8525-3497f65a8415_1
root@n10-075-028:/var/lib/ceph/osd/ceph-62/current/1.fcd_head#


root@n10-075-040:/var/lib/ceph/osd/ceph-133/current/1.fcd_head# find .
-name "103528d.0058__head_12086FCD*"
./DIR_D/DIR_C/DIR_F/DIR_6/DIR_8/103528d.0058__head_12086FCD_fsvolumens\u87c46348-9869-11e7-8525-3497f65a8415_1
root@n10-075-040:/var/lib/ceph/osd/ceph-133/current/1.fcd_head# ls -al
./DIR_D/DIR_C/DIR_F/DIR_6/DIR_8/103528d.0058__head_12086FCD*
-rw-r--r-- 1 ceph ceph 0 Oct 24 22:04
./DIR_D/DIR_C/DIR_F/DIR_6/DIR_8/103528d.0058__head_12086FCD_fsvolumens\u87c46348-9869-11e7-8525-3497f65a8415_1
root@n10-075-040:/var/lib/ceph/osd/ceph-133/current/1.fcd_head#
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com