[ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-02-28 Thread Marco Baldini - H.S. Amiata

Hello

I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and 
1x240GB SSD. I created this cluster after Luminous release, so all OSDs 
are Bluestore. In my crush map I have two rules, one targeting the SSDs 
and one targeting the HDDs. I have 4 pools, one using the SSD rule and 
the others using the HDD rule, three pools are size=3 min_size=2, one is 
size=2 min_size=1 (this one have content that it's ok to lose)


In the last 3 month I'm having a strange random problem. I planned my 
osd scrubs during the night (osd scrub begin hour = 20, osd scrub end 
hour = 7) when office is closed so there is low impact on the users. 
Some mornings, when I ceph the cluster health, I find:


HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent

X and Y sometimes are 1, sometimes 2.

I issue a ceph health detail, check the damaged PGs, and run a ceph pg 
repair for the damaged PGs, I get


instructing pg PG on osd.N to repair

PG are different, OSD that have to repair PG is different, even the node 
hosting the OSD is different, I made a list of all PGs and OSDs. This 
morning is the most recent case:



ceph health detail

HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]


ceph pg repair 13.65

instructing pg 13.65 on osd.4 to repair

(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 
13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 
13.65 repair ok, 0 fixed


ceph pg repair 14.31

instructing pg 14.31 on osd.8 to repair

(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700  0 log_channel(cluster) log [DBG] : 
14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700  0 log_channel(cluster) log [DBG] : 
14.31 repair ok, 0 fixed


I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD had 
to repair PG. Date is dd/mm/


21/12/2017   --  pg 14.29 is active+clean+inconsistent, acting [6,2,4]

18/01/2018   --  pg 14.5a is active+clean+inconsistent, acting [6,4,1]

22/01/2018   --  pg 9.3a is active+clean+inconsistent, acting [2,7]

29/01/2018   --  pg 13.3e is active+clean+inconsistent, acting [4,6,1]
 instructing pg 13.3e on osd.4 to repair

07/02/2018   --  pg 13.7e is active+clean+inconsistent, acting [8,2,5]
 instructing pg 13.7e on osd.8 to repair

09/02/2018   --  pg 13.30 is active+clean+inconsistent, acting [7,3,2]
 instructing pg 13.30 on osd.7 to repair

15/02/2018   --  pg 9.35 is active+clean+inconsistent, acting [1,8]
 instructing pg 9.35 on osd.1 to repair

 pg 13.3e is active+clean+inconsistent, acting [4,6,1]
 instructing pg 13.3e on osd.4 to repair

17/02/2018   --  pg 9.2d is active+clean+inconsistent, acting [7,5]
 instructing pg 9.2d on osd.7 to repair

22/02/2018   --  pg 9.24 is active+clean+inconsistent, acting [5,8]
 instructing pg 9.24 on osd.5 to repair

28/02/2018   --  pg 13.65 is active+clean+inconsistent, acting [4,2,6]
 instructing pg 13.65 on osd.4 to repair

 pg 14.31 is active+clean+inconsistent, acting [8,3,1]
 instructing pg 14.31 on osd.8 to repair



If can be useful, my ceph.conf is here:

[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network = 10.10.10.0/24
public network = 10.10.10.0/24
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440

debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0


[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1

osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3

[client]
rbd cache = true
rbd cache size = 268435456  # 

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-02-28 Thread Paul Emmerich
Hi,

might be http://tracker.ceph.com/issues/22464

Can you check the OSD log file to see if the reported checksum is 0x6706be76?


Paul

> Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata 
> :
> 
> Hello
> 
> I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and 1x240GB 
> SSD. I created this cluster after Luminous release, so all OSDs are 
> Bluestore. In my crush map I have two rules, one targeting the SSDs and one 
> targeting the HDDs. I have 4 pools, one using the SSD rule and the others 
> using the HDD rule, three pools are size=3 min_size=2, one is size=2 
> min_size=1 (this one have content that it's ok to lose)
> 
> In the last 3 month I'm having a strange random problem. I planned my osd 
> scrubs during the night (osd scrub begin hour = 20, osd scrub end hour = 7) 
> when office is closed so there is low impact on the users. Some mornings, 
> when I ceph the cluster health, I find: 
> HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
> OSD_SCRUB_ERRORS X scrub errors
> PG_DAMAGED Possible data damage: Y pg inconsistent
> X and Y sometimes are 1, sometimes 2.
> 
> I issue a ceph health detail, check the damaged PGs, and run a ceph pg repair 
> for the damaged PGs, I get
> 
> instructing pg PG on osd.N to repair
> PG are different, OSD that have to repair PG is different, even the node 
> hosting the OSD is different, I made a list of all PGs and OSDs. This morning 
> is the most recent case:
> 
> > ceph health detail
> HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
> OSD_SCRUB_ERRORS 2 scrub errors
> PG_DAMAGED Possible data damage: 2 pgs inconsistent
> pg 13.65 is active+clean+inconsistent, acting [4,2,6]
> pg 14.31 is active+clean+inconsistent, acting [8,3,1]
> > ceph pg repair 13.65
> instructing pg 13.65 on osd.4 to repair
> 
> (node-2)> tail /var/log/ceph/ceph-osd.4.log
> 2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 
> 13.65 repair starts
> 2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 
> 13.65 repair ok, 0 fixed
> > ceph pg repair 14.31
> instructing pg 14.31 on osd.8 to repair
> 
> (node-3)> tail /var/log/ceph/ceph-osd.8.log
> 2018-02-28 08:52:37.297490 7f4dd0816700  0 log_channel(cluster) log [DBG] : 
> 14.31 repair starts
> 2018-02-28 08:53:00.704020 7f4dd0816700  0 log_channel(cluster) log [DBG] : 
> 14.31 repair ok, 0 fixed
> 
> 
> I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD had to 
> repair PG. Date is dd/mm/
> 21/12/2017   --  pg 14.29 is active+clean+inconsistent, acting [6,2,4]
> 
> 18/01/2018   --  pg 14.5a is active+clean+inconsistent, acting [6,4,1]
> 
> 22/01/2018   --  pg 9.3a is active+clean+inconsistent, acting [2,7]
> 
> 29/01/2018   --  pg 13.3e is active+clean+inconsistent, acting [4,6,1]
>  instructing pg 13.3e on osd.4 to repair
> 
> 07/02/2018   --  pg 13.7e is active+clean+inconsistent, acting [8,2,5]
>  instructing pg 13.7e on osd.8 to repair
> 
> 09/02/2018   --  pg 13.30 is active+clean+inconsistent, acting [7,3,2]
>  instructing pg 13.30 on osd.7 to repair
> 
> 15/02/2018   --  pg 9.35 is active+clean+inconsistent, acting [1,8]
>  instructing pg 9.35 on osd.1 to repair
> 
>  pg 13.3e is active+clean+inconsistent, acting [4,6,1]
>  instructing pg 13.3e on osd.4 to repair
> 
> 17/02/2018   --  pg 9.2d is active+clean+inconsistent, acting [7,5]
>  instructing pg 9.2d on osd.7 to repair 
> 
> 22/02/2018   --  pg 9.24 is active+clean+inconsistent, acting [5,8]
>  instructing pg 9.24 on osd.5 to repair
> 
> 28/02/2018   --  pg 13.65 is active+clean+inconsistent, acting [4,2,6]
>  instructing pg 13.65 on osd.4 to repair
> 
>  pg 14.31 is active+clean+inconsistent, acting [8,3,1]
>  instructing pg 14.31 on osd.8 to repair
> 
> 
> 
> If can be useful, my ceph.conf is here:
> 
> [global]
> auth client required = none
> auth cluster required = none
> auth service required = none
> fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
> cluster network = 10.10.10.0/24
> public network = 10.10.10.0/24
> keyring = /etc/pve/priv/$cluster.$name.keyring
> mon allow pool delete = true
> osd journal size = 5120
> osd pool default min size = 2
> osd pool default size = 3
> bluestore_block_db_size = 64424509440
> 
> debug asok = 0/0
> debug auth = 0/0
> debug buffer = 0/0
> debug client = 0/0
> debug context = 0/0
> debug crush = 0/0
> debug filer = 0/0
> debug filestore = 0/0
> debug finisher = 0/0
> debug heartbeatmap = 0/0
> debug journal = 0/0
> debug journaler = 0/0
> debug lockdep = 0/0
> debug mds = 0/0
> debug mds balancer = 0/0
> debug mds locker = 0/0
> debug mds log = 0/0
> debug mds log expire = 0/0
> debug mds migrator = 0/0
> debug mon = 0/0
> debug monc = 0/0
> debug ms = 0/0
> debug objclass = 0/0
> debug objectcacher = 0/0
> debug objecter 

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-02-28 Thread Marco Baldini - H.S. Amiata

Hi

I read the bugtracker issue and it seems a lot like my problem, even if 
I can't check the reported checksum because I don't have it in my logs, 
perhaps it's because of debug osd = 0/0 in ceph.conf


I just raised the OSD log level

ceph tell osd.* injectargs --debug-osd 5/5

I'll check OSD logs in the next days...

Thanks



Il 28/02/2018 11:59, Paul Emmerich ha scritto:

Hi,

might be http://tracker.ceph.com/issues/22464

Can you check the OSD log file to see if the reported checksum 
is 0x6706be76?



Paul

Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata 
mailto:mbald...@hsamiata.it>>:


Hello

I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and 
1x240GB SSD. I created this cluster after Luminous release, so all 
OSDs are Bluestore. In my crush map I have two rules, one targeting 
the SSDs and one targeting the HDDs. I have 4 pools, one using the 
SSD rule and the others using the HDD rule, three pools are size=3 
min_size=2, one is size=2 min_size=1 (this one have content that it's 
ok to lose)


In the last 3 month I'm having a strange random problem. I planned my 
osd scrubs during the night (osd scrub begin hour = 20, osd scrub end 
hour = 7) when office is closed so there is low impact on the users. 
Some mornings, when I ceph the cluster health, I find:


HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent

X and Y sometimes are 1, sometimes 2.

I issue a ceph health detail, check the damaged PGs, and run a ceph 
pg repair for the damaged PGs, I get


instructing pg PG on osd.N to repair

PG are different, OSD that have to repair PG is different, even the 
node hosting the OSD is different, I made a list of all PGs and OSDs. 
This morning is the most recent case:


> ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
> ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair

(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 
13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 
13.65 repair ok, 0 fixed
> ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair

(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700  0 log_channel(cluster) log [DBG] : 
14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700  0 log_channel(cluster) log [DBG] : 
14.31 repair ok, 0 fixed


I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD 
had to repair PG. Date is dd/mm/


21/12/2017   --  pg 14.29 is active+clean+inconsistent, acting [6,2,4]

18/01/2018   --  pg 14.5a is active+clean+inconsistent, acting [6,4,1]

22/01/2018   --  pg 9.3a is active+clean+inconsistent, acting [2,7]

29/01/2018   --  pg 13.3e is active+clean+inconsistent, acting [4,6,1]
  instructing pg 13.3e on osd.4 to repair

07/02/2018   --  pg 13.7e is active+clean+inconsistent, acting [8,2,5]
  instructing pg 13.7e on osd.8 to repair

09/02/2018   --  pg 13.30 is active+clean+inconsistent, acting [7,3,2]
  instructing pg 13.30 on osd.7 to repair

15/02/2018   --  pg 9.35 is active+clean+inconsistent, acting [1,8]
  instructing pg 9.35 on osd.1 to repair

  pg 13.3e is active+clean+inconsistent, acting [4,6,1]
  instructing pg 13.3e on osd.4 to repair

17/02/2018   --  pg 9.2d is active+clean+inconsistent, acting [7,5]
  instructing pg 9.2d on osd.7 to repair

22/02/2018   --  pg 9.24 is active+clean+inconsistent, acting [5,8]
  instructing pg 9.24 on osd.5 to repair

28/02/2018   --  pg 13.65 is active+clean+inconsistent, acting [4,2,6]
  instructing pg 13.65 on osd.4 to repair

  pg 14.31 is active+clean+inconsistent, acting [8,3,1]
  instructing pg 14.31 on osd.8 to repair



If can be useful, my ceph.conf is here:

[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network = 10.10.10.0/24
public network = 10.10.10.0/24
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440

debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug 

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-05 Thread Marco Baldini - H.S. Amiata

Hi

After some days with debug_osd 5/5 I found [ERR] in different days, 
different PGs, different OSDs, different hosts. This is what I get in 
the OSD logs:


*OSD.5 (host 3)*
2018-03-01 20:30:02.702269 7fdf4d515700  2 osd.5 pg_epoch: 16486 pg[9.1c( v 
16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 
ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] 
r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 
active+clean+scrubbing+deep] 9.1c shard 6: soid 
9:3b157c56:::rbd_data.1526386b8b4567.1761:head candidate had a read 
error
2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 
9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.1761:head 
candidate had a read error

*OSD.4 (host 3)*
2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 
13.65 shard 2: soid 13:a719ecdf:::rbd_data.5f65056b8b4567.f8eb:head 
candidate had a read error

*OSD.8 (host 2)*
2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 
14.31 shard 1: soid 14:8cc6cd37:::rbd_data.30b15b6b8b4567.81a1:head 
candidate had a read error

I don't know what this error is meaning, and as always a ceph pg repair 
fixes it. I don't think this is normal.


Ideas?

Thanks


Il 28/02/2018 14:48, Marco Baldini - H.S. Amiata ha scritto:


Hi

I read the bugtracker issue and it seems a lot like my problem, even 
if I can't check the reported checksum because I don't have it in my 
logs, perhaps it's because of debug osd = 0/0 in ceph.conf


I just raised the OSD log level

ceph tell osd.* injectargs --debug-osd 5/5

I'll check OSD logs in the next days...

Thanks



Il 28/02/2018 11:59, Paul Emmerich ha scritto:

Hi,

might be http://tracker.ceph.com/issues/22464

Can you check the OSD log file to see if the reported checksum 
is 0x6706be76?



Paul

Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata 
mailto:mbald...@hsamiata.it>>:


Hello

I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and 
1x240GB SSD. I created this cluster after Luminous release, so all 
OSDs are Bluestore. In my crush map I have two rules, one targeting 
the SSDs and one targeting the HDDs. I have 4 pools, one using the 
SSD rule and the others using the HDD rule, three pools are size=3 
min_size=2, one is size=2 min_size=1 (this one have content that 
it's ok to lose)


In the last 3 month I'm having a strange random problem. I planned 
my osd scrubs during the night (osd scrub begin hour = 20, osd scrub 
end hour = 7) when office is closed so there is low impact on the 
users. Some mornings, when I ceph the cluster health, I find:


HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent

X and Y sometimes are 1, sometimes 2.

I issue a ceph health detail, check the damaged PGs, and run a ceph 
pg repair for the damaged PGs, I get


instructing pg PG on osd.N to repair

PG are different, OSD that have to repair PG is different, even the 
node hosting the OSD is different, I made a list of all PGs and 
OSDs. This morning is the most recent case:


> ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
> ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair

(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 
13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 
13.65 repair ok, 0 fixed
> ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair

(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700  0 log_channel(cluster) log [DBG] : 
14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700  0 log_channel(cluster) log [DBG] : 
14.31 repair ok, 0 fixed


I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD 
had to repair PG. Date is dd/mm/


21/12/2017   --  pg 14.29 is active+clean+inconsistent, acting [6,2,4]

18/01/2018   --  pg 14.5a is active+clean+inconsistent, acting [6,4,1]

22/01/2018   --  pg 9.3a is active+clean+inconsistent, acting [2,7]

29/01/2018   --  pg 13.3e is active+clean+inconsistent, acting [4,6,1]
  instructing pg 13.3e on osd.4 to repair

07/02/2018   --  pg 13.7e is active+clean+inconsistent, acting [8,2,5]
  instructing pg 13.7e on osd.8 to repair

09/02/2018   --  pg 13.30 is active+clean+inconsistent, acting [7,3,2]
  instructing pg 13.30 on osd.7 to repair

15/02/2018   --  pg 9.35 is active+clean+inconsistent, acting [1,8]
  instructing pg 9.35 on osd.1 to repair

  pg 13.3e is active+cle

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-05 Thread Paul Emmerich
Hi,

yeah, the cluster that I'm seeing this on also has only one host that
reports that specific checksum. Two other hosts only report the same error
that you are seeing.

Could you post to the tracker issue that you are also seeing this?

Paul

2018-03-05 12:21 GMT+01:00 Marco Baldini - H.S. Amiata :

> Hi
>
> After some days with debug_osd 5/5 I found [ERR] in different days,
> different PGs, different OSDs, different hosts. This is what I get in the
> OSD logs:
>
> *OSD.5 (host 3)*
> 2018-03-01 20:30:02.702269 7fdf4d515700  2 osd.5 pg_epoch: 16486 pg[9.1c( v 
> 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 
> ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] 
> r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 
> active+clean+scrubbing+deep] 9.1c shard 6: soid 
> 9:3b157c56:::rbd_data.1526386b8b4567.1761:head candidate had a 
> read error
> 2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 
> 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.1761:head 
> candidate had a read error
>
> *
> OSD.4 (host 3)*
> 2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 
> 13.65 shard 2: soid 
> 13:a719ecdf:::rbd_data.5f65056b8b4567.f8eb:head candidate had a 
> read error
>
> *OSD.8 (host 2)*
> 2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 
> 14.31 shard 1: soid 
> 14:8cc6cd37:::rbd_data.30b15b6b8b4567.81a1:head candidate had a 
> read error
>
> I don't know what this error is meaning, and as always a ceph pg repair
> fixes it. I don't think this is normal.
>
> Ideas?
>
> Thanks
>
> Il 28/02/2018 14:48, Marco Baldini - H.S. Amiata ha scritto:
>
> Hi
>
> I read the bugtracker issue and it seems a lot like my problem, even if I
> can't check the reported checksum because I don't have it in my logs,
> perhaps it's because of debug osd = 0/0 in ceph.conf
>
> I just raised the OSD log level
>
> ceph tell osd.* injectargs --debug-osd 5/5
>
> I'll check OSD logs in the next days...
>
> Thanks
>
>
>
> Il 28/02/2018 11:59, Paul Emmerich ha scritto:
>
> Hi,
>
> might be http://tracker.ceph.com/issues/22464
>
> Can you check the OSD log file to see if the reported checksum
> is 0x6706be76?
>
>
> Paul
>
> Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata <
> mbald...@hsamiata.it>:
>
> Hello
>
> I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and 1x240GB
> SSD. I created this cluster after Luminous release, so all OSDs are
> Bluestore. In my crush map I have two rules, one targeting the SSDs and one
> targeting the HDDs. I have 4 pools, one using the SSD rule and the others
> using the HDD rule, three pools are size=3 min_size=2, one is size=2
> min_size=1 (this one have content that it's ok to lose)
>
> In the last 3 month I'm having a strange random problem. I planned my osd
> scrubs during the night (osd scrub begin hour = 20, osd scrub end hour = 7)
> when office is closed so there is low impact on the users. Some mornings,
> when I ceph the cluster health, I find:
>
> HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
> OSD_SCRUB_ERRORS X scrub errors
> PG_DAMAGED Possible data damage: Y pg inconsistent
>
> X and Y sometimes are 1, sometimes 2.
>
> I issue a ceph health detail, check the damaged PGs, and run a ceph pg
> repair for the damaged PGs, I get
>
> instructing pg PG on osd.N to repair
>
> PG are different, OSD that have to repair PG is different, even the node
> hosting the OSD is different, I made a list of all PGs and OSDs. This
> morning is the most recent case:
>
> > ceph health detail
> HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
> OSD_SCRUB_ERRORS 2 scrub errors
> PG_DAMAGED Possible data damage: 2 pgs inconsistent
> pg 13.65 is active+clean+inconsistent, acting [4,2,6]
> pg 14.31 is active+clean+inconsistent, acting [8,3,1]
>
> > ceph pg repair 13.65
> instructing pg 13.65 on osd.4 to repair
>
> (node-2)> tail /var/log/ceph/ceph-osd.4.log
> 2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 
> 13.65 repair starts
> 2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 
> 13.65 repair ok, 0 fixed
>
> > ceph pg repair 14.31
> instructing pg 14.31 on osd.8 to repair
>
> (node-3)> tail /var/log/ceph/ceph-osd.8.log
> 2018-02-28 08:52:37.297490 7f4dd0816700  0 log_channel(cluster) log [DBG] : 
> 14.31 repair starts
> 2018-02-28 08:53:00.704020 7f4dd0816700  0 log_channel(cluster) log [DBG] : 
> 14.31 repair ok, 0 fixed
>
>
>
> I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD had to
> repair PG. Date is dd/mm/
>
> 21/12/2017   --  pg 14.29 is active+clean+inconsistent, acting [6,2,4]
>
> 18/01/2018   --  pg 14.5a is active+clean+inconsistent, acting [6,4,1]
>
> 22/01/2018   --  pg 9.3a is active+clean+inconsistent, acting [2,7]
>
> 29/01/2018   --  pg 13.3e is active+clean+inconsistent

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-05 Thread Vladimir Prokofev
> candidate had a read error
speaks for itself - while scrubbing it coudn't read data.
I had similar issue, and it was just OSD dying - errors and relocated
sectors in SMART, just replaced the disk. But in your case it seems that
errors are on different OSDs? Are your OSDs all healthy?
You can use this command to see some details.
rados list-inconsistent-obj  --format=json-pretty
pg.id is the PG that's reporting as inconsistent. My guess is that you'll
see read errors in this output, with OSD number that encountered error.
After that you have to check that OSD health - SMART details, etc.
Not always it's the disk itself that causing problems - for example we had
read errors because of a faulty backplane interface in a server; changing
the chassis resolved this issue.


2018-03-05 14:21 GMT+03:00 Marco Baldini - H.S. Amiata :

> Hi
>
> After some days with debug_osd 5/5 I found [ERR] in different days,
> different PGs, different OSDs, different hosts. This is what I get in the
> OSD logs:
>
> *OSD.5 (host 3)*
> 2018-03-01 20:30:02.702269 7fdf4d515700  2 osd.5 pg_epoch: 16486 pg[9.1c( v 
> 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 
> ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] 
> r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 
> active+clean+scrubbing+deep] 9.1c shard 6: soid 
> 9:3b157c56:::rbd_data.1526386b8b4567.1761:head candidate had a 
> read error
> 2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 
> 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.1761:head 
> candidate had a read error
>
> *
> OSD.4 (host 3)*
> 2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 
> 13.65 shard 2: soid 
> 13:a719ecdf:::rbd_data.5f65056b8b4567.f8eb:head candidate had a 
> read error
>
> *OSD.8 (host 2)*
> 2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 
> 14.31 shard 1: soid 
> 14:8cc6cd37:::rbd_data.30b15b6b8b4567.81a1:head candidate had a 
> read error
>
> I don't know what this error is meaning, and as always a ceph pg repair
> fixes it. I don't think this is normal.
>
> Ideas?
>
> Thanks
>
> Il 28/02/2018 14:48, Marco Baldini - H.S. Amiata ha scritto:
>
> Hi
>
> I read the bugtracker issue and it seems a lot like my problem, even if I
> can't check the reported checksum because I don't have it in my logs,
> perhaps it's because of debug osd = 0/0 in ceph.conf
>
> I just raised the OSD log level
>
> ceph tell osd.* injectargs --debug-osd 5/5
>
> I'll check OSD logs in the next days...
>
> Thanks
>
>
>
> Il 28/02/2018 11:59, Paul Emmerich ha scritto:
>
> Hi,
>
> might be http://tracker.ceph.com/issues/22464
>
> Can you check the OSD log file to see if the reported checksum
> is 0x6706be76?
>
>
> Paul
>
> Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata <
> mbald...@hsamiata.it>:
>
> Hello
>
> I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and 1x240GB
> SSD. I created this cluster after Luminous release, so all OSDs are
> Bluestore. In my crush map I have two rules, one targeting the SSDs and one
> targeting the HDDs. I have 4 pools, one using the SSD rule and the others
> using the HDD rule, three pools are size=3 min_size=2, one is size=2
> min_size=1 (this one have content that it's ok to lose)
>
> In the last 3 month I'm having a strange random problem. I planned my osd
> scrubs during the night (osd scrub begin hour = 20, osd scrub end hour = 7)
> when office is closed so there is low impact on the users. Some mornings,
> when I ceph the cluster health, I find:
>
> HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
> OSD_SCRUB_ERRORS X scrub errors
> PG_DAMAGED Possible data damage: Y pg inconsistent
>
> X and Y sometimes are 1, sometimes 2.
>
> I issue a ceph health detail, check the damaged PGs, and run a ceph pg
> repair for the damaged PGs, I get
>
> instructing pg PG on osd.N to repair
>
> PG are different, OSD that have to repair PG is different, even the node
> hosting the OSD is different, I made a list of all PGs and OSDs. This
> morning is the most recent case:
>
> > ceph health detail
> HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
> OSD_SCRUB_ERRORS 2 scrub errors
> PG_DAMAGED Possible data damage: 2 pgs inconsistent
> pg 13.65 is active+clean+inconsistent, acting [4,2,6]
> pg 14.31 is active+clean+inconsistent, acting [8,3,1]
>
> > ceph pg repair 13.65
> instructing pg 13.65 on osd.4 to repair
>
> (node-2)> tail /var/log/ceph/ceph-osd.4.log
> 2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 
> 13.65 repair starts
> 2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 
> 13.65 repair ok, 0 fixed
>
> > ceph pg repair 14.31
> instructing pg 14.31 on osd.8 to repair
>
> (node-3)> tail /var/log/ceph/ceph-osd.8.log
> 2018-02-28 08:52:37.297490 7f4dd0816700  0 log_channel(cluster) l

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-05 Thread Marco Baldini - H.S. Amiata

Hi

I just posted in the ceph tracker with my logs and my issue

Let's hope this will be fixed

Thanks


Il 05/03/2018 13:36, Paul Emmerich ha scritto:

Hi,

yeah, the cluster that I'm seeing this on also has only one host that 
reports that specific checksum. Two other hosts only report the same 
error that you are seeing.


Could you post to the tracker issue that you are also seeing this?

Paul

2018-03-05 12:21 GMT+01:00 Marco Baldini - H.S. Amiata 
mailto:mbald...@hsamiata.it>>:


Hi

After some days with debug_osd 5/5 I found [ERR] in different
days, different PGs, different OSDs, different hosts. This is what
I get in the OSD logs:

*OSD.5 (host 3)*
2018-03-01 20:30:02.702269 7fdf4d515700  2 osd.5 pg_epoch: 16486 pg[9.1c( v 
16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 
ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] 
r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 
active+clean+scrubbing+deep] 9.1c shard 6: soid 
9:3b157c56:::rbd_data.1526386b8b4567.1761:head candidate had a read 
error
2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 
9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.1761:head 
candidate had a read error

*OSD.4 (host 3)*
2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 
13.65 shard 2: soid 13:a719ecdf:::rbd_data.5f65056b8b4567.f8eb:head 
candidate had a read error

*OSD.8 (host 2)*
2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 
14.31 shard 1: soid 14:8cc6cd37:::rbd_data.30b15b6b8b4567.81a1:head 
candidate had a read error

I don't know what this error is meaning, and as always a ceph pg
repair fixes it. I don't think this is normal.

Ideas?

Thanks


Il 28/02/2018 14:48, Marco Baldini - H.S. Amiata ha scritto:


Hi

I read the bugtracker issue and it seems a lot like my problem,
even if I can't check the reported checksum because I don't have
it in my logs, perhaps it's because of debug osd = 0/0 in ceph.conf

I just raised the OSD log level

ceph tell osd.* injectargs --debug-osd 5/5

I'll check OSD logs in the next days...

Thanks



Il 28/02/2018 11:59, Paul Emmerich ha scritto:

Hi,

might be http://tracker.ceph.com/issues/22464


Can you check the OSD log file to see if the reported checksum
is 0x6706be76?


Paul


Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata
mailto:mbald...@hsamiata.it>>:

Hello

I have a little ceph cluster with 3 nodes, each with 3x1TB HDD
and 1x240GB SSD. I created this cluster after Luminous release,
so all OSDs are Bluestore. In my crush map I have two rules,
one targeting the SSDs and one targeting the HDDs. I have 4
pools, one using the SSD rule and the others using the HDD
rule, three pools are size=3 min_size=2, one is size=2
min_size=1 (this one have content that it's ok to lose)

In the last 3 month I'm having a strange random problem. I
planned my osd scrubs during the night (osd scrub begin hour =
20, osd scrub end hour = 7) when office is closed so there is
low impact on the users. Some mornings, when I ceph the cluster
health, I find:

HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent

X and Y sometimes are 1, sometimes 2.

I issue a ceph health detail, check the damaged PGs, and run a
ceph pg repair for the damaged PGs, I get

instructing pg PG on osd.N to repair

PG are different, OSD that have to repair PG is different, even
the node hosting the OSD is different, I made a list of all PGs
and OSDs. This morning is the most recent case:

> ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
> ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair

(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 
13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 
13.65 repair ok, 0 fixed
> ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair

(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700  0 log_channel(cluster) log [DBG] : 
14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700  0 log_channel(cluster) log [DBG] : 
14.31 repair ok, 0 fixed


I made a list of when I got OSD_SCRUB_ERRORS, what PG and what
OSD had to rep

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-05 Thread Marco Baldini - H.S. Amiata

Hi and thanks for reply

The OSDs are all healthy, in fact after a ceph pg repair  the ceph 
health is back to OK and in the OSD log I see  repair ok, 0 fixed


The SMART data of the 3 OSDs seems fine

*OSD.5*

# ceph-disk list | grep osd.5
 /dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2

# smartctl -a /dev/sdd
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number:Z9A1MA1V
LU WWN Device Id: 5 000c50 090c7028b
Firmware Version: CC43
User Capacity:1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate:7200 rpm
Form Factor:  3.5 inches
Device is:In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:Mon Mar  5 16:17:22 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status:  (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection:(0) seconds.
Offline data collection
capabilities:(0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off 
support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time:(   1) minutes.
Extended self-test routine
recommended polling time:( 109) minutes.
Conveyance self-test routine
recommended polling time:(   2) minutes.
SCT capabilities:  (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED  
WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000f   082   063   006Pre-fail  Always   
-   193297722
  3 Spin_Up_Time0x0003   097   097   000Pre-fail  Always   
-   0
  4 Start_Stop_Count0x0032   100   100   020Old_age   Always   
-   60
  5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  Always   
-   0
  7 Seek_Error_Rate 0x000f   091   060   045Pre-fail  Always   
-   1451132477
  9 Power_On_Hours  0x0032   085   085   000Old_age   Always   
-   13283
 10 Spin_Retry_Count0x0013   100   100   097Pre-fail  Always   
-   0
 12 Power_Cycle_Count   0x0032   100   100   020Old_age   Always   
-   61
183 Runtime_Bad_Block   0x0032   100   100   000Old_age   Always   
-   0
184 End-to-End_Error0x0032   100   100   099Old_age   Always   
-   0
187 Reported_Uncorrect  0x0032   100   100   000Old_age   Always   
-   0
188 Command_Timeout 0x0032   100   100   000Old_age   Always   
-   0 0 0
189 High_Fly_Writes 0x003a   086   086   000Old_age   Always   
-   14
190 Airflow_Temperature_Cel 0x0022   071   055   040Old_age   Always   
-   29 (Min/Max 23/32)
193 Load_Cycle_Count0x0032   100   100   000Old_age   Always   
-   607
194 Temperature_Celsius 0x0022   029   014   000Old_age   Always   
-   29 (0 14 0 0 0)
195 Hardware_ECC_Recovered  0x001a   004   001   000Old_age   Always   
-   193297722
197 Current_Pending_Sector  0x0012   100   100   000Old_age   Always   
-   0
198 Offline_Uncorrectable   0x0010   100   100   000Old_age   Offline  
-   0
199 UDM

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-05 Thread Vladimir Prokofev
> always solved by ceph pg repair 
That doesn't necessarily means that there's no hardware issue. In my case
repair also worked fine and returned cluster to OK state every time, but in
time faulty disk fail another scrub operation, and this repeated multiple
times before we replaced that disk.
One last thing to look into is dmesg at your OSD nodes. If there's a
hardware read error it will be logged in dmesg.

2018-03-05 18:26 GMT+03:00 Marco Baldini - H.S. Amiata :

> Hi and thanks for reply
>
> The OSDs are all healthy, in fact after a ceph pg repair  the ceph
> health is back to OK and in the OSD log I see   repair ok, 0 fixed
>
> The SMART data of the 3 OSDs seems fine
>
> *OSD.5*
>
> # ceph-disk list | grep osd.5
>  /dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2
>
> # smartctl -a /dev/sdd
> smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
>
> === START OF INFORMATION SECTION ===
> Model Family: Seagate Barracuda 7200.14 (AF)
> Device Model: ST1000DM003-1SB10C
> Serial Number:Z9A1MA1V
> LU WWN Device Id: 5 000c50 090c7028b
> Firmware Version: CC43
> User Capacity:1,000,204,886,016 bytes [1.00 TB]
> Sector Sizes: 512 bytes logical, 4096 bytes physical
> Rotation Rate:7200 rpm
> Form Factor:  3.5 inches
> Device is:In smartctl database [for details use: -P show]
> ATA Version is:   ATA8-ACS T13/1699-D revision 4
> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
> Local Time is:Mon Mar  5 16:17:22 2018 CET
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
>
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
>
> General SMART Values:
> Offline data collection status:  (0x82)   Offline data collection activity
>   was completed without error.
>   Auto Offline Data Collection: Enabled.
> Self-test execution status:  (   0)   The previous self-test routine 
> completed
>   without error or no self-test has ever
>   been run.
> Total time to complete Offline
> data collection:  (0) seconds.
> Offline data collection
> capabilities:  (0x7b) SMART execute Offline immediate.
>   Auto Offline data collection on/off 
> support.
>   Suspend Offline collection upon new
>   command.
>   Offline surface scan supported.
>   Self-test supported.
>   Conveyance Self-test supported.
>   Selective Self-test supported.
> SMART capabilities:(0x0003)   Saves SMART data before entering
>   power-saving mode.
>   Supports SMART auto save timer.
> Error logging capability:(0x01)   Error logging supported.
>   General Purpose Logging supported.
> Short self-test routine
> recommended polling time:  (   1) minutes.
> Extended self-test routine
> recommended polling time:  ( 109) minutes.
> Conveyance self-test routine
> recommended polling time:  (   2) minutes.
> SCT capabilities:(0x1085) SCT Status supported.
>
> SMART Attributes Data Structure revision number: 10
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED  
> WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate 0x000f   082   063   006Pre-fail  Always  
>  -   193297722
>   3 Spin_Up_Time0x0003   097   097   000Pre-fail  Always  
>  -   0
>   4 Start_Stop_Count0x0032   100   100   020Old_age   Always  
>  -   60
>   5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  Always  
>  -   0
>   7 Seek_Error_Rate 0x000f   091   060   045Pre-fail  Always  
>  -   1451132477
>   9 Power_On_Hours  0x0032   085   085   000Old_age   Always  
>  -   13283
>  10 Spin_Retry_Count0x0013   100   100   097Pre-fail  Always  
>  -   0
>  12 Power_Cycle_Count   0x0032   100   100   020Old_age   Always  
>  -   61
> 183 Runtime_Bad_Block   0x0032   100   100   000Old_age   Always  
>  -   0
> 184 End-to-End_Error0x0032   100   100   099Old_age   Always  
>  -   0
> 187 Reported_Uncorrect  0x0032   100   100   000Old_age   Always  
>  -   0
> 188 Command_Timeout 0x0032   100   100   000Old_age   Always  
>  -   0 0 0
> 189 High_Fly_Writes 0x

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-05 Thread Marco Baldini - H.S. Amiata

Hi

I monitor dmesg in each of the 3 nodes, no hardware issue reported. And 
the problem happens with various different OSDs in different nodes, for 
me it is clear it's not an hardware problem.


Thanks for reply



Il 05/03/2018 21:45, Vladimir Prokofev ha scritto:

> always solved by ceph pg repair 
That doesn't necessarily means that there's no hardware issue. In my 
case repair also worked fine and returned cluster to OK state every 
time, but in time faulty disk fail another scrub operation, and this 
repeated multiple times before we replaced that disk.
One last thing to look into is dmesg at your OSD nodes. If there's a 
hardware read error it will be logged in dmesg.


2018-03-05 18:26 GMT+03:00 Marco Baldini - H.S. Amiata 
mailto:mbald...@hsamiata.it>>:


Hi and thanks for reply

The OSDs are all healthy, in fact after a ceph pg repair  the
ceph health is back to OK and in the OSD log I see   repair
ok, 0 fixed

The SMART data of the 3 OSDs seems fine

*OSD.5*

# ceph-disk list | grep osd.5
  /dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2

# smartctl -a /dev/sdd
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke,www.smartmontools.org 


=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number:Z9A1MA1V
LU WWN Device Id: 5 000c50 090c7028b
Firmware Version: CC43
User Capacity:1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate:7200 rpm
Form Factor:  3.5 inches
Device is:In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:Mon Mar  5 16:17:22 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status:  (   0) The previous self-test routine 
completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection:(0) seconds.
Offline data collection
capabilities:(0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off 
support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time:(   1) minutes.
Extended self-test routine
recommended polling time:( 109) minutes.
Conveyance self-test routine
recommended polling time:(   2) minutes.
SCT capabilities:  (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED  
WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate 0x000f   082   063   006Pre-fail  Always   
-   193297722
   3 Spin_Up_Time0x0003   097   097   000Pre-fail  Always   
-   0
   4 Start_Stop_Count0x0032   100   100   020Old_age   Always   
-   60
   5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  Always   
-   0
   7 Seek_Error_Rate 0x000f   091   060   045Pre-fail  Always   
-   1451132477
   9 Power_On_Hours  0x0032   085   085   000Old_age   Always   
-   13283
  10 Spin_Retry_Count0x0013   100   100   097Pre-fail  Always   
-   0
  12 Power_Cycle_Count   0x0032   100   100   020Old_age   Always   
-   61
   

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-06 Thread Brad Hubbard
On Tue, Mar 6, 2018 at 5:26 PM, Marco Baldini - H.S. Amiata <
mbald...@hsamiata.it> wrote:

> Hi
>
> I monitor dmesg in each of the 3 nodes, no hardware issue reported. And
> the problem happens with various different OSDs in different nodes, for me
> it is clear it's not an hardware problem.
>

If you have osd_debug set to 25 or greater when you run the deep scrub you
should get more information about the nature of the read error in the
ReplicatedBackend::be_deep_scrub() function (assuming this is a replicated
pool).

This may create large logs so watch they don't exhaust storage.

> Thanks for reply
>
>
>
> Il 05/03/2018 21:45, Vladimir Prokofev ha scritto:
>
> > always solved by ceph pg repair 
> That doesn't necessarily means that there's no hardware issue. In my case
> repair also worked fine and returned cluster to OK state every time, but in
> time faulty disk fail another scrub operation, and this repeated multiple
> times before we replaced that disk.
> One last thing to look into is dmesg at your OSD nodes. If there's a
> hardware read error it will be logged in dmesg.
>
> 2018-03-05 18:26 GMT+03:00 Marco Baldini - H.S. Amiata <
> mbald...@hsamiata.it>:
>
>> Hi and thanks for reply
>>
>> The OSDs are all healthy, in fact after a ceph pg repair  the ceph
>> health is back to OK and in the OSD log I see   repair ok, 0 fixed
>>
>> The SMART data of the 3 OSDs seems fine
>>
>> *OSD.5*
>>
>> # ceph-disk list | grep osd.5
>>  /dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2
>>
>> # smartctl -a /dev/sdd
>> smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
>> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
>>
>> === START OF INFORMATION SECTION ===
>> Model Family: Seagate Barracuda 7200.14 (AF)
>> Device Model: ST1000DM003-1SB10C
>> Serial Number:Z9A1MA1V
>> LU WWN Device Id: 5 000c50 090c7028b
>> Firmware Version: CC43
>> User Capacity:1,000,204,886,016 bytes [1.00 TB]
>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>> Rotation Rate:7200 rpm
>> Form Factor:  3.5 inches
>> Device is:In smartctl database [for details use: -P show]
>> ATA Version is:   ATA8-ACS T13/1699-D revision 4
>> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>> Local Time is:Mon Mar  5 16:17:22 2018 CET
>> SMART support is: Available - device has SMART capability.
>> SMART support is: Enabled
>>
>> === START OF READ SMART DATA SECTION ===
>> SMART overall-health self-assessment test result: PASSED
>>
>> General SMART Values:
>> Offline data collection status:  (0x82)  Offline data collection activity
>>  was completed without error.
>>  Auto Offline Data Collection: Enabled.
>> Self-test execution status:  (   0)  The previous self-test routine 
>> completed
>>  without error or no self-test has ever
>>  been run.
>> Total time to complete Offline
>> data collection: (0) seconds.
>> Offline data collection
>> capabilities: (0x7b) SMART execute Offline immediate.
>>  Auto Offline data collection on/off 
>> support.
>>  Suspend Offline collection upon new
>>  command.
>>  Offline surface scan supported.
>>  Self-test supported.
>>  Conveyance Self-test supported.
>>  Selective Self-test supported.
>> SMART capabilities:(0x0003)  Saves SMART data before entering
>>  power-saving mode.
>>  Supports SMART auto save timer.
>> Error logging capability:(0x01)  Error logging supported.
>>  General Purpose Logging supported.
>> Short self-test routine
>> recommended polling time: (   1) minutes.
>> Extended self-test routine
>> recommended polling time: ( 109) minutes.
>> Conveyance self-test routine
>> recommended polling time: (   2) minutes.
>> SCT capabilities:   (0x1085) SCT Status supported.
>>
>> SMART Attributes Data Structure revision number: 10
>> Vendor Specific SMART Attributes with Thresholds:
>> ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED  
>> WHEN_FAILED RAW_VALUE
>>   1 Raw_Read_Error_Rate 0x000f   082   063   006Pre-fail  Always 
>>   -   193297722
>>   3 Spin_Up_Time0x0003   097   097   000Pre-fail  Always 
>>   -   0
>>   4 Start_Stop_Count0x0032   100   100   020Old_age   Always 
>>   -   60
>>   5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  Always 
>>   -   0
>>   7 Seek_Error_Rate 0

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-06 Thread Brad Hubbard
debug_osd that is... :)

On Tue, Mar 6, 2018 at 7:10 PM, Brad Hubbard  wrote:

>
>
> On Tue, Mar 6, 2018 at 5:26 PM, Marco Baldini - H.S. Amiata <
> mbald...@hsamiata.it> wrote:
>
>> Hi
>>
>> I monitor dmesg in each of the 3 nodes, no hardware issue reported. And
>> the problem happens with various different OSDs in different nodes, for me
>> it is clear it's not an hardware problem.
>>
>
> If you have osd_debug set to 25 or greater when you run the deep scrub you
> should get more information about the nature of the read error in the
> ReplicatedBackend::be_deep_scrub() function (assuming this is a
> replicated pool).
>
> This may create large logs so watch they don't exhaust storage.
>
>> Thanks for reply
>>
>>
>>
>> Il 05/03/2018 21:45, Vladimir Prokofev ha scritto:
>>
>> > always solved by ceph pg repair 
>> That doesn't necessarily means that there's no hardware issue. In my case
>> repair also worked fine and returned cluster to OK state every time, but in
>> time faulty disk fail another scrub operation, and this repeated multiple
>> times before we replaced that disk.
>> One last thing to look into is dmesg at your OSD nodes. If there's a
>> hardware read error it will be logged in dmesg.
>>
>> 2018-03-05 18:26 GMT+03:00 Marco Baldini - H.S. Amiata <
>> mbald...@hsamiata.it>:
>>
>>> Hi and thanks for reply
>>>
>>> The OSDs are all healthy, in fact after a ceph pg repair  the ceph
>>> health is back to OK and in the OSD log I see   repair ok, 0 fixed
>>>
>>> The SMART data of the 3 OSDs seems fine
>>>
>>> *OSD.5*
>>>
>>> # ceph-disk list | grep osd.5
>>>  /dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2
>>>
>>> # smartctl -a /dev/sdd
>>> smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
>>> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
>>>
>>> === START OF INFORMATION SECTION ===
>>> Model Family: Seagate Barracuda 7200.14 (AF)
>>> Device Model: ST1000DM003-1SB10C
>>> Serial Number:Z9A1MA1V
>>> LU WWN Device Id: 5 000c50 090c7028b
>>> Firmware Version: CC43
>>> User Capacity:1,000,204,886,016 bytes [1.00 TB]
>>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>> Rotation Rate:7200 rpm
>>> Form Factor:  3.5 inches
>>> Device is:In smartctl database [for details use: -P show]
>>> ATA Version is:   ATA8-ACS T13/1699-D revision 4
>>> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>> Local Time is:Mon Mar  5 16:17:22 2018 CET
>>> SMART support is: Available - device has SMART capability.
>>> SMART support is: Enabled
>>>
>>> === START OF READ SMART DATA SECTION ===
>>> SMART overall-health self-assessment test result: PASSED
>>>
>>> General SMART Values:
>>> Offline data collection status:  (0x82) Offline data collection activity
>>> was completed without error.
>>> Auto Offline Data Collection: Enabled.
>>> Self-test execution status:  (   0) The previous self-test routine 
>>> completed
>>> without error or no self-test has ever
>>> been run.
>>> Total time to complete Offline
>>> data collection:(0) seconds.
>>> Offline data collection
>>> capabilities:(0x7b) SMART execute Offline immediate.
>>> Auto Offline data collection on/off 
>>> support.
>>> Suspend Offline collection upon new
>>> command.
>>> Offline surface scan supported.
>>> Self-test supported.
>>> Conveyance Self-test supported.
>>> Selective Self-test supported.
>>> SMART capabilities:(0x0003) Saves SMART data before entering
>>> power-saving mode.
>>> Supports SMART auto save timer.
>>> Error logging capability:(0x01) Error logging supported.
>>> General Purpose Logging supported.
>>> Short self-test routine
>>> recommended polling time:(   1) minutes.
>>> Extended self-test routine
>>> recommended polling time:( 109) minutes.
>>> Conveyance self-test routine
>>> recommended polling time:(   2) minutes.
>>> SCT capabilities:  (0x1085) SCT Status supported.
>>>
>>> SMART Attributes Data Structure revision number: 10
>>> Vendor Specific SMART Attributes with Thresholds:
>>> ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED  
>>> WHEN_FAILED RAW_VALUE
>>>   1 Raw_Read_Error_Rate 0x000f   082   063   006Pre-fail  Always
>>>-   193297722
>>>   3 Spin_Up_Time0x0003   097   097   000Pre-fail  Always
>>>-   0
>>>   4 Start_Stop_Count0x0032   100   1