[ceph-users] Re: 5 pgs inactive, 5 pgs incomplete

2020-08-11 Thread Wido den Hollander



On 11/08/2020 20:41, Kevin Myers wrote:

Replica count of 2 is a sure fire way to a crisis !



It is :-)


Sent from my iPad


On 11 Aug 2020, at 18:45, Martin Palma  wrote:

Hello,
after an unexpected power outage our production cluster has 5 PGs
inactive and incomplete. The OSDs on which these 5 PGs are located all
show "stuck requests are blocked":

  Reduced data availability: 5 pgs inactive, 5 pgs incomplete
  98 stuck requests are blocked > 4096 sec. Implicated osds 63,80,492,494

What is the best procedure to get these PGs back? These PGs are all of
pools with a replica of 2.


Are the OSDs online? Or do they refuse to boot?

Can you list the data with ceph-objectstore-tool on these OSDs?

Wido



Best,
Martin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] What The Benefit Of Choosing Epson Customer Service?

2020-08-11 Thread mary smith
All together get profited by picking the right and trustworthy Epson Customer 
Service specialist co-ops, it is firmly prescribed to have a solid examination 
before engaging with any of such specialist organizations as there are a few 
phony organizations are likewise there professing to be the best yet really 
not. https://www.epsonprintersupportpro.net/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Get the opportunity To Be Acquainted With HP Support Assistant In A Proper Manner.

2020-08-11 Thread mary smith
So as to be familiar with the HP Support Assistant in an appropriate way, you 
ought not leave any stone unturned in moving toward the guaranteed printer 
specialists who will expertly help you the specific investigating answer for 
your issues in a powerful way. You can move toward them by simply utilizing 
distinctive advantageous methods at whenever. 
https://www.amiytech.com/hp-support-assistant/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] It takes long time for a newly added osd booting to up state due to heavy rocksdb activity

2020-08-11 Thread Jerry Pu
Hi All

We had a cluster (v13.2.4) with 32 osds in total. At first, an osd (osd.18)
in cluster was down. So, we tried to remove the osd and added a new one
(osd.32) with new ID. We unplugged the disk (osd.18) and plugged in a new
disk in the same slot and add osd.32 into cluster. Then, osd.32 was
booting, but, we found it takes much time (around 18 mins) for the osd to
change to up state. Diving into osd.32 logs, we see that there is much
rocksdb activity before osd.32 change to up state. Can anyone explain why
this happened or give me any advice about how to prevent from this. Thanks.


[osd.32 log]
2020-08-03 15:36:58.852 7f88021fa1c0  0 osd.32 0 done with init, starting
boot process
2020-08-03 15:36:58.852 7f88021fa1c0  1 osd.32 0 start_boot
2020-08-03 15:36:58.854 7f87db02b700 -1 osd.32 0 waiting for initial osdmap
2020-08-03 15:36:58.855 7f87e4ba0700 -1 osd.32 0 failed to load OSD map for
epoch 22010, got 0 bytes
2020-08-03 15:36:58.955 7f87e0836700  0 osd.32 22011 crush map has features
283675107524608, adjusting msgr requires for clients
2020-08-03 15:36:58.955 7f87e0836700  0 osd.32 22011 crush map has features
283675107524608 was 288232575208792577, adjusting msgr requires for mons
*2020-08-03 15:36:58.955* 7f87e0836700  0 osd.32 22011 crush map has
features 720859615486820352, adjusting msgr requires for osds
2020-08-03 15:37:31.182 7f87e1037700  4 rocksdb:
[/home/gitlab/rpmbuild/BUILD/ceph-13.2.4/src/rocksdb/db/db_impl_write.cc:1346]
[default] New memtable created with log file: #16. Immutable memtables: 0.

2020-08-03 15:37:31.285 7f87e8045700  4 rocksdb: (Original Log Time
2020/08/03-15:37:31.183995)
[/home/gitlab/rpmbuild/BUILD/ceph-13.2.4/src/rocksdb/db/db_impl_compaction_flush.cc:1396]
Calling FlushMemTableToOutputFile with column family [default], flush slots
available 1, compaction slots available 1, flush slots scheduled 1,
compaction slots scheduled 0
2020-08-03 15:37:31.285 7f87e8045700  4 rocksdb:
[/home/gitlab/rpmbuild/BUILD/ceph-13.2.4/src/rocksdb/db/flush_job.cc:300]
[default] [JOB 3] Flushing memtable with next log file: 16

 lots of rocksdb activity-

2020-08-03 15:54:21.704 7f87e8045700  4 rocksdb: (Original Log Time
2020/08/03-15:54:21.705680)
[/home/gitlab/rpmbuild/BUILD/ceph-13.2.4/src/rocksdb/db/memtable_list.cc:397]
[default] Level-0 commit table #112: memtable #1 done
2020-08-03 15:54:21.704 7f87e8045700  4 rocksdb: (Original Log Time
2020/08/03-15:54:21.705704) EVENT_LOG_v1 {"time_micros": 1596441261705697,
"job": 51, "event": "flush_finished", "output_compression":
"NoCompression", "lsm_state": [1, 3, 0, 0, 0, 0, 0], "immutable_memtables":
0}
2020-08-03 15:54:21.704 7f87e8045700  4 rocksdb: (Original Log Time
2020/08/03-15:54:21.705721)
[/home/gitlab/rpmbuild/BUILD/ceph-13.2.4/src/rocksdb/db/db_impl_compaction_flush.cc:172]
[default] Level summary: base level 1 max bytes base 268435456 files[1 3 0
0 0 0 0] max score 0.75

*2020-08-03 15:54:38.567* 7f87e0836700  1 osd.32 502096 state: booting ->
active
2020-08-03 15:54:38.567 7f87d5820700  1 osd.32 pg_epoch: 502096 pg[1.17e(
empty local-lis/les=0/0 n=0 ec=11627/16 lis/c 501703/501703 les/c/f
501704/501704/0 502096/502096/502096) [32,26,28] r=0 lpr=502096
pi=[501703,502096)/1 crt=0'0 mlcod 0'0 unknown mbc={}] state:
transitioning to Primary


Best
Jerry
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Remapped PGs

2020-08-11 Thread ceph
Hi,

I am not sure but perhaps this could be an Effekt of "balancer" module - if you 
use it!?

Hth
Mehmet

Am 10. August 2020 17:28:27 MESZ schrieb David Orman :
>We've gotten a bit further, after evaluating how this remapped count
>was
>determine (pg_temp), we've found the PGs counted as being remapped:
>
>root@ceph01:~# ceph osd dump |grep pg_temp
>pg_temp 3.7af [93,1,29]
>pg_temp 3.7bc [137,97,5]
>pg_temp 3.7d9 [72,120,18]
>pg_temp 3.7e8 [80,21,71]
>pg_temp 3.7fd [74,51,8]
>
>Looking at 3.7af:
>root@ceph01:~# ceph pg map 3.7af
>osdmap e15406 pg 3.7af (3.f) -> up [87,156,29] acting [87,156,29]
>
>I'm unclear why this is staying in pg_temp. Is there a way to clean
>this
>up? I would have expected it to be cleaned up as per docs but I might
>be
>missing something here.
>
>On Thu, Aug 6, 2020 at 2:40 PM David Orman 
>wrote:
>
>> Still haven't figured this out. We went ahead and upgraded the entire
>> cluster to Podman 2.0.4 and in the process did OS/Kernel upgrades and
>> rebooted every node, one at a time. We've still got 5 PGs stuck in
>> 'remapped' state, according to 'ceph -s' but 0 in the pg dump output
>in
>> that state. Does anybody have any suggestions on what to do about
>this?
>>
>> On Wed, Aug 5, 2020 at 10:54 AM David Orman 
>wrote:
>>
>>> Hi,
>>>
>>> We see that we have 5 'remapped' PGs, but are unclear why/what to do
>>> about it. We shifted some target ratios for the autobalancer and it
>>> resulted in this state. When adjusting ratio, we noticed two OSDs go
>down,
>>> but we just restarted the container for those OSDs with podman, and
>they
>>> came back up. Here's status output:
>>>
>>> ###
>>> root@ceph01:~# ceph status
>>> INFO:cephadm:Inferring fsid x
>>> INFO:cephadm:Inferring config x
>>> INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15
>>>   cluster:
>>> id: 41bb9256-c3bf-11ea-85b9-9e07b0435492
>>> health: HEALTH_OK
>>>
>>>   services:
>>> mon: 5 daemons, quorum ceph01,ceph04,ceph02,ceph03,ceph05 (age
>2w)
>>> mgr: ceph03.ytkuyr(active, since 2w), standbys: ceph01.aqkgbl,
>>> ceph02.gcglcg, ceph04.smbdew, ceph05.yropto
>>> osd: 168 osds: 168 up (since 2d), 168 in (since 2d); 5 remapped
>pgs
>>>
>>>   data:
>>> pools:   3 pools, 1057 pgs
>>> objects: 18.00M objects, 69 TiB
>>> usage:   119 TiB used, 2.0 PiB / 2.1 PiB avail
>>> pgs: 1056 active+clean
>>>  1active+clean+scrubbing+deep
>>>
>>>   io:
>>> client:   859 KiB/s rd, 212 MiB/s wr, 644 op/s rd, 391 op/s wr
>>>
>>> root@ceph01:~#
>>>
>>> ###
>>>
>>> When I look at ceph pg dump, I don't see any marked as remapped:
>>>
>>> ###
>>> root@ceph01:~# ceph pg dump |grep remapped
>>> INFO:cephadm:Inferring fsid x
>>> INFO:cephadm:Inferring config x
>>> INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15
>>> dumped all
>>> root@ceph01:~#
>>> ###
>>>
>>> Any idea what might be going on/how to recover? All OSDs are up.
>Health
>>> is 'OK'. This is Ceph 15.2.4 deployed using Cephadm in containers,
>on
>>> Podman 2.0.3.
>>>
>>
>___
>ceph-users mailing list -- ceph-users@ceph.io
>To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] v14.2.11 Nautilus released

2020-08-11 Thread Abhishek Lekshmanan

We're happy to announce the availability of the eleventh release in the
Nautilus series. This release brings a number of bugfixes across all
major components of Ceph. We recommend that all Nautilus users upgrade
to this release.

Notable Changes
---
* RGW: The `radosgw-admin` sub-commands dealing with orphans --
  `radosgw-admin orphans find`, `radosgw-admin orphans finish`,
  `radosgw-admin orphans list-jobs` -- have been deprecated. They
  have not been actively maintained and they store intermediate
  results on the cluster, which could fill a nearly-full cluster.
  They have been replaced by a tool, currently considered
  experimental, `rgw-orphan-list`.

* Now when noscrub and/or nodeep-scrub flags are set globally or per pool,
  scheduled scrubs of the type disabled will be aborted. All user initiated
  scrubs are NOT interrupted.

* Fixed a ceph-osd crash in _committed_osd_maps when there is a failure to 
encode
  the first incremental map. issue#46443: 
https://github.com/ceph/ceph/pull/46443

For the detailed changelog please refer to the blog entry at
https://ceph.io/releases/v14-2-11-nautilus-released/

Getting Ceph

* Git at git://github.com/ceph/ceph.git
* Tarball at http://download.ceph.com/tarballs/ceph-14.2.11.tar.gz
* For packages, see http://docs.ceph.com/docs/master/install/get-packages/
* Release git sha1: f7fdb2f52131f54b891a2ec99d8205561242cdaf

--
Abhishek Lekshmanan
SUSE Software Solutions Germany GmbH
GF: Felix Imendörffer, HRB 36809 (AG Nürnberg)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 5 pgs inactive, 5 pgs incomplete

2020-08-11 Thread Kevin Myers
Replica count of 2 is a sure fire way to a crisis !

Sent from my iPad

> On 11 Aug 2020, at 18:45, Martin Palma  wrote:
> 
> Hello,
> after an unexpected power outage our production cluster has 5 PGs
> inactive and incomplete. The OSDs on which these 5 PGs are located all
> show "stuck requests are blocked":
> 
>  Reduced data availability: 5 pgs inactive, 5 pgs incomplete
>  98 stuck requests are blocked > 4096 sec. Implicated osds 63,80,492,494
> 
> What is the best procedure to get these PGs back? These PGs are all of
> pools with a replica of 2.
> 
> Best,
> Martin
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] 5 pgs inactive, 5 pgs incomplete

2020-08-11 Thread Martin Palma
Hello,
after an unexpected power outage our production cluster has 5 PGs
inactive and incomplete. The OSDs on which these 5 PGs are located all
show "stuck requests are blocked":

  Reduced data availability: 5 pgs inactive, 5 pgs incomplete
  98 stuck requests are blocked > 4096 sec. Implicated osds 63,80,492,494

What is the best procedure to get these PGs back? These PGs are all of
pools with a replica of 2.

Best,
Martin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD memory leak?

2020-08-11 Thread Frank Schilder
Hi Mark,

here is a first collection of heap profiling data (valid 30 days):

https://files.dtu.dk/u/53HHic_xx5P1cceJ/heap_profiling-2020-08-03.tgz?l

This was collected with the following config settings:

  osd  dev  osd_memory_cache_min  805306368
  osd  basicosd_memory_target 2147483648

Setting the cache_min value seems to help keeping cache space available. 
Unfortunately, the above collection is for 12 days only. I needed to restart 
the OSD and will need to restart it soon again. I hope I can then run a longer 
sample. The profiling does cause slow ops though.

Maybe you can see something already? It seems to have collected some leaked 
memory. Unfortunately, it was a period of extremely low load. Basically, with 
the day of recording the utilization dropped to almost zero.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Frank Schilder 
Sent: 21 July 2020 12:57:32
To: Mark Nelson; Dan van der Ster
Cc: ceph-users
Subject: [ceph-users] Re: OSD memory leak?

Quick question: Is there a way to change the frequency of heap dumps? On this 
page http://goog-perftools.sourceforge.net/doc/heap_profiler.html a function 
HeapProfilerSetAllocationInterval() is mentioned, but no other way of 
configuring this. Is there a config parameter or a ceph daemon call to adjust 
this?

If not, can I change the dump path?

Its likely to overrun my log partition quickly if I cannot adjust either of the 
two.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Frank Schilder 
Sent: 20 July 2020 15:19:05
To: Mark Nelson; Dan van der Ster
Cc: ceph-users
Subject: [ceph-users] Re: OSD memory leak?

Dear Mark,

thank you very much for the very helpful answers. I will raise 
osd_memory_cache_min, leave everything else alone and watch what happens. I 
will report back here.

Thanks also for raising this as an issue.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Mark Nelson 
Sent: 20 July 2020 15:08:11
To: Frank Schilder; Dan van der Ster
Cc: ceph-users
Subject: Re: [ceph-users] Re: OSD memory leak?

On 7/20/20 3:23 AM, Frank Schilder wrote:
> Dear Mark and Dan,
>
> I'm in the process of restarting all OSDs and could use some quick advice on 
> bluestore cache settings. My plan is to set higher minimum values and deal 
> with accumulated excess usage via regular restarts. Looking at the 
> documentation 
> (https://docs.ceph.com/docs/mimic/rados/configuration/bluestore-config-ref/), 
> I find the following relevant options (with defaults):
>
> # Automatic Cache Sizing
> osd_memory_target {4294967296} # 4GB
> osd_memory_base {805306368} # 768MB
> osd_memory_cache_min {134217728} # 128MB
>
> # Manual Cache Sizing
> bluestore_cache_meta_ratio {.4} # 40% ?
> bluestore_cache_kv_ratio {.4} # 40% ?
> bluestore_cache_kv_max {512 * 1024*1024} # 512MB
>
> Q1) If I increase osd_memory_cache_min, should I also increase 
> osd_memory_base by the same or some other amount?


osd_memory_base is a hint at how much memory the OSD could consume
outside the cache once it's reached steady state.  It basically sets a
hard cap on how much memory the cache will use to avoid over-committing
memory and thrashing when we exceed the memory limit. It's not necessary
to get it right, it just helps smooth things out by making the automatic
memory tuning less aggressive.  IE if you have a 2 GB memory target and
a 512MB base, you'll never assign more than 1.5GB to the cache on the
assumption that the rest of the OSD will eventually need 512MB to
operate even if it's not using that much right now.  I think you can
probably just leave it alone.  What you and Dan appear to be seeing is
that this number isn't static in your case but increases over time any
way.  Eventually I'm hoping that we can automatically account for more
and more of that memory by reading the data from the mempools.

> Q2) The cache ratio options are shown under the section "Manual Cache 
> Sizing". Do they also apply when cache auto tuning is enabled? If so, is it 
> worth changing these defaults for higher values of osd_memory_cache_min?


They actually do have an effect on the automatic cache sizing and
probably shouldn't only be under the manual section.  When you have the
automatic cache sizing enabled, those options will affect the "fair
share" values of the different caches at each cache priority level.  IE
at priority level 0, if both caches want more memory than is available,
those ratios will determine how much each cache gets.  If there is more
memory available than requested, each cache gets as much as they want
and we move on to the next priority level and do the same thing again.
So in this case the ratios end up being sort of more like fallback
settings for 

[ceph-users] Re: ceph orch host rm seems to just move daemons out of cephadm, not remove them

2020-08-11 Thread pixel fairy
tried removing the daemon first, and that kinda blew up.

ceph orch daemon rm --force mon.tempmon
ceph orch host rm tempmon

now there are two problems.
1. ceph is still looking for it,

  services:
mon: 4 daemons, quorum ceph1,ceph2,ceph3 (age 3s), out of quorum:
tempmon
mgr: ceph1.oqptlg(active, since 9h), standbys: ceph2.hezrvv
osd: 9 osds: 9 up (since 8h), 9 in (since 8h)

2. more worrying the cephadm module is failing somewhere
INFO:cephadm:Inferring fsid 09e9711e-db88-11ea-b8c2-791b9888d2f2
INFO:cephadm:Using recent ceph image ceph/ceph:v15
  cluster:
id: 09e9711e-db88-11ea-b8c2-791b9888d2f2
health: HEALTH_ERR
Module 'cephadm' has failed: must be str, not NoneType
1/4 mons down, quorum ceph1,ceph2,ceph3

root@ceph1:/home/vagrant# ceph health detail
INFO:cephadm:Inferring fsid 09e9711e-db88-11ea-b8c2-791b9888d2f2
INFO:cephadm:Using recent ceph image ceph/ceph:v15
HEALTH_ERR Module 'cephadm' has failed: must be str, not NoneType; 1/4 mons
down, quorum ceph1,ceph2,ceph3
[ERR] MGR_MODULE_ERROR: Module 'cephadm' has failed: must be str, not
NoneType
Module 'cephadm' has failed: must be str, not NoneType
[WRN] MON_DOWN: 1/4 mons down, quorum ceph1,ceph2,ceph3
mon.tempmon (rank 3) addr [v2:10.16.16.10:3300/0,v1:10.16.16.10:6789/0]
is down (out of quorum)

is there another step that should be taken? Id expect "ceph orch host rm"
to also take anything it managed out of the cluster.

On Mon, Aug 10, 2020 at 12:42 PM pixel fairy  wrote:

> made a cluster of 2 osd hosts, and one temp monitor. then added another
> osd host and did a "ceph orch host rm tempmon". this all in vagrant
> (libvirt), with the generic/ubuntu2004 box.
>
> INFO:cephadm:Inferring fsid 5426a59e-db33-11ea-8441-b913b695959d
> INFO:cephadm:Using recent ceph image ceph/ceph:v15
>   cluster:
> id: 5426a59e-db33-11ea-8441-b913b695959d
> health: HEALTH_WARN
> 2 stray daemons(s) not managed by cephadm
> 1 stray host(s) with 2 daemon(s) not managed by cephadm
>
> added 2 more osd hosts, and ceph -s gave me this,
>   services:
> mon: 6 daemons, quorum ceph5,ceph4,tempmon,ceph3,ceph2,ceph1 (age 33m)
> mgr: ceph5.erdofb(active, since 82m), standbys: tempmon.xkrlmm,
> ceph3.xjuecs
> osd: 15 osds: 15 up (since 33m), 15 in (since 33m)
>
> my guess is cephadm wanted 5 managed mons, so did that, but still never
> removed the removed mon on the removed host. its still up. this is just a
> vagrant file. so i have two questions.
>
> 1. how do you remove that other host and its daemons from the cluster?
> 2. how would you recover from a host being destroyed?
>
> p.s. tried google
> Your search - "ceph orch host rm" "stray daemons(s) not manage by
> cephadm" - did not match any documents.
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Speeding up reconnection

2020-08-11 Thread William Edwards


> Hi,

> you can change the MDS setting to be less strict [1]:

> According to [1] the default is 300 seconds to be evicted. Maybe give  
> the less strict option a try?

Thanks for your reply. I already set mds_session_blacklist_on_timeout to false. 
This seems to have helped somewhat, but still, most of the time, the kernel 
client 'hangs'.

> Regards,
> Eugen



Zitat von William Edwards :

> Hello,
>
> When connection is lost between kernel client, a few things happen:
>
> 1.
> Caps become stale:
>
> Aug 11 11:08:14 admin-cap kernel: [308405.227718] ceph: mds0 caps stale
>
> 2.
> MDS evicts client for being unresponsive:
>
> MDS log: 2020-08-11 11:12:08.923 7fd1f45ae700  0  
> log_channel(cluster) log [WRN] : evicting unresponsive client  
> admin-cap.cf.ha.cyberfusion.cloud:DB0001-cap (144786749), after  
> 300.978 seconds
> Client log: Aug 11 11:12:11 admin-cap kernel: [308643.051006] ceph: mds0 hung
>
> 3.
> Socket is closed:
>
> Aug 11 11:22:57 admin-cap kernel: [309289.192705] libceph: mds0  
> [fdb7:b01e:7b8e:0:10:10:10:1]:6849 socket closed (con state OPEN)
>
> I am not sure whether the kernel client or MDS closes the  
> connection. I think the kernel client does so, because nothing is  
> logged at the MDS side at 11:22:57
>
> 4.
> Connection is reset by MDS:
>
> MDS log: 2020-08-11 11:22:58.831 7fd1f9e49700  0 --1-  
> [v2:[fdb7:b01e:7b8e:0:10:10:10:1]:6800/3619156441,v1:[fdb7:b01e:7b8e:0:10:10:10:1]:6849/3619156441]
>  >> v1:[fc00:b6d:cfc:951::7]:0/133007863 conn(0x55bfaf1c2880 0x55c16cb47000 
> :6849 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 
> l=0).handle_connect_message_2 accept we reset (peer sent cseq 1), sending  
> RESETSESSION
> Client log: Aug 11 11:22:58 admin-cap kernel: [309290.058222]  
> libceph: mds0 [fdb7:b01e:7b8e:0:10:10:10:1]:6849 connection reset
>
> 5.
> Kernel client reconnects:
>
> Aug 11 11:22:58 admin-cap kernel: [309290.058972] ceph: mds0 closed  
> our session
> Aug 11 11:22:58 admin-cap kernel: [309290.058973] ceph: mds0 reconnect start
> Aug 11 11:22:58 admin-cap kernel: [309290.069979] ceph: mds0 reconnect denied
> Aug 11 11:22:58 admin-cap kernel: [309290.069996] ceph: dropping  
> file locks for 6a23d9dd 1099625041446
> Aug 11 11:22:58 admin-cap kernel: [309290.071135] libceph: mds0  
> [fdb7:b01e:7b8e:0:10:10:10:1]:6849 socket closed (con state  
> NEGOTIATING)
>
> Question:
>
> As you can see, there's 10 minutes between losing the connection and  
> the reconnection attempt (11:12:08 - 11:22:58). I could not find any  
> settings related to the period after which reconnection is  
> attempted. I would like to change this value from 10 minutes to  
> something like 1 minute. I also tried searching the Ceph docs for  
> the string '600' (10 minutes), but did not find anything useful.
>
> Hope someone can help.
>
> Environment details:
>
> Client kernel: 4.19.0-10-amd64
> Ceph version: ceph version 14.2.9  
> (bed944f8c45b9c98485e99b70e11bbcec6f6659a) nautilus (stable)
>
>
> Met vriendelijke groeten,
>
> William Edwards
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Single node all-in-one install for testing

2020-08-11 Thread Richard W.M. Jones
I have one spare machine with a single 1TB on it, and I'd like to test
a local Ceph install.  This is just for testing, I don't care that it
won't have redundancy, failover, etc.  Is there any canonical
documentation for this case?

- - -

Longer story is this morning I found this documentation through many
Google searches:
https://medium.com/@balderscape/setting-up-a-virtual-single-node-ceph-storage-cluster-d86d6a6c658e

It kind of works up to a point.  I was able to get Ceph installed, get
both the command line and web interfaces up and running, and got as
far as:

# ceph status
  cluster:
id: 723e09aa-dbd3-11ea-8587-94c691189836
health: HEALTH_WARN
Reduced data availability: 1 pg inactive
OSD count 0 < osd_pool_default_size 3
 
  services:
mon: 1 daemons, quorum dev1 (age 77m)
mgr: dev1.ziibhq(active, since 76m)
osd: 0 osds: 0 up, 0 in
 
  data:
pools:   1 pools, 1 pgs
objects: 0 objects, 0 B
usage:   0 B used, 0 B / 0 B avail
pgs: 100.000% pgs unknown
 1 unknown

# ceph orch device ls
HOST  PATH  TYPE   SIZE  DEVICE AVAIL  REJECT 
REASONS
dev1  /dev/sdb  hdd   14.9G  Cruzer_Blade_4C530210050318117591  True
 
dev1  /dev/sda  hdd931G  WDC_WD10JFCX-68N_WD-WXD1AB73AUX5   False  LVM 
detected, locked  

(The USB stick is left over from the install and isn't suitable as an OSD).

However I've been completely unable to add any OSDs using the local
disk.  The disk (/dev/sda3) is a PV and there is plenty of space.  I
even created some devices that I was hoping to use as OSDs:

# lvs
  LV   VGAttr   LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync 
Convert
  osd1 rhel_dev1 -wi-a- 100.00g 
   
  osd2 rhel_dev1 -wi-a- 100.00g 
   
  osd3 rhel_dev1 -wi-a- 100.00g 
   
  root rhel_dev1 -wi-ao  50.00g 
   
  swap rhel_dev1 -wi-ao  <7.84g

but it simply doesn't do anything when I try to add them:

# ceph orch daemon add osd localhost:rhel_dev1/osd1
(no error, exit code 0)

So I guess the fact that the device is marked as "AVAIL = False" and
"locked" is bad somehow.  Can I add the OSDs anyway?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Announcing go-ceph v0.5.0

2020-08-11 Thread John Mulligan
I'm happy to announce the another release of the go-ceph API 
bindings. This is a regular release following our every-two-months release 
cadence.

https://github.com/ceph/go-ceph/releases/tag/v0.5.0

The bindings aim to play a similar role to the "pybind" python bindings in the 
ceph tree but for the Go language. These API bindings require the use of cgo.  
There are already a few consumers of this library in the wild, including the 
ceph-csi project.


Specific questions, comments, bugs etc are best directed at our github issues 
tracker.


---
John Mulligan

phlogistonj...@asynchrono.us
jmulli...@redhat.com

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Speeding up reconnection

2020-08-11 Thread William Edwards

Hello,

When connection is lost between kernel client, a few things happen:

1.
Caps become stale:

Aug 11 11:08:14 admin-cap kernel: [308405.227718] ceph: mds0 caps stale

2.
MDS evicts client for being unresponsive:

MDS log: 2020-08-11 11:12:08.923 7fd1f45ae700  0 log_channel(cluster) log [WRN] 
: evicting unresponsive client admin-cap.cf.ha.cyberfusion.cloud:DB0001-cap 
(144786749), after 300.978 seconds
Client log: Aug 11 11:12:11 admin-cap kernel: [308643.051006] ceph: mds0 hung

3.
Socket is closed:

Aug 11 11:22:57 admin-cap kernel: [309289.192705] libceph: mds0 
[fdb7:b01e:7b8e:0:10:10:10:1]:6849 socket closed (con state OPEN)

I am not sure whether the kernel client or MDS closes the connection. I think 
the kernel client does so, because nothing is logged at the MDS side at 11:22:57

4.
Connection is reset by MDS:

MDS log: 2020-08-11 11:22:58.831 7fd1f9e49700  0 --1- 
[v2:[fdb7:b01e:7b8e:0:10:10:10:1]:6800/3619156441,v1:[fdb7:b01e:7b8e:0:10:10:10:1]:6849/3619156441]
 >> v1:[fc00:b6d:cfc:951::7]:0/133007863 conn(0x55bfaf1c2880 0x55c16cb47000 
:6849 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 
l=0).handle_connect_message_2 accept we reset (peer sent cseq 1), sending 
RESETSESSION
Client log: Aug 11 11:22:58 admin-cap kernel: [309290.058222] libceph: mds0 
[fdb7:b01e:7b8e:0:10:10:10:1]:6849 connection reset

5.
Kernel client reconnects:

Aug 11 11:22:58 admin-cap kernel: [309290.058972] ceph: mds0 closed our session
Aug 11 11:22:58 admin-cap kernel: [309290.058973] ceph: mds0 reconnect start
Aug 11 11:22:58 admin-cap kernel: [309290.069979] ceph: mds0 reconnect denied
Aug 11 11:22:58 admin-cap kernel: [309290.069996] ceph: dropping file locks for 
6a23d9dd 1099625041446
Aug 11 11:22:58 admin-cap kernel: [309290.071135] libceph: mds0 
[fdb7:b01e:7b8e:0:10:10:10:1]:6849 socket closed (con state NEGOTIATING)

Question:

As you can see, there's 10 minutes between losing the connection and the 
reconnection attempt (11:12:08 - 11:22:58). I could not find any settings 
related to the period after which reconnection is attempted. I would like to 
change this value from 10 minutes to something like 1 minute. I also tried 
searching the Ceph docs for the string '600' (10 minutes), but did not find 
anything useful.

Hope someone can help.

Environment details:

Client kernel: 4.19.0-10-amd64
Ceph version: ceph version 14.2.9 (bed944f8c45b9c98485e99b70e11bbcec6f6659a) 
nautilus (stable)


Met vriendelijke groeten,

William Edwards

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: pg stuck in unknown state

2020-08-11 Thread Michael Thomas

On 8/11/20 2:52 AM, Wido den Hollander wrote:



On 11/08/2020 00:40, Michael Thomas wrote:
On my relatively new Octopus cluster, I have one PG that has been 
perpetually stuck in the 'unknown' state.  It appears to belong to the 
device_health_metrics pool, which was created automatically by the mgr 
daemon(?).


The OSDs that the PG maps to are all online and serving other PGs.  
But when I list the PGs that belong to the OSDs from 'ceph pg map', 
the offending PG is not listed.


# ceph pg dump pgs | grep ^1.0
dumped pgs
1.0    0   0 0  0    0 
0    0   0  0 0   unknown 
2020-08-08T09:30:33.251653-0500 0'0 0:0 []  
-1 []  -1  0'0 
2020-08-08T09:30:33.251653-0500  0'0 
2020-08-08T09:30:33.251653-0500  0


# ceph osd pool stats device_health_metrics
pool device_health_metrics id 1
   nothing is going on

# ceph pg map 1.0
osdmap e7199 pg 1.0 (1.0) -> up [41,40,2] acting [41,0]

What can be done to fix the PG?  I tried doing a 'ceph pg repair 1.0', 
but that didn't seem to do anything.


Is it safe to try to update the crush_rule for this pool so that the 
PG gets mapped to a fresh set of OSDs?


Yes, it would be. But still, it's weird. Mainly as the acting set is so 
different from the up-set.


You have different CRUSH rules I think?

Marking those OSDs down might work, but otherwise change the crush_rule 
and see how that goes.


Yes, I do have different crush rules to help map certain types of data 
to different classes of hardware (EC HDDs, replicated SSDs, replicated 
nvme).  The default crush rule for the device_health_metrics pool was to 
use replication across any storage device.  I changed it to use the 
replicated nvme crush rule, and now the map looks different:


# ceph pg map 1.0
osdmap e7256 pg 1.0 (1.0) -> up [24,22,12] acting [41,0]

However, the acting set of OSDs has not changed.

--Mike
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph not warning about clock skew on an OSD-only host?

2020-08-11 Thread Matthew Vernon

Hi,

Our production cluster runs Luminous.

Yesterday, one of our OSD-only hosts came up with its clock about 8 
hours wrong(!) having been out of the cluster for a week or so. 
Initially, ceph seemed entirely happy, and then after an hour or so it 
all went South (OSDs start logging about bad authenticators, I/O pauses, 
general sadness).


I know clock sync is important to Ceph, so "one system is 8 hours out, 
Ceph becomes sad" is not a surprise. It is perhaps a surprise that the 
OSDs were allowed in at all...


What _is_ a surprise, though, is that at no point in all this did Ceph 
raise a peep about clock skew. Normally it's pretty sensitive to this - 
our test cluster has had clock skew complaints when a mon is only 
slightly out, and here we had a node 8 hours wrong.


Is there some oddity like Ceph not warning on clock skew for OSD-only 
hosts? or an upper bound on how high a discrepency it will WARN about?


Regards,

Matthew

example output from mid-outage:

root@sto-3-1:~#  ceph -s
  cluster:
id: 049fc780-8998-45a8-be12-d3b8b6f30e69
health: HEALTH_ERR
40755436/2702185683 objects misplaced (1.508%)
Reduced data availability: 20 pgs inactive, 20 pgs peering
Degraded data redundancy: 367431/2702185683 objects 
degraded (0.014%), 4549 pgs degraded
481 slow requests are blocked > 32 sec. Implicated osds 
188,284,795,1278,1981,2061,2648,2697
644 stuck requests are blocked > 4096 sec. Implicated osds 
22,31,33,35,101,116,120,130,132,140,150,159,201,211,228,263,327,541,561,566,585,589,636,643,649,654,743,785,790,806,865,1037,1040,1090,1100,1104,1115,1134,1135,1166,1193,1275,1277,1292,1494,1523,1598,1638,1746,2055,2069,2191,2210,2358,2399,2486,2487,2562,2589,2613,2627,2656,2713,2720,2837,2839,2863,2888,2908,2920,2928,2929,2947,2948,2963,2969,2972


[...]


--
The Wellcome Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 
___

ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Speeding up reconnection

2020-08-11 Thread Eugen Block

Hi,

you can change the MDS setting to be less strict [1]:

It is possible to respond to slow clients by simply dropping their  
MDS sessions, but permit them to re-open sessions and permit them to  
continue talking to OSDs. To enable this mode, set  
mds_session_blacklist_on_timeout to false on your MDS nodes.


According to [1] the default is 300 seconds to be evicted. Maybe give  
the less strict option a try?


Regards,
Eugen

[1]  
https://docs.ceph.com/docs/master/cephfs/eviction/#advanced-configuring-blacklisting



Zitat von William Edwards :


Hello,

When connection is lost between kernel client, a few things happen:

1.
Caps become stale:

Aug 11 11:08:14 admin-cap kernel: [308405.227718] ceph: mds0 caps stale

2.
MDS evicts client for being unresponsive:

MDS log: 2020-08-11 11:12:08.923 7fd1f45ae700  0  
log_channel(cluster) log [WRN] : evicting unresponsive client  
admin-cap.cf.ha.cyberfusion.cloud:DB0001-cap (144786749), after  
300.978 seconds

Client log: Aug 11 11:12:11 admin-cap kernel: [308643.051006] ceph: mds0 hung

3.
Socket is closed:

Aug 11 11:22:57 admin-cap kernel: [309289.192705] libceph: mds0  
[fdb7:b01e:7b8e:0:10:10:10:1]:6849 socket closed (con state OPEN)


I am not sure whether the kernel client or MDS closes the  
connection. I think the kernel client does so, because nothing is  
logged at the MDS side at 11:22:57


4.
Connection is reset by MDS:

MDS log: 2020-08-11 11:22:58.831 7fd1f9e49700  0 --1-  
[v2:[fdb7:b01e:7b8e:0:10:10:10:1]:6800/3619156441,v1:[fdb7:b01e:7b8e:0:10:10:10:1]:6849/3619156441] >> v1:[fc00:b6d:cfc:951::7]:0/133007863 conn(0x55bfaf1c2880 0x55c16cb47000 :6849 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2 accept we reset (peer sent cseq 1), sending  
RESETSESSION
Client log: Aug 11 11:22:58 admin-cap kernel: [309290.058222]  
libceph: mds0 [fdb7:b01e:7b8e:0:10:10:10:1]:6849 connection reset


5.
Kernel client reconnects:

Aug 11 11:22:58 admin-cap kernel: [309290.058972] ceph: mds0 closed  
our session

Aug 11 11:22:58 admin-cap kernel: [309290.058973] ceph: mds0 reconnect start
Aug 11 11:22:58 admin-cap kernel: [309290.069979] ceph: mds0 reconnect denied
Aug 11 11:22:58 admin-cap kernel: [309290.069996] ceph: dropping  
file locks for 6a23d9dd 1099625041446
Aug 11 11:22:58 admin-cap kernel: [309290.071135] libceph: mds0  
[fdb7:b01e:7b8e:0:10:10:10:1]:6849 socket closed (con state  
NEGOTIATING)


Question:

As you can see, there's 10 minutes between losing the connection and  
the reconnection attempt (11:12:08 - 11:22:58). I could not find any  
settings related to the period after which reconnection is  
attempted. I would like to change this value from 10 minutes to  
something like 1 minute. I also tried searching the Ceph docs for  
the string '600' (10 minutes), but did not find anything useful.


Hope someone can help.

Environment details:

Client kernel: 4.19.0-10-amd64
Ceph version: ceph version 14.2.9  
(bed944f8c45b9c98485e99b70e11bbcec6f6659a) nautilus (stable)



Met vriendelijke groeten,

William Edwards

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: pgs not deep scrubbed in time - false warning?

2020-08-11 Thread Dirk Sarpe
Of course I found the cause shortly after sending the message …

The scrubbing parameters need to move from the [osd] section to the [global] 
section, see https://www.suse.com/support/kb/doc/?id=19621

Health is back to OK after restarting osds, mons and mgrs.

Cheers,
Dirk


On Dienstag, 11. August 2020 09:34:36 CEST Dirk Sarpe wrote:
> Hi,
> 
> since some time (I think upgrade to nautilus) we get
> 
> X pgs not deep scrubbed in time
> 
> I deep-scrubbed the pgs when the error occurred and expected the cluster to
> recover over time, but no such luck. The warning comes up again and again.
> 
> In our spinning rust cluster we allow deep scrubbing only from 19:00 to
> 06:00 and changed the deep scrub interval to 28 days (detailed osd scrub
> config below). Looking at the pgs which are supposedly not deep scrubbed in
> time, reveals that their 28 day period is not over yet, example:
> 
> # date && ceph health detail | awk '$1=="pg" { print $0 }'
> Di 11. Aug 09:32:17 CEST 2020
> pg 0.787 not deep-scrubbed since 2020-07-30 03:08:24.899264
> pg 0.70c not deep-scrubbed since 2020-07-30 02:45:08.989329
> pg 0.6c1 not deep-scrubbed since 2020-07-30 03:01:15.199496
> pg 13.3 not deep-scrubbed since 2020-07-30 03:29:54.536825
> pg 0.d9 not deep-scrubbed since 2020-07-30 03:12:34.503586
> pg 0.41a not deep-scrubbed since 2020-07-30 03:01:23.514582
> pg 0.490 not deep-scrubbed since 2020-07-30 03:05:45.616100
> 
> 
> I wonder if I have missed or messed up some parameter that could cause this.
> 
> # ceph daemon osd.40 config show | grep scrub
> "mds_max_scrub_ops_in_progress": "5",
> "mon_scrub_inject_crc_mismatch": "0.00",
> "mon_scrub_inject_missing_keys": "0.00",
> "mon_scrub_interval": "86400",
> "mon_scrub_max_keys": "100",
> "mon_scrub_timeout": "300",
> "mon_warn_pg_not_deep_scrubbed_ratio": "0.75",
> "mon_warn_pg_not_scrubbed_ratio": "0.50",
> "osd_debug_deep_scrub_sleep": "0.00",
> "osd_deep_scrub_interval": "2419200.00",
> "osd_deep_scrub_keys": "1024",
> "osd_deep_scrub_large_omap_object_key_threshold": "20",
> "osd_deep_scrub_large_omap_object_value_sum_threshold": "1073741824",
> "osd_deep_scrub_randomize_ratio": "0.15",
> "osd_deep_scrub_stride": "1048576",
> "osd_deep_scrub_update_digest_min_age": "7200",
> "osd_max_scrubs": "1",
> "osd_op_queue_mclock_scrub_lim": "0.001000",
> "osd_op_queue_mclock_scrub_res": "0.00",
> "osd_op_queue_mclock_scrub_wgt": "1.00",
> "osd_requested_scrub_priority": "120",
> "osd_scrub_auto_repair": "false",
> "osd_scrub_auto_repair_num_errors": "5",
> "osd_scrub_backoff_ratio": "0.66",
> "osd_scrub_begin_hour": "19",
> "osd_scrub_begin_week_day": "0",
> "osd_scrub_chunk_max": "25",
> "osd_scrub_chunk_min": "5",
> "osd_scrub_cost": "52428800",
> "osd_scrub_during_recovery": "false",
> "osd_scrub_end_hour": "6",
> "osd_scrub_end_week_day": "7",
> "osd_scrub_interval_randomize_ratio": "0.50",
> "osd_scrub_invalid_stats": "true",
> "osd_scrub_load_threshold": "0.50",
> "osd_scrub_max_interval": "604800.00",
> "osd_scrub_max_preemptions": "5",
> "osd_scrub_min_interval": "172800.00",
> "osd_scrub_priority": "5",
> "osd_scrub_sleep": "0.10",
> 
> 
> ceph version 14.2.10
> 
> My ad hoc helper script
> 
> #!/bin/bash
> 
> # gently deep scrub all pgs which failed to deep scrub in set period
> 
> ceph health detail |
> awk '$1=="pg" { print $2 } ' |
> while read -r pg
> do
> ceph pg deep-scrub "$pg"
> sleep 60
> done
> 
> 
> Cheers,
> Dirk
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


-- 
general it-support unit

Phone  +49 341 97-33118
Email dirk.sa...@idiv.de 

German Centre for Integrative Biodiversity Research (iDiv) 
Halle-Jena-Leipzig
Deutscher Platz 5e 04103
Leipzig
Germany



iDiv is a research centre of the DFG - Deutsche Forschungsgemeinschaft



iDiv ist ein Forschungszentrum der Deutschen 
Forschungsgemeinschaft (DFG). Es ist eine zentrale Einrichtung 
der Universität Leipzig im Sinne des § 92 Abs. 1 SächsHSFG und wird 
zusammen mit der Martin-Luther-Universität Halle-Wittenberg, der 
Friedrich-Schiller-Universität Jena sowie dem Helmholtz-Zentrum für 
Umweltforschung (UFZ) betrieben. Sieben außeruniversitäre Einrichtungen 
unterstützen iDiv finanziell sowie durch ihre Expertise: das 
Max-Planck-Institut für Biogeochemie (MPI BGC), das Max-Planck-Institut 
für chemische Ökologie (MPI CE), das Max-Planck-Institut für 
evolutionäre Anthropologie (MPI EVA), das Leibniz-Institut Deutsche 
Sammlung von Mikroorganismen und Zellkulturen (DSMZ), das 
Leibniz-Institut für Pflanzenbiochemie (IPB), das Leibnitz-Institut für 
Pflanzengenetik und Kulturpflanzen

[ceph-users] Re: pg stuck in unknown state

2020-08-11 Thread Wido den Hollander



On 11/08/2020 00:40, Michael Thomas wrote:
On my relatively new Octopus cluster, I have one PG that has been 
perpetually stuck in the 'unknown' state.  It appears to belong to the 
device_health_metrics pool, which was created automatically by the mgr 
daemon(?).


The OSDs that the PG maps to are all online and serving other PGs.  But 
when I list the PGs that belong to the OSDs from 'ceph pg map', the 
offending PG is not listed.


# ceph pg dump pgs | grep ^1.0
dumped pgs
1.0    0   0 0  0    0  
0    0   0  0 0   unknown 
2020-08-08T09:30:33.251653-0500 0'0 0:0
[]  -1 []  -1  0'0  
2020-08-08T09:30:33.251653-0500  0'0 
2020-08-08T09:30:33.251653-0500  0


# ceph osd pool stats device_health_metrics
pool device_health_metrics id 1
   nothing is going on

# ceph pg map 1.0
osdmap e7199 pg 1.0 (1.0) -> up [41,40,2] acting [41,0]

What can be done to fix the PG?  I tried doing a 'ceph pg repair 1.0', 
but that didn't seem to do anything.


Is it safe to try to update the crush_rule for this pool so that the PG 
gets mapped to a fresh set of OSDs?


Yes, it would be. But still, it's weird. Mainly as the acting set is so 
different from the up-set.


You have different CRUSH rules I think?

Marking those OSDs down might work, but otherwise change the crush_rule 
and see how that goes.


Wido



--Mike
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] pgs not deep scrubbed in time - false warning?

2020-08-11 Thread Dirk Sarpe
Hi,

since some time (I think upgrade to nautilus) we get

X pgs not deep scrubbed in time

I deep-scrubbed the pgs when the error occurred and expected the cluster to 
recover over time, but no such luck. The warning comes up again and again.

In our spinning rust cluster we allow deep scrubbing only from 19:00 to 06:00 
and changed the deep scrub interval to 28 days (detailed osd scrub config 
below). Looking at the pgs which are supposedly not deep scrubbed in time, 
reveals that their 28 day period is not over yet, example:

# date && ceph health detail | awk '$1=="pg" { print $0 }'
Di 11. Aug 09:32:17 CEST 2020
pg 0.787 not deep-scrubbed since 2020-07-30 03:08:24.899264
pg 0.70c not deep-scrubbed since 2020-07-30 02:45:08.989329
pg 0.6c1 not deep-scrubbed since 2020-07-30 03:01:15.199496
pg 13.3 not deep-scrubbed since 2020-07-30 03:29:54.536825
pg 0.d9 not deep-scrubbed since 2020-07-30 03:12:34.503586
pg 0.41a not deep-scrubbed since 2020-07-30 03:01:23.514582
pg 0.490 not deep-scrubbed since 2020-07-30 03:05:45.616100


I wonder if I have missed or messed up some parameter that could cause this.

# ceph daemon osd.40 config show | grep scrub
"mds_max_scrub_ops_in_progress": "5",
"mon_scrub_inject_crc_mismatch": "0.00",
"mon_scrub_inject_missing_keys": "0.00",
"mon_scrub_interval": "86400",
"mon_scrub_max_keys": "100",
"mon_scrub_timeout": "300",
"mon_warn_pg_not_deep_scrubbed_ratio": "0.75",
"mon_warn_pg_not_scrubbed_ratio": "0.50",
"osd_debug_deep_scrub_sleep": "0.00",
"osd_deep_scrub_interval": "2419200.00",
"osd_deep_scrub_keys": "1024",
"osd_deep_scrub_large_omap_object_key_threshold": "20",
"osd_deep_scrub_large_omap_object_value_sum_threshold": "1073741824",
"osd_deep_scrub_randomize_ratio": "0.15",
"osd_deep_scrub_stride": "1048576",
"osd_deep_scrub_update_digest_min_age": "7200",
"osd_max_scrubs": "1",
"osd_op_queue_mclock_scrub_lim": "0.001000",
"osd_op_queue_mclock_scrub_res": "0.00",
"osd_op_queue_mclock_scrub_wgt": "1.00",
"osd_requested_scrub_priority": "120",
"osd_scrub_auto_repair": "false",
"osd_scrub_auto_repair_num_errors": "5",
"osd_scrub_backoff_ratio": "0.66",
"osd_scrub_begin_hour": "19",
"osd_scrub_begin_week_day": "0",
"osd_scrub_chunk_max": "25",
"osd_scrub_chunk_min": "5",
"osd_scrub_cost": "52428800",
"osd_scrub_during_recovery": "false",
"osd_scrub_end_hour": "6",
"osd_scrub_end_week_day": "7",
"osd_scrub_interval_randomize_ratio": "0.50",
"osd_scrub_invalid_stats": "true",
"osd_scrub_load_threshold": "0.50",
"osd_scrub_max_interval": "604800.00",
"osd_scrub_max_preemptions": "5",
"osd_scrub_min_interval": "172800.00",
"osd_scrub_priority": "5",
"osd_scrub_sleep": "0.10",


ceph version 14.2.10

My ad hoc helper script

#!/bin/bash

# gently deep scrub all pgs which failed to deep scrub in set period

ceph health detail |
awk '$1=="pg" { print $2 } ' |
while read -r pg
do
ceph pg deep-scrub "$pg"
sleep 60
done


Cheers,
Dirk

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io