[ceph-users] MDS Performance and PG/PGP value

2022-10-05 Thread Yoann Moulin

Hello

As previously describe here, we have a full-flash NVME ceph cluster (16.2.6) 
with currently only cephfs service configured.

The current setup is

54 nodes with 1 NVME each, 2 partitions for each NVME.
8 MDSs (7 actives, 1 sandby)
MDS cache memory limit to 128GB.
It's an hyperconverged K8S cluster, OSD are on K8s worker nodes, so I set "osd 
memory target" to 16GB.

for the last couple of weeks, we had major slow down with many MDS_SLOW_REQUEST 
and MDS_SLOW_METADATA_IO

[WRN] Health check update: 7 MDSs report slow requests (MDS_SLOW_REQUEST)
[WRN] Health check update: 6 MDSs report slow metadata IOs 
(MDS_SLOW_METADATA_IO)
[WRN] [WRN] MDS_SLOW_METADATA_IO: 6 MDSs report slow metadata IOs
[WRN] mds.icadmin012(mds.4): 1 slow metadata IOs are blocked > 30 secs, 
oldest blocked for 164 secs
[WRN] mds.icadmin014(mds.6): 100+ slow metadata IOs are blocked > 30 secs, 
oldest blocked for 616 secs
[WRN] mds.icadmin015(mds.5): 100+ slow metadata IOs are blocked > 30 secs, 
oldest blocked for 145 secs
[WRN] mds.icadmin011(mds.2): 100+ slow metadata IOs are blocked > 30 secs, 
oldest blocked for 449 secs
[WRN] mds.icadmin013(mds.1): 100+ slow metadata IOs are blocked > 30 secs, 
oldest blocked for 650 secs
[WRN] mds.icadmin008(mds.0): 100+ slow metadata IOs are blocked > 30 secs, 
oldest blocked for 583 secs

We noticed that cephfs_metadata pool had only 16 PG, we have set autoscale_mode to off and increase the number of PG to 256 and with this 
change, the number of SLOW message has decreased drastically.



$ ceph osd pool ls detail > pool 1 'device_health_metrics' replicated size 3 
min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on
 last_change 121056 lfor 0/16088/16086 flags hashpspool stripe_width 0 
pg_num_min 1 application mgr_devicehealth
pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 4096 pgp_num 4096 autoscale_mode on 
 last_change 121056 lfor 0/0/213 flags hashpspool stripe_width 0 target_size_ratio 0.2 application cephfs
pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode off 
 last_change 139312 lfor 0/92367/138900 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 256 recovery_priority 5 
 application cephfs



$ ceph -s >   cluster:
id: cc402f2e-2444-473e-adab-fe7b38d08546
health: HEALTH_OK
 
  services:

mon: 3 daemons, quorum icadmin006,icadmin007,icadmin008 (age 7w)
mgr: icadmin006(active, since 3M), standbys: icadmin007, icadmin008
mds: 7/7 daemons up, 1 standby
osd: 110 osds: 108 up (since 21h), 108 in (since 23h)
 
  data:

volumes: 1/1 healthy
pools:   3 pools, 4353 pgs
objects: 331.16M objects, 81 TiB
usage:   246 TiB used, 88 TiB / 334 TiB avail
pgs: 4350 active+clean
 2active+clean+scrubbing+deep
 1active+clean+scrubbing


Is there any mechanism to increase the number of PG automatically in such a 
situation ? Or this is something to do manually ?

Is 256 good value in our case ? We have 80TB of data with more than 300M files.

Thank you for your help,

--
Yoann Moulin
EPFL IC-IT
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rbd mirroring questions

2022-10-05 Thread John Ratliff
We're testing rbd mirroring so that we can replicate openstack volumes
to a new ceph cluster for use in a new openstack deployment.

We currently have one-way mirroring enabled on two of our test clusters
in pool mode.

How can I disable replication on the new cluster for a particular image
once I am done? If I turn journaling off, the volume appears to be
completely removed from the new cluster. I cannot disable the mirroring
on an image basis, because we enabled it on a pool basis.

At the moment, I cannot disable mirroring on the pool at all. There
seems to be a broken image that is causing problems.

$ sudo rbd mirror pool status volumes --verbose
health: ERROR
daemon health: WARNING
image health: ERROR
images: 2 total
1 error
1 replaying

DAEMONS
service 324099:
  instance_id:
  client_id: os-store1.uzudyj
  hostname: os-store1
  version: 15.2.13
  leader: false
  health: OK


IMAGES
volume-44975133-313a-4cfe-a74b-9f9d43d395c0:
  global_id:   c826e2e5-ce0b-43aa-ac94-0900b997841f
2022-10-05T20:34:28.046+ 7fd907fff700 -1
librbd::image::OpenRequest: failed to retrieve initial metadata: (2) No
such file or directory
2022-10-05T20:34:28.046+ 7fd9077fe700 -1 librbd::io::AioCompletion:
0x559b5cc50f60 fail: (2) No such file or directory
rbd: failed to open image volume-fa407405-56eb-460e-9ec9-aa3d73f24253:
(2) No such file or directory

$ sudo rbd mirror pool disable volumes
2022-10-05T20:43:49.558+ 7f4db5eb8380 -1 librbd::api::Mirror:
mode_set: mirror peers still registered

$ sudo rbd ls volumes
volume-44975133-313a-4cfe-a74b-9f9d43d395c0
volume-fa407405-56eb-460e-9ec9-aa3d73f24253

$ sudo rbd create --size 20 volumes/volume-fa407405-56eb-460e-9ec9-
aa3d73f24253
rbd: create error: (17) File exists
2022-10-05T20:58:35.457+ 7fe8816f3380 -1 librbd: rbd image volume-
fa407405-56eb-460e-9ec9-aa3d73f24253 already exists

What can I do to make ceph get rid of this "phantom" image? This is a
test cluster, so I don't care if it's destructive to the image or not.

If I changed the mirroring mode to image instead of pool, could I
disable mirroring on a specific image but not lose the image on the
mirror peer?

-- 
John Ratliff
Systems Automation Engineer 
GlobalNOC @ Indiana University
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph Leadership Team Meeting Minutes - October 5, 2022

2022-10-05 Thread Neha Ojha
Hi everyone,

Here are the topics discussed in today's meeting.

- What changes with the announcement about IBM [1]? Nothing changes
for the upstream Ceph community. There will be more focus on
performance and scale testing.
- 17.2.4 was released last week, no major issues reported yet. This
release fixes a major PG log dups bug (includes online and offline
version of the fix). Users are encouraged to read the release notes
corresponding to this issue. New PG splits/merges safe with the fix
applied.
- The next Pacific point release 16.2.11 is being planned.
https://tracker.ceph.com/issues/56488 will be prioritized for this
release.

Thanks,
Neha

[1] https://ceph.io/en/news/blog/2022/red-hats-ceph-team-is-moving-to-ibm/

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Trying to add NVMe CT1000P2SSD8

2022-10-05 Thread Murilo Morais
I've already tested the performance. Great performance by the way, but this
anomaly is occurring in the OSDs starting in an Error state. I don't know
how to debug this problem.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Trying to add NVMe CT1000P2SSD8

2022-10-05 Thread Eneko Lacunza

Hi,

This is a consumer SSD. Did you test it's performance first? Better get 
a datacenter disk...


Cheers

El 5/10/22 a las 17:53, Murilo Morais escribió:

Nobody?
___
ceph-users mailing list --ceph-users@ceph.io
To unsubscribe send an email toceph-users-le...@ceph.io



 EnekoLacunza

Director Técnico | Zuzendari teknikoa

Binovo IT Human Project

943 569 206 

elacu...@binovo.es 

binovo.es 

Astigarragako Bidea, 2 - 2 izda. Oficina 10-11, 20180 Oiartzun


youtube    
linkedin  
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs-top doesn't work

2022-10-05 Thread Jos Collin
Yes, you need perf stats version 2 for the latest cephfs-top UI to work.

On Wed, 5 Oct 2022 at 20:03, Vladimir Brik 
wrote:

> It looks like my cluster is too old. I am getting "perf
> stats version mismatch!"
>
> Vlad
>
> On 10/5/22 08:37, Jos Collin wrote:
> > This issue is fixed in
> > https://github.com/ceph/ceph/pull/48090
> > . Could you please
> > check it out and let me know?
> >
> > Thanks.
> >
> > On Tue, 19 Apr 2022 at 01:14, Vladimir Brik
> >  > > wrote:
> >
> > Does anybody know why cephfs-top may only display header
> > lines (date, client types, metric names) but no actual data?
> >
> > When I run it, cephfs-top consumes quite a bit of the CPU
> > and generates quite a bit of network traffic, but it
> > doesn't
> > actually display the data.
> >
> > I poked around in the source code and it seems like it
> > might
> > be curses issue, but I am not sure.
> >
> >
> > Vlad
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > 
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> > 
> >
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Trying to add NVMe CT1000P2SSD8

2022-10-05 Thread Murilo Morais
Nobody?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 17.2.4: mgr/cephadm/grafana_crt is ignored

2022-10-05 Thread Redouane Kachach Elhichou
Glad it helped you to fix the issue. I'll open a tracker to fix the docs.

On Wed, Oct 5, 2022 at 3:52 PM E Taka <0eta...@gmail.com> wrote:

> Thanks, Redouane, that helped! The documentation should of course also be
> updated in this context.
>
> Am Mi., 5. Okt. 2022 um 15:33 Uhr schrieb Redouane Kachach Elhichou <
> rkach...@redhat.com>:
>
> > Hello,
> >
> > As of this PR https://github.com/ceph/ceph/pull/47098 grafana cert/key
> are
> > now stored per-node. So instead of *mgr/cephadm/grafana_crt* they are
> > stored per-nodee as:
> >
> > *mgr/cephadm/{hostname}/grafana_crt*
> > *mgr/cephadm/{hostname}/grafana_key*
> >
> > In order to see the config entries that have been generated you can
> filter
> > by:
> >
> > > ceph config-key dump | grep grafana | grep crt
> >
> > I hope that helps,
> > Redo.
> >
> >
> >
> > On Wed, Oct 5, 2022 at 3:19 PM E Taka <0eta...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > since the last update from 17.2.3 to version 17.2.4, the
> > > mgr/cephadm/grafana_crt
> > > setting is ignored. The output of
> > >
> > > ceph config-key get mgr/cephadm/grafana_crt
> > > ceph config-key get mgr/cephadm/grafana_key
> > > ceph dashboard get-grafana-frontend-api-url
> > >
> > > ist correct.
> > >
> > > Grafana and the Dashboard are re-applied, re-started, and re-configured
> > via
> > > "ceph orch", even the nodes are rebooted. The dashboard is
> {dis,en}abled
> > as
> > > documented via "ceph mgr module en/disable dashboard"
> > >
> > > But the Grafana-Dashboards still use a self signed certificate, and not
> > the
> > > provided one from mgr/cephadm/grafana_crt.
> > >
> > > Prior the update this was never a problem. What did I miss?
> > >
> > > Thanks,
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >
> > >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs-top doesn't work

2022-10-05 Thread Vladimir Brik
It looks like my cluster is too old. I am getting "perf 
stats version mismatch!"


Vlad

On 10/5/22 08:37, Jos Collin wrote:
This issue is fixed in 
https://github.com/ceph/ceph/pull/48090 
. Could you please 
check it out and let me know?


Thanks.

On Tue, 19 Apr 2022 at 01:14, Vladimir Brik 
> wrote:


Does anybody know why cephfs-top may only display header
lines (date, client types, metric names) but no actual data?

When I run it, cephfs-top consumes quite a bit of the CPU
and generates quite a bit of network traffic, but it
doesn't
actually display the data.

I poked around in the source code and it seems like it
might
be curses issue, but I am not sure.


Vlad
___
ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 17.2.4: mgr/cephadm/grafana_crt is ignored

2022-10-05 Thread E Taka
Thanks, Redouane, that helped! The documentation should of course also be
updated in this context.

Am Mi., 5. Okt. 2022 um 15:33 Uhr schrieb Redouane Kachach Elhichou <
rkach...@redhat.com>:

> Hello,
>
> As of this PR https://github.com/ceph/ceph/pull/47098 grafana cert/key are
> now stored per-node. So instead of *mgr/cephadm/grafana_crt* they are
> stored per-nodee as:
>
> *mgr/cephadm/{hostname}/grafana_crt*
> *mgr/cephadm/{hostname}/grafana_key*
>
> In order to see the config entries that have been generated you can filter
> by:
>
> > ceph config-key dump | grep grafana | grep crt
>
> I hope that helps,
> Redo.
>
>
>
> On Wed, Oct 5, 2022 at 3:19 PM E Taka <0eta...@gmail.com> wrote:
>
> > Hi,
> >
> > since the last update from 17.2.3 to version 17.2.4, the
> > mgr/cephadm/grafana_crt
> > setting is ignored. The output of
> >
> > ceph config-key get mgr/cephadm/grafana_crt
> > ceph config-key get mgr/cephadm/grafana_key
> > ceph dashboard get-grafana-frontend-api-url
> >
> > ist correct.
> >
> > Grafana and the Dashboard are re-applied, re-started, and re-configured
> via
> > "ceph orch", even the nodes are rebooted. The dashboard is {dis,en}abled
> as
> > documented via "ceph mgr module en/disable dashboard"
> >
> > But the Grafana-Dashboards still use a self signed certificate, and not
> the
> > provided one from mgr/cephadm/grafana_crt.
> >
> > Prior the update this was never a problem. What did I miss?
> >
> > Thanks,
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs-top doesn't work

2022-10-05 Thread Jos Collin
This issue is fixed in https://github.com/ceph/ceph/pull/48090. Could you
please check it out and let me know?

Thanks.

On Tue, 19 Apr 2022 at 01:14, Vladimir Brik 
wrote:

> Does anybody know why cephfs-top may only display header
> lines (date, client types, metric names) but no actual data?
>
> When I run it, cephfs-top consumes quite a bit of the CPU
> and generates quite a bit of network traffic, but it doesn't
> actually display the data.
>
> I poked around in the source code and it seems like it might
> be curses issue, but I am not sure.
>
>
> Vlad
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph tell setting ignored?

2022-10-05 Thread Nicola Mori

That's indeed the case:

# ceph config get osd osd_op_queue
mclock_scheduler

Thank you very much for this tip, I'll play with mclock parameters.

On 05/10/22 13:11, Janne Johansson wrote:

# ceph tell osd.2 config get osd_max_backfills
{
  "osd_max_backfills": "1000"
}

makes little sense to me.



This means you have the mClock IO scheduler, and it gives back this
value since you are meant to change the mClock priorities and not the
number of backfills.

Some more info at

https://docs.ceph.com/en/quincy/rados/configuration/osd-config-ref/#dmclock-qos




--
Nicola Mori, Ph.D.
INFN sezione di Firenze
Via Bruno Rossi 1, 50019 Sesto F.no (Italy)
+390554572660
m...@fi.infn.it
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 17.2.4: mgr/cephadm/grafana_crt is ignored

2022-10-05 Thread Redouane Kachach Elhichou
Hello,

As of this PR https://github.com/ceph/ceph/pull/47098 grafana cert/key are
now stored per-node. So instead of *mgr/cephadm/grafana_crt* they are
stored per-nodee as:

*mgr/cephadm/{hostname}/grafana_crt*
*mgr/cephadm/{hostname}/grafana_key*

In order to see the config entries that have been generated you can filter
by:

> ceph config-key dump | grep grafana | grep crt

I hope that helps,
Redo.



On Wed, Oct 5, 2022 at 3:19 PM E Taka <0eta...@gmail.com> wrote:

> Hi,
>
> since the last update from 17.2.3 to version 17.2.4, the
> mgr/cephadm/grafana_crt
> setting is ignored. The output of
>
> ceph config-key get mgr/cephadm/grafana_crt
> ceph config-key get mgr/cephadm/grafana_key
> ceph dashboard get-grafana-frontend-api-url
>
> ist correct.
>
> Grafana and the Dashboard are re-applied, re-started, and re-configured via
> "ceph orch", even the nodes are rebooted. The dashboard is {dis,en}abled as
> documented via "ceph mgr module en/disable dashboard"
>
> But the Grafana-Dashboards still use a self signed certificate, and not the
> provided one from mgr/cephadm/grafana_crt.
>
> Prior the update this was never a problem. What did I miss?
>
> Thanks,
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] 17.2.4: mgr/cephadm/grafana_crt is ignored

2022-10-05 Thread E Taka
Hi,

since the last update from 17.2.3 to version 17.2.4, the
mgr/cephadm/grafana_crt
setting is ignored. The output of

ceph config-key get mgr/cephadm/grafana_crt
ceph config-key get mgr/cephadm/grafana_key
ceph dashboard get-grafana-frontend-api-url

ist correct.

Grafana and the Dashboard are re-applied, re-started, and re-configured via
"ceph orch", even the nodes are rebooted. The dashboard is {dis,en}abled as
documented via "ceph mgr module en/disable dashboard"

But the Grafana-Dashboards still use a self signed certificate, and not the
provided one from mgr/cephadm/grafana_crt.

Prior the update this was never a problem. What did I miss?

Thanks,
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph on kubernetes

2022-10-05 Thread Nico Schottelius

Hey Oğuz,

the typical recommendations of native ceph still uphold in k8s,
additionally something you need to consider:

- Hyperconverged setup or dedicated nodes - what is your workload and
  budget
- Similar to native ceph, think about where you want to place data, this
  influences the selector inside rook of which devices / nodes to add
- Inside & Outside consumption: rook is very good with in-cluster
  configurations, creating PVCs/PVs, however you can also use rook
- mgr: usually we run 1+2 (standby) on native clusters, with k8s/rook it
  might be good enough to use 1 mgr, as k8s can take care of
  restarting/redeploying
- traffic separation: if that is a concern, you might want to go with
  multus in addition to your standard CNI
- Rook does not assign `resource` specs to OSD pods by default, if you
  hyperconverge you should be aware of that
- Always have the ceph-toolbox deployed - while you need it rarely, when
  you need it, you don't want to think about where to get the pod and
  how access it

Otherwise from our experience rook/ceph is probably the easiest in
regards to updates, easier than native handling and I suppose (*) easier
than cephadm as well.

Best regards,

Nico

(*) Can only judge from the mailing list comments, we cannot use cephadm
as our hosts are natively running Alpine Linux without systemd.

Oğuz Yarımtepe  writes:

> Hi,
>
> I am using Ceph on RKE2. Rook operator is installed on a rke2 cluster
> running on Azure vms. I would like to learn whether there are best
> practices for ceph on Kubernetes, like separating ceph nodes or pools or
> using some custom settings for Kubernetes environment. Will be great if
> anyone shares tips.
>
> Regards.


--
Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph tell setting ignored?

2022-10-05 Thread Stefan Kooman

On 10/5/22 12:09, Nicola Mori wrote:

Dear Ceph users,

I am trying to tune my cluster's recovery and backfill. On the web I 
found that I can set related tunables by e.g.:


ceph tell osd.* injectargs --osd-recovery-sleep-hdd=0.0 
--osd-max-backfills=8 --osd-recovery-max-active=8 
--osd-recovery-max-single-start=4


but I cannot find a way to be sure that the settings have been correctly 
applied and are being honored. Actually, when querying OSD settings I 
get e.g.:


# ceph config show osd.0 | grep osd_max_backfill
osd_max_backfills    1000


Login on the host where osd.0 resides. Ask the daemon:

ceph daemon osd.0 config get osd_max_backfills

That will give you the actual running state

You can do that for any daemon type / config setting.

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

2022-10-05 Thread Anh Phan Tuan
It seems the 17.2.4 release has fixed this.

ceph-volume: fix fast device alloc size on mulitple device (pr#47293,
> Arthur Outhenin-Chalandre)


Bug #56031: batch compute a lower size than what it should be for blockdb
with multiple fast device - ceph-volume - Ceph


Regards,
Anh Phan

On Fri, Sep 16, 2022 at 2:34 AM Christophe BAILLON  wrote:

> Hi
>
> The problem is still present in version 17.2.3,
> thanks for the trick to work around...
>
> Regards
>
> - Mail original -
> > De: "Anh Phan Tuan" 
> > À: "Calhoun, Patrick" 
> > Cc: "Arthur Outhenin-Chalandre" ,
> "ceph-users" 
> > Envoyé: Jeudi 11 Août 2022 10:14:17
> > Objet: [ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD
>
> > Hi Patrick,
> >
> > I am also facing this bug when deploying a new cluster at the time 16.2.7
> > release.
> >
> > The bugs relative to the way ceph calculator db_size form give db disk.
> >
> > Instead of : slot db size = size of db disk / num slot per disk.
> > Ceph calculated the value: slot db size = size of db disk (just one
> disk) /
> > total number of slots needed (number of osd prepared in that time).
> >
> > In your case, you have 2 db disks, It will make the db size only 50% of
> the
> > corrected value.
> > In my case, I have 4 db disks per host, It makes the db size only 25% of
> > the corrected value.
> >
> > This bug happens even when you deploy by batch command.
> > In that time, I finally used to work around by batch command but only
> > deploy all osd relative to one db disk a time, in this case ceph
> calculated
> > the correct value.
> >
> > Cheers,
> > Anh Phan
> >
> >
> >
> > On Sat, Jul 30, 2022 at 12:31 AM Calhoun, Patrick 
> wrote:
> >
> >> Thanks, Arthur,
> >>
> >> I think you are right about that bug looking very similar to what I've
> >> observed. I'll try to remember to update the list once the fix is merged
> >> and released and I get a chance to test it.
> >>
> >> I'm hoping somebody can comment on what are ceph's current best
> practices
> >> for sizing WAL/DB volumes, considering rocksdb levels and compaction.
> >>
> >> -Patrick
> >>
> >> 
> >> From: Arthur Outhenin-Chalandre 
> >> Sent: Friday, July 29, 2022 2:11 AM
> >> To: ceph-users@ceph.io 
> >> Subject: [ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD
> >>
> >> Hi Patrick,
> >>
> >> On 7/28/22 16:22, Calhoun, Patrick wrote:
> >> > In a new OSD node with 24 hdd (16 TB each) and 2 ssd (1.44 TB each),
> I'd
> >> like to have "ceph orch" allocate WAL and DB on the ssd devices.
> >> >
> >> > I use the following service spec:
> >> > spec:
> >> >   data_devices:
> >> > rotational: 1
> >> > size: '14T:'
> >> >   db_devices:
> >> > rotational: 0
> >> > size: '1T:'
> >> >   db_slots: 12
> >> >
> >> > This results in each OSD having a 60GB volume for WAL/DB, which
> equates
> >> to 50% total usage in the VG on each ssd, and 50% free.
> >> > I honestly don't know what size to expect, but exactly 50% of capacity
> >> makes me suspect this is due to a bug:
> >> > https://tracker.ceph.com/issues/54541
> >> > (In fact, I had run into this bug when specifying block_db_size rather
> >> than db_slots)
> >> >
> >> > Questions:
> >> >   Am I being bit by that bug?
> >> >   Is there a better approach, in general, to my situation?
> >> >   Are DB sizes still governed by the rocksdb tiering? (I thought that
> >> this was mostly resolved by https://github.com/ceph/ceph/pull/29687 )
> >> >   If I provision a DB/WAL logical volume size to 61GB, is that
> >> effectively a 30GB database, and 30GB of extra room for compaction?
> >>
> >> I don't use cephadm, but it's maybe related to this regression:
> >> https://tracker.ceph.com/issues/56031. At list the symptoms looks very
> >> similar...
> >>
> >> Cheers,
> >>
> >> --
> >> Arthur Outhenin-Chalandre
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
> --
> Christophe BAILLON
> Mobile :: +336 16 400 522
> Work :: https://eyona.com
> Twitter :: https://twitter.com/ctof
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph on kubernetes

2022-10-05 Thread Oğuz Yarımtepe
Hi,

I am using Ceph on RKE2. Rook operator is installed on a rke2 cluster
running on Azure vms. I would like to learn whether there are best
practices for ceph on Kubernetes, like separating ceph nodes or pools or
using some custom settings for Kubernetes environment. Will be great if
anyone shares tips.

Regards.

-- 
Oğuz Yarımtepe
http://about.me/oguzy
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph tell setting ignored?

2022-10-05 Thread Wout van Heeswijk
Hi Nicola,

Maybe 'config diff' can be of use to you

ceph tell osd.2 config diff

It should tell you every value that is not 'default' and where the value(s) 
came from (File, mon, override).

Wout

-Oorspronkelijk bericht-
Van: Nicola Mori  
Verzonden: Wednesday, 5 October 2022 12:33
Aan: ceph-users@ceph.io
Onderwerp: [ceph-users] Re: ceph tell setting ignored?

But how can I check if the applied temporary value has been correctly set? 
Maybe I'm doing something wrong, but this:

# ceph tell osd.2 config set osd_max_backfills 8 {
 "success": "osd_max_backfills = '8' "
}
# ceph tell osd.2 config get osd_max_backfills {
 "osd_max_backfills": "1000"
}

makes little sense to me.
On 05/10/22 12:13, Anthony D'Atri wrote:
> Injection modifies the running state of the specified daemons.  It does not 
> modify the central config database (saved / persistent state). Injected 
> values will go away when the daemon restarts.
> 
>> On Oct 5, 2022, at 6:10 AM, Nicola Mori  wrote:
>>
>> Dear Ceph users,
>>
>> I am trying to tune my cluster's recovery and backfill. On the web I found 
>> that I can set related tunables by e.g.:
>>
>> ceph tell osd.* injectargs --osd-recovery-sleep-hdd=0.0 
>> --osd-max-backfills=8 --osd-recovery-max-active=8 
>> --osd-recovery-max-single-start=4
>>
>> but I cannot find a way to be sure that the settings have been correctly 
>> applied and are being honored. Actually, when querying OSD settings I get 
>> e.g.:
>>
>> # ceph config show osd.0 | grep osd_max_backfill
>> osd_max_backfills1000 override
>>
>> This does not change even if I try to set just a single property for a 
>> single osd:
>>
>> # ceph tell osd.0 injectargs --osd-max-backfills=8 {} 
>> osd_max_backfills = '8' osd_recovery_max_active = '1000'
>> # ceph config show osd.0 | grep osd_max_backfills
>> osd_max_backfills1000 override
>>
>> Am I overlooking or doing something wrong?
>> Thanks,
>>
>> Nicola
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
>> email to ceph-users-le...@ceph.io

--
Nicola Mori, Ph.D.
INFN sezione di Firenze
Via Bruno Rossi 1, 50019 Sesto F.no (Italy)
+390554572660
m...@fi.infn.it
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph tell setting ignored?

2022-10-05 Thread Janne Johansson
> # ceph tell osd.2 config get osd_max_backfills
> {
>  "osd_max_backfills": "1000"
> }
>
> makes little sense to me.


This means you have the mClock IO scheduler, and it gives back this
value since you are meant to change the mClock priorities and not the
number of backfills.

Some more info at

https://docs.ceph.com/en/quincy/rados/configuration/osd-config-ref/#dmclock-qos


-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph tell setting ignored?

2022-10-05 Thread Nicola Mori
But how can I check if the applied temporary value has been correctly 
set? Maybe I'm doing something wrong, but this:


# ceph tell osd.2 config set osd_max_backfills 8
{
"success": "osd_max_backfills = '8' "
}
# ceph tell osd.2 config get osd_max_backfills
{
"osd_max_backfills": "1000"
}

makes little sense to me.
On 05/10/22 12:13, Anthony D'Atri wrote:

Injection modifies the running state of the specified daemons.  It does not 
modify the central config database (saved / persistent state). Injected values 
will go away when the daemon restarts.


On Oct 5, 2022, at 6:10 AM, Nicola Mori  wrote:

Dear Ceph users,

I am trying to tune my cluster's recovery and backfill. On the web I found that 
I can set related tunables by e.g.:

ceph tell osd.* injectargs --osd-recovery-sleep-hdd=0.0 --osd-max-backfills=8 
--osd-recovery-max-active=8 --osd-recovery-max-single-start=4

but I cannot find a way to be sure that the settings have been correctly 
applied and are being honored. Actually, when querying OSD settings I get e.g.:

# ceph config show osd.0 | grep osd_max_backfill
osd_max_backfills1000 override

This does not change even if I try to set just a single property for a single 
osd:

# ceph tell osd.0 injectargs --osd-max-backfills=8
{}
osd_max_backfills = '8' osd_recovery_max_active = '1000'
# ceph config show osd.0 | grep osd_max_backfills
osd_max_backfills1000 override

Am I overlooking or doing something wrong?
Thanks,

Nicola
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Nicola Mori, Ph.D.
INFN sezione di Firenze
Via Bruno Rossi 1, 50019 Sesto F.no (Italy)
+390554572660
m...@fi.infn.it
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph tell setting ignored?

2022-10-05 Thread Anthony D'Atri
Injection modifies the running state of the specified daemons.  It does not 
modify the central config database (saved / persistent state). Injected values 
will go away when the daemon restarts.  

> On Oct 5, 2022, at 6:10 AM, Nicola Mori  wrote:
> 
> Dear Ceph users,
> 
> I am trying to tune my cluster's recovery and backfill. On the web I found 
> that I can set related tunables by e.g.:
> 
> ceph tell osd.* injectargs --osd-recovery-sleep-hdd=0.0 --osd-max-backfills=8 
> --osd-recovery-max-active=8 --osd-recovery-max-single-start=4
> 
> but I cannot find a way to be sure that the settings have been correctly 
> applied and are being honored. Actually, when querying OSD settings I get 
> e.g.:
> 
> # ceph config show osd.0 | grep osd_max_backfill
> osd_max_backfills1000 override
> 
> This does not change even if I try to set just a single property for a single 
> osd:
> 
> # ceph tell osd.0 injectargs --osd-max-backfills=8
> {}
> osd_max_backfills = '8' osd_recovery_max_active = '1000'
> # ceph config show osd.0 | grep osd_max_backfills
> osd_max_backfills1000 override
> 
> Am I overlooking or doing something wrong?
> Thanks,
> 
> Nicola
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph tell setting ignored?

2022-10-05 Thread Nicola Mori

Dear Ceph users,

I am trying to tune my cluster's recovery and backfill. On the web I 
found that I can set related tunables by e.g.:


ceph tell osd.* injectargs --osd-recovery-sleep-hdd=0.0 
--osd-max-backfills=8 --osd-recovery-max-active=8 
--osd-recovery-max-single-start=4


but I cannot find a way to be sure that the settings have been correctly 
applied and are being honored. Actually, when querying OSD settings I 
get e.g.:


# ceph config show osd.0 | grep osd_max_backfill
osd_max_backfills1000 




override

This does not change even if I try to set just a single property for a 
single osd:


# ceph tell osd.0 injectargs --osd-max-backfills=8
{}
osd_max_backfills = '8' osd_recovery_max_active = '1000'
# ceph config show osd.0 | grep osd_max_backfills
osd_max_backfills1000 




override

Am I overlooking or doing something wrong?
Thanks,

Nicola
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io