[ceph-users] Re: Remapping OSDs under a PG

2021-05-27 Thread 胡 玮文

在 2021年5月28日,08:18,Jeremy Hansen  写道:


I’m very new to Ceph so if this question makes no sense, I apologize.  
Continuing to study but I thought an answer to this question would help me 
understand Ceph a bit more.

Using cephadm, I set up a cluster.  Cephadm automatically creates a pool for 
Ceph metrics.  It looks like one of my ssd osd’s was allocated for the PG.  I’d 
like to understand how to remap this PG so it’s not using the SSD OSDs.

ceph pg map 1.0
osdmap e205 pg 1.0 (1.0) -> up [28,33,10] acting [28,33,10]

OSD 28 is the SSD.

Is this possible?  Does this make any sense?  I’d like to reserve the SSDs for 
their own pool.

Yes, you can refer to the doc [1]. You need to create a new crush rule with HDD 
device class, and assign this new rule to that pool.

[1]: https://docs.ceph.com/en/latest/rados/operations/crush-map/#device-classes

Weiwen Hu

Thank you!
-jeremy
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Remapping OSDs under a PG

2021-05-27 Thread Jeremy Hansen

I’m very new to Ceph so if this question makes no sense, I apologize.  
Continuing to study but I thought an answer to this question would help me 
understand Ceph a bit more.

Using cephadm, I set up a cluster.  Cephadm automatically creates a pool for 
Ceph metrics.  It looks like one of my ssd osd’s was allocated for the PG.  I’d 
like to understand how to remap this PG so it’s not using the SSD OSDs.

ceph pg map 1.0
osdmap e205 pg 1.0 (1.0) -> up [28,33,10] acting [28,33,10]

OSD 28 is the SSD.

Is this possible?  Does this make any sense?  I’d like to reserve the SSDs for 
their own pool.

Thank you!
-jeremy


signature.asc
Description: Message signed with OpenPGP
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Messed up placement of MDS

2021-05-27 Thread mabi
Hello,

I am trying to place the two MDS daemons for CephFS on dedicated nodes. For 
that purpose I tried out a few different "cephadm orch apply ..." commands with 
a label but at the end it looks like I messed up with the placement as I now 
have two mds service_types as you can see below:

# ceph orch ls --service-type mds --export
service_type: mds
service_id: ceph1fs
service_name: mds.ceph1fs
placement:
  count: 2
  hosts:
  - ceph1g
  - ceph1a
---
service_type: mds
service_id: label:mds
service_name: mds.label:mds
placement:
  count: 2


This second entry at the bottom seems totally wrong and I would like to remove 
it but I haven't found how to remove it totally. Any ideas?

Ideally I just want to place two MDS daemons on node ceph1a and ceph1g.

Regards,
Mabi

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] cephfs auditing

2021-05-27 Thread Michael Thomas
Is there a way to log or track which cephfs files are being accessed? 
This would help us in planning where to place certain datasets based on 
popularity, eg on a EC HDD pool or a replicated SSD pool.


I know I can run inotify on the ceph clients, but I was hoping that the 
MDS would have a way to log this information centrally.


--Mike
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] XFS on RBD on EC painfully slow

2021-05-27 Thread Reed Dier
Hoping someone may be able to help point out where my bottleneck(s) may be.

I have an 80TB kRBD image on an EC8:2 pool, with an XFS filesystem on top of 
that.
This was not an ideal scenario, rather it was a rescue mission to dump a large, 
aging raid array before it was too late, so I'm working with the hand I was 
dealt.

To further conflate the issues, the main directory structure consists of lots 
and lots of small file sizes, and deep directories.

My goal is to try and rsync (or otherwise) data from the RBD to cephfs, but its 
just unbearably slow and will take ~150 days to transfer ~35TB, which is far 
from ideal.

>  15.41G  79%4.36MB/s0:56:09 (xfr#23165, ir-chk=4061/27259)

> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>0.170.001.34   13.230.00   85.26
> 
> Devicer/s rMB/s   rrqm/s  %rrqm r_await rareq-sz w/s 
> wMB/s   wrqm/s  %wrqm w_await wareq-sz d/s dMB/s   drqm/s  %drqm 
> d_await dareq-sz  aqu-sz  %util
> rbd0   124.00  0.66 0.00   0.00   17.30 5.48   50.00  
> 0.17 0.00   0.00   31.70 3.490.00  0.00 0.00   0.00
> 0.00 0.003.39  96.40

Rsync progress and iostat (during the rsync) from the rbd to a local ssd, to 
remove any bottlenecks doubling back to cephfs.
About 16G in 1h, not exactly blazing, this being 5 of the 7000 directories I'm 
looking to offload to cephfs.

Currently running 15.2.11, and the host is Ubuntu 20.04 (5.4.0-72-generic) with 
a single E5-2620, 64GB of memory, and 4x10GbT bond talking to ceph, iperf 
proves it out.
EC8:2, across about 16 hosts, 240 OSDs, with 24 of those being 8TB 7.2k SAS, 
and the other 216 being 2TB 7.2K SATA. So there are quite a few spindles in 
play here.
Only 128 PGs, in this pool, but its the only RBD image in this pool. Autoscaler 
recommends going to 512, but was hoping to avoid the performance overhead of 
the PG splits if possible, given perf is bad enough as is.

Examining the main directory structure it looks like there are 7000 files per 
directory, about 60% of which are <1MiB, and in all totaling nearly 5GiB per 
directory.

My fstab for this is:
> xfs   _netdev,noatime 0   0

I tried to increase the read_ahead_kb to 4M from 128K at 
/sys/block/rbd0/queue/read_ahead_kb to match the object/stripe size of the EC 
pool, but that doesn't appear to have had much of an impact.

The only thing I can think of that I could possibly try as a change would be to 
increase the queue depth in the rbdmap up from 128, so thats my next bullet to 
fire.

Attaching xfs_info in case there are any useful nuggets:
> meta-data=/dev/rbd0  isize=256agcount=81, agsize=268435455 
> blks
>  =   sectsz=512   attr=2, projid32bit=0
>  =   crc=0finobt=0, sparse=0, rmapbt=0
>  =   reflink=0
> data =   bsize=4096   blocks=21483470848, imaxpct=5
>  =   sunit=0  swidth=0 blks
> naming   =version 2  bsize=4096   ascii-ci=0, ftype=0
> log  =internal log   bsize=4096   blocks=32768, version=2
>  =   sectsz=512   sunit=0 blks, lazy-count=0
> realtime =none   extsz=4096   blocks=0, rtextents=0

And rbd-info:
> rbd image 'rbd-image-name:
> size 85 TiB in 22282240 objects
> order 22 (4 MiB objects)
> snapshot_count: 0
> id: a09cac2b772af5
> data_pool: rbd-ec82-pool
> block_name_prefix: rbd_data.29.a09cac2b772af5
> format: 2
> features: layering, exclusive-lock, object-map, fast-diff, 
> deep-flatten, data-pool
> op_features:
> flags:
> create_timestamp: Mon Apr 12 18:44:38 2021
> access_timestamp: Mon Apr 12 18:44:38 2021
> modify_timestamp: Mon Apr 12 18:44:38 2021


Any other ideas or hints are greatly appreciated.

Thanks,
Reed

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph osd will not start.

2021-05-27 Thread Peter Childs
In the end it looks like I might be able to get the node up to about 30
odds before it stops creating any more.

Or more it formats the disks but freezes up starting the daemons.

I suspect I'm missing somthing I can tune to get it working better.

If I could see any error messages that might help, but I'm yet to spit
anything.

Peter.

On Wed, 26 May 2021, 10:57 Eugen Block,  wrote:

> > If I add the osd daemons one at a time with
> >
> > ceph orch daemon add osd drywood12:/dev/sda
> >
> > It does actually work,
>
> Great!
>
> > I suspect what's happening is when my rule for creating osds run and
> > creates them all-at-once it ties the orch it overloads cephadm and it
> can't
> > cope.
>
> It's possible, I guess.
>
> > I suspect what I might need to do at least to work around the issue is
> set
> > "limit:" and bring it up until it stops working.
>
> It's worth a try, yes, although the docs state you should try to avoid
> it, it's possible that it doesn't work properly, in that case create a
> bug report. ;-)
>
> > I did work out how to get ceph-volume to nearly work manually.
> >
> > cephadm shell
> > ceph auth get client.bootstrap-osd -o
> > /var/lib/ceph/bootstrap-osd/ceph.keyring
> > ceph-volume lvm create --data /dev/sda --dmcrypt
> >
> > but given I've now got "add osd" to work, I suspect I just need to fine
> > tune my osd creation rules, so it does not try and create too many osds
> on
> > the same node at the same time.
>
> I agree, no need to do it manually if there is an automated way,
> especially if you're trying to bring up dozens of OSDs.
>
>
> Zitat von Peter Childs :
>
> > After a bit of messing around. I managed to get it somewhat working.
> >
> > If I add the osd daemons one at a time with
> >
> > ceph orch daemon add osd drywood12:/dev/sda
> >
> > It does actually work,
> >
> > I suspect what's happening is when my rule for creating osds run and
> > creates them all-at-once it ties the orch it overloads cephadm and it
> can't
> > cope.
> >
> > service_type: osd
> > service_name: osd.drywood-disks
> > placement:
> >   host_pattern: 'drywood*'
> > spec:
> >   data_devices:
> > size: "7TB:"
> >   objectstore: bluestore
> >
> > I suspect what I might need to do at least to work around the issue is
> set
> > "limit:" and bring it up until it stops working.
> >
> > I did work out how to get ceph-volume to nearly work manually.
> >
> > cephadm shell
> > ceph auth get client.bootstrap-osd -o
> > /var/lib/ceph/bootstrap-osd/ceph.keyring
> > ceph-volume lvm create --data /dev/sda --dmcrypt
> >
> > but given I've now got "add osd" to work, I suspect I just need to fine
> > tune my osd creation rules, so it does not try and create too many osds
> on
> > the same node at the same time.
> >
> >
> >
> > On Wed, 26 May 2021 at 08:25, Eugen Block  wrote:
> >
> >> Hi,
> >>
> >> I believe your current issue is due to a missing keyring for
> >> client.bootstrap-osd on the OSD node. But even after fixing that
> >> you'll probably still won't be able to deploy an OSD manually with
> >> ceph-volume because 'ceph-volume activate' is not supported with
> >> cephadm [1]. I just tried that in a virtual environment, it fails when
> >> activating the systemd-unit:
> >>
> >> ---snip---
> >> [2021-05-26 06:47:16,677][ceph_volume.process][INFO  ] Running
> >> command: /usr/bin/systemctl enable
> >> ceph-volume@lvm-8-1a8fc8ae-8f4c-4f91-b044-d5636bb52456
> >> [2021-05-26 06:47:16,692][ceph_volume.process][INFO  ] stderr Failed
> >> to connect to bus: No such file or directory
> >> [2021-05-26 06:47:16,693][ceph_volume.devices.lvm.create][ERROR ] lvm
> >> activate was unable to complete, while creating the OSD
> >> Traceback (most recent call last):
> >>File
> >> "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/create.py",
> >> line 32, in create
> >>  Activate([]).activate(args)
> >>File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py",
> >> line 16, in is_root
> >>  return func(*a, **kw)
> >>File
> >> "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py",
> >> line
> >> 294, in activate
> >>  activate_bluestore(lvs, args.no_systemd)
> >>File
> >> "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py",
> >> line
> >> 214, in activate_bluestore
> >>  systemctl.enable_volume(osd_id, osd_fsid, 'lvm')
> >>File
> >> "/usr/lib/python3.6/site-packages/ceph_volume/systemd/systemctl.py",
> >> line 82, in enable_volume
> >>  return enable(volume_unit % (device_type, id_, fsid))
> >>File
> >> "/usr/lib/python3.6/site-packages/ceph_volume/systemd/systemctl.py",
> >> line 22, in enable
> >>  process.run(['systemctl', 'enable', unit])
> >>File "/usr/lib/python3.6/site-packages/ceph_volume/process.py",
> >> line 153, in run
> >>  raise RuntimeError(msg)
> >> RuntimeError: command returned non-zero exit status: 1
> >> [2021-05-26 06:47:16,694][ceph_volume.devices.lvm.create][INFO  ] will
> >> rollback OSD ID creation
> 

[ceph-users] Re: How to add back stray OSD daemon after node re-installation

2021-05-27 Thread mabi
It works again but I had to do after a start/stop of the OSD on an admin node:

# ceph orch daemon stop osd.2
# ceph orch daemon start tosd.2

What an adventure, thanks again so much for your help!

‐‐‐ Original Message ‐‐‐
On Thursday, May 27, 2021 3:37 PM, Eugen Block  wrote:

> That file is in the regular filesystem, you can copy it from a
> different osd directory, it just a minimal ceph.conf. The directory
> for the failing osd should now be present after the failed attempts.
>
> Zitat von mabi m...@protonmail.ch:
>
> > Nicely spotted about the missing file, it looks like I have the same
> > case as you can see below from the syslog:
> > May 27 15:33:12 ceph1f systemd[1]:
> > ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service: Scheduled
> > restart job, restart counter is at 1.
> > May 27 15:33:12 ceph1f systemd[1]: Stopped Ceph osd.2 for
> > 8d47792c-987d-11eb-9bb6-a5302e00e1fa.
> > May 27 15:33:12 ceph1f systemd[1]: Starting Ceph osd.2 for
> > 8d47792c-987d-11eb-9bb6-a5302e00e1fa...
> > May 27 15:33:12 ceph1f kernel: [19332.481779] overlayfs:
> > unrecognized mount option "volatile" or missing value
> > May 27 15:33:13 ceph1f kernel: [19332.709205] overlayfs:
> > unrecognized mount option "volatile" or missing value
> > May 27 15:33:13 ceph1f kernel: [19332.933442] overlayfs:
> > unrecognized mount option "volatile" or missing value
> > May 27 15:33:13 ceph1f bash[64982]: Error: statfs
> > /var/lib/ceph/8d47792c-987d-11eb-9bb6-a5302e00e1fa/osd.2/config: no
> > such file or directory
> > May 27 15:33:13 ceph1f systemd[1]:
> > ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service: Control
> > process exited, code=exited, status=125/n/a
> > So how do I go to generate/create that missing
> > /var/lib/ceph/8d47792c-987d-11eb-9bb6-a5302e00e1fa/osd.2/config file?
> > ‐‐‐ Original Message ‐‐‐
> > On Thursday, May 27, 2021 3:28 PM, Eugen Block ebl...@nde.ag wrote:
> >
> > > Can you try with both cluster and osd fsid? Something like this:
> > > pacific2:~ # cephadm deploy --name osd.2 --fsid
> > > acbb46d6-bde3-11eb-9cf2-fa163ebb2a74 --osd-fsid
> > > bc241cd4-e284-4c5a-aad2-5744632fc7fc
> > > I tried to reproduce a similar scenario and found a missing config
> > > file in the osd directory:
> > > Error: statfs
> > > /var/lib/ceph/acbb46d6-bde3-11eb-9cf2-fa163ebb2a74/osd.2/config: no
> > > such file or directory
> > > Check your syslog for more information why the osd start fails.
> > > Zitat von mabi m...@protonmail.ch:
> > >
> > > > You are right, I used the FSID of the OSD and not of the cluster in
> > > > the deploy command. So now I tried again with the cluster ID as FSID
> > > > but still it does not work as you can see below:
> > > > ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
> > > > 8d47792c-987d-11eb-9bb6-a5302e00e1fa
> > > > Deploy daemon osd.2 ...
> > > > Traceback (most recent call last):
> > > > File "/usr/local/sbin/cephadm", line 6223, in 
> > > > r = args.func()
> > > > File "/usr/local/sbin/cephadm", line 1440, in _default_image
> > > > return func()
> > > > File "/usr/local/sbin/cephadm", line 3457, in command_deploy
> > > > deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
> > > > File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
> > > > deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
> > > > File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
> > > > assert osd_fsid
> > > > AssertionError
> > > > In case that's of any help here is the output of the "cephadm
> > > > ceph-volume lvm list" command:
> > > > == osd.2 ===
> > > > [block]
> > >
> > > /dev/ceph-cca8abe6-cf9b-4c2f-ab81-ae0758585414/osd-block-91a86f20-8083-40b1-8bf1-fe35fac3d677
> > >
> > > >   block device
> > > >
> > >
> > > /dev/ceph-cca8abe6-cf9b-4c2f-ab81-ae0758585414/osd-block-91a86f20-8083-40b1-8bf1-fe35fac3d677
> > >
> > > > block uuid W3omTg-vami-RB0V-CkVb-cgpb-88Jy-pIK2Tz
> > > > cephx lockbox secret
> > > > cluster fsid 8d47792c-987d-11eb-9bb6-a5302e00e1fa
> > > > cluster name ceph
> > > > crush device class None
> > > > encrypted 0
> > > > osd fsid 91a86f20-8083-40b1-8bf1-fe35fac3d677
> > > > osd id 2
> > > > osdspec affinity all-available-devices
> > > > type block
> > > > vdo 0
> > > > devices /dev/sda
> > > > ‐‐‐ Original Message ‐‐‐
> > > > On Thursday, May 27, 2021 12:32 PM, Eugen Block ebl...@nde.ag wrote:
> > > >
> > > > > > ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
> > > > >
> > > > > > 91a86f20-8083-40b1-8bf1-fe35fac3d677
> > > > > > Deploy daemon osd.2 ...
> > > > >
> > > > > Which fsid is it, the cluster's or the OSD's? According to the
> > > > > 'cephadm deploy' help page it should be the cluster fsid.
> > > > > Zitat von mabi m...@protonmail.ch:
> > > > >
> > > > > > Hi Eugen,
> > > > > > What a good coincidence ;-)
> > > > > > So I ran "cephadm ceph-volume lvm list" on the OSD node which I
> > > > > > re-instaled and it saw my osd.2 OSD. So far so good, but the
> > > > > > following 

[ceph-users] Re: How to add back stray OSD daemon after node re-installation

2021-05-27 Thread mabi
Nicely spotted about the missing file, it looks like I have the same case as 
you can see below from the syslog:

May 27 15:33:12 ceph1f systemd[1]: 
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service: Scheduled restart job, 
restart counter is at 1.
May 27 15:33:12 ceph1f systemd[1]: Stopped Ceph osd.2 for 
8d47792c-987d-11eb-9bb6-a5302e00e1fa.
May 27 15:33:12 ceph1f systemd[1]: Starting Ceph osd.2 for 
8d47792c-987d-11eb-9bb6-a5302e00e1fa...
May 27 15:33:12 ceph1f kernel: [19332.481779] overlayfs: unrecognized mount 
option "volatile" or missing value
May 27 15:33:13 ceph1f kernel: [19332.709205] overlayfs: unrecognized mount 
option "volatile" or missing value
May 27 15:33:13 ceph1f kernel: [19332.933442] overlayfs: unrecognized mount 
option "volatile" or missing value
May 27 15:33:13 ceph1f bash[64982]: Error: statfs 
/var/lib/ceph/8d47792c-987d-11eb-9bb6-a5302e00e1fa/osd.2/config: no such file 
or directory
May 27 15:33:13 ceph1f systemd[1]: 
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service: Control process 
exited, code=exited, status=125/n/a

So how do I go to generate/create that missing 
/var/lib/ceph/8d47792c-987d-11eb-9bb6-a5302e00e1fa/osd.2/config file?


‐‐‐ Original Message ‐‐‐
On Thursday, May 27, 2021 3:28 PM, Eugen Block  wrote:

> Can you try with both cluster and osd fsid? Something like this:
>
> pacific2:~ # cephadm deploy --name osd.2 --fsid
> acbb46d6-bde3-11eb-9cf2-fa163ebb2a74 --osd-fsid
> bc241cd4-e284-4c5a-aad2-5744632fc7fc
>
> I tried to reproduce a similar scenario and found a missing config
> file in the osd directory:
>
> Error: statfs
> /var/lib/ceph/acbb46d6-bde3-11eb-9cf2-fa163ebb2a74/osd.2/config: no
> such file or directory
>
> Check your syslog for more information why the osd start fails.
>
> Zitat von mabi m...@protonmail.ch:
>
> > You are right, I used the FSID of the OSD and not of the cluster in
> > the deploy command. So now I tried again with the cluster ID as FSID
> > but still it does not work as you can see below:
> > ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
> > 8d47792c-987d-11eb-9bb6-a5302e00e1fa
> > Deploy daemon osd.2 ...
> > Traceback (most recent call last):
> > File "/usr/local/sbin/cephadm", line 6223, in 
> > r = args.func()
> > File "/usr/local/sbin/cephadm", line 1440, in _default_image
> > return func()
> > File "/usr/local/sbin/cephadm", line 3457, in command_deploy
> > deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
> > File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
> > deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
> > File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
> > assert osd_fsid
> > AssertionError
> > In case that's of any help here is the output of the "cephadm
> > ceph-volume lvm list" command:
> > == osd.2 ===
> > [block]
> > /dev/ceph-cca8abe6-cf9b-4c2f-ab81-ae0758585414/osd-block-91a86f20-8083-40b1-8bf1-fe35fac3d677
> >
> >   block device
> >
> >
> > /dev/ceph-cca8abe6-cf9b-4c2f-ab81-ae0758585414/osd-block-91a86f20-8083-40b1-8bf1-fe35fac3d677
> > block uuid W3omTg-vami-RB0V-CkVb-cgpb-88Jy-pIK2Tz
> > cephx lockbox secret
> > cluster fsid 8d47792c-987d-11eb-9bb6-a5302e00e1fa
> > cluster name ceph
> > crush device class None
> > encrypted 0
> > osd fsid 91a86f20-8083-40b1-8bf1-fe35fac3d677
> > osd id 2
> > osdspec affinity all-available-devices
> > type block
> > vdo 0
> > devices /dev/sda
> > ‐‐‐ Original Message ‐‐‐
> > On Thursday, May 27, 2021 12:32 PM, Eugen Block ebl...@nde.ag wrote:
> >
> > > > ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
> > >
> > > > 91a86f20-8083-40b1-8bf1-fe35fac3d677
> > > > Deploy daemon osd.2 ...
> > >
> > > Which fsid is it, the cluster's or the OSD's? According to the
> > > 'cephadm deploy' help page it should be the cluster fsid.
> > > Zitat von mabi m...@protonmail.ch:
> > >
> > > > Hi Eugen,
> > > > What a good coincidence ;-)
> > > > So I ran "cephadm ceph-volume lvm list" on the OSD node which I
> > > > re-instaled and it saw my osd.2 OSD. So far so good, but the
> > > > following suggested command does not work as you can see below:
> > > > ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
> > > > 91a86f20-8083-40b1-8bf1-fe35fac3d677
> > > > Deploy daemon osd.2 ...
> > > > Traceback (most recent call last):
> > > > File "/usr/local/sbin/cephadm", line 6223, in 
> > > > r = args.func()
> > > > File "/usr/local/sbin/cephadm", line 1440, in _default_image
> > > > return func()
> > > > File "/usr/local/sbin/cephadm", line 3457, in command_deploy
> > > > deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
> > > > File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
> > > > deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
> > > > File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
> > > > assert osd_fsid
> > > > AssertionError
> > > > Any ideas what is wrong here?
> > > > Regards,
> > > > Mabi
> > > > ‐‐‐ Original 

[ceph-users] Re: rebalancing after node more

2021-05-27 Thread Rok Jaklič
16.2.4

for some when starting ods with systemctl on this "renewed" host, did not
start osds after a while, but when doing it through console manually, it
did.

Thank anyway.

On Thu, 27 May 2021, 16:31 Eugen Block,  wrote:

> Yes, if your pool requires 5 chunks and you only have 5 hosts (with
> failure domain host) your PGs become undersized when a host fails and
> won't recover until the OSDs come back. Which ceph version is this?
>
>
> Zitat von Rok Jaklič :
>
> > For this pool I have set EC 3+2 (so in total I have 5 nodes) which one
> was
> > temporarily removed, but maybe this was the problem?
> >
> > On Thu, May 27, 2021 at 3:51 PM Rok Jaklič  wrote:
> >
> >> Hi, thanks for quick reply
> >>
> >> root@ctplmon1:~# ceph pg dump pgs_brief | grep undersized
> >> dumped pgs_brief
> >> 9.5  active+undersized+degraded   [72,85,54,120,2147483647]
> >> 72   [72,85,54,120,2147483647]  72
> >> 9.6  active+undersized+degraded  [101,47,113,74,2147483647]
> >>  101  [101,47,113,74,2147483647] 101
> >> 9.2  active+undersized+degraded   [86,118,74,2147483647,49]
> >> 86   [86,118,74,2147483647,49]  86
> >> 9.d  active+undersized+degraded   [49,136,83,90,2147483647]
> >> 49   [49,136,83,90,2147483647]  49
> >> 9.f  active+undersized+degraded  [55,103,81,128,2147483647]
> >> 55  [55,103,81,128,2147483647]  55
> >> 9.18 active+undersized+degraded   [115,50,61,89,2147483647]
> >>  115   [115,50,61,89,2147483647] 115
> >> 9.1d active+undersized+degraded   [61,90,31,2147483647,125]
> >> 61   [61,90,31,2147483647,125]  61
> >> 9.10 active+undersized+degraded   [46,2147483647,71,86,122]
> >> 46   [46,2147483647,71,86,122]  46
> >> 9.17 active+undersized+degraded   [60,95,114,2147483647,48]
> >> 60   [60,95,114,2147483647,48]  60
> >> 9.15 active+undersized+degraded  [121,76,30,101,2147483647]
> >>  121  [121,76,30,101,2147483647] 121
> >> root@ctplmon1:~# ceph osd tree
> >> ID   CLASS  WEIGHT TYPE NAME  STATUS  REWEIGHT  PRI-AFF
> >>  -1 764.11981  root default
> >>  -3 152.82378  host ctplosd1
> >>   0hdd5.45798  osd.0down 0  1.0
> >>   1hdd5.45799  osd.1down 0  1.0
> >>   2hdd5.45799  osd.2down 0  1.0
> >>   3hdd5.45799  osd.3down 0  1.0
> >>   4hdd5.45799  osd.4down 0  1.0
> >>   5hdd5.45799  osd.5down 0  1.0
> >>   6hdd5.45799  osd.6down 0  1.0
> >>   7hdd5.45799  osd.7down 0  1.0
> >>   8hdd5.45799  osd.8down 0  1.0
> >>   9hdd5.45799  osd.9down 0  1.0
> >>  10hdd5.45799  osd.10   down 0  1.0
> >>  11hdd5.45799  osd.11   down 0  1.0
> >>  12hdd5.45799  osd.12   down 0  1.0
> >>  13hdd5.45799  osd.13   down 0  1.0
> >>  14hdd5.45799  osd.14   down 0  1.0
> >>  15hdd5.45799  osd.15   down 0  1.0
> >>  16hdd5.45799  osd.16   down 0  1.0
> >>  17hdd5.45799  osd.17   down 0  1.0
> >>  18hdd5.45799  osd.18   down 0  1.0
> >>  19hdd5.45799  osd.19   down 0  1.0
> >>  20hdd5.45799  osd.20   down 0  1.0
> >>  21hdd5.45799  osd.21   down 0  1.0
> >>  22hdd5.45799  osd.22   down 0  1.0
> >>  23hdd5.45799  osd.23   down 0  1.0
> >>  24hdd5.45799  osd.24   down 0  1.0
> >>  25hdd5.45799  osd.25   down 0  1.0
> >>  26hdd5.45799  osd.26   down 0  1.0
> >>  27hdd5.45799  osd.27   down 0  1.0
> >> -11 152.82401  host ctplosd5
> >> 112hdd5.45799  osd.112up   1.0  1.0
> >> 113hdd5.45799  osd.113up   1.0  1.0
> >> 114hdd5.45799  osd.114up   1.0  1.0
> >> 115hdd5.45799  osd.115up   1.0  1.0
> >> 116hdd5.45799  osd.116up   1.0  1.0
> >> 117hdd5.45799  osd.117up   1.0  1.0
> >> 118hdd5.45799  osd.118up   1.0  1.0
> >> 119hdd5.45799  osd.119up   1.0  1.0
> >> 120hdd5.45799  osd.120up   1.0  1.0
> >> 121hdd5.45799  osd.121  

[ceph-users] Re: How to add back stray OSD daemon after node re-installation

2021-05-27 Thread mabi
I managed to remove that wrongly created cluster on the node running:

sudo cephadm rm-cluster --fsid 91a86f20-8083-40b1-8bf1-fe35fac3d677 --force

So I am getting closed but the osd.2 service on that node simply does not want 
to start as you can see below:

# ceph orch daemon start osd.2
Scheduled to start osd.2 on host 'ceph1f'

# ceph orch ps|grep osd.2
osd.2  ceph1f  unknown2m ago -  
  

In the log files I see the following:

5/27/21 2:47:34 PM[ERR]`ceph1f: cephadm unit osd.2 start` failed Traceback 
(most recent call last): File "/usr/share/ceph/mgr/cephadm/module.py", line 
1451, in _daemon_action ['--name', name, a]) File 
"/usr/share/ceph/mgr/cephadm/module.py", line 1168, in _run_cephadm code, 
'\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm exited with 
an error code: 1, stderr:stderr Job for 
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service failed because the 
control process exited with error code. stderr See "systemctl status 
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service" and "journalctl -xe" 
for details. Traceback (most recent call last): File "", line 6159, in 
 File "", line 1310, in _infer_fsid File "", line 3655, 
in command_unit File "", line 1072, in call_throws RuntimeError: Failed 
command: systemctl start ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2

5/27/21 2:47:34 PM[ERR]cephadm exited with an error code: 1, stderr:stderr Job 
for ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service failed because the 
control process exited with error code. stderr See "systemctl status 
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service" and "journalctl -xe" 
for details. Traceback (most recent call last): File "", line 6159, in 
 File "", line 1310, in _infer_fsid File "", line 3655, 
in command_unit File "", line 1072, in call_throws RuntimeError: Failed 
command: systemctl start ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2 
Traceback (most recent call last): File 
"/usr/share/ceph/mgr/cephadm/module.py", line 1021, in _remote_connection yield 
(conn, connr) File "/usr/share/ceph/mgr/cephadm/module.py", line 1168, in 
_run_cephadm code, '\n'.join(err))) orchestrator._interface.OrchestratorError: 
cephadm exited with an error code: 1, stderr:stderr Job for 
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service failed because the 
control process exited with error code. stderr See "systemctl status 
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service" and "journalctl -xe" 
for details. Traceback (most recent call last): File "", line 6159, in 
 File "", line 1310, in _infer_fsid File "", line 3655, 
in command_unit File "", line 1072, in call_throws RuntimeError: Failed 
command: systemctl start ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2

And finally the systemctl "status" of that osd.2 service on the OSD node:

ubuntu@ceph1f:/var/lib/ceph$ sudo systemctl status 
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service
● ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service - Ceph osd.2 for 
8d47792c-987d-11eb-9bb6-a5302e00e1fa
 Loaded: loaded 
(/etc/systemd/system/ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@.service; 
disabled; vendor preset: enabled)
 Active: failed (Result: exit-code) since Thu 2021-05-27 14:48:24 CEST; 20s 
ago
Process: 56163 ExecStartPre=/bin/rm -f 
//run/ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service-pid 
//run/ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service-cid (code=exited, 
status=0/SUCCESS)
Process: 56164 ExecStart=/bin/bash 
/var/lib/ceph/8d47792c-987d-11eb-9bb6-a5302e00e1fa/osd.2/unit.run (code=exited, 
status=127)
Process: 56165 ExecStopPost=/bin/bash 
/var/lib/ceph/8d47792c-987d-11eb-9bb6-a5302e00e1fa/osd.2/unit.poststop 
(code=exited, status=127)
Process: 56166 ExecStopPost=/bin/rm -f 
//run/ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service-pid 
//run/ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service-cid (code=exited, 
status=0/SUCCESS)

May 27 14:48:14 ceph1f systemd[1]: Failed to start Ceph osd.2 for 
8d47792c-987d-11eb-9bb6-a5302e00e1fa.
May 27 14:48:24 ceph1f systemd[1]: 
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service: Scheduled restart job, 
restart counter is at 5.
May 27 14:48:24 ceph1f systemd[1]: Stopped Ceph osd.2 for 
8d47792c-987d-11eb-9bb6-a5302e00e1fa.
May 27 14:48:24 ceph1f systemd[1]: 
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service: Start request repeated 
too quickly.
May 27 14:48:24 ceph1f systemd[1]: 
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service: Failed with result 
'exit-code'.
May 27 14:48:24 ceph1f systemd[1]: Failed to start Ceph osd.2 for 
8d47792c-987d-11eb-9bb6-a5302e00e1fa.


‐‐‐ Original Message ‐‐‐
On Thursday, May 27, 2021 2:22 PM, mabi  wrote:

> I am trying to run "cephadm shell" on that newly installed OSD node and it 
> seems that I have now unfortunately configured a new cluster ID as it shows:
>
> 

[ceph-users] Re: MDS stuck in up:stopping state

2021-05-27 Thread Martin Rasmus Lundquist Hansen
Hi Weiwen,

Amazing, that actually worked. So simple, thanks!


Fra: 胡 玮文 
Sendt: 27. maj 2021 09:02
Til: Martin Rasmus Lundquist Hansen ; ceph-users@ceph.io 

Emne: 回复: MDS stuck in up:stopping state


Hi Martin,



You may hit 
https://tracker.ceph.com/issues/50112,
 which we failed to find the root cause yet. I resolved this by restart rank 0. 
(I have only 2 active MDSs)



Weiwen Hu



发送自 Windows 10 
版邮件应用



发件人: Martin Rasmus Lundquist Hansen
发送时间: 2021年5月27日 14:26
收件人: ceph-users@ceph.io
主题: [ceph-users] MDS stuck in up:stopping state



After scaling the number of MDS daemons down, we now have a daemon stuck in the
"up:stopping" state. The documentation says it can take several minutes to stop 
the
daemon, but it has been stuck in this state for almost a full day. According to
the "ceph fs status" output attached below, it still holds information about 2
inodes, which we assume is the reason why it cannot stop completely.

Does anyone know what we can do to finally stop it?


cephfs - 71 clients
==
RANK   STATEMDS ACTIVITY DNSINOS
 0 active   ceph-mon-01  Reqs:0 /s  15.7M  15.4M
 1 active   ceph-mon-02  Reqs:   48 /s  19.7M  17.1M
 2stopping  ceph-mon-030  2
  POOL TYPE USED  AVAIL
cephfs_metadata  metadata   652G   185T
  cephfs_data  data1637T   539T
   STANDBY MDS
ceph-mon-03-mds-2
MDS version: ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) 
octopus (stable)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to add back stray OSD daemon after node re-installation

2021-05-27 Thread mabi
I am trying to run "cephadm shell" on that newly installed OSD node and it 
seems that I have now unfortunately configured a new cluster ID as it shows:

ubuntu@ceph1f:~$ sudo cephadm shell
ERROR: Cannot infer an fsid, one must be specified: 
['8d47792c-987d-11eb-9bb6-a5302e00e1fa', '91a86f20-8083-40b1-8bf1-fe35fac3d677']

Maybe this is causing trouble... So is there any method where I can remove the 
wrongly new created cluster ID 91a86f20-8083-40b1-8bf1-fe35fac3d677 ??


‐‐‐ Original Message ‐‐‐
On Thursday, May 27, 2021 12:58 PM, mabi  wrote:

> You are right, I used the FSID of the OSD and not of the cluster in the 
> deploy command. So now I tried again with the cluster ID as FSID but still it 
> does not work as you can see below:
>
> ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid 
> 8d47792c-987d-11eb-9bb6-a5302e00e1fa
> Deploy daemon osd.2 ...
> Traceback (most recent call last):
> File "/usr/local/sbin/cephadm", line 6223, in 
>
> r = args.func()
>
>
> File "/usr/local/sbin/cephadm", line 1440, in _default_image
> return func()
> File "/usr/local/sbin/cephadm", line 3457, in command_deploy
> deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
> File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
> deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
> File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
> assert osd_fsid
> AssertionError
>
> In case that's of any help here is the output of the "cephadm ceph-volume lvm 
> list" command:
>
> == osd.2 ===
>
> [block] 
> /dev/ceph-cca8abe6-cf9b-4c2f-ab81-ae0758585414/osd-block-91a86f20-8083-40b1-8bf1-fe35fac3d677
>
> block device 
> /dev/ceph-cca8abe6-cf9b-4c2f-ab81-ae0758585414/osd-block-91a86f20-8083-40b1-8bf1-fe35fac3d677
> block uuid W3omTg-vami-RB0V-CkVb-cgpb-88Jy-pIK2Tz
> cephx lockbox secret
> cluster fsid 8d47792c-987d-11eb-9bb6-a5302e00e1fa
> cluster name ceph
> crush device class None
> encrypted 0
> osd fsid 91a86f20-8083-40b1-8bf1-fe35fac3d677
> osd id 2
> osdspec affinity all-available-devices
> type block
> vdo 0
> devices /dev/sda
>
> ‐‐‐ Original Message ‐‐‐
> On Thursday, May 27, 2021 12:32 PM, Eugen Block ebl...@nde.ag wrote:
>
> > > ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
> >
> > > 91a86f20-8083-40b1-8bf1-fe35fac3d677
> > > Deploy daemon osd.2 ...
> >
> > Which fsid is it, the cluster's or the OSD's? According to the
> > 'cephadm deploy' help page it should be the cluster fsid.
> > Zitat von mabi m...@protonmail.ch:
> >
> > > Hi Eugen,
> > > What a good coincidence ;-)
> > > So I ran "cephadm ceph-volume lvm list" on the OSD node which I
> > > re-instaled and it saw my osd.2 OSD. So far so good, but the
> > > following suggested command does not work as you can see below:
> > > ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
> > > 91a86f20-8083-40b1-8bf1-fe35fac3d677
> > > Deploy daemon osd.2 ...
> > > Traceback (most recent call last):
> > > File "/usr/local/sbin/cephadm", line 6223, in 
> > > r = args.func()
> > > File "/usr/local/sbin/cephadm", line 1440, in _default_image
> > > return func()
> > > File "/usr/local/sbin/cephadm", line 3457, in command_deploy
> > > deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
> > > File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
> > > deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
> > > File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
> > > assert osd_fsid
> > > AssertionError
> > > Any ideas what is wrong here?
> > > Regards,
> > > Mabi
> > > ‐‐‐ Original Message ‐‐‐
> > > On Thursday, May 27, 2021 12:13 PM, Eugen Block ebl...@nde.ag wrote:
> > >
> > > > Hi,
> > > > I posted a link to the docs [1], [2] just yesterday ;-)
> > > > You should see the respective OSD in the output of 'cephadm
> > > > ceph-volume lvm list' on that node. You should then be able to get it
> > > > back to cephadm with
> > > > cephadm deploy --name osd.x
> > > > But I haven't tried this yet myself, so please report back if that
> > > > works for you.
> > > > Regards,
> > > > Eugen
> > > > [1] https://tracker.ceph.com/issues/49159
> > > > [2] https://tracker.ceph.com/issues/46691
> > > > Zitat von mabi m...@protonmail.ch:
> > > >
> > > > > Hello,
> > > > > I have by mistake re-installed the OS of an OSD node of my Octopus
> > > > > cluster (managed by cephadm). Luckily the OSD data is on a separate
> > > > > disk and did not get affected by the re-install.
> > > > > Now I have the following state:
> > > > >
> > > > > health: HEALTH_WARN
> > > > > 1 stray daemon(s) not managed by cephadm
> > > > > 1 osds down
> > > > > 1 host (1 osds) down
> > > > >
> > > > >
> > > > > To fix that I tried to run:
> > > > > ceph orch daemon add osd ceph1f:/dev/sda
> > > > >
> > > > > =
> > > > >
> > > > > Created no osd(s) on host ceph1f; already created?
> > > > 

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-27 Thread Dan van der Ster
Hi Fulvio,

I suggest removing only the upmaps which are clearly incorrect, and
then see if the upmap balancer re-creates them.
Perhaps they were created when they were not incorrect, when you had a
different crush rule?
Or perhaps you're running an old version of ceph which had buggy
balancer implementation?

Cheers, Dan



On Thu, May 27, 2021 at 5:16 PM Fulvio Galeazzi  wrote:
>
> Hallo Dan, Nathan, thanks for your replies and apologies for my silence.
>
>Sorry I had made a typo... the rule is really 6+4. And to reply to
> Nathan's message, the rule was built like this in anticipation of
> getting additional servers, at which point in time I will relax the "2
> chunks per OSD" part.
>
> [cephmgr@cephAdmPA1.cephAdmPA1 ~]$ ceph osd pool get
> default.rgw.buckets.data erasure_code_profile
> erasure_code_profile: ec_6and4_big
> [cephmgr@cephAdmPA1.cephAdmPA1 ~]$ ceph osd erasure-code-profile get
> ec_6and4_big
> crush-device-class=big
> crush-failure-domain=osd
> crush-root=default
> jerasure-per-chunk-alignment=false
> k=6
> m=4
> plugin=jerasure
> technique=reed_sol_van
> w=8
>
> Indeed, Dan:
>
> [cephmgr@cephAdmPA1.cephAdmPA1 ~]$ ceph osd dump | grep upmap | grep 116.453
> pg_upmap_items 116.453 [76,49,129,108]
>
> Don't think I ever set such an upmap myself. Do you think it would be
> good to try and remove all upmaps, let the upmap balancer do its magic,
> and check again?
>
>Thanks!
>
> Fulvio
>
>
> On 20/05/2021 18:59, Dan van der Ster wrote:
> > Hold on: 8+4 needs 12 osds but you only show 10 there. Shouldn't you
> > choose 6 type host and then chooseleaf 2 type osd?
> >
> > .. Dan
> >
> >
> > On Thu, May 20, 2021, 1:30 PM Fulvio Galeazzi  > > wrote:
> >
> > Hallo Dan, Bryan,
> >   I have a rule similar to yours, for an 8+4 pool, with only
> > difference that I replaced the second "choose" with "chooseleaf", which
> > I understand should make no difference:
> >
> > rule default.rgw.buckets.data {
> >   id 6
> >   type erasure
> >   min_size 3
> >   max_size 10
> >   step set_chooseleaf_tries 5
> >   step set_choose_tries 100
> >   step take default class big
> >   step choose indep 5 type host
> >   step chooseleaf indep 2 type osd
> >   step emit
> > }
> >
> > I am on Nautilus 14.2.16 and while performing a maintenance the
> > other
> > day, I noticed 2 PGs were incomplete and caused troubles to some users.
> > I then verified that (thanks Bryan for the command):
> >
> > [cephmgr@cephAdmCT1.cephAdmCT1 clusterCT]$ for osd in $(ceph pg map
> > 116.453 -f json | jq -r '.up[]'); do ceph osd find $osd | jq -r '.host'
> > ; done | sort | uniq -c | sort -n -k1
> > 2 r2srv07.ct1.box.garr
> > 2 r2srv10.ct1.box.garr
> > 2 r3srv07.ct1.box.garr
> > 4 r1srv02.ct1.box.garr
> >
> > You see that 4 PGs were put on r1srv02.
> > May be this happened due to some temporary unavailability of the
> > host at
> > some point? As all my servers are now up and running, is there a way to
> > force the placement rule to rerun?
> >
> > Thanks!
> >
> >  Fulvio
> >
> >
> > Il 5/16/2021 11:40 PM, Dan van der Ster ha scritto:
> >  > Hi Bryan,
> >  >
> >  > I had to do something similar, and never found a rule to place
> > "up to"
> >  > 2 chunks per host, so I stayed with the placement of *exactly* 2
> >  > chunks per host.
> >  >
> >  > But I did this slightly differently to what you wrote earlier: my
> > rule
> >  > chooses exactly 4 hosts, then chooses exactly 2 osds on each:
> >  >
> >  >  type erasure
> >  >  min_size 3
> >  >  max_size 10
> >  >  step set_chooseleaf_tries 5
> >  >  step set_choose_tries 100
> >  >  step take default class hdd
> >  >  step choose indep 4 type host
> >  >  step choose indep 2 type osd
> >  >  step emit
> >  >
> >  > If you really need the "up to 2" approach then maybe you can split
> >  > each host into two "host" crush buckets, with half the OSDs in each.
> >  > Then a normal host-wise rule should work.
> >  >
> >  > Cheers, Dan
> >  >
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-27 Thread Fulvio Galeazzi

Hallo Dan, Nathan, thanks for your replies and apologies for my silence.

  Sorry I had made a typo... the rule is really 6+4. And to reply to 
Nathan's message, the rule was built like this in anticipation of 
getting additional servers, at which point in time I will relax the "2 
chunks per OSD" part.


[cephmgr@cephAdmPA1.cephAdmPA1 ~]$ ceph osd pool get 
default.rgw.buckets.data erasure_code_profile

erasure_code_profile: ec_6and4_big
[cephmgr@cephAdmPA1.cephAdmPA1 ~]$ ceph osd erasure-code-profile get 
ec_6and4_big

crush-device-class=big
crush-failure-domain=osd
crush-root=default
jerasure-per-chunk-alignment=false
k=6
m=4
plugin=jerasure
technique=reed_sol_van
w=8

Indeed, Dan:

[cephmgr@cephAdmPA1.cephAdmPA1 ~]$ ceph osd dump | grep upmap | grep 116.453
pg_upmap_items 116.453 [76,49,129,108]

Don't think I ever set such an upmap myself. Do you think it would be 
good to try and remove all upmaps, let the upmap balancer do its magic, 
and check again?


  Thanks!

Fulvio


On 20/05/2021 18:59, Dan van der Ster wrote:
Hold on: 8+4 needs 12 osds but you only show 10 there. Shouldn't you 
choose 6 type host and then chooseleaf 2 type osd?


.. Dan


On Thu, May 20, 2021, 1:30 PM Fulvio Galeazzi > wrote:


Hallo Dan, Bryan,
      I have a rule similar to yours, for an 8+4 pool, with only
difference that I replaced the second "choose" with "chooseleaf", which
I understand should make no difference:

rule default.rgw.buckets.data {
          id 6
          type erasure
          min_size 3
          max_size 10
          step set_chooseleaf_tries 5
          step set_choose_tries 100
          step take default class big
          step choose indep 5 type host
          step chooseleaf indep 2 type osd
          step emit
}

    I am on Nautilus 14.2.16 and while performing a maintenance the
other
day, I noticed 2 PGs were incomplete and caused troubles to some users.
I then verified that (thanks Bryan for the command):

[cephmgr@cephAdmCT1.cephAdmCT1 clusterCT]$ for osd in $(ceph pg map
116.453 -f json | jq -r '.up[]'); do ceph osd find $osd | jq -r '.host'
; done | sort | uniq -c | sort -n -k1
        2 r2srv07.ct1.box.garr
        2 r2srv10.ct1.box.garr
        2 r3srv07.ct1.box.garr
        4 r1srv02.ct1.box.garr

    You see that 4 PGs were put on r1srv02.
May be this happened due to some temporary unavailability of the
host at
some point? As all my servers are now up and running, is there a way to
force the placement rule to rerun?

    Thanks!

                         Fulvio


Il 5/16/2021 11:40 PM, Dan van der Ster ha scritto:
 > Hi Bryan,
 >
 > I had to do something similar, and never found a rule to place
"up to"
 > 2 chunks per host, so I stayed with the placement of *exactly* 2
 > chunks per host.
 >
 > But I did this slightly differently to what you wrote earlier: my
rule
 > chooses exactly 4 hosts, then chooses exactly 2 osds on each:
 >
 >          type erasure
 >          min_size 3
 >          max_size 10
 >          step set_chooseleaf_tries 5
 >          step set_choose_tries 100
 >          step take default class hdd
 >          step choose indep 4 type host
 >          step choose indep 2 type osd
 >          step emit
 >
 > If you really need the "up to 2" approach then maybe you can split
 > each host into two "host" crush buckets, with half the OSDs in each.
 > Then a normal host-wise rule should work.
 >
 > Cheers, Dan
 >




smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to add back stray OSD daemon after node re-installation

2021-05-27 Thread mabi
You are right, I used the FSID of the OSD and not of the cluster in the deploy 
command. So now I tried again with the cluster ID as FSID but still it does not 
work as you can see below:

ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid 
8d47792c-987d-11eb-9bb6-a5302e00e1fa
Deploy daemon osd.2 ...
Traceback (most recent call last):
  File "/usr/local/sbin/cephadm", line 6223, in 
r = args.func()
  File "/usr/local/sbin/cephadm", line 1440, in _default_image
return func()
  File "/usr/local/sbin/cephadm", line 3457, in command_deploy
deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
  File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
  File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
assert osd_fsid
AssertionError

In case that's of any help here is the output of the "cephadm ceph-volume lvm 
list" command:

== osd.2 ===

  [block]   
/dev/ceph-cca8abe6-cf9b-4c2f-ab81-ae0758585414/osd-block-91a86f20-8083-40b1-8bf1-fe35fac3d677

  block device  
/dev/ceph-cca8abe6-cf9b-4c2f-ab81-ae0758585414/osd-block-91a86f20-8083-40b1-8bf1-fe35fac3d677
  block uuidW3omTg-vami-RB0V-CkVb-cgpb-88Jy-pIK2Tz
  cephx lockbox secret
  cluster fsid  8d47792c-987d-11eb-9bb6-a5302e00e1fa
  cluster name  ceph
  crush device classNone
  encrypted 0
  osd fsid  91a86f20-8083-40b1-8bf1-fe35fac3d677
  osd id2
  osdspec affinity  all-available-devices
  type  block
  vdo   0
  devices   /dev/sda

‐‐‐ Original Message ‐‐‐
On Thursday, May 27, 2021 12:32 PM, Eugen Block  wrote:

> > ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
>
> > 91a86f20-8083-40b1-8bf1-fe35fac3d677
> > Deploy daemon osd.2 ...
>
> Which fsid is it, the cluster's or the OSD's? According to the
> 'cephadm deploy' help page it should be the cluster fsid.
>
> Zitat von mabi m...@protonmail.ch:
>
> > Hi Eugen,
> > What a good coincidence ;-)
> > So I ran "cephadm ceph-volume lvm list" on the OSD node which I
> > re-instaled and it saw my osd.2 OSD. So far so good, but the
> > following suggested command does not work as you can see below:
> > ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
> > 91a86f20-8083-40b1-8bf1-fe35fac3d677
> > Deploy daemon osd.2 ...
> > Traceback (most recent call last):
> > File "/usr/local/sbin/cephadm", line 6223, in 
> > r = args.func()
> > File "/usr/local/sbin/cephadm", line 1440, in _default_image
> > return func()
> > File "/usr/local/sbin/cephadm", line 3457, in command_deploy
> > deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
> > File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
> > deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
> > File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
> > assert osd_fsid
> > AssertionError
> > Any ideas what is wrong here?
> > Regards,
> > Mabi
> > ‐‐‐ Original Message ‐‐‐
> > On Thursday, May 27, 2021 12:13 PM, Eugen Block ebl...@nde.ag wrote:
> >
> > > Hi,
> > > I posted a link to the docs [1], [2] just yesterday ;-)
> > > You should see the respective OSD in the output of 'cephadm
> > > ceph-volume lvm list' on that node. You should then be able to get it
> > > back to cephadm with
> > > cephadm deploy --name osd.x
> > > But I haven't tried this yet myself, so please report back if that
> > > works for you.
> > > Regards,
> > > Eugen
> > > [1] https://tracker.ceph.com/issues/49159
> > > [2] https://tracker.ceph.com/issues/46691
> > > Zitat von mabi m...@protonmail.ch:
> > >
> > > > Hello,
> > > > I have by mistake re-installed the OS of an OSD node of my Octopus
> > > > cluster (managed by cephadm). Luckily the OSD data is on a separate
> > > > disk and did not get affected by the re-install.
> > > > Now I have the following state:
> > > >
> > > > health: HEALTH_WARN
> > > > 1 stray daemon(s) not managed by cephadm
> > > > 1 osds down
> > > > 1 host (1 osds) down
> > > >
> > > >
> > > > To fix that I tried to run:
> > > > ceph orch daemon add osd ceph1f:/dev/sda
> > > > =
> > > > Created no osd(s) on host ceph1f; already created?
> > > > That did not work, so I tried:
> > > > ceph cephadm osd activate ceph1f
> > > > =
> > > > no valid command found; 10 closest matches:
> > > > ...
> > > > Error EINVAL: invalid command
> > > > Did not work either. So I wanted to ask how can I "adopt" back an
> > > > OSD disk to my cluster?
> > > > Thanks for your help.
> > > > Regards,
> > > > Mabi
> > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >
> > > ceph-users mailing list -- 

[ceph-users] Re: rebalancing after node more

2021-05-27 Thread Eugen Block
Yes, if your pool requires 5 chunks and you only have 5 hosts (with  
failure domain host) your PGs become undersized when a host fails and  
won't recover until the OSDs come back. Which ceph version is this?



Zitat von Rok Jaklič :


For this pool I have set EC 3+2 (so in total I have 5 nodes) which one was
temporarily removed, but maybe this was the problem?

On Thu, May 27, 2021 at 3:51 PM Rok Jaklič  wrote:


Hi, thanks for quick reply

root@ctplmon1:~# ceph pg dump pgs_brief | grep undersized
dumped pgs_brief
9.5  active+undersized+degraded   [72,85,54,120,2147483647]
72   [72,85,54,120,2147483647]  72
9.6  active+undersized+degraded  [101,47,113,74,2147483647]
 101  [101,47,113,74,2147483647] 101
9.2  active+undersized+degraded   [86,118,74,2147483647,49]
86   [86,118,74,2147483647,49]  86
9.d  active+undersized+degraded   [49,136,83,90,2147483647]
49   [49,136,83,90,2147483647]  49
9.f  active+undersized+degraded  [55,103,81,128,2147483647]
55  [55,103,81,128,2147483647]  55
9.18 active+undersized+degraded   [115,50,61,89,2147483647]
 115   [115,50,61,89,2147483647] 115
9.1d active+undersized+degraded   [61,90,31,2147483647,125]
61   [61,90,31,2147483647,125]  61
9.10 active+undersized+degraded   [46,2147483647,71,86,122]
46   [46,2147483647,71,86,122]  46
9.17 active+undersized+degraded   [60,95,114,2147483647,48]
60   [60,95,114,2147483647,48]  60
9.15 active+undersized+degraded  [121,76,30,101,2147483647]
 121  [121,76,30,101,2147483647] 121
root@ctplmon1:~# ceph osd tree
ID   CLASS  WEIGHT TYPE NAME  STATUS  REWEIGHT  PRI-AFF
 -1 764.11981  root default
 -3 152.82378  host ctplosd1
  0hdd5.45798  osd.0down 0  1.0
  1hdd5.45799  osd.1down 0  1.0
  2hdd5.45799  osd.2down 0  1.0
  3hdd5.45799  osd.3down 0  1.0
  4hdd5.45799  osd.4down 0  1.0
  5hdd5.45799  osd.5down 0  1.0
  6hdd5.45799  osd.6down 0  1.0
  7hdd5.45799  osd.7down 0  1.0
  8hdd5.45799  osd.8down 0  1.0
  9hdd5.45799  osd.9down 0  1.0
 10hdd5.45799  osd.10   down 0  1.0
 11hdd5.45799  osd.11   down 0  1.0
 12hdd5.45799  osd.12   down 0  1.0
 13hdd5.45799  osd.13   down 0  1.0
 14hdd5.45799  osd.14   down 0  1.0
 15hdd5.45799  osd.15   down 0  1.0
 16hdd5.45799  osd.16   down 0  1.0
 17hdd5.45799  osd.17   down 0  1.0
 18hdd5.45799  osd.18   down 0  1.0
 19hdd5.45799  osd.19   down 0  1.0
 20hdd5.45799  osd.20   down 0  1.0
 21hdd5.45799  osd.21   down 0  1.0
 22hdd5.45799  osd.22   down 0  1.0
 23hdd5.45799  osd.23   down 0  1.0
 24hdd5.45799  osd.24   down 0  1.0
 25hdd5.45799  osd.25   down 0  1.0
 26hdd5.45799  osd.26   down 0  1.0
 27hdd5.45799  osd.27   down 0  1.0
-11 152.82401  host ctplosd5
112hdd5.45799  osd.112up   1.0  1.0
113hdd5.45799  osd.113up   1.0  1.0
114hdd5.45799  osd.114up   1.0  1.0
115hdd5.45799  osd.115up   1.0  1.0
116hdd5.45799  osd.116up   1.0  1.0
117hdd5.45799  osd.117up   1.0  1.0
118hdd5.45799  osd.118up   1.0  1.0
119hdd5.45799  osd.119up   1.0  1.0
120hdd5.45799  osd.120up   1.0  1.0
121hdd5.45799  osd.121up   1.0  1.0
122hdd5.45799  osd.122up   1.0  1.0
123hdd5.45799  osd.123up   1.0  1.0
124hdd5.45799  osd.124up   1.0  1.0
125hdd5.45799  osd.125up   1.0  1.0
126hdd5.45799  osd.126up   1.0  1.0
127hdd5.45799  osd.127up   1.0  1.0
128hdd5.45799  osd.128up   1.0  1.0
129hdd5.45799  osd.129up   1.0  1.0
130hdd

[ceph-users] Re: How to add back stray OSD daemon after node re-installation

2021-05-27 Thread mabi
Hi Eugen,

What a good coincidence ;-)

So I ran "cephadm ceph-volume lvm list" on the OSD node which I re-instaled and 
it saw my osd.2 OSD. So far so good, but the following suggested command does 
not work as you can see below:

ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid 
91a86f20-8083-40b1-8bf1-fe35fac3d677
Deploy daemon osd.2 ...

Traceback (most recent call last):
  File "/usr/local/sbin/cephadm", line 6223, in 
r = args.func()
  File "/usr/local/sbin/cephadm", line 1440, in _default_image
return func()
  File "/usr/local/sbin/cephadm", line 3457, in command_deploy
deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
  File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
  File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
assert osd_fsid
AssertionError

Any ideas what is wrong here?

Regards,
Mabi

‐‐‐ Original Message ‐‐‐
On Thursday, May 27, 2021 12:13 PM, Eugen Block  wrote:

> Hi,
>
> I posted a link to the docs [1], [2] just yesterday ;-)
>
> You should see the respective OSD in the output of 'cephadm
> ceph-volume lvm list' on that node. You should then be able to get it
> back to cephadm with
>
> cephadm deploy --name osd.x
>
> But I haven't tried this yet myself, so please report back if that
> works for you.
>
> Regards,
> Eugen
>
> [1] https://tracker.ceph.com/issues/49159
> [2] https://tracker.ceph.com/issues/46691
>
> Zitat von mabi m...@protonmail.ch:
>
> > Hello,
> > I have by mistake re-installed the OS of an OSD node of my Octopus
> > cluster (managed by cephadm). Luckily the OSD data is on a separate
> > disk and did not get affected by the re-install.
> > Now I have the following state:
> >
> > health: HEALTH_WARN
> > 1 stray daemon(s) not managed by cephadm
> > 1 osds down
> > 1 host (1 osds) down
> >
> >
> > To fix that I tried to run:
> >
> > ceph orch daemon add osd ceph1f:/dev/sda
> >
> > =
> >
> > Created no osd(s) on host ceph1f; already created?
> > That did not work, so I tried:
> >
> > ceph cephadm osd activate ceph1f
> >
> > =
> >
> > no valid command found; 10 closest matches:
> > ...
> > Error EINVAL: invalid command
> > Did not work either. So I wanted to ask how can I "adopt" back an
> > OSD disk to my cluster?
> > Thanks for your help.
> > Regards,
> > Mabi
> >
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm: How to replace failed HDD where DB is on SSD

2021-05-27 Thread Kai Stian Olstad

On 27.05.2021 11:53, Eugen Block wrote:

This test was on ceph version 15.2.8.

On Pacific (ceph version 16.2.4) this also works for me for initial
deployment of an entire host:

+-+-+--+--+--+-+
|SERVICE  |NAME |HOST  |DATA  |DB|WAL  |
+-+-+--+--+--+-+
|osd  |ssd-hdd-mix  |pacific1  |/dev/vdb  |/dev/vdd  |-|
|osd  |ssd-hdd-mix  |pacific1  |/dev/vdc  |/dev/vdd  |-|
+-+-+--+--+--+-+

But it doesn't work if I remove one OSD, just like you describe. This
is what ceph-volume reports:

---snip---
[ceph: root@pacific1 /]# ceph-volume lvm batch --report /dev/vdc
--db-devices /dev/vdd --block-db-size 3G
--> passed data devices: 1 physical, 0 LVM
--> relative data size: 1.0
--> passed block_db devices: 1 physical, 0 LVM
--> 1 fast devices were passed, but none are available

Total OSDs: 0

  TypePath
LV Size % of device
---snip---

I know that this has already worked in Octopus, I did test it
successfully not long ago.


Thank you for trying, so it looks like a bug.
Searching through the issue tracker I find few issues related to 
replacing OSD, but it doesn't look like they get much attention.



I tried to find a way to add the disk manually, did not find any 
documentation about it, but looking at the source code, some issues with 
some trial and error I ended up with this.


Since the LV is deleted I recreated it with the same name.

# lvcreate -l 91570 -n osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69 
ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b


In "cephadm shell"
# cephadm shell
# ceph auth get client.bootstrap-osd 
>/var/lib/ceph/bootstrap-osd/ceph.keyring
# ceph-volume lvm prepare --bluestore --no-systemd --data /dev/sdt 
--block.db 
ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69



Need to have a json file for the "cephadm deploy"
# printf '{\n"config": "%s",\n"keyring": "%s"\n}\n' "$(ceph config 
generate-minimal-conf | sed -e ':a;N;$!ba;s/\n/\\n/g' -e 's/\t/\\t/g' -e 
's/$/\\n/')" "$(ceph auth get osd.178 | head -n 2 | sed -e 
':a;N;$!ba;s/\n/\\n/g' -e 's/\t/\\t/g' -e 's/$/\\n/')" 
>config-osd.178.json



Exit cephadm shell and run
# cephadm --image ceph:v15.2.9 deploy --fsid 
3614abcc-201c-11eb-995a-2794bcc75ae0 --config-json 
/var/lib/ceph/3614abcc-201c-11eb-995a-2794bcc75ae0/home/config-osd.178.json 
--osd-fsid 9227e8ae-92eb-429e-9c7f-d4a2b75afb8e



And the OSD is back, but the VG name on the HDD is missing block in it's 
name, just a cosmetic thing so I leave it as is.


  LVVG   
   Attr   LSize
  osd-block-9227e8ae-92eb-429e-9c7f-d4a2b75afb8e
ceph-46f42262-d3dc-4dc3-8952-eec3e4a2c178   -wi-ao   12.47t
  osd-block-2da790bc-a74c-41da-8772-3b8aac77001c
ceph-block-1b5ad7e7-2e24-4315-8a05-7439ab782b45 -wi-ao   12.47t


The fist one is the new OSD and the second one is one that cephadm 
itself created.



--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rebalancing after node more

2021-05-27 Thread Rok Jaklič
For this pool I have set EC 3+2 (so in total I have 5 nodes) which one was
temporarily removed, but maybe this was the problem?

On Thu, May 27, 2021 at 3:51 PM Rok Jaklič  wrote:

> Hi, thanks for quick reply
>
> root@ctplmon1:~# ceph pg dump pgs_brief | grep undersized
> dumped pgs_brief
> 9.5  active+undersized+degraded   [72,85,54,120,2147483647]
> 72   [72,85,54,120,2147483647]  72
> 9.6  active+undersized+degraded  [101,47,113,74,2147483647]
>  101  [101,47,113,74,2147483647] 101
> 9.2  active+undersized+degraded   [86,118,74,2147483647,49]
> 86   [86,118,74,2147483647,49]  86
> 9.d  active+undersized+degraded   [49,136,83,90,2147483647]
> 49   [49,136,83,90,2147483647]  49
> 9.f  active+undersized+degraded  [55,103,81,128,2147483647]
> 55  [55,103,81,128,2147483647]  55
> 9.18 active+undersized+degraded   [115,50,61,89,2147483647]
>  115   [115,50,61,89,2147483647] 115
> 9.1d active+undersized+degraded   [61,90,31,2147483647,125]
> 61   [61,90,31,2147483647,125]  61
> 9.10 active+undersized+degraded   [46,2147483647,71,86,122]
> 46   [46,2147483647,71,86,122]  46
> 9.17 active+undersized+degraded   [60,95,114,2147483647,48]
> 60   [60,95,114,2147483647,48]  60
> 9.15 active+undersized+degraded  [121,76,30,101,2147483647]
>  121  [121,76,30,101,2147483647] 121
> root@ctplmon1:~# ceph osd tree
> ID   CLASS  WEIGHT TYPE NAME  STATUS  REWEIGHT  PRI-AFF
>  -1 764.11981  root default
>  -3 152.82378  host ctplosd1
>   0hdd5.45798  osd.0down 0  1.0
>   1hdd5.45799  osd.1down 0  1.0
>   2hdd5.45799  osd.2down 0  1.0
>   3hdd5.45799  osd.3down 0  1.0
>   4hdd5.45799  osd.4down 0  1.0
>   5hdd5.45799  osd.5down 0  1.0
>   6hdd5.45799  osd.6down 0  1.0
>   7hdd5.45799  osd.7down 0  1.0
>   8hdd5.45799  osd.8down 0  1.0
>   9hdd5.45799  osd.9down 0  1.0
>  10hdd5.45799  osd.10   down 0  1.0
>  11hdd5.45799  osd.11   down 0  1.0
>  12hdd5.45799  osd.12   down 0  1.0
>  13hdd5.45799  osd.13   down 0  1.0
>  14hdd5.45799  osd.14   down 0  1.0
>  15hdd5.45799  osd.15   down 0  1.0
>  16hdd5.45799  osd.16   down 0  1.0
>  17hdd5.45799  osd.17   down 0  1.0
>  18hdd5.45799  osd.18   down 0  1.0
>  19hdd5.45799  osd.19   down 0  1.0
>  20hdd5.45799  osd.20   down 0  1.0
>  21hdd5.45799  osd.21   down 0  1.0
>  22hdd5.45799  osd.22   down 0  1.0
>  23hdd5.45799  osd.23   down 0  1.0
>  24hdd5.45799  osd.24   down 0  1.0
>  25hdd5.45799  osd.25   down 0  1.0
>  26hdd5.45799  osd.26   down 0  1.0
>  27hdd5.45799  osd.27   down 0  1.0
> -11 152.82401  host ctplosd5
> 112hdd5.45799  osd.112up   1.0  1.0
> 113hdd5.45799  osd.113up   1.0  1.0
> 114hdd5.45799  osd.114up   1.0  1.0
> 115hdd5.45799  osd.115up   1.0  1.0
> 116hdd5.45799  osd.116up   1.0  1.0
> 117hdd5.45799  osd.117up   1.0  1.0
> 118hdd5.45799  osd.118up   1.0  1.0
> 119hdd5.45799  osd.119up   1.0  1.0
> 120hdd5.45799  osd.120up   1.0  1.0
> 121hdd5.45799  osd.121up   1.0  1.0
> 122hdd5.45799  osd.122up   1.0  1.0
> 123hdd5.45799  osd.123up   1.0  1.0
> 124hdd5.45799  osd.124up   1.0  1.0
> 125hdd5.45799  osd.125up   1.0  1.0
> 126hdd5.45799  osd.126up   1.0  1.0
> 127hdd5.45799  osd.127up   1.0  1.0
> 128hdd5.45799  osd.128up   1.0  1.0
> 129hdd5.45799  osd.129up   1.0  1.0
> 130hdd5.45799  osd.130up   1.0  1.0
> 131hdd5.45799  

[ceph-users] Re: rebalancing after node more

2021-05-27 Thread Rok Jaklič
Hi, thanks for quick reply

root@ctplmon1:~# ceph pg dump pgs_brief | grep undersized
dumped pgs_brief
9.5  active+undersized+degraded   [72,85,54,120,2147483647]
72   [72,85,54,120,2147483647]  72
9.6  active+undersized+degraded  [101,47,113,74,2147483647]
 101  [101,47,113,74,2147483647] 101
9.2  active+undersized+degraded   [86,118,74,2147483647,49]
86   [86,118,74,2147483647,49]  86
9.d  active+undersized+degraded   [49,136,83,90,2147483647]
49   [49,136,83,90,2147483647]  49
9.f  active+undersized+degraded  [55,103,81,128,2147483647]
55  [55,103,81,128,2147483647]  55
9.18 active+undersized+degraded   [115,50,61,89,2147483647]
 115   [115,50,61,89,2147483647] 115
9.1d active+undersized+degraded   [61,90,31,2147483647,125]
61   [61,90,31,2147483647,125]  61
9.10 active+undersized+degraded   [46,2147483647,71,86,122]
46   [46,2147483647,71,86,122]  46
9.17 active+undersized+degraded   [60,95,114,2147483647,48]
60   [60,95,114,2147483647,48]  60
9.15 active+undersized+degraded  [121,76,30,101,2147483647]
 121  [121,76,30,101,2147483647] 121
root@ctplmon1:~# ceph osd tree
ID   CLASS  WEIGHT TYPE NAME  STATUS  REWEIGHT  PRI-AFF
 -1 764.11981  root default
 -3 152.82378  host ctplosd1
  0hdd5.45798  osd.0down 0  1.0
  1hdd5.45799  osd.1down 0  1.0
  2hdd5.45799  osd.2down 0  1.0
  3hdd5.45799  osd.3down 0  1.0
  4hdd5.45799  osd.4down 0  1.0
  5hdd5.45799  osd.5down 0  1.0
  6hdd5.45799  osd.6down 0  1.0
  7hdd5.45799  osd.7down 0  1.0
  8hdd5.45799  osd.8down 0  1.0
  9hdd5.45799  osd.9down 0  1.0
 10hdd5.45799  osd.10   down 0  1.0
 11hdd5.45799  osd.11   down 0  1.0
 12hdd5.45799  osd.12   down 0  1.0
 13hdd5.45799  osd.13   down 0  1.0
 14hdd5.45799  osd.14   down 0  1.0
 15hdd5.45799  osd.15   down 0  1.0
 16hdd5.45799  osd.16   down 0  1.0
 17hdd5.45799  osd.17   down 0  1.0
 18hdd5.45799  osd.18   down 0  1.0
 19hdd5.45799  osd.19   down 0  1.0
 20hdd5.45799  osd.20   down 0  1.0
 21hdd5.45799  osd.21   down 0  1.0
 22hdd5.45799  osd.22   down 0  1.0
 23hdd5.45799  osd.23   down 0  1.0
 24hdd5.45799  osd.24   down 0  1.0
 25hdd5.45799  osd.25   down 0  1.0
 26hdd5.45799  osd.26   down 0  1.0
 27hdd5.45799  osd.27   down 0  1.0
-11 152.82401  host ctplosd5
112hdd5.45799  osd.112up   1.0  1.0
113hdd5.45799  osd.113up   1.0  1.0
114hdd5.45799  osd.114up   1.0  1.0
115hdd5.45799  osd.115up   1.0  1.0
116hdd5.45799  osd.116up   1.0  1.0
117hdd5.45799  osd.117up   1.0  1.0
118hdd5.45799  osd.118up   1.0  1.0
119hdd5.45799  osd.119up   1.0  1.0
120hdd5.45799  osd.120up   1.0  1.0
121hdd5.45799  osd.121up   1.0  1.0
122hdd5.45799  osd.122up   1.0  1.0
123hdd5.45799  osd.123up   1.0  1.0
124hdd5.45799  osd.124up   1.0  1.0
125hdd5.45799  osd.125up   1.0  1.0
126hdd5.45799  osd.126up   1.0  1.0
127hdd5.45799  osd.127up   1.0  1.0
128hdd5.45799  osd.128up   1.0  1.0
129hdd5.45799  osd.129up   1.0  1.0
130hdd5.45799  osd.130up   1.0  1.0
131hdd5.45799  osd.131up   1.0  1.0
132hdd5.45799  osd.132up   1.0  1.0
133hdd5.45799  osd.133up   1.0  1.0
134hdd5.45799  osd.134up   1.0  1.0
135hdd5.45799  osd.135up   1.0  1.0
136hdd5.45799  

[ceph-users] Re: How to add back stray OSD daemon after node re-installation

2021-05-27 Thread Eugen Block
That file is in the regular filesystem, you can copy it from a  
different osd directory, it just a minimal ceph.conf. The directory  
for the failing osd should now be present after the failed attempts.



Zitat von mabi :

Nicely spotted about the missing file, it looks like I have the same  
case as you can see below from the syslog:


May 27 15:33:12 ceph1f systemd[1]:  
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service: Scheduled  
restart job, restart counter is at 1.
May 27 15:33:12 ceph1f systemd[1]: Stopped Ceph osd.2 for  
8d47792c-987d-11eb-9bb6-a5302e00e1fa.
May 27 15:33:12 ceph1f systemd[1]: Starting Ceph osd.2 for  
8d47792c-987d-11eb-9bb6-a5302e00e1fa...
May 27 15:33:12 ceph1f kernel: [19332.481779] overlayfs:  
unrecognized mount option "volatile" or missing value
May 27 15:33:13 ceph1f kernel: [19332.709205] overlayfs:  
unrecognized mount option "volatile" or missing value
May 27 15:33:13 ceph1f kernel: [19332.933442] overlayfs:  
unrecognized mount option "volatile" or missing value
May 27 15:33:13 ceph1f bash[64982]: Error: statfs  
/var/lib/ceph/8d47792c-987d-11eb-9bb6-a5302e00e1fa/osd.2/config: no  
such file or directory
May 27 15:33:13 ceph1f systemd[1]:  
ceph-8d47792c-987d-11eb-9bb6-a5302e00e1fa@osd.2.service: Control  
process exited, code=exited, status=125/n/a


So how do I go to generate/create that missing  
/var/lib/ceph/8d47792c-987d-11eb-9bb6-a5302e00e1fa/osd.2/config file?



‐‐‐ Original Message ‐‐‐
On Thursday, May 27, 2021 3:28 PM, Eugen Block  wrote:


Can you try with both cluster and osd fsid? Something like this:

pacific2:~ # cephadm deploy --name osd.2 --fsid
acbb46d6-bde3-11eb-9cf2-fa163ebb2a74 --osd-fsid
bc241cd4-e284-4c5a-aad2-5744632fc7fc

I tried to reproduce a similar scenario and found a missing config
file in the osd directory:

Error: statfs
/var/lib/ceph/acbb46d6-bde3-11eb-9cf2-fa163ebb2a74/osd.2/config: no
such file or directory

Check your syslog for more information why the osd start fails.

Zitat von mabi m...@protonmail.ch:

> You are right, I used the FSID of the OSD and not of the cluster in
> the deploy command. So now I tried again with the cluster ID as FSID
> but still it does not work as you can see below:
> ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
> 8d47792c-987d-11eb-9bb6-a5302e00e1fa
> Deploy daemon osd.2 ...
> Traceback (most recent call last):
> File "/usr/local/sbin/cephadm", line 6223, in 
> r = args.func()
> File "/usr/local/sbin/cephadm", line 1440, in _default_image
> return func()
> File "/usr/local/sbin/cephadm", line 3457, in command_deploy
> deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
> File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
> deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
> File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
> assert osd_fsid
> AssertionError
> In case that's of any help here is the output of the "cephadm
> ceph-volume lvm list" command:
> == osd.2 ===
> [block]
>  
/dev/ceph-cca8abe6-cf9b-4c2f-ab81-ae0758585414/osd-block-91a86f20-8083-40b1-8bf1-fe35fac3d677

>
>   block device
>
>
>  
/dev/ceph-cca8abe6-cf9b-4c2f-ab81-ae0758585414/osd-block-91a86f20-8083-40b1-8bf1-fe35fac3d677

> block uuid W3omTg-vami-RB0V-CkVb-cgpb-88Jy-pIK2Tz
> cephx lockbox secret
> cluster fsid 8d47792c-987d-11eb-9bb6-a5302e00e1fa
> cluster name ceph
> crush device class None
> encrypted 0
> osd fsid 91a86f20-8083-40b1-8bf1-fe35fac3d677
> osd id 2
> osdspec affinity all-available-devices
> type block
> vdo 0
> devices /dev/sda
> ‐‐‐ Original Message ‐‐‐
> On Thursday, May 27, 2021 12:32 PM, Eugen Block ebl...@nde.ag wrote:
>
> > > ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
> >
> > > 91a86f20-8083-40b1-8bf1-fe35fac3d677
> > > Deploy daemon osd.2 ...
> >
> > Which fsid is it, the cluster's or the OSD's? According to the
> > 'cephadm deploy' help page it should be the cluster fsid.
> > Zitat von mabi m...@protonmail.ch:
> >
> > > Hi Eugen,
> > > What a good coincidence ;-)
> > > So I ran "cephadm ceph-volume lvm list" on the OSD node which I
> > > re-instaled and it saw my osd.2 OSD. So far so good, but the
> > > following suggested command does not work as you can see below:
> > > ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
> > > 91a86f20-8083-40b1-8bf1-fe35fac3d677
> > > Deploy daemon osd.2 ...
> > > Traceback (most recent call last):
> > > File "/usr/local/sbin/cephadm", line 6223, in 
> > > r = args.func()
> > > File "/usr/local/sbin/cephadm", line 1440, in _default_image
> > > return func()
> > > File "/usr/local/sbin/cephadm", line 3457, in command_deploy
> > > deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
> > > File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
> > > deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
> > > File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
> > > assert osd_fsid
> > > AssertionError
> > > Any ideas what is 

[ceph-users] Re: rebalancing after node more

2021-05-27 Thread Eugen Block

Hi,

this sounds like your crush rule(s) for one or more pools can't place  
the PGs because the host is missing. Please share


ceph pg dump pgs_brief | grep undersized
ceph osd tree
ceph osd pool ls detail

and the crush rule(s) for the affected pool(s).


Zitat von Rok Jaklič :


Hi,

I have removed one node, but now ceph seems to stuck in:
Degraded data redundancy: 67/2393 objects degraded (2.800%), 12 pgs
degraded, 12 pgs undersized

How to "force" rebalancing? Or should I just wait a little bit more?

Kind regards,
rok
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to add back stray OSD daemon after node re-installation

2021-05-27 Thread Eugen Block

Can you try with both cluster and osd fsid? Something like this:

pacific2:~ # cephadm deploy --name osd.2 --fsid  
acbb46d6-bde3-11eb-9cf2-fa163ebb2a74 --osd-fsid  
bc241cd4-e284-4c5a-aad2-5744632fc7fc


I tried to reproduce a similar scenario and found a missing config  
file in the osd directory:


Error: statfs  
/var/lib/ceph/acbb46d6-bde3-11eb-9cf2-fa163ebb2a74/osd.2/config: no  
such file or directory


Check your syslog for more information why the osd start fails.



Zitat von mabi :

You are right, I used the FSID of the OSD and not of the cluster in  
the deploy command. So now I tried again with the cluster ID as FSID  
but still it does not work as you can see below:


ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid  
8d47792c-987d-11eb-9bb6-a5302e00e1fa

Deploy daemon osd.2 ...
Traceback (most recent call last):
  File "/usr/local/sbin/cephadm", line 6223, in 
r = args.func()
  File "/usr/local/sbin/cephadm", line 1440, in _default_image
return func()
  File "/usr/local/sbin/cephadm", line 3457, in command_deploy
deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
  File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
  File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
assert osd_fsid
AssertionError

In case that's of any help here is the output of the "cephadm  
ceph-volume lvm list" command:


== osd.2 ===

  [block]
/dev/ceph-cca8abe6-cf9b-4c2f-ab81-ae0758585414/osd-block-91a86f20-8083-40b1-8bf1-fe35fac3d677


  block device   
/dev/ceph-cca8abe6-cf9b-4c2f-ab81-ae0758585414/osd-block-91a86f20-8083-40b1-8bf1-fe35fac3d677

  block uuidW3omTg-vami-RB0V-CkVb-cgpb-88Jy-pIK2Tz
  cephx lockbox secret
  cluster fsid  8d47792c-987d-11eb-9bb6-a5302e00e1fa
  cluster name  ceph
  crush device classNone
  encrypted 0
  osd fsid  91a86f20-8083-40b1-8bf1-fe35fac3d677
  osd id2
  osdspec affinity  all-available-devices
  type  block
  vdo   0
  devices   /dev/sda

‐‐‐ Original Message ‐‐‐
On Thursday, May 27, 2021 12:32 PM, Eugen Block  wrote:


> ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid

> 91a86f20-8083-40b1-8bf1-fe35fac3d677
> Deploy daemon osd.2 ...

Which fsid is it, the cluster's or the OSD's? According to the
'cephadm deploy' help page it should be the cluster fsid.

Zitat von mabi m...@protonmail.ch:

> Hi Eugen,
> What a good coincidence ;-)
> So I ran "cephadm ceph-volume lvm list" on the OSD node which I
> re-instaled and it saw my osd.2 OSD. So far so good, but the
> following suggested command does not work as you can see below:
> ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid
> 91a86f20-8083-40b1-8bf1-fe35fac3d677
> Deploy daemon osd.2 ...
> Traceback (most recent call last):
> File "/usr/local/sbin/cephadm", line 6223, in 
> r = args.func()
> File "/usr/local/sbin/cephadm", line 1440, in _default_image
> return func()
> File "/usr/local/sbin/cephadm", line 3457, in command_deploy
> deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
> File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
> deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
> File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
> assert osd_fsid
> AssertionError
> Any ideas what is wrong here?
> Regards,
> Mabi
> ‐‐‐ Original Message ‐‐‐
> On Thursday, May 27, 2021 12:13 PM, Eugen Block ebl...@nde.ag wrote:
>
> > Hi,
> > I posted a link to the docs [1], [2] just yesterday ;-)
> > You should see the respective OSD in the output of 'cephadm
> > ceph-volume lvm list' on that node. You should then be able to get it
> > back to cephadm with
> > cephadm deploy --name osd.x
> > But I haven't tried this yet myself, so please report back if that
> > works for you.
> > Regards,
> > Eugen
> > [1] https://tracker.ceph.com/issues/49159
> > [2] https://tracker.ceph.com/issues/46691
> > Zitat von mabi m...@protonmail.ch:
> >
> > > Hello,
> > > I have by mistake re-installed the OS of an OSD node of my Octopus
> > > cluster (managed by cephadm). Luckily the OSD data is on a separate
> > > disk and did not get affected by the re-install.
> > > Now I have the following state:
> > >
> > > health: HEALTH_WARN
> > > 1 stray daemon(s) not managed by cephadm
> > > 1 osds down
> > > 1 host (1 osds) down
> > >
> > >
> > > To fix that I tried to run:
> > > ceph orch daemon add osd ceph1f:/dev/sda
> > > =
> > > Created no osd(s) on host ceph1f; already created?
> > > That did not work, so I tried:
> > > ceph cephadm osd activate ceph1f
> > > =
> > > no valid command found; 10 closest matches:
> > 

[ceph-users] Re: MDS stuck in up:stopping state

2021-05-27 Thread 胡 玮文

> 在 2021年5月27日,19:11,Mark Schouten  写道:
> 
> On Thu, May 27, 2021 at 12:38:07PM +0200, Mark Schouten wrote:
>>> On Thu, May 27, 2021 at 06:25:44AM +, Martin Rasmus Lundquist Hansen 
>>> wrote:
>>> After scaling the number of MDS daemons down, we now have a daemon stuck in 
>>> the
>>> "up:stopping" state. The documentation says it can take several minutes to 
>>> stop the
>>> daemon, but it has been stuck in this state for almost a full day. 
>>> According to
>>> the "ceph fs status" output attached below, it still holds information 
>>> about 2
>>> inodes, which we assume is the reason why it cannot stop completely.
>>> 
>>> Does anyone know what we can do to finally stop it?
>> 
>> I have no clients, and it still does not want to stop rank1. Funny
>> thing is, while trying to fix this by restarting mdses, I sometimes see
>> a list of clients popping up in the dashboard, even though no clients
>> are connected..
> 
> Configuring debuglogging shows me the following:
> https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fp.6core.net%2Fp%2FrlMaunS8IM1AY5E58uUB6oy4data=04%7C01%7C%7Ccb4cf10f17b446878c9f08d921001eca%7C84df9e7fe9f640afb435%7C1%7C0%7C637577106693763173%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=yD1njU7QWDBoatY73Zif0TSb1%2FCJgKo5QUNqEn85njU%3Dreserved=0

I think your case is different from mine. Your logs show “waiting for stray to 
migrate”. I didn’t see this.

> I have quite a lot of hardlinks on this filesystem, which I've seen
> issue with 'No space left on device'. I have mds_bal_fragment_size_max
> set to 20 to mitigate that.
> 
> The message 'waiting for strays to migrate' makes me feel like I should
> push the MDS to migrate them somehow .. But how?
> 
> -- 
> Mark Schouten | Tuxis B.V.
> KvK: 74698818 | 
> https://apac01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tuxis.nl%2Fdata=04%7C01%7C%7Ccb4cf10f17b446878c9f08d921001eca%7C84df9e7fe9f640afb435%7C1%7C0%7C637577106693763173%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=RTkbqtEbavbGmDmpviD2Kdfraz7Euac5xioyLKTJOSY%3Dreserved=0
> T: +31 318 200208 | i...@tuxis.nl
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs:: store files on different pools?

2021-05-27 Thread Dietmar Rieder

On 5/27/21 2:33 PM, Adrian Sevcenco wrote:
Hi! is is (technically) possible to instruct cephfs to store files < 
1Mib on a (replicate) pool

and the others files on another (ec) pool?

And even more, is it possible to take the same kind of decision on the 
path of the file?

(let's say that critical files with names like  > r"/critical_path/critical_.*" 
i want them in a 6x replication ssd pool)


you can pin a directory to a specific pool using setfattr

e.g.
setfattr -n ceph.dir.layout.pool -v critical-rep-data-pool 
/mnt/cephfs/critical
setfattr -n ceph.dir.layout.pool -v standard-ec-data-pool 
/mnt/cephfs/standard


HTH
  Dietmar

--
_
D i e t m a r  R i e d e r, Mag.Dr.
Head of HPC/Bioinformatics facility
Innsbruck Medical University
Biocenter - Institute of Bioinformatics
Innrain 80, 6020 Innsbruck
Phone: +43 512 9003 71402
Fax: +43 512 9003 73100
Email: dietmar.rie...@i-med.ac.at
Web:   http://www.icbi.at




OpenPGP_signature
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] cephfs:: store files on different pools?

2021-05-27 Thread Adrian Sevcenco

Hi! is is (technically) possible to instruct cephfs to store files < 1Mib on a 
(replicate) pool
and the others files on another (ec) pool?

And even more, is it possible to take the same kind of decision on the path of 
the file?
(let's say that critical files with names like r"/critical_path/critical_.*" i 
want them in a 6x replication ssd pool)

Thank you!
Adrian



smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Python lib usage access permissions

2021-05-27 Thread Szabo, Istvan (Agoda)
Hi,

Is there a way to be able to manage specific pools with the python lib without 
admin keyring?
Not sure why it only works with admin keyring but not with the client keyring :/

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---



This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS stuck in up:stopping state

2021-05-27 Thread Mark Schouten
On Thu, May 27, 2021 at 12:38:07PM +0200, Mark Schouten wrote:
> On Thu, May 27, 2021 at 06:25:44AM +, Martin Rasmus Lundquist Hansen 
> wrote:
> > After scaling the number of MDS daemons down, we now have a daemon stuck in 
> > the
> > "up:stopping" state. The documentation says it can take several minutes to 
> > stop the
> > daemon, but it has been stuck in this state for almost a full day. 
> > According to
> > the "ceph fs status" output attached below, it still holds information 
> > about 2
> > inodes, which we assume is the reason why it cannot stop completely.
> > 
> > Does anyone know what we can do to finally stop it?
> 
> I have no clients, and it still does not want to stop rank1. Funny
> thing is, while trying to fix this by restarting mdses, I sometimes see
> a list of clients popping up in the dashboard, even though no clients
> are connected..

Configuring debuglogging shows me the following:
https://p.6core.net/p/rlMaunS8IM1AY5E58uUB6oy4


I have quite a lot of hardlinks on this filesystem, which I've seen
issue with 'No space left on device'. I have mds_bal_fragment_size_max
set to 20 to mitigate that.

The message 'waiting for strays to migrate' makes me feel like I should
push the MDS to migrate them somehow .. But how?

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS stuck in up:stopping state

2021-05-27 Thread Mark Schouten
On Thu, May 27, 2021 at 06:25:44AM +, Martin Rasmus Lundquist Hansen wrote:
> After scaling the number of MDS daemons down, we now have a daemon stuck in 
> the
> "up:stopping" state. The documentation says it can take several minutes to 
> stop the
> daemon, but it has been stuck in this state for almost a full day. According 
> to
> the "ceph fs status" output attached below, it still holds information about 2
> inodes, which we assume is the reason why it cannot stop completely.
> 
> Does anyone know what we can do to finally stop it?

I have no clients, and it still does not want to stop rank1. Funny
thing is, while trying to fix this by restarting mdses, I sometimes see
a list of clients popping up in the dashboard, even though no clients
are connected..

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to add back stray OSD daemon after node re-installation

2021-05-27 Thread Eugen Block
ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid  
91a86f20-8083-40b1-8bf1-fe35fac3d677

Deploy daemon osd.2 ...


Which fsid is it, the cluster's or the OSD's? According to the  
'cephadm deploy' help page it should be the cluster fsid.



Zitat von mabi :


Hi Eugen,

What a good coincidence ;-)

So I ran "cephadm ceph-volume lvm list" on the OSD node which I  
re-instaled and it saw my osd.2 OSD. So far so good, but the  
following suggested command does not work as you can see below:


ubuntu@ceph1f:~$ sudo cephadm deploy --name osd.2 --fsid  
91a86f20-8083-40b1-8bf1-fe35fac3d677

Deploy daemon osd.2 ...

Traceback (most recent call last):
  File "/usr/local/sbin/cephadm", line 6223, in 
r = args.func()
  File "/usr/local/sbin/cephadm", line 1440, in _default_image
return func()
  File "/usr/local/sbin/cephadm", line 3457, in command_deploy
deploy_daemon(args.fsid, daemon_type, daemon_id, c, uid, gid,
  File "/usr/local/sbin/cephadm", line 2193, in deploy_daemon
deploy_daemon_units(fsid, uid, gid, daemon_type, daemon_id, c,
  File "/usr/local/sbin/cephadm", line 2255, in deploy_daemon_units
assert osd_fsid
AssertionError

Any ideas what is wrong here?

Regards,
Mabi

‐‐‐ Original Message ‐‐‐
On Thursday, May 27, 2021 12:13 PM, Eugen Block  wrote:


Hi,

I posted a link to the docs [1], [2] just yesterday ;-)

You should see the respective OSD in the output of 'cephadm
ceph-volume lvm list' on that node. You should then be able to get it
back to cephadm with

cephadm deploy --name osd.x

But I haven't tried this yet myself, so please report back if that
works for you.

Regards,
Eugen

[1] https://tracker.ceph.com/issues/49159
[2] https://tracker.ceph.com/issues/46691

Zitat von mabi m...@protonmail.ch:

> Hello,
> I have by mistake re-installed the OS of an OSD node of my Octopus
> cluster (managed by cephadm). Luckily the OSD data is on a separate
> disk and did not get affected by the re-install.
> Now I have the following state:
>
> health: HEALTH_WARN
> 1 stray daemon(s) not managed by cephadm
> 1 osds down
> 1 host (1 osds) down
>
>
> To fix that I tried to run:
>
> ceph orch daemon add osd ceph1f:/dev/sda
>
> =
>
> Created no osd(s) on host ceph1f; already created?
> That did not work, so I tried:
>
> ceph cephadm osd activate ceph1f
>
> =
>
> no valid command found; 10 closest matches:
> ...
> Error EINVAL: invalid command
> Did not work either. So I wanted to ask how can I "adopt" back an
> OSD disk to my cluster?
> Thanks for your help.
> Regards,
> Mabi
>
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] MDS stuck in up:stopping state

2021-05-27 Thread Martin Rasmus Lundquist Hansen
After scaling the number of MDS daemons down, we now have a daemon stuck in the
"up:stopping" state. The documentation says it can take several minutes to stop 
the
daemon, but it has been stuck in this state for almost a full day. According to
the "ceph fs status" output attached below, it still holds information about 2
inodes, which we assume is the reason why it cannot stop completely.

Does anyone know what we can do to finally stop it?


cephfs - 71 clients
==
RANK   STATEMDS ACTIVITY DNSINOS
 0 active   ceph-mon-01  Reqs:0 /s  15.7M  15.4M
 1 active   ceph-mon-02  Reqs:   48 /s  19.7M  17.1M
 2stopping  ceph-mon-030  2
  POOL TYPE USED  AVAIL
cephfs_metadata  metadata   652G   185T
  cephfs_data  data1637T   539T
   STANDBY MDS
ceph-mon-03-mds-2
MDS version: ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) 
octopus (stable)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to add back stray OSD daemon after node re-installation

2021-05-27 Thread Eugen Block

Hi,

I posted a link to the docs [1], [2] just yesterday ;-)

You should see the respective OSD in the output of 'cephadm  
ceph-volume lvm list' on that node. You should then be able to get it  
back to cephadm with


cephadm deploy --name osd.x

But I haven't tried this yet myself, so please report back if that  
works for you.


Regards,
Eugen


[1] https://tracker.ceph.com/issues/49159
[2] https://tracker.ceph.com/issues/46691


Zitat von mabi :


Hello,

I have by mistake re-installed the OS of an OSD node of my Octopus  
cluster (managed by cephadm). Luckily the OSD data is on a separate  
disk and did not get affected by the re-install.


Now I have the following state:

health: HEALTH_WARN
1 stray daemon(s) not managed by cephadm
1 osds down
1 host (1 osds) down

To fix that I tried to run:

# ceph orch daemon add osd ceph1f:/dev/sda
Created no osd(s) on host ceph1f; already created?

That did not work, so I tried:

# ceph cephadm osd activate ceph1f
no valid command found; 10 closest matches:
...
Error EINVAL: invalid command

Did not work either. So I wanted to ask how can I "adopt" back an  
OSD disk to my cluster?


Thanks for your help.

Regards,
Mabi
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm: How to replace failed HDD where DB is on SSD

2021-05-27 Thread Eugen Block

This test was on ceph version 15.2.8.

On Pacific (ceph version 16.2.4) this also works for me for initial  
deployment of an entire host:


+-+-+--+--+--+-+
|SERVICE  |NAME |HOST  |DATA  |DB|WAL  |
+-+-+--+--+--+-+
|osd  |ssd-hdd-mix  |pacific1  |/dev/vdb  |/dev/vdd  |-|
|osd  |ssd-hdd-mix  |pacific1  |/dev/vdc  |/dev/vdd  |-|
+-+-+--+--+--+-+

But it doesn't work if I remove one OSD, just like you describe. This  
is what ceph-volume reports:


---snip---
[ceph: root@pacific1 /]# ceph-volume lvm batch --report /dev/vdc  
--db-devices /dev/vdd --block-db-size 3G

--> passed data devices: 1 physical, 0 LVM
--> relative data size: 1.0
--> passed block_db devices: 1 physical, 0 LVM
--> 1 fast devices were passed, but none are available

Total OSDs: 0

  TypePath 
LV Size % of device

---snip---

I know that this has already worked in Octopus, I did test it  
successfully not long ago.



Zitat von Kai Stian Olstad :


On 27.05.2021 11:17, Eugen Block wrote:

That's not how it's supposed to work. I tried the same on an Octopus
cluster and removed all filters except:

data_devices:
 rotational: 1
db_devices:
 rotational: 0

My Octopus test osd nodes have two HDDs and one SSD, I removed all
OSDs and redeployed on one node. This spec file results in three
standalone OSDs! Without the other filters this won't work as
expected, it seems. I'll try again on Pacific with the same test and
see where that goes.


This spec did worked for me when I initially deployed with Octopus 15.2.5.

--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS cache tunning

2021-05-27 Thread Andres Rojas Guerrero

Thank you very much, very good explanation!!

El 27/5/21 a las 9:42, Dan van der Ster escribió:

etween 100-200


--
***
Andrés Rojas Guerrero
Unidad Sistemas Linux
Area Arquitectura Tecnológica
Secretaría General Adjunta de Informática
Consejo Superior de Investigaciones Científicas (CSIC)
Pinar 19
28006 - Madrid
Tel: +34 915680059 -- Ext. 990059
email: a.ro...@csic.es
ID comunicate.csic.es: @50852720l:matrix.csic.es
***


OpenPGP_0x2DEE9321B16B4A68.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm: How to replace failed HDD where DB is on SSD

2021-05-27 Thread Kai Stian Olstad

On 27.05.2021 11:17, Eugen Block wrote:

That's not how it's supposed to work. I tried the same on an Octopus
cluster and removed all filters except:

data_devices:
  rotational: 1
db_devices:
  rotational: 0

My Octopus test osd nodes have two HDDs and one SSD, I removed all
OSDs and redeployed on one node. This spec file results in three
standalone OSDs! Without the other filters this won't work as
expected, it seems. I'll try again on Pacific with the same test and
see where that goes.


This spec did worked for me when I initially deployed with Octopus 
15.2.5.


--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm: How to replace failed HDD where DB is on SSD

2021-05-27 Thread Eugen Block
That's not how it's supposed to work. I tried the same on an Octopus  
cluster and removed all filters except:


data_devices:
  rotational: 1
db_devices:
  rotational: 0

My Octopus test osd nodes have two HDDs and one SSD, I removed all  
OSDs and redeployed on one node. This spec file results in three  
standalone OSDs! Without the other filters this won't work as  
expected, it seems. I'll try again on Pacific with the same test and  
see where that goes.



Zitat von Kai Stian Olstad :


On 26.05.2021 22:14, David Orman wrote:

We've found that after doing the osd rm, you can use: "ceph-volume lvm
zap --osd-id 178 --destroy" on the server with that OSD as per:
https://docs.ceph.com/en/latest/ceph-volume/lvm/zap/#removing-devices
and it will clean things up so they work as expected.


With the help of Eugen I did run "cephadm ceph-volume lvm zap  
--destroy " and the LV is gone.
I think that is the same result as "ceph-volume lvm zap --osd-id 178  
--destroy" would give me?


I now have 357GB free space on the VG, but Cephadm doesn't find and  
use this space.

Above it the result of the zap command and it show the LV is deleted.

$ sudo cephadm ceph-volume lvm zap --destroy  
/dev/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69

INFO:cephadm:Inferring fsid 3614abcc-201c-11eb-995a-2794bcc75ae0
INFO:cephadm:Using recent ceph image ceph:v15.2.9
INFO:cephadm:/usr/bin/podman:stderr --> Zapping:  
/dev/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69
INFO:cephadm:/usr/bin/podman:stderr Running command: /usr/bin/dd  
if=/dev/zero  
of=/dev/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69 bs=1M count=10  
conv=fsync

INFO:cephadm:/usr/bin/podman:stderr  stderr: 10+0 records in
INFO:cephadm:/usr/bin/podman:stderr 10+0 records out
INFO:cephadm:/usr/bin/podman:stderr  stderr: 10485760 bytes (10 MB,  
10 MiB) copied, 0.0195532 s, 536 MB/s
INFO:cephadm:/usr/bin/podman:stderr --> More than 1 LV left in VG,  
will proceed to destroy LV only
INFO:cephadm:/usr/bin/podman:stderr --> Removing LV because  
--destroy was given:  
/dev/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69
INFO:cephadm:/usr/bin/podman:stderr Running command:  
/usr/sbin/lvremove -v -f  
/dev/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69
INFO:cephadm:/usr/bin/podman:stderr  stdout: Logical volume  
"osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69" successfully  
removed
INFO:cephadm:/usr/bin/podman:stderr  stderr: Removing  
ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--449bd001--eb32--46de--ab80--a1cbcd293d69  
(253:3)
INFO:cephadm:/usr/bin/podman:stderr  stderr: Archiving volume group  
"ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b" metadata  
(seqno 61).
INFO:cephadm:/usr/bin/podman:stderr  stderr: Releasing logical  
volume "osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69"
INFO:cephadm:/usr/bin/podman:stderr  stderr: Creating volume group  
backup  
"/etc/lvm/backup/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b" (seqno  
62).
INFO:cephadm:/usr/bin/podman:stderr --> Zapping successful for: /dev/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69>



--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Spam] �ظ�: MDS stuck in up:stopping state

2021-05-27 Thread Mark Schouten
On Thu, May 27, 2021 at 10:37:33AM +0200, Mark Schouten wrote:
> On Thu, May 27, 2021 at 07:02:16AM +, 胡 玮文 wrote:
> > You may hit https://tracker.ceph.com/issues/50112, which we failed to find 
> > the root cause yet. I resolved this by restart rank 0. (I have only 2 
> > active MDSs)
> 
> I have this exact issue while trying to upgrade from 12.2 (which is
> pending this mds issue). I don't have any active clients, restarting
> rank0 does not help.

Since I have no active clients. Can I just shut down the all mds'es,
upgrade them and expect an upgrade to fix this magically? Or would
upgrading possibly break the CephFS-fs?

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm: How to replace failed HDD where DB is on SSD

2021-05-27 Thread Eugen Block

Hi,

The VG has 357.74GB of free space of total 5.24TB so I did actually  
tried different values like "30G:", "30G", "300G:", "300G", "357G".
I also tied some crazy high numbers and some ranges, but don't  
remember the values. But none of them worked.


the size parameter is filtering the disk size, not the size you want  
the db to have (that's block_db_size). Your SSD disk size is 1.8 TB so  
your specs could look something like this:


block_db_size: 360G
data_devices:
  size: "12T:"
  rotational: 1
db_devices:
  size: ":2T"
  rotational: 0
filter_logic: AND
...

But I was under the impression that this all should of course work  
with just the rotational flags, I'm confused that it doesn't. Can you  
try with these specs to see if you get the OSD deployed? I'll try  
again with Octopus to see if I see similar behaviour.



Zitat von Kai Stian Olstad :


On 26.05.2021 18:12, Eugen Block wrote:

Could you share the output of

lsblk -o name,rota,size,type

from the affected osd node?


# lsblk -o name,rota,size,type
NAME  
 ROTA   SIZE TYPE
loop0 
1  71.3M loop
loop1 
155M loop
loop2 
1  29.9M loop
sda   
1   223G disk
├─sda1
1   512M part
├─sda2
1 1G part
└─sda3
1 221.5G part
sdb   
1  12.5T disk
└─ceph--block--1b5ad7e7--2e24--4315--8a05--7439ab782b45-osd--block--2da790bc--a74c--41da--8772--3b8aac77001c 1  12.5T  
lvm
sdc   
1  12.5T disk
└─ceph--block--44ae73e8--726f--4556--978c--8e7d6570c867-osd--block--daeb5218--c10c--45e1--a864--2d60de44e594 1  12.5T  
lvm
sdd   
1  12.5T disk
└─ceph--block--38e361f5--257f--47a5--85dc--16dbdd5fb905-osd--block--a3e1511f--8644--4c1e--a3dd--f365fcb27fc6 1  12.5T  
lvm
sde   
0   1.8T disk
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--f5d822a2--d42a--4a2c--985f--c65977f4d0200 357.7G  
lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--9ed8d8a4--11bc--4669--bb8f--d66806181a7d0 357.7G  
lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--0fdff890--f5e4--4762--83f6--8d02ee63c3990 357.7G  
lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--facaa59d--1f44--4605--bdb8--5a7d582713230 357.7G  
lvm
└─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--44fbc70f--4169--41aa--9c2f--6638b226065a0 357.7G  
lvm
sdf   
0   1.8T disk
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--7c82d5e2--6ce1--49c6--9691--2e156a8fd9c00 357.7G  
lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--135169cd--e0c8--4949--a137--c5e7da12bc520 357.7G  
lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--fcb5b6cc--ee95--4dfa--a978--0a768da8bc660 357.7G  
lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--44b74f22--d2f1--485a--971b--d89a82849c6e0 357.7G  
lvm
└─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--1dd5ee57--e141--40aa--b4a6--5cb20dc51cc00 357.7G  
lvm
sdg   
0   1.8T disk
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--274c882c--da8b--4251--a195--309ee9cbc36f0 357.7G  
lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--087b2305--2bbe--4583--b1ca--dda7416efefc0 357.7G  
lvm

[ceph-users] Re: cephadm: How to replace failed HDD where DB is on SSD

2021-05-27 Thread Kai Stian Olstad

On 26.05.2021 22:14, David Orman wrote:

We've found that after doing the osd rm, you can use: "ceph-volume lvm
zap --osd-id 178 --destroy" on the server with that OSD as per:
https://docs.ceph.com/en/latest/ceph-volume/lvm/zap/#removing-devices
and it will clean things up so they work as expected.


With the help of Eugen I did run "cephadm ceph-volume lvm zap --destroy 
" and the LV is gone.
I think that is the same result as "ceph-volume lvm zap --osd-id 178 
--destroy" would give me?


I now have 357GB free space on the VG, but Cephadm doesn't find and use 
this space.

Above it the result of the zap command and it show the LV is deleted.

$ sudo cephadm ceph-volume lvm zap --destroy 
/dev/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69

INFO:cephadm:Inferring fsid 3614abcc-201c-11eb-995a-2794bcc75ae0
INFO:cephadm:Using recent ceph image ceph:v15.2.9
INFO:cephadm:/usr/bin/podman:stderr --> Zapping: 
/dev/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69
INFO:cephadm:/usr/bin/podman:stderr Running command: /usr/bin/dd 
if=/dev/zero 
of=/dev/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69 
bs=1M count=10 conv=fsync

INFO:cephadm:/usr/bin/podman:stderr  stderr: 10+0 records in
INFO:cephadm:/usr/bin/podman:stderr 10+0 records out
INFO:cephadm:/usr/bin/podman:stderr  stderr: 10485760 bytes (10 MB, 10 
MiB) copied, 0.0195532 s, 536 MB/s
INFO:cephadm:/usr/bin/podman:stderr --> More than 1 LV left in VG, will 
proceed to destroy LV only
INFO:cephadm:/usr/bin/podman:stderr --> Removing LV because --destroy 
was given: 
/dev/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69
INFO:cephadm:/usr/bin/podman:stderr Running command: /usr/sbin/lvremove 
-v -f 
/dev/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69
INFO:cephadm:/usr/bin/podman:stderr  stdout: Logical volume 
"osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69" successfully removed
INFO:cephadm:/usr/bin/podman:stderr  stderr: Removing 
ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--449bd001--eb32--46de--ab80--a1cbcd293d69 
(253:3)
INFO:cephadm:/usr/bin/podman:stderr  stderr: Archiving volume group 
"ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b" metadata (seqno 
61).
INFO:cephadm:/usr/bin/podman:stderr  stderr: Releasing logical volume 
"osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69"
INFO:cephadm:/usr/bin/podman:stderr  stderr: Creating volume group 
backup 
"/etc/lvm/backup/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b" 
(seqno 62).
INFO:cephadm:/usr/bin/podman:stderr --> Zapping successful for: /dev/ceph-block-dbs-563432b7-f52d-4cfe-b952-11542594843b/osd-block-db-449bd001-eb32-46de-ab80-a1cbcd293d69>



--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Spam] �ظ�: MDS stuck in up:stopping state

2021-05-27 Thread Mark Schouten
On Thu, May 27, 2021 at 07:02:16AM +, 胡 玮文 wrote:
> You may hit https://tracker.ceph.com/issues/50112, which we failed to find 
> the root cause yet. I resolved this by restart rank 0. (I have only 2 active 
> MDSs)

I have this exact issue while trying to upgrade from 12.2 (which is
pending this mds issue). I don't have any active clients, restarting
rank0 does not help.

+--+--+---+---+---+---+
| Rank |  State   |MDS |Activity  |  dns  |  inos |
+--+--+---+---+---+---+
|  0   |  active  | osdnode05 | Reqs:0 /s | 2760k | 2760k |
|  1   | stopping | osdnode06 |   |   10  |   11  |
+--+--+---+---+---+---+


-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm: How to replace failed HDD where DB is on SSD

2021-05-27 Thread Kai Stian Olstad

On 26.05.2021 18:12, Eugen Block wrote:

Could you share the output of

lsblk -o name,rota,size,type

from the affected osd node?


# lsblk -o name,rota,size,type
NAME 
 ROTA   SIZE TYPE
loop0
1  71.3M loop
loop1
155M loop
loop2
1  29.9M loop
sda  
1   223G disk
├─sda1   
1   512M part
├─sda2   
1 1G part
└─sda3   
1 221.5G part
sdb  
1  12.5T disk
└─ceph--block--1b5ad7e7--2e24--4315--8a05--7439ab782b45-osd--block--2da790bc--a74c--41da--8772--3b8aac77001c 
1  12.5T lvm
sdc  
1  12.5T disk
└─ceph--block--44ae73e8--726f--4556--978c--8e7d6570c867-osd--block--daeb5218--c10c--45e1--a864--2d60de44e594 
1  12.5T lvm
sdd  
1  12.5T disk
└─ceph--block--38e361f5--257f--47a5--85dc--16dbdd5fb905-osd--block--a3e1511f--8644--4c1e--a3dd--f365fcb27fc6 
1  12.5T lvm
sde  
0   1.8T disk
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--f5d822a2--d42a--4a2c--985f--c65977f4d020 
   0 357.7G lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--9ed8d8a4--11bc--4669--bb8f--d66806181a7d 
   0 357.7G lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--0fdff890--f5e4--4762--83f6--8d02ee63c399 
   0 357.7G lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--facaa59d--1f44--4605--bdb8--5a7d58271323 
   0 357.7G lvm
└─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--44fbc70f--4169--41aa--9c2f--6638b226065a 
   0 357.7G lvm
sdf  
0   1.8T disk
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--7c82d5e2--6ce1--49c6--9691--2e156a8fd9c0 
   0 357.7G lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--135169cd--e0c8--4949--a137--c5e7da12bc52 
   0 357.7G lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--fcb5b6cc--ee95--4dfa--a978--0a768da8bc66 
   0 357.7G lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--44b74f22--d2f1--485a--971b--d89a82849c6e 
   0 357.7G lvm
└─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--1dd5ee57--e141--40aa--b4a6--5cb20dc51cc0 
   0 357.7G lvm
sdg  
0   1.8T disk
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--274c882c--da8b--4251--a195--309ee9cbc36f 
   0 357.7G lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--087b2305--2bbe--4583--b1ca--dda7416efefc 
   0 357.7G lvm
├─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--3a97a94b--92bd--41d9--808a--b79aa210fe11 
   0 357.7G lvm
└─ceph--block--dbs--563432b7--f52d--4cfe--b952--11542594843b-osd--block--db--441521b4--52ce--4a03--b5b2--da9392d037bc 
   0 357.7G lvm
sdi  
1  12.5T disk
└─ceph--block--1c122cd0--b28a--4409--b17b--535d95029dda-osd--block--9a6456d9--4c23--4585--a6df--85eda27ae651 
1  12.5T lvm
sdj  
1  12.5T disk
└─ceph--block--4cbe20eb--6aa3--4e4a--bdda--d556144e2a83-osd--block--2fd1bc6c--0eb3--48aa--b495--db0423b5be28 
1  12.5T lvm
sdk  

[ceph-users] Re: best practice balance mode in HAproxy in front of RGW?

2021-05-27 Thread Boris Behrens
Am Do., 27. Mai 2021 um 07:47 Uhr schrieb Janne Johansson :
>
> Den ons 26 maj 2021 kl 16:33 skrev Boris Behrens :
> >
> > Hi Janne,
> > do you know if there can be data duplication which leads to orphan objects?
> >
> > I am currently huntin strange errors (there is a lot more data in the
> > pool, than accessible via rgw) and want to be sure it doesn't come
> > from the HAproxy.
>
> No, I don't think the HAProxy (or any other load balancing setup) in
> itself would
> cause a lot of orphans. Or in reverse, the multipart stateless way S3
> acts always
> allows for half-uploads and broken connections which would leave orphans even
> if you did not have HAProxy in between, and in both cases you should
> periodically
> run the orphan finding commands and trim usage logs you no longer
> require and so on.
>
>
> --
> May the most significant bit of your life be positive.

Well, this drops a lot of pressure from my shoulders.
Is there a way to reduce the probability of creating orphan objects?
We use s3 for rbd backups (create snapshot, compress it and then copy
it to s3 via s3cmd) and we created 25m orphan objects in 4 weeks.
If there is any option / best practive I can do, I will happily use it :)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS cache tunning

2021-05-27 Thread Dan van der Ster
I don't think # clients alone is a good measure by which to decide to
deploy multiple MDSs -- idle clients create very little load, but just
a few badly behaving clients can use all the MDS performance. (If you
must hear a number, I can share that we have single MDSs with 2-3000
clients connected.)

To detect an overloaded MDS, in most cases the users will notice that
metadata ops are becoming slow -- so simple things like ls, mv, or
creating files will become slow. The MDS cpu usage will also be very
high -- (note that the MDS is not multithreaded, so at most it will
saturate somewhere between 100-200%). There are also some op latency
metrics in the mds perf dump you can observe -- if simple ops are
taking more than a few milliseconds then this is another indication
that the MDS load is too high.

Next, if you do find that you need multiple MDSs, and if you
understand the client workload very well, it is best if you can use
subtree pinning to statically pin sub directories to a particular MDS
rank. See https://ceph.io/community/new-luminous-cephfs-subtree-pinning/
Without pinning, the MDSs will use a heuristic to move subtrees
between themselves -- this doesn't always work very well for all
workloads, and can sometimes cause more harm than good.

Cheers, Dan



On Thu, May 27, 2021 at 9:30 AM Andres Rojas Guerrero  wrote:
>
> Oh, very interesting!! I have reduced the number of MDS to one. Only one
> question more,  out of curiosity, from what number can we consider that
> there are many clients?
>
>
>
> El 27/5/21 a las 9:24, Dan van der Ster escribió:
> > On Thu, May 27, 2021 at 9:21 AM Andres Rojas Guerrero  
> > wrote:
> >>
> >>
> >>
> >> El 26/5/21 a las 16:51, Dan van der Ster escribió:
> >>> I see you have two active MDSs. Is your cluster more stable if you use
> >>> only one single active MDS?
> >>
> >> Good question!! I read form Ceph Doc:
> >>
> >> "You should configure multiple active MDS daemons when your metadata
> >> performance is bottlenecked on the single MDS that runs by default."
> >>
> >> "Workloads that typically benefit from a larger number of active MDS
> >> daemons are those with many clients, perhaps working on many separate
> >> directories."
> >>
> >> I have more or less 25 concurrent clients, but working in the same
> >> directory, Is that number a lot of clients?
> >>
> >> And I assumed that two are always better than one.
> >
> > 25 isn't many clients, but if they are operating in the same directory
> > it will create a lot of contention between the two MDSs, which might
> > explain some of the issues you observe.
> > I recommend that you reduce back to 1 active mds and observe the
> > system stability and performance.
> >
> > -- dan
> >
>
> --
> ***
> Andrés Rojas Guerrero
> Unidad Sistemas Linux
> Area Arquitectura Tecnológica
> Secretaría General Adjunta de Informática
> Consejo Superior de Investigaciones Científicas (CSIC)
> Pinar 19
> 28006 - Madrid
> Tel: +34 915680059 -- Ext. 990059
> email: a.ro...@csic.es
> ID comunicate.csic.es: @50852720l:matrix.csic.es
> ***
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS cache tunning

2021-05-27 Thread Andres Rojas Guerrero
Oh, very interesting!! I have reduced the number of MDS to one. Only one 
question more,  out of curiosity, from what number can we consider that 
there are many clients?




El 27/5/21 a las 9:24, Dan van der Ster escribió:

On Thu, May 27, 2021 at 9:21 AM Andres Rojas Guerrero  wrote:




El 26/5/21 a las 16:51, Dan van der Ster escribió:

I see you have two active MDSs. Is your cluster more stable if you use
only one single active MDS?


Good question!! I read form Ceph Doc:

"You should configure multiple active MDS daemons when your metadata
performance is bottlenecked on the single MDS that runs by default."

"Workloads that typically benefit from a larger number of active MDS
daemons are those with many clients, perhaps working on many separate
directories."

I have more or less 25 concurrent clients, but working in the same
directory, Is that number a lot of clients?

And I assumed that two are always better than one.


25 isn't many clients, but if they are operating in the same directory
it will create a lot of contention between the two MDSs, which might
explain some of the issues you observe.
I recommend that you reduce back to 1 active mds and observe the
system stability and performance.

-- dan



--
***
Andrés Rojas Guerrero
Unidad Sistemas Linux
Area Arquitectura Tecnológica
Secretaría General Adjunta de Informática
Consejo Superior de Investigaciones Científicas (CSIC)
Pinar 19
28006 - Madrid
Tel: +34 915680059 -- Ext. 990059
email: a.ro...@csic.es
ID comunicate.csic.es: @50852720l:matrix.csic.es
***


OpenPGP_0x2DEE9321B16B4A68.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS cache tunning

2021-05-27 Thread Dan van der Ster
On Thu, May 27, 2021 at 9:21 AM Andres Rojas Guerrero  wrote:
>
>
>
> El 26/5/21 a las 16:51, Dan van der Ster escribió:
> > I see you have two active MDSs. Is your cluster more stable if you use
> > only one single active MDS?
>
> Good question!! I read form Ceph Doc:
>
> "You should configure multiple active MDS daemons when your metadata
> performance is bottlenecked on the single MDS that runs by default."
>
> "Workloads that typically benefit from a larger number of active MDS
> daemons are those with many clients, perhaps working on many separate
> directories."
>
> I have more or less 25 concurrent clients, but working in the same
> directory, Is that number a lot of clients?
>
> And I assumed that two are always better than one.

25 isn't many clients, but if they are operating in the same directory
it will create a lot of contention between the two MDSs, which might
explain some of the issues you observe.
I recommend that you reduce back to 1 active mds and observe the
system stability and performance.

-- dan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS cache tunning

2021-05-27 Thread Andres Rojas Guerrero



El 26/5/21 a las 16:51, Dan van der Ster escribió:

I see you have two active MDSs. Is your cluster more stable if you use
only one single active MDS?


Good question!! I read form Ceph Doc:

"You should configure multiple active MDS daemons when your metadata 
performance is bottlenecked on the single MDS that runs by default."


"Workloads that typically benefit from a larger number of active MDS 
daemons are those with many clients, perhaps working on many separate 
directories."


I have more or less 25 concurrent clients, but working in the same 
directory, Is that number a lot of clients?


And I assumed that two are always better than one.





--
***
Andrés Rojas Guerrero
Unidad Sistemas Linux
Area Arquitectura Tecnológica
Secretaría General Adjunta de Informática
Consejo Superior de Investigaciones Científicas (CSIC)
Pinar 19
28006 - Madrid
Tel: +34 915680059 -- Ext. 990059
email: a.ro...@csic.es
ID comunicate.csic.es: @50852720l:matrix.csic.es
***


OpenPGP_0x2DEE9321B16B4A68.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] 回复: MDS stuck in up:stopping state

2021-05-27 Thread 胡 玮文
Hi Martin,

You may hit https://tracker.ceph.com/issues/50112, which we failed to find the 
root cause yet. I resolved this by restart rank 0. (I have only 2 active MDSs)

Weiwen Hu

发送自 Windows 10 版邮件应用

发件人: Martin Rasmus Lundquist Hansen
发送时间: 2021年5月27日 14:26
收件人: ceph-users@ceph.io
主题: [ceph-users] MDS stuck in up:stopping state

After scaling the number of MDS daemons down, we now have a daemon stuck in the
"up:stopping" state. The documentation says it can take several minutes to stop 
the
daemon, but it has been stuck in this state for almost a full day. According to
the "ceph fs status" output attached below, it still holds information about 2
inodes, which we assume is the reason why it cannot stop completely.

Does anyone know what we can do to finally stop it?


cephfs - 71 clients
==
RANK   STATEMDS ACTIVITY DNSINOS
 0 active   ceph-mon-01  Reqs:0 /s  15.7M  15.4M
 1 active   ceph-mon-02  Reqs:   48 /s  19.7M  17.1M
 2stopping  ceph-mon-030  2
  POOL TYPE USED  AVAIL
cephfs_metadata  metadata   652G   185T
  cephfs_data  data1637T   539T
   STANDBY MDS
ceph-mon-03-mds-2
MDS version: ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) 
octopus (stable)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io