[ceph-users] Re: Make Ceph available over VPN?

2022-11-08 Thread Robert Sander

On 08.11.22 00:17, Sagittarius-A Black Hole wrote:
All the ceph nodes are part of the vpn network, so all of the nodes can 
be reached: in tailscale, each host gets an additional vpn ip and can be 
reached over tailscale from the individual client systems (laptops) when 
out of the office.


Ah, OK. But AFAIK the Ceph components (MON, OSD, MGR, MDS, etc) bind to 
the one IP in the public network (and the OSD one in the cluster network 
in addition) and this is recorded in the cluster map. This way all the 
components find each other because they get the cluster map from the MONs.


The clients also retrieve the cluster map from the MONs and use that 
information to talk to the OSDs.


All components only register one IP.

How would the client decide which IP to talk to if there were multiple 
per MON or OSD and the client would not be within one of the networks?


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Best practice for removing failing host from cluster?

2022-11-09 Thread Robert Sander

On 10.11.22 03:19, Matt Larson wrote:


Should I use `ceph orch osd rm XX` for each of the OSDs of this host
or should I set the weights of each of the OSDs as 0?  Can I do this while
the host is offline, or should I bring it online first before setting
weights or using `ceph orch osd rm`?


I would set all OSDs of this host to "out" first.
This way the cluster still knows about them and is able to utilize them 
when doing the data movement to the other OSDs.


After they are really empty you can purge them and remove the host from 
the cluster.


Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Tuning CephFS on NVME for HPC / IO500

2022-11-30 Thread Robert Sander

Hi,

On 2022-12-01 8:26, Manuel Holtgrewe wrote:

The Ceph cluster nodes have 10x enterprise NVMEs each (all branded as 
"Dell
enterprise disks"), 8 older nodes (last year) have "Dell Ent NVMe v2 
AGN RI
U.2 15.36TB" which are Samsung disks, 2 newer nodes (just delivered) 
have

"Dell Ent NVMe CM6 RI 15.36TB" which are Kioxia disks.


Does the "RI" stand for read-intensive?

I think you need mixed-use flash storage for a Ceph cluster as it has 
many random write accesses.


Regards
--
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 93818 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OMAP data growth

2022-12-05 Thread Robert Sander

Am 02.12.22 um 21:09 schrieb Wyll Ingersoll:


   *   What is causing the OMAP data consumption to grow so fast and can it be 
trimmed/throttled?


S3 is a heavy user of OMAP data. RBD and CephFS not so much.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Reweight Only works in same host?

2022-12-18 Thread Robert Sander

Hi,

Am 19.12.22 um 08:28 schrieb Isaiah Tang Yue Shun:


Currently, I have 3 OSDs in 3 different hosts:

# ceph osd tree:

ID  CLASS  WEIGHT   TYPE NAME   STATUS  REWEIGHT  PRI-AFF
-1 0.03908  root default
-3 0.00980  host node1
  0hdd  0.00980  osd.0   up   1.0  1.0
-9   0  host node2
-5 0.00980  host node3
  1hdd  0.00980  osd.1   up   1.0  1.0
-7 0.01949  host node4
  2hdd  0.01949  osd.2   up   1.0  1.0

When I insert data into the pool, the data is equally distributed on 3 OSDs. (3 
replicas)
However, I want the osd.2 to hold more data since it has more space.


What additional data should OSD 2 hold?

The pool has a size of 3, i.e. all data objects have three copies.

The cluster only has 3 OSDs, so each OSD gets one copy.

In this configuration your available capacity is that of the smallest 
OSD. You cannot use additional space in larger OSDs.


Such a heterogenous setup is only possible with a large number of OSDs 
where the placement groups can be assigned more flexible to the OSDs.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph filesystem

2022-12-19 Thread Robert Sander

Am 19.12.22 um 14:19 schrieb akshay sharma:


sudo ceph auth get-or-create client.user mon 'allow r' mds 'allow r, allow
rw path=/home/cephfs' osd 'allow rw pool=cephfs_data' -o
/etc/ceph/ceph.client.user.keyring


The path for this command is relative to the root of the CephFS, usually 
just /.


You should also switch to "ceph fs authorize" for creating cephx keys 
for CephFS usage.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph filesystem

2022-12-20 Thread Robert Sander

On 20.12.22 08:38, akshay sharma wrote:


Now, I'm able to copy files from the same machine.. basically copy file
from home to /mnt/cephfs is working but when copying from remote machine
using SFTP or SCP to /mnt/cephfs is not working.


What account are you using when locally copying files?
What account are you using when doing the same via SCP?

What POSIX access rights do these accounts have in the filesystem?

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph filesystem

2022-12-20 Thread Robert Sander

On 20.12.22 10:21, akshay sharma wrote:

With account you mean the user?


If yes, then we are using different user..the mds auth is created with 
client.user..


This is the cephx key that is only used when mounting the filesystem.

>
while copying we are login as user- test but locally with

user test we are able to
  The remote machine is also with user test.

ls -lrth
Drwxr-xr-x 1 root root mycephfs

Mount is changing the permission to root/root.


With an Unix account "test" you usually cannot write into a directory 
that is owned by "root" and only writable by "root".


You will need to chown or chgrp and chmod directories and/or files if 
you want to change them. This is basic POSIC permissions management.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Setting Prometheus retention_time

2023-01-08 Thread Robert Sander

Hi,

The Quincy documentation shows that we could set the Prometheus 
retention_time within a service specification:


https://docs.ceph.com/en/quincy/cephadm/services/monitoring/#setting-up-prometheus

When trying this "ceph orch apply" only shows:

Error EINVAL: ServiceSpec: __init__() got an unexpected keyword argument 
'retention_time'


It looks like release 17.2.5 does not contain this code yet.

Why is the content of the documentation already online when 
https://github.com/ceph/ceph/pull/47943 has not been released yet?


Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Current min_alloc_size of OSD?

2023-01-11 Thread Robert Sander
Hi,

Ceph 16 Pacific introduced a new smaller default min_alloc_size of 4096 bytes 
for HDD and SSD OSDs.

How can I get the current min_allloc_size of OSDs that were created with older 
Ceph versions? Is there a command that shows this info from the on disk format 
of a bluestore OSD?

Regards
-- 
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 93818 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Current min_alloc_size of OSD?

2023-01-12 Thread Robert Sander

On 11.01.23 23:47, Anthony D'Atri wrote:


It’s printed in the OSD log at startup.


But which info is it exactly?

This line looks like reporting the block_size of the device:

  bdev(0x55b50a2e5800 /var/lib/ceph/osd/ceph-0/block) open size 107369988096 
(0x18ffc0, 100 GiB) block_size 4096 (4 KiB) non-rotational discard supported

Is it this line?

  bluefs _init_alloc shared, id 1, capacity 0x18ffc0, block size 0x1

  0x1 equals 65536 aka 64K in decimal.

Is it this line?

  bluestore(/var/lib/ceph/osd/ceph-0) _open_super_meta min_alloc_size 0x1000

Or this one?

  bluestore(/var/lib/ceph/osd/ceph-0) _init_alloc loaded 100 GiB in 1 extents, 
allocator type hybrid, capacity 0x18ffc0, block size 0x1000, free 
0x18ffbfd000, fragmentation 0


I don’t immediately see it in `ceph osd metadata` ; arguably it should be there.


An entry in ceph osd metadata would be great to have.


`config show` on the admin socket I suspect does not show the existing value.


This show the value currently set in the configuration.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [solved] Current min_alloc_size of OSD?

2023-01-12 Thread Robert Sander

Hi,

On 12.01.23 11:11, Gerdriaan Mulder wrote:


On 12/01/2023 10.26, Robert Sander wrote:

Is it this line?

    bluestore(/var/lib/ceph/osd/ceph-0) _open_super_meta min_alloc_size
0x1000


That seems to be it:
https://github.com/ceph/ceph/blob/v15.2.17/src/os/bluestore/BlueStore.cc#L11754-L11755


Thanks for the confirmation.

So one can grep for "min_alloc_size" in the OSD's log output.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS: Questions regarding Namespaces, Subvolumes and Mirroring

2023-01-12 Thread Robert Sander

On 12.01.23 17:13, Jonas Schwab wrote:


rbd namespace ls  --format=json
      But the latter command just returns an empty list. Are the
namespaces used for rdb and CephFS different ones?


RBD and CephFS are different interfaces. You would need to use rados to 
list all objects and their namespaces. I have not found a way to only 
list namespaces of a pool.


root@cephtest20:~# rados -p .nfs --all ls
nfs01   rec-0049:nfs.nfs01.0
nfs01   export-1
nfs01   rec-002a:nfs.nfs01.1
nfs01   conf-nfs.nfs01
nfs01   rec-0049:nfs.nfs01.1
nfs01   rec-0013:nfs.nfs01.1
nfs01   grace
nfs01   rec-0008:nfs.nfs01.1
nfs01   rec-0049:nfs.nfs01.2
nfs01   rec-0029:nfs.nfs01.2
nfs01   rec-0010:nfs.nfs01.1
nfs01   rec-0011:nfs.nfs01.2
nfs01   rec-0009:nfs.nfs01.0
root@cephtest20:~# rbd namespace ls .nfs
root@cephtest20:~#

Where "nfs01" is a namespace in the pool .nfs

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Current min_alloc_size of OSD?

2023-01-13 Thread Robert Sander

Hi,

Am 13.01.23 um 14:35 schrieb Konstantin Shalygin:


ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-0/ get S min_alloc_size


This only works when the OSD is not running.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Dashboard access to CephFS snapshots

2023-01-17 Thread Robert Sander

Hi,

The dashboard has a simple CephFS browser where we can set
quota and snapshots for the directories.

When a directory has the "other" permission bits unset, i.e.
only access for user and group, the dashboard displays an error:

Failed to execute CephFS
opendir failed at /path/to/dir/.snap: Permission denied [Errno 13]

It can be reproduced in Ceph 17.2.5 by creating the directory
and using "chmod o= /path/to/dir" to not allow "other".

How does the dashboard access the contents of the CephFS?
It looks like the MGR uses something like the nobody account.

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] 17.2.5 ceph fs status: AssertionError

2023-01-18 Thread Robert Sander

Hi,

I have a healthy (test) cluster running 17.2.5:

root@cephtest20:~# ceph status
  cluster:
id: ba37db20-2b13-11eb-b8a9-871ba11409f6
health: HEALTH_OK
 
  services:

mon: 3 daemons, quorum cephtest31,cephtest41,cephtest21 (age 2d)
mgr: cephtest22.lqzdnk(active, since 4d), standbys: 
cephtest32.ybltym, cephtest42.hnnfaf
mds: 1/1 daemons up, 1 standby, 1 hot standby
osd: 48 osds: 48 up (since 4d), 48 in (since 4M)
rgw: 2 daemons active (2 hosts, 1 zones)
tcmu-runner: 6 portals active (3 hosts)
 
  data:

volumes: 1/1 healthy
pools:   17 pools, 513 pgs
objects: 28.25k objects, 4.7 GiB
usage:   26 GiB used, 4.7 TiB / 4.7 TiB avail
pgs: 513 active+clean
 
  io:

client:   4.3 KiB/s rd, 170 B/s wr, 5 op/s rd, 0 op/s wr

CephFS is mounted and can be used without any issue.

But I get an error when I when querying its status:

root@cephtest20:~# ceph fs status
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1757, in _handle_command
return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 462, in call
return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/status/module.py", line 159, in handle_fs_status
assert metadata
AssertionError


The dashboard's filesystem page shows no error and displays
all information about cephfs.

Where does this AssertionError come from?

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 17.2.5 ceph fs status: AssertionError

2023-01-18 Thread Robert Sander

Am 18.01.23 um 10:12 schrieb Robert Sander:


root@cephtest20:~# ceph fs status
Error EINVAL: Traceback (most recent call last):
   File "/usr/share/ceph/mgr/mgr_module.py", line 1757, in _handle_command
     return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
   File "/usr/share/ceph/mgr/mgr_module.py", line 462, in call
     return self.func(mgr, **kwargs)
   File "/usr/share/ceph/mgr/status/module.py", line 159, in 
handle_fs_status

     assert metadata
AssertionError


After restarting all MDS daemons the AssertionError is gone, ceph fs 
status shows the filesystem status again. Strange.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Pools and classes

2023-01-23 Thread Robert Sander

Am 23.01.23 um 09:44 schrieb Massimo Sgaravatto:


This triggered the remapping of some pgs and therefore some data movement.
Is this normal/expected, since for the time being I have only hdd osds ?


This is expected behaviour as the cluster map has changed. Internally 
the device classes are represented through "shadow" trees of the cluster 
topology.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: deploying Ceph using FQDN for MON / MDS Services

2023-01-24 Thread Robert Sander

Hi,

On 24.01.23 15:02, Lokendra Rathour wrote:


My /etc/ceph/ceph.conf is as follows:

[global]
fsid = 7969b8a3-1df7-4eae-8ccf-2e5794de87fe
mon host = 
[v2:[abcd:abcd:abcd::21]:3300,v1:[abcd:abcd:abcd::21]:6789],[v2:[abcd:abcd:abcd::22]:3300,v1:[abcd:abcd:abcd::22]:6789],[v2:[abcd:abcd:abcd::23]:3300,v1:[abcd:abcd:abcd::23]:6789]


Does this ceph.conf also exist on the hosts that want to mount the 
filesystem? Then you do not need to specify a MON host or IP when 
mounting CephFS. Just do


mount -t ceph -o name=admin,secret=XXX :/ /backup

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: deploying Ceph using FQDN for MON / MDS Services

2023-01-24 Thread Robert Sander

Hi,

you can also use SRV records in DNS to publish the IPs of the MONs.

Read https://docs.ceph.com/en/quincy/rados/configuration/mon-lookup-dns/ 
for more info.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph 16.2.10 cluster down

2023-01-26 Thread Robert Sander

Hi,

On 26.01.23 12:46, Jens Galsgaard wrote:


Setup is:
3 hosts with each 12 disks (osd/mon)
3 vm's with mon/mds/mgr

The vm's are unavailable at the moment and one of the hosts is online with 
osd/mon running.



You have only one out of six MONs running. This MON is unable to form a 
quorum.


Best would be to start the other MONs so that you have at least 4 
running. They could form a quorum and then the cluster will respond again.


If that is not possible and you want to recover from a catastrophic 
failure you need to manually edit the MON map (with monmaptool) and 
remove all but the running MON from it. Then this MON will only see 
itself as active in the cluster and form the quorum.


https://docs.ceph.com/en/quincy/rados/troubleshooting/troubleshooting-mon

https://docs.ceph.com/en/quincy/rados/operations/add-or-rm-mons/#removing-monitors

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph 16.2.10 cluster down

2023-01-26 Thread Robert Sander

Hi Jens,

On 26.01.23 16:17, Jens Galsgaard wrote:


After removing the dead monitors with the monmaptool the mon container has 
vanished from podman.
So this somehow made things worse.


You have not mentioned that you are running Ceph in containers.

The procedure to repair the MON map may look a little bit different 
then. The documentation only shows the non-containerized procedure.


Do you still have the systemd unit for the last working MON?

Run "systemctl --all | grep mon" to look for it. If it is stop try to 
start it. Does the MON run?



Is it possible to create and add  new monitors? Re-bootstrap the cluster in 
lack of better terms?


It should be possible to restore the MON db using an exisiting OSD:

https://docs.ceph.com/en/pacific/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds

This procedure would also have to be adapted for a containerized setup.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Any ceph constants available?

2023-02-06 Thread Robert Sander

On 04.02.23 00:02, Thomas Cannon wrote:


Boreal-01 - the host - 17.2.5:



Boreal-02 - 15.2.6:



Boreal-03 - 15.2.8:



And the host I added - Boreal-04 - 17.2.5:


This is a wild mix of versions. Such a situation may exist during an 
upgrade but not when operating normally or extending the cluster.


Please show the output of "ceph versions".

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Inconsistency in rados ls

2023-02-06 Thread Robert Sander

On 04.02.23 20:54, Ramin Najjarbashi wrote:


ceph df | grep mypoo

--- POOLS ---

POOL OBJECTS

mypool   1.11G

---

  and from this, I got 8.8M objects :

for item in `radosgw-admin user list | jq -r ".[]" | head`; do
B_OBJ=$(radosgw-admin user stats --uid $item 2>/dev/null | jq -r '.stats |
select(.num_objects > 0) | .num_objects'); SUM=$((SUM + B_OBJ)); done


You have mixed RADOS objects and S3 objects.

These are two different layers. Only small (< 4MB) S3 objects are stored 
in a single RADOS object. Larger S3 objects are split into multiple 4MB 
sized RAOS objects by the rados-gateway.


This is why you see much more RADOS objects than S3 objects.

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Removing Rados Gateway in ceph cluster

2023-02-06 Thread Robert Sander

On 06.02.23 13:48, Michel Niyoyita wrote:


root@ceph-mon1:~# ceph -v
ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific
(stable)


This is the version of the command line tool "ceph".

Please run "ceph versions" to show the version of the running Ceph daemons.

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: recovery for node disaster

2023-02-13 Thread Robert Sander

Am 13.02.23 um 06:31 schrieb farhad kh:


Is it possible to recover data when two nodes with all physical disks are
lost for any reason?


You have one copy of each object on each node and each node runs a MON.

If two nodes fail then the cluster will cease to function as the 
remaining MON will not be able to gain quorum.


In this worst case you would need to manually edit the MON map and 
remove the two failed MONs. The remaining MON will then be "lonely" and 
will be able to reach quorum with itself. The cluster will work again.


In this moment the data will be available again, but read-only.

This is because there are less than "min_size" object copies available.

The next step would be to add new nodes (and MONs). Reduce min_size for 
each pool to 1 to tell the cluster that it should be recover from the 
last remaining copy.


After that has been done increase min_size to 2 again.

While recovery runs there is an increased risk to lose data when a disk 
in the remaining node fails.



What is the maximum number of fault tolerance for the cluster?


Such a cluster can stand the loss of two nodes without data loss. If no 
disk in the remaining node fails.


To increase fault tolerance you need to streamline your processes and 
replace a failed node immediately before the next one fails. In such 
small clusters each consecutive failure can lead to data loss.


Best would be to add more nodes.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: how to sync data on two site CephFS

2023-02-16 Thread Robert Sander

Hi,

On 16.02.23 12:53, zxcs wrote:


we  already have a CephFS cluster, called A,  and now we want to setup another 
CephFS cluster(called B) in other site.
And we need to  synchronize data with each other for some directory(if all 
directory can synchronize , then very very good), Means when we write a file in 
A cluster, this file can auto sync to B cluster, and when we create a file or 
directory on B Cluster, this file or directory can auto sync to A Cluster.



Ceph has CephFS snapshot mirroring: 
https://docs.ceph.com/en/latest/cephfs/cephfs-mirroring/


But this is a one way mirror. It only supports A -> B.

You need a two way sync. There is software like unison available for 
that task: https://en.wikipedia.org/wiki/Unison_(software)


If you do not have too many or too large directories you could let 
unison run regularily. But it will bail on conflicts, meaning it has to 
ask what to do if a file has been changed on both sides.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Undo "radosgw-admin bi purge"

2023-02-20 Thread Robert Sander

Hi,

There is an operation "radosgw-admin bi purge" that removes all bucket 
index objects for one bucket in the rados gateway.


What is the undo operation for this?

After this operation the bucket cannot be listed or removed any more.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Undo "radosgw-admin bi purge"

2023-02-22 Thread Robert Sander

On 21.02.23 22:52, Richard Bade wrote:


A colleague and I ran into this a few weeks ago. The way we managed to
get access back to delete the bucket properly (using radosgw-admin
bucket rm) was to reshard the bucket.



This created a new bucket index and therefore it was then possible to delete it.
If you are looking to get access back to the objects, then as Eric
said there's no way to get those indexes back but the objects will
still be there in the pool.


Thanks for the answers so far.

The issue we faced was a corrupt bucket index object.

We thought about strategies to repair that but found none.

I tried different things on a test cluster in a test bucket, one of them 
was "bi purge". And then I thought: Why is there such an operation when 
there is no way to get the index back and a working bucket?


Resharding after a "bi prune" seems to work but as a result the bucket 
is empty when listing via S3. A bucket remove is successful but leaves 
all the RADOS objects in the index and data pools.


Why is there no operation to rebuild the index for a bucket based on the 
existing RADOS objects in the data pool?


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Undo "radosgw-admin bi purge"

2023-02-22 Thread Robert Sander

On 22.02.23 14:42, David Orman wrote:

If it's a test cluster, you could try:

root@ceph01:/# radosgw-admin bucket check -h |grep -A1 check-objects
--check-objects   bucket check: rebuilds bucket index according to
  actual objects state


After a "bi purge" a "bucket check" returns an error:

# radosgw-admin bi purge --bucket=testbucket --yes-i-really-mean-it
# radosgw-admin bi list --bucket=testbucket
ERROR: bi_list(): (2) No such file or directory
# radosgw-admin bucket check --bucket=testbucket --check-objects
2023-02-22T16:51:11.970+0100 7fdcc6093e40  0 int RGWRados::cls_bucket_list_ordered(const 
DoutPrefixProvider*, RGWBucketInfo&, int, const rgw_obj_index_key&, const string&, 
const string&, uint32_t, bool, uint16_t, RGWRados::ent_map_t&, bool*, bool*, 
rgw_obj_index_key*, optional_yield, RGWBucketListNameFilter): CLSRGWIssueBucketList for 
:testbucket[471f26a3-ff89-4b02-911a-0c89e2e295fa.104944180.1]) failed

Adding --fix does not change anything.

I can still download the one S3 object I put in the bucket
because I know its name, but:

# s3cmd ls s3://testbucket/
ERROR: S3 error: 404 (NoSuchKey)

A "bucket reshard" recreates index objects:

# radosgw-admin bucket reshard --bucket=testbucket --num-shards=12
tenant:
bucket name: testbucket
old bucket instance id: 471f26a3-ff89-4b02-911a-0c89e2e295fa.104944180.1
new bucket instance id: 471f26a3-ff89-4b02-911a-0c89e2e295fa.105128491.1
total entries: 0
2023-02-22T16:58:34.496+0100 7f52360dce40  1 execute INFO: reshard of bucket "testbucket" from 
"testbucket:471f26a3-ff89-4b02-911a-0c89e2e295fa.104944180.1" to 
"testbucket:471f26a3-ff89-4b02-911a-0c89e2e295fa.105128491.1" completed successfully

After that "bucket check" runs without error but cannot
fix the situation:

# radosgw-admin bucket check --bucket=testbucket --check-objects --fix
[]
{}
{
"existing_header": {
"usage": {}
},
"calculated_header": {
"usage": {}
}
}

"s3cmd ls s3://testbucket/" shows nothing.

"s3cmd rb s3://testbucket/" removes the bucket but the RADOS
objects of the S3 objects remain in the data pool.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Undo "radosgw-admin bi purge"

2023-02-23 Thread Robert Sander

Hi,

On 22.02.23 17:45, J. Eric Ivancich wrote:

You also asked why there’s not a command to scan the data pool and 
recreate the bucket index. I think the concept would work as all head 
objects include the bucket marker in their names. There might be some 
corner cases where it’d partially fail, such as (possibly) transactional 
changes that were underway when the bucket index was purged. And there 
is metadata in the bucket index that’s not stored in the objects, so it 
would have to be recreated somehow. But no one has written it yet.


I am not in an urgent need to get such a feature.

How would the process look to get development started in this direction?

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: s3 compatible interface

2023-02-28 Thread Robert Sander

On 28.02.23 16:31, Marc wrote:


Anyone know of a s3 compatible interface that I can just run, and reads/writes 
files from a local file system and not from object storage?


Have a look at Minio:

https://min.io/product/overview#architecture

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Theory about min_size and its implications

2023-03-03 Thread Robert Sander

On 02.03.23 09:16, stefan.pin...@bearingpoint.com wrote:


so if one room goes down/offline, around 50% of the PGs would be left with only 
1 replica making them read-only.


Most people forget the other half of the cluster in such a scenario.

For us humans it is obvious that one room is down, because we can see it 
from the outside.


The OSDs only see that they do not have connectivity to their peering 
partners. They do not know if this is because the other hosts are down 
or just the network in between.


It could be the case that just the line between both rooms is dead and 
then you have 2 copies running in one room and only one in the other.
If you now allow changes in the "smaller" room in addition to changes in 
the room with two copies you immediately get a conflict as soon as the 
network connection between both rooms is reestablished.


This is why min_size=1 is a really bad idea outside of a desaster 
scenario where the other two copies are completely lost to a fire.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: unable to calc client keyring client.admin placement PlacementSpec(label='_admin'): Cannot place : No matching hosts for label _admin

2023-03-03 Thread Robert Sander

On 03.03.23 11:16, Jeremy Hansen wrote:
3/3/23 2:13:53 AM[WRN]unable to calc client keyring client.admin 
placement PlacementSpec(label='_admin'): Cannot place : No matching 
hosts for label _admin



I keep seeing this warning in the logs.  I’m not really sure what action 
to take to resolve this issue.


"No matching hosts for label _admin" means that there is no host which
has the label _admin.

The label _admin is a special label which instructs the orchestrator to
keep /etc/ceph/ceph.conf and /etc/ceph/ceph.client.admin.keyring on such
labeled hosts up to date.

You could just label one of the cluster hosts with _admin:

ceph orch host label add hostname _admin

https://docs.ceph.com/en/quincy/cephadm/host-management/#special-host-labels

https://docs.ceph.com/en/quincy/cephadm/operations/#client-keyrings-and-configs

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Error deploying Ceph Qunicy using ceph-ansible 7 on Rocky 9

2023-03-08 Thread Robert Sander

On 08.03.23 13:22, wodel youchi wrote:


I am trying to deploy Ceph Quincy using ceph-ansible on Rocky9. I am having
some problems and I don't know where to search for the reason.
The README.rst of the ceph-ansible project on 
https://github.com/ceph/ceph-ansible encourages you to move to cephadm 
as this is the recommended installation method.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Can't install cephadm on HPC

2023-03-13 Thread Robert Sander

On 13.03.23 03:29, zyz wrote:

Hi:
   I encountered a problem when I install cephadm on Huawei Cloud EulerOS. When 
enter the following command, it raise an error. What should I do?



./cephadm add-repo --release quincy



<< ERROR: Distro hce version 2.0 not supported


There are just no upstream package repositories for your distribution 
and version available. Nobody has compiled Ceph packages for Huawei 
Cloud EulerOS.


But this does not matter as long as Podman/Docker, LVM, systemd and time 
synchronization via NTP are available.


You can still bootstrap a cephadm orchestrator managed Ceph cluster as 
everything runs in containers.


You just miss the Ceph command line clients from the upstream repos 
(commands ceph, rados, rbd etc.). Your distribution may package these or 
you can use "cephadm shell" to get a shell inside a Ceph container where 
all these CLI tools are available.


Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade 16.2.11 -> 17.2.0 failed

2023-03-14 Thread Robert Sander

On 14.03.23 14:21, bbk wrote:
`

# ceph orch upgrade start --ceph-version 17.2.0


I would never recommend to update to a .0 release.

Why not go directly to the latest 17.2.5?

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade 16.2.11 -> 17.2.0 failed

2023-03-16 Thread Robert Sander

On 14.03.23 15:22, b...@nocloud.ch wrote:

ah.. ok, it was not clear to me that skipping minor version when doing a major 
upgrade was supported.


You can even skip one major version when doing an upgrade.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph cluster out of balance after adding OSDs

2023-03-27 Thread Robert Sander

On 27.03.23 16:04, Pat Vaughan wrote:


we looked at the number of PGs for that pool, and found that there was only
1 for the rgw.data and rgw.log pools, and "osd pool autoscale-status"
doesn't return anything, so it looks like that hasn't been working.


If you are in this situation, have a look at the crush rules of your 
pools. If the cluster has multiple device classes (hdd, ssd) then all 
pools need to use just one device class each.


The autoscaler currently does not work when one pool uses just one 
device class and another pool uses the default crush rule and therefor 
multiple device classes.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph cluster out of balance after adding OSDs

2023-03-27 Thread Robert Sander

On 27.03.23 16:34, Pat Vaughan wrote:

Yes, all the OSDs are using the SSD device class.


Do you have multiple CRUSH rules by chance?
Are all pools using the same CRUSH rule?

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph cluster out of balance after adding OSDs

2023-03-28 Thread Robert Sander

On 27.03.23 23:13, Pat Vaughan wrote:
Looking at the pools, there are 2 crush rules. Only one pool has a 
meaningful amount of data, the  charlotte.rgw.buckets.data pool. This is 
the crush rule for that pool.


So that pool uses the device class ssd explicitely where the other pools 
do not care about the device class.


The autoscaler is not able to cope with this situation.

charlotte.rgw.buckets.data is an erasure coded pool, correct? And the 
rule was created automatically when you created the erasure coding profile.


You should create an erasure coding rule that does not care about the 
device class and assign it to the pool charlotte.rgw.buckets.data.

After that the autoscaler will be able to work again.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Adding new server to existing ceph cluster - with separate block.db on NVME

2023-03-28 Thread Robert Sander

Hi,

On 28.03.23 05:42, Robert W. Eckert wrote:


I am trying to add a new server to an existing cluster, but cannot get the OSDs 
to create correctly
When I try
Cephadm ceph-volume lvm create, it returns nothing but the container info.



You are running a containerized cluster with the cephadm orchestrator?
Which version?

Have you tried

ceph orch daemon add osd 
host1:data_devices=/dev/sda,/dev/sdb,db_devices=/dev/nvme0

as shown on https://docs.ceph.com/en/quincy/cephadm/services/osd/ ?

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Adding new server to existing ceph cluster - with separate block.db on NVME

2023-03-29 Thread Robert Sander

On 29.03.23 01:09, Robert W. Eckert wrote:


I did miss seeing the db_devices part. For ceph orch apply -  that would have 
saved a lot of effort.  Does the osds_per_device create the partitions on the 
db device?


No, osds_per_device creates multiple OSDs on one data device, could be 
useful for NVMe, do no use on HDD.


The command automatically creates the number of db slots on the 
db_device based on how many data_devices you pass it.


If you want more slots for the RocksDB then pass it the db_slots parameter.


Also is there any way to disable --all-available-devices if it was turned on.

The
ceph orch apply osd --all-available-devices --unmanaged=true

command doesn't seem to disable the behavior of adding new drives.


You can set the service to unmanaged when exporting the specification.

ceph orch ls osd --export > osd.yml

Edit osd.yml and add "unmanaged: true" to the specification. After that

ceph orch apply -i osd.yml

Or you could just remove the specification with "ceph orch rm NAME".
The OSD service will be removed but the OSD will remain.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: deploying Ceph using FQDN for MON / MDS Services

2023-04-17 Thread Robert Sander

On 14.04.23 12:17, Lokendra Rathour wrote:


*mount: /mnt/image: mount point does not exist.*


Have you created the mount point?

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: deploying Ceph using FQDN for MON / MDS Services

2023-04-18 Thread Robert Sander

On 18.04.23 06:12, Lokendra Rathour wrote:

but if I try mounting from a normal Linux machine with connectivity 
enabled between Ceph mon nodes, it gives the error as stated before.


Have you installed ceph-common on the "normal Linux machine"?

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to replace an HDD in a OSD with shared SSD for DB/WAL

2023-04-21 Thread Robert Sander

Hi,

On 21.04.23 05:44, Tao LIU wrote:


I build a Ceph Cluster with cephadm.
Every cehp node has 4 OSDs. These 4 OSD were build with 4 HDD (block) and 1
SDD (DB).
At present , one HDD is broken, and I am trying to replace the HDD,and
build the OSD with the new HDD and the free space of the SDD. I did the
follows:

#ceph osd stop osd.23
#ceph osd out osd.23
#ceph osd crush remove osd.23
#ceph osd rm osd.23
#ceph orch daemon rm osd.23 --force
#lvremove
/dev/ceph-ae21e618-601e-4273-9185-99180edb8453/osd-block-96eda371-1a3f-4139-9123-24ec1ba362c4
#wipefs -af /dev/sda
#lvremove
/dev/ceph-e50203a6-8b8e-480f-965c-790e21515395/osd-db-70f7a032-cf2c-4964-b979-2b90f43f2216
#ceph orch daemon add osd
compute11:data_devices=/dev/sda,db_devices=/dev/sdc,osds_per_device=1

The OSD can be built, but is always down.

Is there anyting that I missed during the building?


Assuming /dev/ceph-UUID/osd-db-UUID is the logical volume for the old OSD you 
could have run this:

ceph orch osd rm 23

replace the faulty HDD

ceph orch daemon add osd 
compute11:data_devices=/dev/sda,db_devices=ceph-UUID/osd-db-UUID

This will reuse the existing logical volume for the OSD DB.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD_TOO_MANY_REPAIRS on random OSDs causing clients to hang

2023-04-26 Thread Robert Sander

On 26.04.23 13:24, Thomas Hukkelberg wrote:


[WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 1 OSDs
 osd.34 had 9936 reads repaired


Are there any messages in the kernel log that indicate this device has 
read errors? Have you considered replacing the disk?


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Encryption per user Howto

2023-05-23 Thread Robert Sander

On 23.05.23 08:42, huxia...@horebdata.cn wrote:

Indeed, the question is on  server-side encryption with keys managed by ceph on 
a per-user basis


What kind of security to you want to achieve with encryption keys stored 
on the server side?


Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Encryption per user Howto

2023-05-26 Thread Robert Sander

On 5/26/23 12:26, Frank Schilder wrote:


It may very well not serve any other purpose, but these are requests we get. If 
I could provide an encryption key to a ceph-fs kernel at mount time, this 
requirement could be solved very elegantly on a per-user (request) basis and 
only making users who want it pay with performance penalties.


I understand this use case. But this would still mean that the client 
encrypts the data. In your case the CephFS mount or with S3 the 
rados-gateway.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Cluster without messenger v1, new MON still binds to port 6789

2023-06-01 Thread Robert Sander

Hi,

a cluster has ms_bind_msgr1 set to false in the config database.

Newly created MONs still listen on port 6789 and add themselves as 
providing messenger v1 into the monmap.


How do I change that?

Shouldn't the MONs use the config for ms_bind_msgr1?

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph iSCSI GW not working with VMware VMFS and Windows Clustered Storage Volumes (CSV)

2023-06-19 Thread Robert Sander

On 19.06.23 13:47, Work Ceph wrote:


Recently, we had the need to add some VMWare clusters as clients for the
iSCSI GW and also Windows systems with the use of Clustered Storage Volumes
(CSV), and we are facing a weird situation. In windows for instance, the
iSCSI block can be mounted, formatted and consumed by all nodes, but when
we add in the CSV it fails with some generic exception. The same happens in
VMWare, when we try to use it with VMFS it fails.


The iSCSI target used does not support SCSI persistent group 
reservations when in multipath mode.


https://docs.ceph.com/en/quincy/rbd/iscsi-initiators/

AFAIK VMware uses these in VMFS.

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: device class for nvme disk is ssd

2023-06-28 Thread Robert Sander

On 6/28/23 14:03, Boris Behrens wrote:


is it a problem that the device class for all my disks is SSD even all of
these disks are NVME disks? If it is just a classification for ceph, so I
can have pools on SSDs and NVMEs separated I don't care. But maybe ceph
handles NVME disks differently internally?


No. When creating the OSD ceph-volume looks at
/sys/class/block/DEVICE/queue/rotational
to determine if it's an HDD (file contains 1) or not (file contains 0).

If you need to distinguish between SSD and NVMe you can manually assign 
another device class to the OSDs.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Are replicas 4 or 6 safe during network partition? Will there be split-brain?

2023-07-10 Thread Robert Sander

Hi,

On 07.07.23 16:52, jcic...@cloudflare.com wrote:


There are two sites, A and B. There are 5 mons, 2 in A, 3 in B. Looking at just 
one PG and 4 replicas, we have 2 replicas in site A and 2 replicas in site B. 
Site A holds the primary OSD for this PG. When a network split happens, I/O 
would still be working in site A since there are still 2 OSDs, even without mon 
quorum.


The site without MON quorum will stop to work completely.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Per minor-version view on docs.ceph.com

2023-07-13 Thread Robert Sander

Hi,

On 7/12/23 05:44, Satoru Takeuchi wrote:


I have a request about docs.ceph.com. Could you provide per minor-version views
on docs.ceph.com?


I would like to second that. Sometimes the behaviour of Ceph changes a 
lot between point releases. If the documentation gets unreliable it does 
not shine a good light on the project.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph 17.2.6 alert-manager receives error 500 from inactive MGR

2023-07-26 Thread Robert Sander
.54.226.222:49904] [POST] [500] [0.002s] [513.0B] 
[a9b25e54-f1e1-42eb-90b2-af5aa22769cf] /api/prometheus_receiver
Jul 25 09:25:27 mgr002 ceph-mgr[1841]: [dashboard ERROR request] [b'{"status": "500 Internal Server Error", 
"detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", 
"request _id": "a9b25e54-f1e1-42eb-90b2-af5aa22769cf"}  

 ']
Jul 25 09:25:27 mgr002 ceph-mgr[1841]: [dashboard INFO request] 
[:::10.54.226.222:49904] [POST] [500] [0.002s] [513.0B] 
[a9b25e54-f1e1-42eb-90b2-af5aa22769cf] /api/prometheus_receiver
Jul 25 09:25:28 mgr002 ceph-mgr[1841]: mgr handle_mgr_map Activating!
Jul 25 09:25:28 mgr002 ceph-mgr[1841]: mgr handle_mgr_map I am now activating

We have a test cluster running also with version 17.2.6 where
this does not happen. In this test cluster the passive MGRs return an HTTP
code 204 when the alert-manager tries to request /api/prometheus_receiver.

What is happening here?

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 17.2.6 alert-manager receives error 500 from inactive MGR

2023-07-27 Thread Robert Sander

On 7/27/23 13:27, Eugen Block wrote:


[2] https://github.com/ceph/ceph/pull/47011


This PR implements the 204 HTTP code that I see in my test cluster.

I wonder why in the same situation the other cluster returns a 500 here.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph orch upgrade stuck between 16.2.7 and 16.2.13

2023-08-15 Thread Robert Sander

Hi,

A healthy 16.2.7 cluster should get an upgrade to 16.2.13.

ceph orch upgrade start --ceph-version 16.2.13

did upgrade MONs, MGRs and 25% of the OSDs and is now stuck.

We tried several "ceph orch upgrade stop" and starts again.
We "failed" the active MGR but no progress.
We set the debug logging with "ceph config set mgr mgr/cephadm/log_to_cluster_level 
debug" but it only tells that it starts:

2023-08-15T09:05:58.548896+0200 mgr.cephmon01 [INF] Upgrade: Started with 
target quay.io/ceph/ceph:v16.2.13

How can we check what is happening (or not happening) here?
How do we get cephadm to complete the task?

Current status is:

# ceph orch upgrade status
{
"target_image": "quay.io/ceph/ceph:v16.2.13",
"in_progress": true,
"which": "Upgrading all daemon types on all hosts",
"services_complete": [],
"progress": "",
"message": "",
"is_paused": false
}

# ceph -s
  cluster:
id: 3098199a-c7f5-4baf-901c-f178131be6f4
health: HEALTH_WARN
There are daemons running an older version of ceph
 
  services:

mon: 5 daemons, quorum cephmon02,cephmon01,cephmon03,cephmon04,cephmon05 
(age 4d)
mgr: cephmon03(active, since 8d), standbys: cephmon01, cephmon02
mds: 2/2 daemons up, 1 standby, 2 hot standby
osd: 202 osds: 202 up (since 11d), 202 in (since 13d)
rgw: 2 daemons active (2 hosts, 1 zones)
 
  data:

volumes: 2/2 healthy
pools:   11 pools, 4961 pgs
objects: 98.84M objects, 347 TiB
usage:   988 TiB used, 1.3 PiB / 2.3 PiB avail
pgs: 4942 active+clean
 19   active+clean+scrubbing+deep
 
  io:

client:   89 MiB/s rd, 598 MiB/s wr, 25 op/s rd, 157 op/s wr
 
  progress:

Upgrade to quay.io/ceph/ceph:v16.2.13 (0s)
  []

# ceph versions
{
"mon": {
"ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific 
(stable)": 5
},
"mgr": {
"ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific 
(stable)": 3
},
"osd": {
"ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific 
(stable)": 48,
"ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific 
(stable)": 154
},
"mds": {
"ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific 
(stable)": 5
},
"rgw": {
"ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific 
(stable)": 2
},
"overall": {
"ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific 
(stable)": 56,
"ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific 
(stable)": 161
}
}

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph orch upgrade stuck between 16.2.7 and 16.2.13

2023-08-15 Thread Robert Sander

On 8/15/23 11:02, Eugen Block wrote:


I guess I would start looking on the nodes where it failed to upgrade
OSDs and check out the cephadm.log as well as syslog. Did you see
progress messages in the mgr log for the successfully updated OSDs (or
MON/MGR)?


The issue is that there is no information on which OSD cephadm tries to 
upgrade next. There is no failure reported. It seems to just sit there 
and wait for something.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph orch upgrade stuck between 16.2.7 and 16.2.13

2023-08-15 Thread Robert Sander

On 8/15/23 11:16, Curt wrote:
Probably not the issue, but do all your osd servers have internet 
access?  I've had a similar experience when one of our osd servers 
default gateway got changed, so it was just waiting to download and took 
a bit to timeout.


Yes, all nodes can manually pull the image from quay.io.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] cephadm orchestrator does not restart daemons [was: ceph orch upgrade stuck between 16.2.7 and 16.2.13]

2023-08-16 Thread Robert Sander

On 8/15/23 16:36, Adam King wrote:

with the log to cluster level already on debug, if you do a "ceph mgr 
fail" what does cephadm log to the cluster before it reports sleeping? 
It should at least be doing something if it's responsive at all. Also, 
in "ceph orch ps"  and "ceph orch device ls" are the REFRESHED columns 
reporting that they've refreshed the info recently (last 10 minutes for 
daemons, last 30 minutes for devices)?


They have been refreshed very recently.

The issue seems to be a bit larger than just the not working upgrade.

We are now not even able to restart a daemon.

When I issue the command

# ceph orch daemon restart crash.cephmon01

these two lines show up in the cephadm log but nothing else happens:

2023-08-16T10:35:41.640027+0200 mgr.cephmon01 [INF] Schedule restart daemon 
crash.cephmon01
2023-08-16T10:35:41.640497+0200 mgr.cephmon01 [DBG] _kick_serve_loop

The container for crash.cephmon01 does not get restarted.

It looks like the service loop does not get executed.

Can we see what jobs are in this queue and why they do not get executed?

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm orchestrator does not restart daemons [was: ceph orch upgrade stuck between 16.2.7 and 16.2.13]

2023-08-16 Thread Robert Sander

On 8/16/23 12:10, Eugen Block wrote:

I don't really have a good idea right now, but there was a thread [1]
about ssh sessions that are not removed, maybe that could have such an
impact? And if you crank up the debug level to 30, do you see anything
else?


It was something similar. There were leftover ceph-volume processes 
running on some of the OSD nodes. After killing them the cephadm 
orchestrator is now able to resume the upgrade.


As we also restarted the MGR processes (with systemctl restart 
CONTAINER) there were no leftover SSH sessions.


But the still running ceph-volume processes must have used a lock that 
blocked new cephadm commands.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Status of diskprediction MGR module?

2023-08-28 Thread Robert Sander

Hi,

Several years ago the diskprediction module was added to the MGR 
collecting SMART data from the OSDs.


There were local and cloud modes available claiming different 
accuracies. Now only the local mode remains.


What is the current status of that MGR module (diskprediction_local)?

We have a cluster where SMART data is available from the disks (tested 
with smartctl and visible in the Ceph dashboard), but even with an 
enabled diskprediction_local module no health and lifetime info is shown.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Status of diskprediction MGR module?

2023-08-28 Thread Robert Sander

On 8/28/23 13:26, Konstantin Shalygin wrote:


The module don't have new commits for more than two year


So diskprediction_local is unmaintained. Will it be removed?
It looks like a nice feature but when you try to use it it's useless.


I suggest to use smartctl_exporter [1] for monitoring drives health


I tried to deploy that with cephadm as a custom container.

Follow-up questions:

How do I tell cephadm that smartctl_exporter has to run in a priviledged 
container as root with all the devices?


How do I tell the cephadm managed Prometheus that it can scrape these 
new exporters?


How do I add a dashboard in cephadm managed Grafana that shows the 
values from smartctl_exporter? Where do I get such a dashboard?


How do I add alerts to the cephadm managed Alert-Manager? Where do I get 
useful alert definitions for smartctl_exporter metrics?


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Separating Mons and OSDs in Ceph Cluster

2023-09-11 Thread Robert Sander

Hi,

On 9/9/23 09:34, Ramin Najjarbashi wrote:


The primary goal is to deploy new Monitors on different servers without
causing service interruptions or disruptions to data availability.


Just do that. New MONs will be added to the mon map which will be 
distributed to all running components. All OSDs will immediately know 
about the new MONs.


The same goes when removing an old MON.

After that you have to update the ceph.conf on each host to make the 
change "reboot safe".


No need to restart any other component including OSDs.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph services failing to start after OS upgrade

2023-09-13 Thread Robert Sander

On 12.09.23 14:51, hansen.r...@live.com.au wrote:


I have a ceph cluster running on my proxmox system and it all seemed to upgrade 
successfully however after the reboot my ceph-mon and my ceph-osd services are 
failing to start or are crashing by the looks of it.


You should ask that question on the Proxmox forum at 
https://forum.proxmox.com/ as they distribute their own Ceph packages.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Status of IPv4 / IPv6 dual stack?

2023-09-15 Thread Robert Sander

Hi,

as the documentation sends mixed signals in

https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/#ipv4-ipv6-dual-stack-mode

"Note

Binding to IPv4 is enabled by default, so if you just add the option to 
bind to IPv6 you’ll actually put yourself into dual stack mode."


and

https://docs.ceph.com/en/latest/rados/configuration/msgr2/#address-formats

"Note

The ability to bind to multiple ports has paved the way for dual-stack 
IPv4 and IPv6 support. That said, dual-stack operation is not yet 
supported as of Quincy v17.2.0."


just the quick questions:

Is a dual stacked networking with IPv4 and IPv6 now supported or not?
From which version on is it considered stable?
Are OSDs now able to register themselves with two IP addresses in the 
cluster map? MONs too?


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Status of IPv4 / IPv6 dual stack?

2023-09-19 Thread Robert Sander

On 9/18/23 11:19, Stefan Kooman wrote:


IIIRC, the "enable dual" stack PR's were more or less "accidentally"
merged


So this looks like a big NO on the dual stack support for Ceph.

I just need an answer, I do not need dual stack support.

It would be nice if the documentation was a little be clearer on this topic.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm configuration in git

2023-10-11 Thread Robert Sander

On 10/11/23 11:42, Kamil Madac wrote:


Is it possible to do it with cephadm? Is it possible to have some config
files in git and then apply  same cluster configuration on multiple
clusters? Or is this approach not aligned with cephadm and we should do it
different way?


You can export the service specifications with "ceph orch ls --export" 
and import the YAML file with "ceph orch apply -i …".


This does not cover the hosts in the cluster.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to deal with increasing HDD sizes ? 1 OSD for 2 LVM-packed HDDs ?

2023-10-18 Thread Robert Sander

On 10/18/23 09:25, Renaud Jean Christophe Miel wrote:

Hi,

Use case:
* Ceph cluster with old nodes having 6TB HDDs
* Add new node with new 12TB HDDs

Is it supported/recommended to pack 2 6TB HDDs handled by 2 old OSDs
into 1 12TB LVM disk handled by 1 new OSD ?


The 12 TB HDD will get double the IO than one of the 6 TB HDDs.
But it will still only be able to handle about 120 IOPs.
This makes the newer larger HDDs a bottleneck when run in the same pool.

If you are not planning to decommission the smaller HDDs it is 
recommended to use the larger ones in a separate pool for performance 
reasons.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Robert Sander

Hi,

On 11/2/23 11:28, Mohamed LAMDAOUAR wrote:


   I have 7 machines on CEPH cluster, the service ceph runs on a docker
container.
  Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
   During a reboot, the ssd bricked on 4 machines, the data are available on
the HDD disk but the nvme is bricked and the system is not available. is it
possible to recover the data of the cluster (the data disk are all
available)


You can try to recover the MON db from the OSDs, as they keep a copy of it:

https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-mon/#monitor-store-failures

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Robert Sander

On 11/2/23 12:48, Mohamed LAMDAOUAR wrote:


I reinstalled the OS on a  new SSD disk. How can I rebuild my cluster with
only one mons.


If there is one MON still operating you can try to extract its monmap 
and remove all the other MONs from it with the monmaptool:


https://docs.ceph.com/en/latest/man/8/monmaptool/
https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#recovering-a-monitor-s-broken-monmap

This way the remaining MON will be the only one in the map and will have 
quorum and the cluster will work again.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Robert Sander

Hi,

On 11/2/23 13:05, Mohamed LAMDAOUAR wrote:


when I ran this command, I got this error (because the database of the 
osd was on the boot disk)


The RocksDB part of the OSD was on the failed SSD?

Then the OSD is lost and cannot be recovered.
The RocksDB contains the information where each object is stored on the 
OSD data partition and without it nobody knows where each object is. The 
data is lost.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph storage pool error

2023-11-08 Thread Robert Sander

Hi,

On 11/7/23 12:35, necoe0...@gmail.com wrote:

Ceph 3 clusters are running and the 3rd cluster gave an error, it is currently 
offline. I want to get all the remaining data in 2 clusters. Instead of fixing 
ceph, I just want to save the data. How can I access this data and connect to 
the pool? Can you help me?1 and 2 clusters are working. I want to view my data 
from them and then transfer them to another place. How can I do this? I have 
never used Ceph before.


Please send the output of:

ceph -s
ceph health detail
ceph osd df tree

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Where is a simple getting started guide for a very basic cluster?

2023-11-26 Thread Robert Sander

Hi,

have you read https://docs.ceph.com/en/reef/cephadm/install/ ?

Bootstrapping a new cluster should be as easy as

 # cephadm bootstrap --mon-ip **

if the nodes fulfill the requirements:

- Python 3
- Systemd
- Podman or Docker for running containers
- Time synchronization (such as chrony or NTP)
- LVM2 for provisioning storage devices

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Where is a simple getting started guide for a very basic cluster?

2023-11-28 Thread Robert Sander

On 11/28/23 17:50, Leo28C wrote:
Problem is I don't have the cephadm command on every node. Do I need it 
on all nodes or just one of them? I tried installing it via curl but my 
ceph version is 14.2.22 which is not on the archive anymore so the curl 
command returns a 404 error html file. How do I get cephadm for 14.2?


There is no cephadm for Ceph 14 as the orchestrator was first introduced 
in version 15.


Why are you talking about version 14 now anyhow?
As long as your nodes fulfill the requirements for cephadm you can 
install the latest version of Ceph.


PS: Please reply to the list.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph osd dump_historic_ops

2023-12-01 Thread Robert Sander

On 12/1/23 10:33, Phong Tran Thanh wrote:


ceph daemon osd.8 dump_historic_ops show the error, the command run on node
with osd.8
Can't get admin socket path: unable to get conf option admin_socket for
osd: b"error parsing 'osd': expected string of the form TYPE.ID, valid
types are: auth, mon, osd, mds, mgr, client\n"

I am running ceph cluster reef version by cephadmin install


When the daemons run in containers managed by the cephadm orchestrator 
the socket file has a different location and the command line tool ceph 
(run outisde the container) does not find it automatically.


You can run

# ceph daemon /var/run/ceph/$FSID/ceph-osd.$OSDID.asok dump_historic_ops

to use the socket outside the container.

Or you enter the container with

# cephadm enter --name osd.$OSDID

and then execute

# ceph daemon osd.$OSDID dump_historic_ops

inside the container.

$FSID is the UUID of the Ceph cluster, $OSDID is the OSD id.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: EC Profiles & DR

2023-12-05 Thread Robert Sander

On 12/5/23 10:01, duluxoz wrote:

Thanks David, I knew I had something wrong  :-)

Just for my own edification: Why is k=2, m=1 not recommended for
production? Considered to "fragile", or something else?


It is the same as a replicated pool with size=2. Only one host can go 
down. After that you risk to lose data.


Erasure coding is possible with a cluster size of 10 nodes or more.
With smaller clusters you have to go with replicated pools.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: EC Profiles & DR

2023-12-05 Thread Robert Sander

On 12/5/23 10:06, duluxoz wrote:


I'm confused - doesn't k4 m2 mean that you can loose any 2 out of the 6
osds?


Yes, but OSDs are not a good failure zone.
The host is the smallest failure zone that is practicable and safe 
against data loss.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Building new cluster had a couple of questions

2023-12-21 Thread Robert Sander

Hi,

On 12/21/23 14:50, Drew Weaver wrote:

#1 cephadm or ceph-ansible for management?


cephadm.

The ceph-ansible project writes in its README:

NOTE: cephadm is the new official installer, you should consider 
migrating to cephadm.


https://github.com/ceph/ceph-ansible


#2 Since the whole... CentOS thing... what distro appears to be the most 
straightforward to use with Ceph?  I was going to try and deploy it on Rocky 9.


Any distribution with a recent systemd, podman, LVM2 and time 
synchronization is viable. I prefer Debian, others RPM-based distributions.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Building new cluster had a couple of questions

2023-12-21 Thread Robert Sander

Hi,

On 21.12.23 19:11, Albert Shih wrote:


What is the advantage of podman vs docker ? (I mean not in general but for
ceph).


Docker comes with the Docker daemon that when it gets an update has to 
be restarted and restarts all containers. For a storage system not the 
best procedure.


Everything needed for the Ceph containers is provided by podman.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Building new cluster had a couple of questions

2023-12-21 Thread Robert Sander

Hi,

On 21.12.23 15:13, Nico Schottelius wrote:


I would strongly recommend k8s+rook for new clusters, also allows
running Alpine Linux as the host OS.


Why would I want to learn Kubernetes before I can deploy a new Ceph 
cluster when I have no need for K8s at all?


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Building new cluster had a couple of questions

2023-12-22 Thread Robert Sander

On 21.12.23 22:27, Anthony D'Atri wrote:


It's been claimed to me that almost nobody uses podman in production, but I 
have no empirical data.


I even converted clusters from Docker to podman while they stayed online 
thanks to "ceph orch redeploy".


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Building new cluster had a couple of questions

2023-12-22 Thread Robert Sander

Hi,

On 22.12.23 11:41, Albert Shih wrote:


for n in 1-100
   Put off line osd on server n
   Uninstall docker on server n
   Install podman on server n
   redeploy on server n
end


Yep, that's basically the procedure.

But first try it on a test cluster.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Building new cluster had a couple of questions

2023-12-22 Thread Robert Sander

On 22.12.23 11:46, Marc wrote:


Does podman have this still, what dockers has. That if you kill the docker 
daemon all tasks are killed?


Podman does not come with a daemon to start containers.

The Ceph orchestrator creates systemd units to start the daemons in 
podman containers.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Docs: active releases outdated

2024-01-03 Thread Robert Sander

Hi Eugen,

the release info is current only in the latest branch of the 
documentation: https://docs.ceph.com/en/latest/releases/


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm bootstrap on 3 network clusters

2024-01-03 Thread Robert Sander

Hi,

On 1/3/24 14:51, Luis Domingues wrote:


But when I bootstrap my cluster, I set my MON IP and CLUSTER NETWORK, and then 
the bootstrap process tries to add my bootstrap node using the MON IP.


IMHO the bootstrap process has to run directly on the first node.
The MON IP is local to this node. It is used to determine the public 
network.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm bootstrap on 3 network clusters

2024-01-03 Thread Robert Sander

Hi Luis,

On 1/3/24 16:12, Luis Domingues wrote:


My issue is that mon1 cannot connect via SSH to itself using pub network, and 
bootstrap fail at the end when cephadm tries to add mon1 to the list of hosts.


Why? The public network should not have any restrictions between the 
Ceph nodes. Same with the cluster network.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Cephadm orchestrator and special label _admin in 17.2.7

2024-01-18 Thread Robert Sander

Hi,

According to the documentation¹ the special host label _admin instructs 
the cephadm orchestrator to place a valid ceph.conf and the 
ceph.client.admin.keyring into /etc/ceph of the host.


I noticed that (at least) on 17.2.7 only the keyring file is placed in 
/etc/ceph, but not ceph.conf.


Both files are placed into the /var/lib/ceph//config directory.

Has something changed?

¹: 
https://docs.ceph.com/en/quincy/cephadm/host-management/#special-host-labels


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm orchestrator and special label _admin in 17.2.7

2024-01-18 Thread Robert Sander

Hi,

On 1/18/24 14:07, Eugen Block wrote:

I just tired that in my test cluster, removed the ceph.conf and admin 
keyring from /etc/ceph and then added the _admin label to the host via 
'ceph orch' and both were created immediately.


This is strange, I only get this:

2024-01-18T11:47:07.870079+0100 mgr.cephtest32.ybltym [INF] Added label _admin 
to host cephtest23
2024-01-18T11:47:07.878786+0100 mgr.cephtest32.ybltym [INF] Updating 
cephtest23:/var/lib/ceph/ba37db20-2b13-11eb-b8a9-871ba11409f6/config/ceph.conf
2024-01-18T11:47:08.045347+0100 mgr.cephtest32.ybltym [INF] Updating 
cephtest23:/etc/ceph/ceph.client.admin.keyring
2024-01-18T11:47:08.212303+0100 mgr.cephtest32.ybltym [INF] Updating 
cephtest23:/var/lib/ceph/ba37db20-2b13-11eb-b8a9-871ba11409f6/config/ceph.client.admin.keyring

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm orchestrator and special label _admin in 17.2.7

2024-01-18 Thread Robert Sander

On 1/18/24 14:28, Eugen Block wrote:

Is your admin keyring under management?


There is no issue with the admin keyring but with ceph.conf.

The config setting "mgr mgr/cephadm/manage_etc_ceph_ceph_conf" is set to 
true and "mgr mgr/cephadm/manage_etc_ceph_ceph_conf_hosts" was at "*", 
so all hosts. I have set that to "label:_admin".


It still does not put ceph.conf into /etc/ceph when adding the label _admin.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm orchestrator and special label _admin in 17.2.7

2024-01-19 Thread Robert Sander

Hi,

more strang behaviour:

When I isssue "ceph mgr fail" a backup MGR takes over and updates all 
config files on all hosts including /etc/ceph/ceph.conf.


At first I thought that this was the solution but when I now remove the 
_admin label and add it again the new MGR also does not update 
/etc/ceph/ceph.conf.


Only when I again do "ceph mgr fail" the new MGR will update 
/etc/ceph/ceph.conf on the hosts labeled with _admin.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How many pool for cephfs

2024-01-24 Thread Robert Sander

Hi,

On 1/24/24 09:40, Albert Shih wrote:


Knowing I got two class of osd (hdd and ssd), and I have a need of ~ 20/30
cephfs (currently and that number will increase with time).


Why do you need 20 - 30 separate CephFS instances?


and put all my cephfs inside two of them. Or should I create for each
cephfs a couple of pool metadata/data ?


Each CephFS instance needs their own pools, at least two (data + 
metadata) per instance. And each CephFS needs at least one MDS running, 
better with an additional cold or even hot standby MDS.



Il will also need to have ceph S3 storage, same question, should I have a
designated pool for S3 storage or can/should I use the same
cephfs_data_replicated/erasure pool ?


No, S3 needs its own pools. It cannot re-use CephFS pools.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How many pool for cephfs

2024-01-24 Thread Robert Sander

Hi,

On 1/24/24 10:08, Albert Shih wrote:


99.99% because I'm newbie with ceph and don't understand clearly how
the autorisation work with cephfs ;-)


I strongly recommend you to ask for a expierenced Ceph consultant that 
helps you design and setup your storage cluster.


It looks like you try to make design decisions that will heavily 
influence performance of the system.



If I say 20-30 it's because I currently have on my classic ZFS/NFS server
around 25 «datasets» exported to various server.


The next question is how would the "consumers" access the filesystem: 
Via NFS or mounted directly. Even with the second option you can 
separate client access via CephX keys as David already wrote.



Ok. I got for my ceph cluster two set of servers, first set are for
services (mgr,mon,etc.) with ssd and don't currently run any osd (but still
have 2 ssd not used), I also got a second set of server with HDD and 2 SSD. The 
data pool will be on
the second set (with HDD). Where should I run the MDS and on which osd ?


Do you intend to use the Ceph cluster only for archival storage?
Hwo large is your second set of Ceph nodes, how many HDDs in each? Do 
you intend to use the SSDs for the OSDs' RocksDB?
Where do you plan to store the metadata pools for CephFS? They should be 
stored on fats media.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Questions about the CRUSH details

2024-01-25 Thread Robert Sander

On 1/25/24 13:32, Janne Johansson wrote:


It doesn't take OSD usage into consideration except at creation time
or OSD in/out/reweighing (or manual displacements with upmap and so
forth), so this is why "ceph df" will tell you a pool has X free
space, where X is "smallest free space on the OSDs on which this pool
lies, times the number of OSDs". Given the pseudorandom placement of
objects to PGs, there is nothing to prevent you from having the worst
luck ever and all the objects you create end up on the OSD with least
free space.


This is why you need a decent amount of PGs, to not run into statistical 
edge cases.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Understanding subvolumes

2024-02-02 Thread Robert Sander

On 01.02.24 00:20, Matthew Melendy wrote:
In our department we're getting starting with Ceph 'reef', using Ceph 
FUSE client for our Ubuntu workstations.


So far so good, except I can't quite figure out one aspect of subvolumes.


AFAIK subvolumes were introduced to be used with Kubernetes and other 
cloud technologies.


If you run a classical file service on top of CephFS you usually do not 
need subvolumes but can go with normal quotas on directories.


Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Some questions about cephadm

2024-02-26 Thread Robert Sander

Hi,

On 26.02.24 11:08, wodel youchi wrote:


Then I tried to deploy using this command on the admin node:
cephadm --image 192.168.2.36:4000/ceph/ceph:v17 bootstrap --mon-ip
10.1.0.23 --cluster-network 10.2.0.0/16

After the boot strap I found that it still downloads the images from the
internet, even the ceph image itself, I see two images one from my registry
the second from quay.


To quote the docs: you can run cephadm bootstrap -h to see all of cephadm’s 
available options.

These options are available:

  --registry-url REGISTRY_URL
url for custom registry
  --registry-username REGISTRY_USERNAME
username for custom registry
  --registry-password REGISTRY_PASSWORD
password for custom registry
  --registry-json REGISTRY_JSON
json file with custom registry login info (URL, 
Username, Password)

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Some questions about cephadm

2024-02-26 Thread Robert Sander

Hi,

On 2/26/24 13:22, wodel youchi wrote:


No didn't work, the bootstrap is still downloading the images from quay.


For the image locations of the monitoring stack you have to create an 
initical ceph.conf like it is mentioned in the chapter you referred 
earlier: 
https://docs.ceph.com/en/reef/cephadm/install/#deployment-in-an-isolated-environment


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm and Ceph.conf

2024-02-26 Thread Robert Sander

On 2/26/24 14:24, Michael Worsham wrote:

I deployed a Ceph reef cluster using cephadm. When it comes to the ceph.conf 
file, which file should I be editing for making changes to the cluster - the 
one running under the docker container or the local one on the Ceph monitors?


None of both. You can adjust settings with "ceph config" or the 
Configuration tab of the Dashboard.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm and Ceph.conf

2024-02-26 Thread Robert Sander

On 2/26/24 15:24, Michael Worsham wrote:

So how would I be able to put in configurations like this into it?

[global]
 fsid = 46620486-b8a6-11ee-bf23-6510c4d9efa7
 mon_host = [v2:10.20.27.10:3300/0,v1:10.20.27.10:6789/0] 
[v2:10.20.27.11:3300/0,v1:10.20.27.11:6789/0]
 osd pool default size = 3
 osd pool default min size = 2
 osd pool default pg num = 256
 osd pool default pgp num = 256
 mon_max_pg_per_osd = 800
 osd max pg per osd hard ratio = 10
 mon allow pool delete = true
 auth cluster required = cephx
 auth service required = cephx
 auth client required = cephx
 ms_mon_client_mode = crc

[client.radosgw.mon1]
 host = ceph-mon1
 log_file = /var/log/ceph/client.radosgw.mon1.log
 rgw_dns_name = ceph-mon1
 rgw_frontends = "beast port=80 num_threads=500"
 rgw_crypt_require_ssl = false


ceph config assimilate-conf may be of help here.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgraded 16.2.14 to 16.2.15

2024-03-05 Thread Robert Sander

Hi,

On 3/5/24 08:57, Eugen Block wrote:


extra_entrypoint_args:
   - 
'--mon-rocksdb-options=write_buffer_size=33554432,compression=kLZ4Compression,level_compaction_dynamic_level_bytes=true,bottommost_compression=kLZ4HCCompression,max_background_jobs=4,max_subcompactions=2'


When I try this on my test cluster with Reef 18.2.1 the orchestrator tells me:

# ceph orch apply -i mon.yml
Error EINVAL: ServiceSpec: __init__() got an unexpected keyword argument 
'extra_entrypoint_args'

It's a documented feature:

https://docs.ceph.com/en/reef/cephadm/services/#cephadm-extra-entrypoint-args

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


  1   2   3   4   >