[ceph-users] Re: Create OSDs MANUALLY

2023-08-22 Thread Anh Phan Tuan
You don't need to create OSDs manual to get what you want.
Cephadm has two options to control that in OSD specification.
OSD Service — Ceph Documentation


block_db_size*: Union[int, str, None]*


Set (or override) the “bluestore_block_db_size” value, in bytes
db_slots


How many OSDs per DB device

You can use these two options to control the size allocation on ssd.

regards,
Anh Phan

On Tue, Aug 22, 2023 at 11:56 PM Alfredo Rezinovsky 
wrote:

> SSD driver works awful when full.
> Even if I  set DB to ssd for 4 OSDs and theres 2 the dashboard daemon
> allocates all the ssd.
>
> I want to partition only 70% of the SSD for DB/WAL and leave the rest for
> SSD maneouvering.
>
> There's a way to create an OSD telling manually disk or partitions to user
> for data and DB (like the way I used to do it with ceph-deploy)?
>
>
> --
> Alfrenovsky
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph osd error log

2023-08-22 Thread Peter
Hi Ceph community,


My cluster has lots of logs regarding an error that  ceph-osd. I am 
encountering the following error message in the logs:

Aug 22 00:01:28 host008 ceph-osd[3877022]: 2023-08-22T00:01:28.347-0700 
7fef85251700 -1 Fail to open '/proc/3850681/cmdline' error = (2) No such file 
or directory


My cluster is working healthy and I am looking to gain a better understanding 
of this error and its implications for the system's functioning to avoid 
protencial issue in the future.


root@ceph001:~# ceph -v
ceph version 16.2.13 (b81a1d7f978c8d41cf452da7af14e190542d2ee2) pacific (stable)



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Create OSDs MANUALLY

2023-08-22 Thread Alfredo Rezinovsky
SSD driver works awful when full.
Even if I  set DB to ssd for 4 OSDs and theres 2 the dashboard daemon
allocates all the ssd.

I want to partition only 70% of the SSD for DB/WAL and leave the rest for
SSD maneouvering.

There's a way to create an OSD telling manually disk or partitions to user
for data and DB (like the way I used to do it with ceph-deploy)?


-- 
Alfrenovsky
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Client failing to respond to capability release

2023-08-22 Thread Frank Schilder
Hi all,

I have this warning the whole day already (octopus latest cluster):

HEALTH_WARN 4 clients failing to respond to capability release; 1 pgs not 
deep-scrubbed in time
[WRN] MDS_CLIENT_LATE_RELEASE: 4 clients failing to respond to capability 
release
mds.ceph-24(mds.1): Client sn352.hpc.ait.dtu.dk:con-fs2-hpc failing to 
respond to capability release client_id: 145698301
mds.ceph-24(mds.1): Client sn463.hpc.ait.dtu.dk:con-fs2-hpc failing to 
respond to capability release client_id: 189511877
mds.ceph-24(mds.1): Client sn350.hpc.ait.dtu.dk:con-fs2-hpc failing to 
respond to capability release client_id: 189511887
mds.ceph-24(mds.1): Client sn403.hpc.ait.dtu.dk:con-fs2-hpc failing to 
respond to capability release client_id: 231250695

If I look at the session info from mds.1 for these clients I see this:

# ceph tell mds.1 session ls | jq -c '[.[] | {id: .id, h: 
.client_metadata.hostname, addr: .inst, fs: .client_metadata.root, caps: 
.num_caps, req: .request_load_avg}]|sort_by(.caps)|.[]' | grep -e 145698301 -e 
189511877 -e 189511887 -e 231250695
{"id":189511887,"h":"sn350.hpc.ait.dtu.dk","addr":"client.189511887 
v1:192.168.57.221:0/4262844211","fs":"/hpc/groups","caps":2,"req":0}
{"id":231250695,"h":"sn403.hpc.ait.dtu.dk","addr":"client.231250695 
v1:192.168.58.18:0/1334540218","fs":"/hpc/groups","caps":3,"req":0}
{"id":189511877,"h":"sn463.hpc.ait.dtu.dk","addr":"client.189511877 
v1:192.168.58.78:0/3535879569","fs":"/hpc/groups","caps":4,"req":0}
{"id":145698301,"h":"sn352.hpc.ait.dtu.dk","addr":"client.145698301 
v1:192.168.57.223:0/2146607320","fs":"/hpc/groups","caps":7,"req":0}

We have mds_min_caps_per_client=4096, so it looks like the limit is well 
satisfied. Also, the file system is pretty idle at the moment.

Why and what exactly is the MDS complaining about here?

Thanks and best regards.
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: snaptrim number of objects

2023-08-22 Thread Mark Nelson

On 8/21/23 17:38, Angelo Höngens wrote:


On 21/08/2023 16:47, Manuel Lausch wrote:

Hello,

on my testcluster I played a bit with ceph quincy (17.2.6).
I also see slow ops while deleting snapshots. With the previous major
(pacific) this wasn't a issue.
In my case this is related to the new mclock scheduler which is
defaulted with quincy. With "ceph config set global osd_op_queue wpq".
Thie issue is gone.(after restarting the OSDs of course). wpq was the
previous default scheduler.

Maybe this will help you.

On the other hand, mclock shouldn't break down the cluster in this way.
At least not with "high_client_ops" which I used. Maybe someone should
have a look at this.


Manuel

Hey Manuel,

You made me a happy man (for now!)

In short: wpq indeed seems to do way better in my setup.

We did a lot of tuning with the mclock scheduler, tuned
osd_mclock_max_capacity_iops_hdd, tried a lot of different settings
for osd_snap_trim_sleep_hdd/ssd, etc, but it did not yet have the
desired effect. The only thing that prevented my cluster from going
down, was setting osd_max_trimming_pgs to 0 on all disks, and set it
to 1 or 2 for a few OSD's at a time. As soon as I enable too many
OSD's, everything would bog down, slow ops everywhere, hanging cephfs
clients, etc. I think I could do a max of 100 objects/sec
snaptrimming.

I also played around with the different mclock profiles to speed up
recovery. I think with high_client_ops, we got 400-600MB/s client io,
and about 50MB/s recovery io (we had a few degraded objects and some
rebalancing). With the high_recovery_ops profile I think I was able to
get around 400-500MB/s client write, and 300MB/s recovery. As soon as
I enabled snaptrimming, stuff would get quite a bit slower.

At your suggestion I just changed the osd_op_queue to wpq, removed
almost all other osd config variables and restarted all osd's.

Now I see 400-600MB/s client i/o (normal) AND I see recovery at
1500MB/s(!) AND it's also snaptrimmimng at 250 objects/sec. And I
haven't seen the first slow op warning yet!

I'm still cautious, but for now, this looks very positive!

This also leads me to agree with you there's 'something wrong' with
the mclock scheduler. I was almost starting to suspect hardware issues
or something like that, I was at my wit's end.


Angelo.



If you have the inclination, I would also be very curious if enabling 
this helps:


"rocksdb_cf_compact_on_deletion"


This is a new feature we added in reef and backported to quincy/pacific 
in a disabled state that issues compaction if too many tombstones are 
encountered during iteration in RocksDB.  You can control how quickly to 
issue compactions using:



"rocksdb_cf_compact_on_deletion_trigger" (default is 16384, shrink to 
increase compaction frequency)


"rocksdb_cf_compact_on_deletion_sliding_window" (default is 32768, grow 
to increase compaction frequency)



The combination of these two parameters dictates how many X tombstones 
you must encounter over Y keys before triggering a compaction.  The 
default is pretty conservative so you may need to play with it if you 
are hitting too many tombstones.  If compactions are trigger too 
frequently you can increase the number of X allowed tombstones per Y keys.


Mark




On Mon, Aug 21, 2023 at 4:49 PM Manuel Lausch  wrote:

Hello,

on my testcluster I played a bit with ceph quincy (17.2.6).
I also see slow ops while deleting snapshots. With the previous major
(pacific) this wasn't a issue.
In my case this is related to the new mclock scheduler which is
defaulted with quincy. With "ceph config set global osd_op_queue wpq".
Thie issue is gone.(after restarting the OSDs of course). wpq was the
previous default scheduler.

Maybe this will help you.

On the other hand, mclock shouldn't break down the cluster in this way.
At least not with "high_client_ops" which I used. Maybe someone should
have a look at this.


Manuel



On Fri, 4 Aug 2023 17:40:42 -0400
Angelo Höngens  wrote:


Hey guys,

I'm trying to figure out what's happening to my backup cluster that
often grinds to a halt when cephfs automatically removes snapshots.
Almost all OSD's go to 100% CPU, ceph complains about slow ops, and
CephFS stops doing client i/o.

I'm graphing the cumulative value of the snaptrimq_len value, and that
slowly decreases over time. One night it takes an hour, but other
days, like today, my cluster has been down for almost 20 hours, and I
think we're half way. Funny thing is that in both cases, the
snaptrimq_len value initially goes to the same value, around 3000, and
then slowly decreases, but my guess is that the number of objects that
need to be trimmed varies hugely every day.

Is there a way to show the size of cephfs snapshots, or get the number
of objects or bytes that need snaptrimming? Perhaps I can graph that
and see where the differences are.

That won't explain why my cluster bogs down, but at least it gives
some visibility. Running 17.2.6 everywhere by the way.

Angelo.

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-22 Thread Konstantin Shalygin
Hi,

This how OSD's woks. For change the network subnet you need to setup 
reachability of both: old and new network, until end of migration

k
Sent from my iPhone

> On 22 Aug 2023, at 10:43, Boris Behrens  wrote:
> 
> The OSDs are still only bound to one IP address.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Windows 2016 RBD Driver install failure

2023-08-22 Thread Robert Ford
Hello,

We have been running into an issue installing the pacific windows rbd
driver on windows 2016. It has no issues with either 2019 or 2022. It
looks like it fails at checkpoint creation. We are installing it as
admin. Has anyone seen this before or know of a solution?

The closest thing I can find to why it wont install:

   *** Product: D:\software\ceph_pacific_beta.msi
   *** Action: INSTALL
   *** CommandLine: **
MSI (s) (CC:24) [12:31:30:315]: Machine policy value
'DisableUserInstalls' is 0
MSI (s) (CC:24) [12:31:30:315]: Note: 1: 2203 2:
C:\windows\Installer\inprogressinstallinfo.ipi 3: -2147287038 
MSI (s) (CC:24) [12:31:30:315]: Machine policy value
'LimitSystemRestoreCheckpointing' is 0
MSI (s) (CC:24) [12:31:30:315]: Note: 1: 1715 2: Ceph for Windows 
MSI (s) (CC:24) [12:31:30:315]: Calling SRSetRestorePoint API.
dwRestorePtType: 0, dwEventType: 102, llSequenceNumber: 0,
szDescription: "Installed Ceph for Windows".
MSI (s) (CC:24) [12:31:30:315]: The call to SRSetRestorePoint API
failed. Returned status: 0. GetLastError() returned: 127
  
-- 
-- 


Robert Ford
GoDaddy | SRE III
9519020587
Phoenix, AZ
rf...@godaddy.com
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw-admin sync error trim seems to do nothing

2023-08-22 Thread Matthew Darwin

Thanks Rich,

On quincy it seems that provding an end-date is an error.  Any other 
ideas from anyone?


$ radosgw-admin sync error trim --end-date="2023-08-20 23:00:00"
end-date not allowed.

On 2023-08-20 19:00, Richard Bade wrote:

Hi Matthew,
At least for nautilus (14.2.22) i have discovered through trial and
error that you need to specify a beginning or end date. Something like
this:
radosgw-admin sync error trim --end-date="2023-08-20 23:00:00"
--rgw-zone={your_zone_name}

I specify the zone as there's a error list for each zone.
Hopefully that helps.

Rich

--

Date: Sat, 19 Aug 2023 12:48:55 -0400
From: Matthew Darwin 
Subject: [ceph-users] radosgw-admin sync error trim seems to do
   nothing
To: Ceph Users 
Message-ID: <95e7edfd-ca29-fc0e-a30a-987f1c43e...@mdarwin.ca>
Content-Type: text/plain; charset=UTF-8; format=flowed

Hello all,

"radosgw-admin sync error list" returns errors from 2022.  I want to
clear those out.

I tried "radosgw-admin sync error trim" but it seems to do nothing.
The man page seems to offer no suggestions
https://protect-au.mimecast.com/s/26o0CzvkGRhLoOXfXjZR3?domain=docs.ceph.com

Any ideas what I need to do to remove old errors? (or at least I want
to see more recent errors)

ceph version 17.2.6 (quincy)

Thanks.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] When to use the auth profiles simple-rados-client and profile simple-rados-client-with-blocklist?

2023-08-22 Thread Christian Rohmann

Hey ceph-users,

1) When configuring Gnocchi to use Ceph storage (see 
https://gnocchi.osci.io/install.html#ceph-requirements)

I was wondering if one could use any of the auth profiles like
 * simple-rados-client
 * simple-rados-client-with-blocklist ?

Or are those for different use cases?

2) I was also wondering why the documentation mentions "(Monitor only)" 
but then it says

"Gives a user read-only permissions for monitor, OSD, and PG data."?

3) And are those profiles really for "read-only" users? Why don't they 
have "read-only" in their name like the rbd and the corresponding 
"rbd-read-only" profile?



Regards


Christian


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-22 Thread Boris Behrens
Yes, I did change the mon_host config in ceph.conf.
Then I restarted all ceph services with systemctl restart ceph.target

After the restart, nothing changed. I rebooted the host then, and now all
OSDs are attached to the new network.

I thought that OSDs can attach to different networks, or even to ALL ip
addresses.

Am Di., 22. Aug. 2023 um 09:53 Uhr schrieb Eugen Block :

> Can you add some more details? Did you change the mon_host in
> ceph.conf and then rebooted? So the OSDs do work correctly now within
> the new network? OSDs do only bind to one public and one cluster IP,
> I'm not aware of a way to have them bind to multiple public IPs like
> the MONs can. You'll probably need to route the compute node traffic
> towards the new network. Please correct me if I misunderstood your
> response.
>
> Zitat von Boris Behrens :
>
> > The OSDs are still only bound to one IP address.
> > After a reboot, the OSDs switched to the new address and are now
> > unreachable from the compute nodes.
> >
> >
> >
> > Am Di., 22. Aug. 2023 um 09:17 Uhr schrieb Eugen Block :
> >
> >> You'll need to update the mon_host line as well. Not sure if it makes
> >> sense to have both old and new network in there, but I'd try on one
> >> host first and see if it works.
> >>
> >> Zitat von Boris Behrens :
> >>
> >> > We're working on the migration to cephadm, but it requires some
> >> > prerequisites that still needs planing.
> >> >
> >> > root@host:~# cat /etc/ceph/ceph.conf ; ceph config dump
> >> > [global]
> >> > fsid = ...
> >> > mon_host = [OLD_NETWORK::10], [OLD_NETWORK::11], [OLD_NETWORK::12]
> >> > #public_network = OLD_NETWORK::/64, NEW_NETWORK::/64
> >> > ms_bind_ipv6 = true
> >> > ms_bind_ipv4 = false
> >> > auth_cluster_required = none
> >> > auth_service_required = none
> >> > auth_client_required = none
> >> >
> >> > [client]
> >> > ms_mon_client_mode = crc
> >> > #debug_rgw = 20
> >> > rgw_frontends = beast endpoint=[OLD_NETWORK::12]:7480
> >> > rgw_region = ...
> >> > rgw_zone = ...
> >> > rgw_thread_pool_size = 512
> >> > rgw_dns_name = ...
> >> > rgw_dns_s3website_name = ...
> >> >
> >> > [mon-new]
> >> > public_addr = fNEW_NETWORK::12
> >> > public_bind_addr = NEW_NETWORK::12
> >> >
> >> > WHO   MASK  LEVEL OPTION
> >> > VALUE
> >> >RO
> >> > global  advanced  auth_client_required
> >> > none
> >> > *
> >> > global  advanced  auth_cluster_required
> >> >  none
> >> > *
> >> > global  advanced  auth_service_required
> >> >  none
> >> > *
> >> > global  advanced  mon_allow_pool_size_one
> >> >  true
> >> > global  advanced  ms_bind_ipv4
> >> > false
> >> > global  advanced  ms_bind_ipv6
> >> > true
> >> > global  advanced  osd_pool_default_pg_autoscale_mode
> >> > warn
> >> > global  advanced  public_network
> >> > OLD_NETWORK::/64, NEW_NETWORK::/64
> >> >   *
> >> > mon advanced
> auth_allow_insecure_global_id_reclaim
> >> >  false
> >> > mon advanced  mon_allow_pool_delete
> >> >  false
> >> > mgr advanced  mgr/balancer/active
> >> >  true
> >> > mgr advanced  mgr/balancer/mode
> >> >  upmap
> >> > mgr advanced  mgr/cephadm/migration_current
> >> 5
> >> >
> >> >  *
> >> > mgr advanced  mgr/orchestrator/orchestrator
> >> >  cephadm
> >> > mgr.0cc47a6df14ebasic container_image
> >> >
> >>
> quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
> >> >  *
> >> > mgr.0cc47aad8ce8basic container_image
> >> >
> >>
> quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
> >> >  *
> >> > osd.0   basic osd_mclock_max_capacity_iops_ssd
> >> > 13295.404086
> >> > osd.1   basic osd_mclock_max_capacity_iops_ssd
> >> > 14952.522452
> >> > osd.2   basic osd_mclock_max_capacity_iops_ssd
> >> > 13584.113025
> >> > osd.3   basic osd_mclock_max_capacity_iops_ssd
> >> > 16421.770356
> >> > osd.4   basic osd_mclock_max_capacity_iops_ssd
> >> > 15209.375302
> >> > osd.5   basic osd_mclock_max_capacity_iops_ssd
> >> > 15333.697366
> >> >
> >> > Am Mo., 21. Aug. 2023 um 14:20 Uhr schrieb Eugen Block  >:
> >> >
> >> >> Hi,
> >> >>
> >> >> > I don't have those configs. The cluster is not maintained via
> cephadm
> >> /
> >> >> > orchestrator.
> >> >>
> >> >> I just assumed that with Quincy it already would be managed by
> >> >> cephadm. So what does the ceph.conf currently look like on an OSD
> host
> >> >> (mask sensitive data)?
> >> >>
> >> >> Zitat von Boris Behrens :
> >> >>
> >> >> > Hey Eugen,
> >> >> > I don't have those configs. The cluster is not maintained via
> cephadm
> 

[ceph-users] CephFS: convert directory into subvolume

2023-08-22 Thread Eugen Block

Hi,

while writing a response to [1] I tried to convert an existing  
directory within a single cephfs into a subvolume. According to [2]  
that should be possible, I'm just wondering how to confirm that it  
actually worked. Because setting the xattr works fine, the directory  
just doesn't show up in the subvolume ls command. This is what I tried  
(in Reef and Pacific):


# one "regular" subvolume already exists
$ ceph fs subvolume ls cephfs
[
{
"name": "subvol1"
}
]

# mounted / and created new subdir
$ mkdir /mnt/volumes/subvol2
$ setfattr -n ceph.dir.subvolume -v 1 /mnt/volumes/subvol2

# still only one subvolume
$ ceph fs subvolume ls cephfs
[
{
"name": "subvol1"
}
]

I also tried it directly underneath /mnt:

$ mkdir /mnt/subvol2
$ setfattr -n ceph.dir.subvolume -v 1 /mnt/subvol2

But still no subvolume2 available. What am I missing here?

Thanks
Eugen

[1]  
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/G4ZWGGUPPFQIOVB4SFAIK73H3NLU2WRF/

[2] https://www.spinics.net/lists/ceph-users/msg72341.html

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Patch change for CephFS subvolume

2023-08-22 Thread Michal Strnad

Hi Eugen,

thank you for the message. I've already tried the process of changing 
the classic directory to a subvolume, but it also didn't appear in the 
list of set subvolumes. Perhaps it's no longer supported?


Michal


On 8/22/23 12:56, Eugen Block wrote:

Hi,

I don't know if there's a way to change the path (I assume not except 
creating a new path and copy the data), but you could set up a directory 
the "old school" way (mount the root filesystem, create your 
subdirectory tree) and then convert the directory into a subvolume by 
setting the subvolume xattr (according to [1]):


# setfattr -n ceph.dir.subvolume -v 1 my/favorite/dir/is/now/a/subvol1

But I'm not sure if that works as expected, I tried with Pacific and 
Reef but the directory doesn't turn up in 'ceph fs subvolume ls 
'. Shouldn't it? Or am I misunderstanding hwo it works?


[1] https://www.spinics.net/lists/ceph-users/msg72341.html


Zitat von Michal Strnad :


Hi!

I'm trying to figure out how to specify the path for a CephFS 
subvolume, as it's intended to represent a user's home directory. By 
default, it's located at /volumes/_nogroup/$NAME/$UUID. Is it possible 
to change this path somehow, or is using symbolic links the only option?


Thank you
Michal Strnad



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Michal Strnad
Oddeleni datovych ulozist
CESNET z.s.p.o.


smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: EC pool degrades when adding device-class to crush rule

2023-08-22 Thread Lars Fenneberg
Hey Eugen!

Quoting Eugen Block (ebl...@nde.ag):

> >When a client writes an object to the primary OSD, the primary OSD
> >is responsible for writing the replicas to the replica OSDs. After
> >the primary OSD writes the object to storage, the PG will remain
> >in a degraded state until the primary OSD has received an
> >acknowledgement from the replica OSDs that Ceph created the
> >replica objects successfully.
> 
> Applying that to your situation where PGs are moved across nodes
> (well, they're not moved but recreated) it can take quite some time
> until they become fully available, depending on the PG size and the
> number of objects in it. So unless you don't have "inactive PGs"
> you're fine in a degraded state as long as it resolves eventually.
> Having degraded PGs is nothing unusual, e. g. during maintenance
> when a server is rebooted.

Thank you for your reply and reassurance.  I've changed the rule on my
production cluster now and there aren't any degraded PGs at all, only
misplaced objects as I had originally hoped.

The main difference seems to be that Ceph on the production cluster decided
to backfill all of the PGs.  On my test cluster Ceph started recovery
operations instead.  I think this might have something to do with the number
of OSDs which is different between test and production (14 versus 42) or with
the number of objects in the PG. Probably the latter.

Kind regards,
LF.
-- 
Lars Fenneberg, l...@elemental.net
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Patch change for CephFS subvolume

2023-08-22 Thread Eugen Block

Hi,

I don't know if there's a way to change the path (I assume not except  
creating a new path and copy the data), but you could set up a  
directory the "old school" way (mount the root filesystem, create your  
subdirectory tree) and then convert the directory into a subvolume by  
setting the subvolume xattr (according to [1]):


# setfattr -n ceph.dir.subvolume -v 1 my/favorite/dir/is/now/a/subvol1

But I'm not sure if that works as expected, I tried with Pacific and  
Reef but the directory doesn't turn up in 'ceph fs subvolume ls  
'. Shouldn't it? Or am I misunderstanding hwo it works?


[1] https://www.spinics.net/lists/ceph-users/msg72341.html


Zitat von Michal Strnad :


Hi!

I'm trying to figure out how to specify the path for a CephFS  
subvolume, as it's intended to represent a user's home directory. By  
default, it's located at /volumes/_nogroup/$NAME/$UUID. Is it  
possible to change this path somehow, or is using symbolic links the  
only option?


Thank you
Michal Strnad



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Patch change for CephFS subvolume

2023-08-22 Thread Michal Strnad

Hi!

I'm trying to figure out how to specify the path for a CephFS subvolume, 
as it's intended to represent a user's home directory. By default, it's 
located at /volumes/_nogroup/$NAME/$UUID. Is it possible to change this 
path somehow, or is using symbolic links the only option?


Thank you
Michal Strnad


smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-22 Thread Eugen Block
Can you add some more details? Did you change the mon_host in  
ceph.conf and then rebooted? So the OSDs do work correctly now within  
the new network? OSDs do only bind to one public and one cluster IP,  
I'm not aware of a way to have them bind to multiple public IPs like  
the MONs can. You'll probably need to route the compute node traffic  
towards the new network. Please correct me if I misunderstood your  
response.


Zitat von Boris Behrens :


The OSDs are still only bound to one IP address.
After a reboot, the OSDs switched to the new address and are now
unreachable from the compute nodes.



Am Di., 22. Aug. 2023 um 09:17 Uhr schrieb Eugen Block :


You'll need to update the mon_host line as well. Not sure if it makes
sense to have both old and new network in there, but I'd try on one
host first and see if it works.

Zitat von Boris Behrens :

> We're working on the migration to cephadm, but it requires some
> prerequisites that still needs planing.
>
> root@host:~# cat /etc/ceph/ceph.conf ; ceph config dump
> [global]
> fsid = ...
> mon_host = [OLD_NETWORK::10], [OLD_NETWORK::11], [OLD_NETWORK::12]
> #public_network = OLD_NETWORK::/64, NEW_NETWORK::/64
> ms_bind_ipv6 = true
> ms_bind_ipv4 = false
> auth_cluster_required = none
> auth_service_required = none
> auth_client_required = none
>
> [client]
> ms_mon_client_mode = crc
> #debug_rgw = 20
> rgw_frontends = beast endpoint=[OLD_NETWORK::12]:7480
> rgw_region = ...
> rgw_zone = ...
> rgw_thread_pool_size = 512
> rgw_dns_name = ...
> rgw_dns_s3website_name = ...
>
> [mon-new]
> public_addr = fNEW_NETWORK::12
> public_bind_addr = NEW_NETWORK::12
>
> WHO   MASK  LEVEL OPTION
> VALUE
>RO
> global  advanced  auth_client_required
> none
> *
> global  advanced  auth_cluster_required
>  none
> *
> global  advanced  auth_service_required
>  none
> *
> global  advanced  mon_allow_pool_size_one
>  true
> global  advanced  ms_bind_ipv4
> false
> global  advanced  ms_bind_ipv6
> true
> global  advanced  osd_pool_default_pg_autoscale_mode
> warn
> global  advanced  public_network
> OLD_NETWORK::/64, NEW_NETWORK::/64
>   *
> mon advanced  auth_allow_insecure_global_id_reclaim
>  false
> mon advanced  mon_allow_pool_delete
>  false
> mgr advanced  mgr/balancer/active
>  true
> mgr advanced  mgr/balancer/mode
>  upmap
> mgr advanced  mgr/cephadm/migration_current
5
>
>  *
> mgr advanced  mgr/orchestrator/orchestrator
>  cephadm
> mgr.0cc47a6df14ebasic container_image
>
quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
>  *
> mgr.0cc47aad8ce8basic container_image
>
quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
>  *
> osd.0   basic osd_mclock_max_capacity_iops_ssd
> 13295.404086
> osd.1   basic osd_mclock_max_capacity_iops_ssd
> 14952.522452
> osd.2   basic osd_mclock_max_capacity_iops_ssd
> 13584.113025
> osd.3   basic osd_mclock_max_capacity_iops_ssd
> 16421.770356
> osd.4   basic osd_mclock_max_capacity_iops_ssd
> 15209.375302
> osd.5   basic osd_mclock_max_capacity_iops_ssd
> 15333.697366
>
> Am Mo., 21. Aug. 2023 um 14:20 Uhr schrieb Eugen Block :
>
>> Hi,
>>
>> > I don't have those configs. The cluster is not maintained via cephadm
/
>> > orchestrator.
>>
>> I just assumed that with Quincy it already would be managed by
>> cephadm. So what does the ceph.conf currently look like on an OSD host
>> (mask sensitive data)?
>>
>> Zitat von Boris Behrens :
>>
>> > Hey Eugen,
>> > I don't have those configs. The cluster is not maintained via cephadm
/
>> > orchestrator.
>> > The ceph.conf does not have IPaddresses configured.
>> > A grep in /var/lib/ceph show only binary matches on the mons
>> >
>> > I've restarted the whole host, which also did not work.
>> >
>> > Am Mo., 21. Aug. 2023 um 13:18 Uhr schrieb Eugen Block :
>> >
>> >> Hi,
>> >>
>> >> there have been a couple of threads wrt network change, simply
>> >> restarting OSDs is not sufficient. I still haven't had to do it
>> >> myself, but did you 'ceph orch reconfig osd' after adding the second
>> >> public network, then restart them? I'm not sure if the orchestrator
>> >> works as expected here, last year there was a thread [1] with the
same
>> >> intention. Can you check the local ceph.conf file
>> >> (/var/lib/ceph///config) of the OSDs (or start with
>> >> one) if it contains both public networks? I (still) expect the
>> >> orchestrator to update that config as well. Maybe it's worth a bug
>> >> report? If there's more to it than 

[ceph-users] Re: Global recovery event but HEALTH_OK

2023-08-22 Thread Eugen Block

Hi,

can you add 'ceph -s' output? Has the recovery finished and if not, do  
you see progress? Has the upgrade finished? You could try a 'ceph mgr  
fail'.



Zitat von Alfredo Daniel Rezinovsky :


I had many movement in my cluster. Broken node, replacement, rebalancing.


Noy I'm stuck in upgrade to 18.2.0 (mgr and mon upgraded) and the  
cluster is in "Global Recovery Event"


The health is OK

I don't know how to search for the problem
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-22 Thread Boris Behrens
The OSDs are still only bound to one IP address.
After a reboot, the OSDs switched to the new address and are now
unreachable from the compute nodes.



Am Di., 22. Aug. 2023 um 09:17 Uhr schrieb Eugen Block :

> You'll need to update the mon_host line as well. Not sure if it makes
> sense to have both old and new network in there, but I'd try on one
> host first and see if it works.
>
> Zitat von Boris Behrens :
>
> > We're working on the migration to cephadm, but it requires some
> > prerequisites that still needs planing.
> >
> > root@host:~# cat /etc/ceph/ceph.conf ; ceph config dump
> > [global]
> > fsid = ...
> > mon_host = [OLD_NETWORK::10], [OLD_NETWORK::11], [OLD_NETWORK::12]
> > #public_network = OLD_NETWORK::/64, NEW_NETWORK::/64
> > ms_bind_ipv6 = true
> > ms_bind_ipv4 = false
> > auth_cluster_required = none
> > auth_service_required = none
> > auth_client_required = none
> >
> > [client]
> > ms_mon_client_mode = crc
> > #debug_rgw = 20
> > rgw_frontends = beast endpoint=[OLD_NETWORK::12]:7480
> > rgw_region = ...
> > rgw_zone = ...
> > rgw_thread_pool_size = 512
> > rgw_dns_name = ...
> > rgw_dns_s3website_name = ...
> >
> > [mon-new]
> > public_addr = fNEW_NETWORK::12
> > public_bind_addr = NEW_NETWORK::12
> >
> > WHO   MASK  LEVEL OPTION
> > VALUE
> >RO
> > global  advanced  auth_client_required
> > none
> > *
> > global  advanced  auth_cluster_required
> >  none
> > *
> > global  advanced  auth_service_required
> >  none
> > *
> > global  advanced  mon_allow_pool_size_one
> >  true
> > global  advanced  ms_bind_ipv4
> > false
> > global  advanced  ms_bind_ipv6
> > true
> > global  advanced  osd_pool_default_pg_autoscale_mode
> > warn
> > global  advanced  public_network
> > OLD_NETWORK::/64, NEW_NETWORK::/64
> >   *
> > mon advanced  auth_allow_insecure_global_id_reclaim
> >  false
> > mon advanced  mon_allow_pool_delete
> >  false
> > mgr advanced  mgr/balancer/active
> >  true
> > mgr advanced  mgr/balancer/mode
> >  upmap
> > mgr advanced  mgr/cephadm/migration_current
> 5
> >
> >  *
> > mgr advanced  mgr/orchestrator/orchestrator
> >  cephadm
> > mgr.0cc47a6df14ebasic container_image
> >
> quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
> >  *
> > mgr.0cc47aad8ce8basic container_image
> >
> quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
> >  *
> > osd.0   basic osd_mclock_max_capacity_iops_ssd
> > 13295.404086
> > osd.1   basic osd_mclock_max_capacity_iops_ssd
> > 14952.522452
> > osd.2   basic osd_mclock_max_capacity_iops_ssd
> > 13584.113025
> > osd.3   basic osd_mclock_max_capacity_iops_ssd
> > 16421.770356
> > osd.4   basic osd_mclock_max_capacity_iops_ssd
> > 15209.375302
> > osd.5   basic osd_mclock_max_capacity_iops_ssd
> > 15333.697366
> >
> > Am Mo., 21. Aug. 2023 um 14:20 Uhr schrieb Eugen Block :
> >
> >> Hi,
> >>
> >> > I don't have those configs. The cluster is not maintained via cephadm
> /
> >> > orchestrator.
> >>
> >> I just assumed that with Quincy it already would be managed by
> >> cephadm. So what does the ceph.conf currently look like on an OSD host
> >> (mask sensitive data)?
> >>
> >> Zitat von Boris Behrens :
> >>
> >> > Hey Eugen,
> >> > I don't have those configs. The cluster is not maintained via cephadm
> /
> >> > orchestrator.
> >> > The ceph.conf does not have IPaddresses configured.
> >> > A grep in /var/lib/ceph show only binary matches on the mons
> >> >
> >> > I've restarted the whole host, which also did not work.
> >> >
> >> > Am Mo., 21. Aug. 2023 um 13:18 Uhr schrieb Eugen Block  >:
> >> >
> >> >> Hi,
> >> >>
> >> >> there have been a couple of threads wrt network change, simply
> >> >> restarting OSDs is not sufficient. I still haven't had to do it
> >> >> myself, but did you 'ceph orch reconfig osd' after adding the second
> >> >> public network, then restart them? I'm not sure if the orchestrator
> >> >> works as expected here, last year there was a thread [1] with the
> same
> >> >> intention. Can you check the local ceph.conf file
> >> >> (/var/lib/ceph///config) of the OSDs (or start with
> >> >> one) if it contains both public networks? I (still) expect the
> >> >> orchestrator to update that config as well. Maybe it's worth a bug
> >> >> report? If there's more to it than just updating the monmap I would
> >> >> like to see that added to the docs since moving monitors to a
> >> >> different network is already documented [2].
> >> >>
> >> >> Regards,
> >> >> Eugen
> >> 

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-22 Thread Eugen Block
You'll need to update the mon_host line as well. Not sure if it makes  
sense to have both old and new network in there, but I'd try on one  
host first and see if it works.


Zitat von Boris Behrens :


We're working on the migration to cephadm, but it requires some
prerequisites that still needs planing.

root@host:~# cat /etc/ceph/ceph.conf ; ceph config dump
[global]
fsid = ...
mon_host = [OLD_NETWORK::10], [OLD_NETWORK::11], [OLD_NETWORK::12]
#public_network = OLD_NETWORK::/64, NEW_NETWORK::/64
ms_bind_ipv6 = true
ms_bind_ipv4 = false
auth_cluster_required = none
auth_service_required = none
auth_client_required = none

[client]
ms_mon_client_mode = crc
#debug_rgw = 20
rgw_frontends = beast endpoint=[OLD_NETWORK::12]:7480
rgw_region = ...
rgw_zone = ...
rgw_thread_pool_size = 512
rgw_dns_name = ...
rgw_dns_s3website_name = ...

[mon-new]
public_addr = fNEW_NETWORK::12
public_bind_addr = NEW_NETWORK::12

WHO   MASK  LEVEL OPTION
VALUE
   RO
global  advanced  auth_client_required
none
*
global  advanced  auth_cluster_required
 none
*
global  advanced  auth_service_required
 none
*
global  advanced  mon_allow_pool_size_one
 true
global  advanced  ms_bind_ipv4
false
global  advanced  ms_bind_ipv6
true
global  advanced  osd_pool_default_pg_autoscale_mode
warn
global  advanced  public_network
OLD_NETWORK::/64, NEW_NETWORK::/64
  *
mon advanced  auth_allow_insecure_global_id_reclaim
 false
mon advanced  mon_allow_pool_delete
 false
mgr advanced  mgr/balancer/active
 true
mgr advanced  mgr/balancer/mode
 upmap
mgr advanced  mgr/cephadm/migration_current  5

 *
mgr advanced  mgr/orchestrator/orchestrator
 cephadm
mgr.0cc47a6df14ebasic container_image
quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
 *
mgr.0cc47aad8ce8basic container_image
quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
 *
osd.0   basic osd_mclock_max_capacity_iops_ssd
13295.404086
osd.1   basic osd_mclock_max_capacity_iops_ssd
14952.522452
osd.2   basic osd_mclock_max_capacity_iops_ssd
13584.113025
osd.3   basic osd_mclock_max_capacity_iops_ssd
16421.770356
osd.4   basic osd_mclock_max_capacity_iops_ssd
15209.375302
osd.5   basic osd_mclock_max_capacity_iops_ssd
15333.697366

Am Mo., 21. Aug. 2023 um 14:20 Uhr schrieb Eugen Block :


Hi,

> I don't have those configs. The cluster is not maintained via cephadm /
> orchestrator.

I just assumed that with Quincy it already would be managed by
cephadm. So what does the ceph.conf currently look like on an OSD host
(mask sensitive data)?

Zitat von Boris Behrens :

> Hey Eugen,
> I don't have those configs. The cluster is not maintained via cephadm /
> orchestrator.
> The ceph.conf does not have IPaddresses configured.
> A grep in /var/lib/ceph show only binary matches on the mons
>
> I've restarted the whole host, which also did not work.
>
> Am Mo., 21. Aug. 2023 um 13:18 Uhr schrieb Eugen Block :
>
>> Hi,
>>
>> there have been a couple of threads wrt network change, simply
>> restarting OSDs is not sufficient. I still haven't had to do it
>> myself, but did you 'ceph orch reconfig osd' after adding the second
>> public network, then restart them? I'm not sure if the orchestrator
>> works as expected here, last year there was a thread [1] with the same
>> intention. Can you check the local ceph.conf file
>> (/var/lib/ceph///config) of the OSDs (or start with
>> one) if it contains both public networks? I (still) expect the
>> orchestrator to update that config as well. Maybe it's worth a bug
>> report? If there's more to it than just updating the monmap I would
>> like to see that added to the docs since moving monitors to a
>> different network is already documented [2].
>>
>> Regards,
>> Eugen
>>
>> [1] https://www.spinics.net/lists/ceph-users/msg75162.html
>> [2]
>>
>>
https://docs.ceph.com/en/quincy/cephadm/services/mon/#moving-monitors-to-a-different-network
>>
>> Zitat von Boris Behrens :
>>
>> > Hi,
>> > I need to migrate a storage cluster to a new network.
>> >
>> > I added the new network to the ceph config via:
>> > ceph config set global public_network "old_network/64, new_network/64"
>> > I've added a set of new mon daemons with IP addresses in the new
network
>> > and they are added to the quorum and seem to work as expected.
>> >
>> > But when I restart the OSD daemons, the do not bind to the new
>> addresses. I
>> > would have expected that the OSDs try to bind to all networks but they
>> are
>> > only bound 

[ceph-users] Upcoming change to fix "ceph config dump" output inconsistency.

2023-08-22 Thread Sridhar Seshasayee
Hello Everyone,

Recently, an issue related to inconsistency in the output of
"ceph config dump" command was reported. The inconsistency
is between the normal (non-pretty-print) and pretty-print
outputs. The non-pretty print output displays the localized
option name whereas the pretty-print output displays the
normalized option name. For e.g.,

Normalized: mgr/dashboard/ssl_server_port
Localized: mgr/dashboard/x/ssl_server_port

The fix ensures that the localized option name is shown in all
cases. The issue is tracked in https://tracker.ceph.com/issues/62379
and the fix is not yet merged.

This is to give a heads up in case you have any kind of automation
that relies on the pretty-printed output (json, xml). This fix would soon
be made available in upstream and downstream branches.

If you have any concerns around this change,  please let us know.

Thanks,
-Sridhar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io