[ceph-users] Problems with crash and k8sevents modules

2024-08-01 Thread Massimo Sgaravatto
Dear all


Today I realized that the syslog on the mon/mgr modules was complaining
about missing credentials for the crash client

So. following what is reported in:

https://docs.ceph.com/en/quincy/mgr/crash/

I did:

root@ceph-mon-01 ~]# ceph auth get-or-create client.crash mon 'profile
crash' mgr 'profile crash'
[client.crash]
key = AQAqNKtmny3yNRAALoVRCgPju2Epc+BPfqOSdw==



and on the 3 mon/mgr nodes I created the file
/etc/ceph/ceph.client.crash.keyring

root@ceph-mon-01 ~]# ll /etc/ceph/ceph.client.crash.keyring
-rw-r--r-- 1 ceph ceph 64 Aug  1 09:21 /etc/ceph/ceph.client.crash.keyring
[root@ceph-mon-01 ~]# cat /etc/ceph/ceph.client.crash.keyring
[client.crash]
key = AQAqNKtmny3yNRAALoVRCgPju2Epc+BPfqOSdw==

But it is still complaining in the syslog [*]
Do I need to also create a  client.crash.$hostname ? I have to admit that
the doc is not fully clear to me 

At any rate now ceph shows:

 [root@ceph-mon-01 ~]# ceph health detail
HEALTH_WARN 8 mgr modules have recently crashed
[WRN] RECENT_MGR_MODULE_CRASH: 8 mgr modules have recently crashed
mgr module k8sevents crashed in daemon mgr.ceph-mon-01 on host
ceph-mon-01.cloud.pd.infn.it at 2024-07-23T08:53:52.543451Z
mgr module k8sevents crashed in daemon mgr.ceph-mon-01 on host
ceph-mon-01.cloud.pd.infn.it at 2024-07-23T09:09:17.703402Z
mgr module k8sevents crashed in daemon mgr.ceph-mon-02 on host
ceph-mon-02.cloud.pd.infn.it at 2024-07-22T13:44:18.779525Z
mgr module k8sevents crashed in daemon mgr.ceph-mon-02 on host
ceph-mon-02.cloud.pd.infn.it at 2024-07-22T13:57:24.207309Z
mgr module k8sevents crashed in daemon mgr.ceph-mon-03 on host
ceph-mon-03.cloud.pd.infn.it at 2024-07-22T10:42:42.002880Z
mgr module k8sevents crashed in daemon mgr.ceph-mon-03 on host
ceph-mon-03.cloud.pd.infn.it at 2024-07-22T11:09:21.639576Z
mgr module k8sevents crashed in daemon mgr.ceph-mon-03 on host
ceph-mon-03.cloud.pd.infn.it at 2024-08-01T06:02:01.139099Z
mgr module k8sevents crashed in daemon mgr.ceph-mon-03 on host
ceph-mon-03.cloud.pd.infn.it at 2024-08-01T07:45:16.249664Z


but that module is disabled [**]. So I am a bit confused

Thanks, Massimo


[**]

[root@ceph-mon-01 ~]# ceph mgr module ls
MODULE
balancer   on (always on)
crash  on (always on)
devicehealth   on (always on)
orchestrator   on (always on)
pg_autoscaler  on (always on)
progress   on (always on)
rbd_supporton (always on)
status on (always on)
telemetry  on (always on)
volumeson (always on)
dashboard  on
restfulon
alerts -
cephadm-
influx -
insights   -
iostat -
k8sevents  -
localpool  -
mds_autoscaler -
mirroring  -
nfs-
osd_perf_query -
osd_support-
prometheus -
rgw-
rook   -
selftest   -
snap_schedule  -
stats  -
telegraf   -
test_orchestrator  -
zabbix -
[root@ceph-mon-01 ~]# ceph mgr module disable k8sevents
module 'k8sevents' is already disabled
[root@ceph-mon-01 ~]#





[*]

Aug  1 09:26:39 ceph-mon-01 ceph-crash[1399]: WARNING:ceph-crash:post
/var/lib/ceph/crash/2024-07-23T08:53:52.543451Z_da7be91d-5025-4bd0-b6a9-8fe2a1d3b71d
as client.crash.ceph-mon-01.cloud.pd.infn.it failed:
2024-08-01T09:26:39.599+0200 7f44ba8ae640 -1 auth: unable to find a keyring
on
/etc/ceph/ceph.client.crash.ceph-mon-01.cloud.pd.infn.it.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin:
(2) No such file or directory
Aug  1 09:26:39 ceph-mon-01 ceph-crash[1399]: 2024-08-01T09:26:39.599+0200
7f44ba8ae640 -1 AuthRegistry(0x7f44b4060ac0) no keyring found at
/etc/ceph/ceph.client.crash.ceph-mon-01.cloud.pd.infn.it.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
disabling cephx
Aug  1 09:26:39 ceph-mon-01 ceph-crash[1399]: 2024-08-01T09:26:39.604+0200
7f44ba8ae640 -1 auth: unable to find a keyring on
/etc/ceph/ceph.client.crash.ceph-mon-01.cloud.pd.infn.it.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin:
(2) No such file or directory
Aug  1 09:26:39 ceph-mon-01 ceph-crash[1399]: 2024-08-01T09:26:39.604+0200
7f44ba8ae640 -1 AuthRegistry(0x7f44b408c3d0) no keyring found at
/etc/ceph/ceph.client.crash.ceph-mon-01.cloud.pd.infn.it.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
disabling cephx
Aug  1 09:26:39 ceph-mon-01 ceph-crash[1399]: 2024-08-01T09:26:39.604+0200
7f44ba8ae640 -1 auth: unable to find a keyring on
/etc/ceph/ceph.client.crash.ceph-mon-01.cloud.pd.infn.it.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin:
(2) No such file or directory
Aug  1 09:26:39 ceph-mon-01 ceph-crash[1399]: 2024-08-01T09:26:39.604+0200
7f44ba8ae640 -1 AuthRegistry(0x7f44ba8ad0c0) no keyring found at
/etc/ceph/ceph.client.crash.ceph-mon-01.cloud.pd.infn.it.keyring,/etc/ceph/ceph.keyring,/etc/ceph

[ceph-users] Re: Radosgw replicated -> EC pool

2024-01-02 Thread Massimo Sgaravatto
A while ago I moved from a replicated pool to a EC pool using this
procedure (a down of the service during the data migration was acceptable
in my case):



# Stop rgw on all instances
systemctl stop ceph-radosgw.target

# Create the new EC pool for rgw data
ceph osd pool create cloudprod.rgw.buckets.data.new 32 32 erasure
profile-4-2

# copy data from old pool to the new one
rados cppool cloudprod.rgw.buckets.data cloudprod.rgw.buckets.data.new

# Rename the pools
ceph osd pool rename cloudprod.rgw.buckets.data
cloudprod.rgw.buckets.data.old
ceph osd pool rename cloudprod.rgw.buckets.data.new
cloudprod.rgw.buckets.data

# Application setting
 ceph osd pool application enable cloudprod.rgw.buckets.data rgw

# Delete old pool
ceph tell mon.\* injectargs '--mon-allow-pool-delete=true'
ceph osd pool delete cloudprod.rgw.buckets.data.old
cloudprod.rgw.buckets.data.old --yes-i-really-really-mean-it
ceph tell mon.\* injectargs '--mon-allow-pool-delete=false'

# Restart all rgw instances
systemctl start ceph-radosgw.target


Cheers, Massimo


On Tue, Jan 2, 2024 at 6:02 PM Jan Kasprzak  wrote:

> Hello, Ceph users,
>
> what is the best way how to change the storage layout of all buckets
> in radosgw?
>
> I have default.rgw.buckets.data pool as replicated, and I want to use
> an erasure-coded layout instead. One way is to use cache tiering
> as described here:
>
> https://cephnotes.ksperis.com/blog/2015/04/15/ceph-pool-migration/
>
> Could this be done under the running radosgw? If I read this correctly,
> it should be done, because radosgw is just another RADOS client.
>
> Another possible approach would be to create a new erasure-coded pool,
> a new zone placement, and set it as default. But how can I migrate
> the existing data? If I understand it correctly, the default placement
> applies only to new buckets.
>
> Something like this:
>
> ceph osd erasure-code-profile set k5m2 k=5 m=2
> ceph osd pool create default.rgw.buckets.ecdata erasure k5m2
> ceph osd pool application enable default.rgw.buckets.ecdata radosgw
>
> radosgw-admin zonegroup placement add --rgw-zonegroup default
> --placement-id ecdata-placement
> radosgw-admin zone placement add --rgw-zone default --placement-id
> ecdata-placement --data-pool default.rgw.buckets.ecdata --index-pool
> default.rgw.buckets.index --data-extra-pool default.rgw.buckets.non-ec
> radosgw-admin zonegroup placement default --rgw-zonegroup default
> --placement-id ecdata-placement
>
> How to continue from this point?
>
> And a secondary question: what purpose does the data-extra-pool serve?
>
> Thanks!
>
> -Yenya
>
> --
> | Jan "Yenya" Kasprzak 
> |
> | https://www.fi.muni.cz/~kas/GPG: 4096R/A45477D5
> |
> We all agree on the necessity of compromise. We just can't agree on
> when it's necessary to compromise. --Larry Wall
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Minimum client version for Quincy

2023-03-03 Thread Massimo Sgaravatto
Thanks.
The question is indeed mainly for rbd. In particolar the old clients I was
referring to are a proxmox cluster and a Openstack installation
Regards, Massimo


Il ven 3 mar 2023, 15:53 Daniel Gryniewicz  ha scritto:

> I can't speak for RBD, but for RGW, as long as you upgrade all the RGWs
> themselves, clients will be fine, since they speak S3 to the RGWs, not
> RADOS.
>
> Daniel
>
> On 3/3/23 04:29, Massimo Sgaravatto wrote:
> > Dear all
> > I am going to update a ceph cluster (where I am using only rbd and rgw,
> > i.e. I didn't deploy cephfs) from Octtopus to Quincy
> >
> > Before doing that I would like to understand if some old nautilus clients
> > (that I can't update for several reasons) will still be able to connect
> >
> > In general: I am not able to find this information in the documentation
> of
> > any ceph release
> >
> > Should I refer to get-require-min-compat-client ?
> >
> > Now in my Octopus cluster I see:
> >
> > [root@ceph-mon-01 ~]# ceph osd get-require-min-compat-client
> > luminous
> >
> >
> > but I have the feeling that this value is simply the one I set a while
> ago
> > to support the upmap feature
> >
> > Thanks, Massimo
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Minimum client version for Quincy

2023-03-03 Thread Massimo Sgaravatto
Dear all
I am going to update a ceph cluster (where I am using only rbd and rgw,
i.e. I didn't deploy cephfs) from Octtopus to Quincy

Before doing that I would like to understand if some old nautilus clients
(that I can't update for several reasons) will still be able to connect

In general: I am not able to find this information in the documentation of
any ceph release

Should I refer to get-require-min-compat-client ?

Now in my Octopus cluster I see:

[root@ceph-mon-01 ~]# ceph osd get-require-min-compat-client
luminous


but I have the feeling that this value is simply the one I set a while ago
to support the upmap feature

Thanks, Massimo
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Migrate a bucket from replicated pool to ec pool

2023-02-13 Thread Massimo Sgaravatto
Years ago I moved from a replicated pool to an EC pool, but with a downtime
of the service (and at that time I didn't have too many data)

Basically after having stopped the radosgw services, I created the new pool
and moved the data from the old replicated pool to new EC one:

ceph osd pool create cloudprod.rgw.buckets.data.new 32 32 erasure
profile-4-2
rados cppool cloudprod.rgw.buckets.data cloudprod.rgw.buckets.data.new
ceph osd pool rename cloudprod.rgw.buckets.data
cloudprod.rgw.buckets.data.old
ceph osd pool rename cloudprod.rgw.buckets.data.new
cloudprod.rgw.buckets.data
ceph osd pool application enable cloudprod.rgw.buckets.data rgw

Eventually I deleted the replicated pool:

ceph tell mon.\* injectargs '--mon-allow-pool-delete=true'
ceph osd pool delete cloudprod.rgw.buckets.data.old
cloudprod.rgw.buckets.data.old --yes-i-really-really-mean-it
ceph tell mon.\* injectargs '--mon-allow-pool-delete=false'

On Mon, Feb 13, 2023 at 4:43 PM Casey Bodley  wrote:

> On Mon, Feb 13, 2023 at 4:31 AM Boris Behrens  wrote:
> >
> > Hi Casey,
> >
> >> changes to the user's default placement target/storage class don't
> >> apply to existing buckets, only newly-created ones. a bucket's default
> >> placement target/storage class can't be changed after creation
> >
> >
> > so I can easily update the placement rules for this user and can migrate
> existing buckets one at a time. Very cool. Thanks
> >
> >>
> >> you might add the EC pool as a new storage class in the existing
> >> placement target, and use lifecycle transitions to move the objects.
> >> but the bucket's default storage class would still be replicated, so
> >> new uploads would go there unless the client adds a
> >> x-amz-storage-class header to override it. if you want to change those
> >> defaults, you'd need to create a new bucket and copy the objects over
> >
> >
> > Can you link me to documentation. It might be the monday, but I do not
> understand that totally.
>
> https://docs.ceph.com/en/octopus/radosgw/placement/#adding-a-storage-class
> should cover the addition of a new storage class for your EC pool
>
> >
> > Do you know how much more CPU/RAM EC takes, and when (putting, reading,
> deleting objects, recovering OSD failure)?
>
> i don't have any data on that myself. maybe others on the list can share
> theirs?
>
> >
> >
> > --
> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Problems with autoscaler (overlapping roots) after changing the pool class

2023-01-25 Thread Massimo Sgaravatto
I tried the following on a small testbed first:

ceph osd erasure-code-profile set profile-4-2-hdd k=4 m=2
crush-failure-domain=host crush-device-class=hdd
ceph osd crush rule create-erasure ecrule-4-2-hdd profile-4-2-hdd
ceph osd pool set ecpool-4-2 crush_rule ecrule-4-2-hdd

and indeed after having applied this change for all the EC pools, the
autoscaler doesn't complain anymore

Thanks a lot !

Cheers, Massimo

On Tue, Jan 24, 2023 at 7:02 PM Eugen Block  wrote:

> Hi,
>
> what you can’t change with EC pools is the EC profile, the pool‘s
> ruleset you can change. The fix is the same as for the replicates
> pools, assign a ruleset with hdd class and after some data movement
> the autoscaler should not complain anymore.
>
> Regards
> Eugen
>
> Zitat von Massimo Sgaravatto :
>
> > Dear all
> >
> > I have just changed the crush rule for all the replicated pools in the
> > following way:
> >
> > ceph osd crush rule create-replicated replicated_hdd default host hdd
> > ceph osd pool set   crush_rule replicated_hdd
> >
> > See also this [*] thread
> > Before applying this change, these pools were all using
> > the replicated_ruleset rule where the class is not specified.
> >
> >
> >
> > I am noticing now a problem with the autoscaler: "ceph osd pool
> > autoscale-status" doesn't report any output and the mgr log complains
> about
> > overlapping roots:
> >
> >  [pg_autoscaler ERROR root] pool xyz has overlapping roots: {-18, -1}
> >
> >
> > Indeed:
> >
> > # ceph osd crush tree --show-shadow
> > ID   CLASS  WEIGHT  TYPE NAME
> > -18hdd  1329.26501  root default~hdd
> > -17hdd   329.14154  rack Rack11-PianoAlto~hdd
> > -15hdd54.56085  host ceph-osd-04~hdd
> >  30hdd 5.45609  osd.30
> >  31hdd 5.45609  osd.31
> > ...
> > ...
> >  -1 1329.26501  root default
> >  -7  329.14154  rack Rack11-PianoAlto
> >  -8   54.56085  host ceph-osd-04
> >  30hdd 5.45609  osd.30
> >  31hdd 5.45609  osd.31
> > ...
> >
> > I have already read about this behavior but  I have no clear ideas how to
> > fix the problem.
> >
> > I read somewhere that the problem happens when there are rules that force
> > some pools to only use one class and there are also pools which does not
> > make any distinction between device classes
> >
> >
> > All the replicated pools are using the replicated_hdd pool but I also
> have
> > some EC pools which are using a profile where the class is not specified.
> > As far I understand, I can't force these pools to use only the hdd class:
> > according to the doc I can't change this profile specifying the hdd class
> > (or at least the change wouldn't be applied to the existing EC pools)
> >
> > Any suggestions ?
> >
> > The crush map is available at https://cernbox.cern.ch/s/gIyjbQbmoTFHCrr,
> if
> > you want to have a look
> >
> > Many thanks, Massimo
> >
> > [*] https://www.mail-archive.com/ceph-users@ceph.io/msg18534.html
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Problems with autoscaler (overlapping roots) after changing the pool class

2023-01-24 Thread Massimo Sgaravatto
Dear all

I have just changed the crush rule for all the replicated pools in the
following way:

ceph osd crush rule create-replicated replicated_hdd default host hdd
ceph osd pool set   crush_rule replicated_hdd

See also this [*] thread
Before applying this change, these pools were all using
the replicated_ruleset rule where the class is not specified.



I am noticing now a problem with the autoscaler: "ceph osd pool
autoscale-status" doesn't report any output and the mgr log complains about
overlapping roots:

 [pg_autoscaler ERROR root] pool xyz has overlapping roots: {-18, -1}


Indeed:

# ceph osd crush tree --show-shadow
ID   CLASS  WEIGHT  TYPE NAME
-18hdd  1329.26501  root default~hdd
-17hdd   329.14154  rack Rack11-PianoAlto~hdd
-15hdd54.56085  host ceph-osd-04~hdd
 30hdd 5.45609  osd.30
 31hdd 5.45609  osd.31
...
...
 -1 1329.26501  root default
 -7  329.14154  rack Rack11-PianoAlto
 -8   54.56085  host ceph-osd-04
 30hdd 5.45609  osd.30
 31hdd 5.45609  osd.31
...

I have already read about this behavior but  I have no clear ideas how to
fix the problem.

I read somewhere that the problem happens when there are rules that force
some pools to only use one class and there are also pools which does not
make any distinction between device classes


All the replicated pools are using the replicated_hdd pool but I also have
some EC pools which are using a profile where the class is not specified.
As far I understand, I can't force these pools to use only the hdd class:
according to the doc I can't change this profile specifying the hdd class
(or at least the change wouldn't be applied to the existing EC pools)

Any suggestions ?

The crush map is available at https://cernbox.cern.ch/s/gIyjbQbmoTFHCrr, if
you want to have a look

Many thanks, Massimo

[*] https://www.mail-archive.com/ceph-users@ceph.io/msg18534.html
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Pools and classes

2023-01-23 Thread Massimo Sgaravatto
Thanks a lot
Cheers, Massimo

On Mon, Jan 23, 2023 at 9:55 AM Robert Sander 
wrote:

> Am 23.01.23 um 09:44 schrieb Massimo Sgaravatto:
>
> >> This triggered the remapping of some pgs and therefore some data
> movement.
> >> Is this normal/expected, since for the time being I have only hdd osds ?
>
> This is expected behaviour as the cluster map has changed. Internally
> the device classes are represented through "shadow" trees of the cluster
> topology.
>
> Regards
> --
> Robert Sander
> Heinlein Consulting GmbH
> Schwedter Str. 8/9b, 10119 Berlin
>
> http://www.heinlein-support.de
>
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
>
> Zwangsangaben lt. §35a GmbHG:
> HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein -- Sitz: Berlin
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Pools and classes

2023-01-23 Thread Massimo Sgaravatto
Any feedback ? I would just like to be sure that I am using the right
procedure ...

Thanks, Massimo

On Fri, Jan 20, 2023 at 11:28 AM Massimo Sgaravatto <
massimo.sgarava...@gmail.com> wrote:

> Dear all
>
> I have a ceph cluster where so far all OSDs have been rotational hdd disks
> (actually there are some SSDs, used only for block.db and wal.db)
>
> I now want to add some SSD disks to be used as OSD. My use case is:
>
> 1) for the existing pools keep using only hdd disks
> 2) create some new pools using only sdd disks
>
>
> Let's start with 1 (I didn't have added yet the ssd disks in the cluster)
>
> I have some replicated pools and some ec pools. The replicated pools are
> using a replicated_ruleset rule [*].
> I created a new "replicated_hdd" rule [**] using the command:
>
> ceph osd crush rule create-replicated replicated_hdd default host hdd
>
> I then changed the crush rule of a existing pool (that was using
> 'replicated_ruleset') using the command:
>
>
> ceph osd pool set   crush_rule replicated_hdd
>
> This triggered the remapping of some pgs and therefore some data movement.
> Is this normal/expected, since for the time being I have only hdd osds ?
>
> Thanks, Massimo
>
>
>
> [*]
> rule replicated_ruleset {
> id 0
> type replicated
> min_size 1
> max_size 10
> step take default
> step chooseleaf firstn 0 type host
> step emit
> }
>
> [**]
> rule replicated_hdd {
> id 7
> type replicated
> min_size 1
> max_size 10
> step take default class hdd
> step chooseleaf firstn 0 type host
> step emit
> }
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Pools and classes

2023-01-20 Thread Massimo Sgaravatto
Dear all

I have a ceph cluster where so far all OSDs have been rotational hdd disks
(actually there are some SSDs, used only for block.db and wal.db)

I now want to add some SSD disks to be used as OSD. My use case is:

1) for the existing pools keep using only hdd disks
2) create some new pools using only sdd disks


Let's start with 1 (I didn't have added yet the ssd disks in the cluster)

I have some replicated pools and some ec pools. The replicated pools are
using a replicated_ruleset rule [*].
I created a new "replicated_hdd" rule [**] using the command:

ceph osd crush rule create-replicated replicated_hdd default host hdd

I then changed the crush rule of a existing pool (that was using
'replicated_ruleset') using the command:


ceph osd pool set   crush_rule replicated_hdd

This triggered the remapping of some pgs and therefore some data movement.
Is this normal/expected, since for the time being I have only hdd osds ?

Thanks, Massimo



[*]
rule replicated_ruleset {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}

[**]
rule replicated_hdd {
id 7
type replicated
min_size 1
max_size 10
step take default class hdd
step chooseleaf firstn 0 type host
step emit
}
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph upgrade advice - Luminous to Pacific with OS upgrade

2022-12-06 Thread Massimo Sgaravatto
If it can help, I have recently updated my ceph cluster (composed by 3
mon-mgr nodes and n osd nodes) from Nautilus CentOS7 to Pacific Centos8
stream.

First I reinstalled the mon-mgr nodes with Centos8 stream (removing them
from the cluster and then re-adding them with the new operating system).
This was needed because the mgr on octopus runs only on rhel8 and its forks

Then I migrated the cluster to Octopus (so mon-mgr running C8stream and osd
nodes running centos7)

Then I reinstalled each OSD node with Centos8 Stream, without draining the
node  [*]

Then I migrated the cluster from Octopus to Pacific

[*]
ceph osd set noout
Reinstallation of the node with the C8stream
Installation of ceph
ceph-volume lvm activate --all


Cheers, Massimo


On Tue, Dec 6, 2022 at 3:58 PM David C  wrote:

> Hi All
>
> I'm planning to upgrade a Luminous 12.2.10 cluster to Pacific 16.2.10,
> cluster is primarily used for CephFS, mix of Filestore and Bluestore
> OSDs, mons/osds collocated, running on CentOS 7 nodes
>
> My proposed upgrade path is: Upgrade to Nautilus 14.2.22 -> Upgrade to
> EL8 on the nodes (probably Rocky) -> Upgrade to Pacific
>
> I assume the cleanest way to update the node OS would be to drain the
> node and remove from the cluster, install Rocky 8, add back to cluster
> as effectively a new node
>
> I have a relatively short maintenance window and was hoping to speed
> up OS upgrade with the following approach on each node:
>
> - back up ceph config/systemd files etc.
> - set noout etc.
> - deploy Rocky 8, being careful not to touch OSD block devices
> - install Nautilus binaries (ensuring I use same version as pre OS upgrade)
> - copy ceph config back over
>
> In theory I could then start up the daemons and they wouldn't care
> that we're now running on a different OS
>
> Does anyone see any issues with that approach? I plan to test on a dev
> cluster anyway but would be grateful for any thoughts
>
> Thanks,
> David
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Same location for wal.db and block.db

2022-10-01 Thread Massimo Sgaravatto
Thanks for confirming !
Cheers, Massimo

On Fri, Sep 30, 2022 at 9:19 AM Janne Johansson  wrote:

> > I used to create Bluestore OSDs using commands such as this one:
> >
> > ceph-volume lvm create --bluestore --data ceph-block-50/block-50
> --block.db
> > ceph-db-50-54/db-50
> > with the goal of having  block.db and wal.db co-located on the same LV
> > (ceph-db-50-54/db-5 in my example, which is on a SSD device).
> >
> > Is still still true that, if the wal location is not explicitly
> specified,
> > it goes together with
> > the db, as stated e.g. in:
>
> Yes, that should still be true.
>
> --
> May the most significant bit of your life be positive.
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Same location for wal.db and block.db

2022-09-29 Thread Massimo Sgaravatto
I used to create Bluestore OSDs using commands such as this one:

ceph-volume lvm create --bluestore --data ceph-block-50/block-50 --block.db
ceph-db-50-54/db-50

with the goal of having  block.db and wal.db co-located on the same LV
(ceph-db-50-54/db-5 in my example, which is on a SSD device).

Is still still true that, if the wal location is not explicitly specified,
it goes together with
the db, as stated e.g. in:

https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/CBMWLN3CJCMR2PGZETPKGZZFZJ4HP6N7/

?

I can't find this information in the latest doc (I am reading
https://docs.ceph.com/en/quincy/ceph-volume/lvm/prepare/#bluestore)

Thanks, Massimo
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph on RHEL 9

2022-07-26 Thread Massimo Sgaravatto
Dear all

Are there any updates on this question ?

In particular, since I am planning how to update my infrastructure, I would
be interested to know if there are plans to provide packages for centos9
stream for Pacific and/or Quincy

Thanks, Massimo


On Fri, Jun 10, 2022 at 6:46 PM Gregory Farnum  wrote:

> We aren't building for Centos 9 yet, so I guess the python dependency
> declarations don't work with the versions in that release.
> I've put updating to 9 on the agenda for the next CLT.
>
> (Do note that we don't test upstream packages against RHEL, so if
> Centos Stream does something which doesn't match the RHEL release it
> still might get busted.)
> -Greg
>
> On Thu, Jun 9, 2022 at 6:57 PM Robert W. Eckert 
> wrote:
> >
> > Does anyone have any pointers to install CEPH on Rhel 9?
> >
> > -Original Message-
> > From: Robert W. Eckert 
> > Sent: Saturday, May 28, 2022 8:28 PM
> > To: ceph-users@ceph.io
> > Subject: [ceph-users] Ceph on RHEL 9
> >
> > Hi- I started to update my 3 host cluster to RHEL 9, but came across a
> bit of a stumbling block.
> >
> > The upgrade process uses the RHEL leapp process, which ran through a few
> simple things to clean up, and told me everything was hunky dory, but when
> I kicked off the first server, the server wouldn't boot because I had a
> ceph filesystem mounted in /etc/fstab, commenting it out, let the upgrade
> happen.
> >
> > Then I went to check on the ceph client which appears to be uninstalled.
> >
> > When I tried to install ceph,  I got:
> >
> > [root@story ~]# dnf install ceph
> > Updating Subscription Management repositories.
> > Last metadata expiration check: 0:07:58 ago on Sat 28 May 2022 08:06:52
> PM EDT.
> > Error:
> > Problem: package ceph-2:17.2.0-0.el8.x86_64 requires ceph-mgr =
> 2:17.2.0-0.el8, but none of the providers can be installed
> >   - conflicting requests
> >   - nothing provides libpython3.6m.so.1.0()(64bit) needed by
> ceph-mgr-2:17.2.0-0.el8.x86_64 (try to add '--skip-broken' to skip
> uninstallable packages or '--nobest' to use not only best candidate
> packages)
> >
> > This is the content of my /etc/yum.repos.d/ceph.conf
> >
> > [ceph]
> > name=Ceph packages for $basearch
> > baseurl=https://download.ceph.com/rpm-quincy/el8/$basearch
> > enabled=1
> > priority=2
> > gpgcheck=1
> > gpgkey=https://download.ceph.com/keys/release.asc
> >
> > [ceph-noarch]
> > name=Ceph noarch packages
> > baseurl=https://download.ceph.com/rpm-quincy/el8/noarch
> > enabled=1
> > priority=2
> > gpgcheck=1
> > gpgkey=https://download.ceph.com/keys/release.asc
> >
> > [ceph-source]
> > name=Ceph source packages
> > baseurl=https://download.ceph.com/rpm-quincy/el8/SRPMS
> > enabled=0
> > priority=2
> > gpgcheck=1
> > gpgkey=https://download.ceph.com/keys/release.asc
> > Is there anything I should change for el9 (I don't see el9 rpms out yet).
> >
> > Or should I  wait before updating the other two servers?
> >
> > Thanks,
> > Rob
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph packages for Rocky Linux

2021-08-23 Thread Massimo Sgaravatto
Isn't Rocky Linux 8 supposed to be binary-compatible with RHEL8 ?

Cheers, Massimo

On Tue, Aug 24, 2021 at 12:08 AM Kyriazis, George 
wrote:

> Hello,
>
> Are there client packages available for Rocky Linux (specifically 8.4) for
> Pacific?  If not, when can we expect them?
>
> I also looked at download.ceph.com and I couldn’t find anything relevant.
> I only saw rh7 and rh8 packages.
>
> Thank you!
>
> George
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How can I check my rgw quota ?

2021-06-22 Thread Massimo Sgaravatto
Sorry for the very naive question:

I know how to set/check the rgw quota for a user (using  radosgw-admin)

But how can a  radosgw user check what is the quota assigned to his/her
account , using the S3 and/or the swift interface  ?

I don't get this information using "swift stat", and I can't find a s3cmd
quota related command ...

Thanks, Massimo
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Updating a CentOS7-Nautilus cluster to CentOS8-Pacific

2021-04-26 Thread Massimo Sgaravatto
Hi

I have a ceph cluster running Nautilus. The ceph services are hosted on
CentOS7
servers.

Right now I have:
- 3 servers, each one running MON+MGR
- 10 servers running OSDs
- 2 servers running RGW


I need to update this cluster to CentOS8 (actually CentOS stream 8) and
Pacific.


What is the update path that you would suggest ?

I was thinking:

CentOS7-Nautilus --> CentOS8-Nautilus --> CentOS8-Pacific


but I don't know if this really the best solution.
For example, as far as I understand ceph-deploy isn't available on CentOS8,
and there is no documentation on how to manually (i.e without using
ceph-deploy) re-deploy RGW on CentOS8

Thanks, Massimo
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Repo for Nautilus packages for CentOS8

2020-05-28 Thread Massimo Sgaravatto
Hi

What is the repo supposed to be used for Nautilus packages for CentOS8 ?

The documentation says to use  https://download.ceph.com/rpm-nautilus, but
https://download.ceph.com/rpm-nautilus/el8 is empty

I see some packages in:

http://mirror.centos.org/centos/8/storage/x86_64/ceph-nautilus/Packages/c/

but the latest rpms published there are for the 14.2.7 release (I can't
find 14.2.9 packages)

Thanks, Massimo
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: PGS INCONSISTENT - read_error - replace disk or pg repair then replace disk

2020-05-23 Thread Massimo Sgaravatto
When I see this problem usually:

- I run pg repair
- I remove the OSD from the cluster
- I replace the disk
- I recreate the OSD on the new disk

Cheers, Massimo

On Wed, May 20, 2020 at 9:41 PM Peter Lewis  wrote:

> Hello,
>
> I  came across a section of the documentation that I don't quite
> understand.  In the section about inconsistent PGs it says if one of the
> shards listed in `rados list-inconsistent-obj` has a read_error the disk is
> probably bad.
>
> Quote from documentation:
>
> https://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#pgs-inconsistent
> `If read_error is listed in the errors attribute of a shard, the
> inconsistency is likely due to disk errors. You might want to check your
> disk used by that OSD.`
>
> I determined that the disk is bad by looking at the output of smartctl.  I
> would think that replacing the disk by removing the OSD from the cluster
> and allowing the cluster to recover would fix this inconsistency error
> without having to run `ceph pg repair`.
>
> Can I just replace the OSD and the inconsistency will be resolved by the
> recovery?  Or would it be better to run `ceph pg repair` and then replace
> the OSD associated with that bad disk?
>
> Thanks!
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: luminous -> nautilus upgrade path

2020-02-12 Thread Massimo Sgaravatto
We skipped from Luminous to Nautilus, skipping Mimic
This is supported and documented

On Wed, Feb 12, 2020 at 9:30 AM Eugen Block  wrote:

> Hi,
>
> we also skipped Mimic when upgrading from L --> N and it worked fine.
>
>
> Zitat von c...@elchaka.de:
>
> > Afaik you can migrate from 12 to 14 in a direct way. This is supported
> iirc.
> >
> > I will do that on a few month on my ceph Cluster.
> >
> > Hth
> > Mehmet
> >
> > Am 12. Februar 2020 09:19:53 MEZ schrieb Wolfgang Lendl
> > :
> >> hello,
> >>
> >> we plan to upgrade from luminous to nautilus.
> >> does it make sense to do the mimic step instead of going directly for
> >> nautilus?
> >>
> >> br
> >> wolfgang
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Different memory usage on OSD nodes after update to Nautilus

2020-02-06 Thread Massimo Sgaravatto
Thanks for your feedback

The Ganglia graphs are available here:

https://cernbox.cern.ch/index.php/s/0xBDVwNkRqcoGdF

Replying to the other questions:

- Free Memory in ganglia is derived from "MemFree" in /proc/meminfo
- Memory Buffers in ganglia is derived from "Buffers" in /proc/meminfo
- On this host, the OSDs are 6TB. On other hosts we have 10TB OSDs
- "osd memory target" is set to ~ 4.5 GB (actually, while debugging this
issue, I have just lowered the value to 3.2 GB)
- "ceph tell osd.x heap stats" basically always reports 0 (or a very low
value) for "Bytes in page heap freelist" and a heap release doesn't change
the memory usage
- I can agree that swap is antiquated. But so far it was simply not used
and didn't cause any problems. At any rate I am now going to remove the
swap (or setting the swappiness to 0).

Thanks again !

Cheers, Massimo




On Thu, Feb 6, 2020 at 6:28 PM Anthony D'Atri  wrote:

>  Attachments are usually filtered by mailing lists.  Yours did not come
> through.  A URL to Skitch or some other hosting works better.
>
> Your kernel version sounds like RHEL / CentOS?  I can say that memory
> accounting definitely did change between upstream 3.19 and 4.9
>
>
> osd04-cephstorage1-gsc:~ # head /proc/meminfo
> MemTotal:   197524684 kB
> MemFree:80388504 kB
> MemAvailable:   86055708 kB
> Buffers:  633768 kB
> Cached:  4705408 kB
> SwapCached:0 kB
>
> Specifically, node_memory_Active as reported by node_exporter changes
> dramatically, and MemAvailable is the more meaningful metric.  What is your
> “FreeMem” metric actually derived from?
>
> 64GB for 10 OSDs might be on the light side, how large are those OSDs?
>
> For sure swap is antiquated.  If your systems have any swap provisioned at
> all, you’re doing it wrong.  I’ve had good results setting it to 1.
>
> Do `ceph daemon osd.xx heap stats`, see if your OSD processes have much
> unused memory that has not been released to the OS.  If they do, “heap
> release” can be useful.
>
>
>
> > On Feb 6, 2020, at 9:08 AM, Massimo Sgaravatto <
> massimo.sgarava...@gmail.com> wrote:
> >
> > Dear all
> >
> > In the mid of January I updated my ceph cluster from Luminous to
> Nautilus.
> >
> > Attached you can see the memory metrics collected on one OSD node (I see
> > the very same behavior on all OSD hosts) graphed via Ganglia
> > This is Centos 7 node, with 64 GB of RAM, hosting 10 OSDs.
> >
> > So before the update there were about 20 GB of FreeMem.
> > Now FreeMem is basically 0, but I see 20 GB of Buffers,
> >
> > I guess this triggered some swapping, probably because I forgot to
> > set vm.swappiness to 0 (it was set to 60, the default value).
> >
> > I was wondering if this the expected behavior
> >
> > PS: Actually besides updating ceph, I also updated all the other packages
> > (yum update), so I am not sure that this different memory usage is
> because
> > of the ceph update
> > For the record in this update the kernel was updated from 3.10.0-1062.1.2
> > to 3.10.0-1062.9.1
> >
> > Thanks, Massimo
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Different memory usage on OSD nodes after update to Nautilus

2020-02-06 Thread Massimo Sgaravatto
Dear all

In the mid of January I updated my ceph cluster from Luminous to Nautilus.

Attached you can see the memory metrics collected on one OSD node (I see
the very same behavior on all OSD hosts) graphed via Ganglia
This is Centos 7 node, with 64 GB of RAM, hosting 10 OSDs.

So before the update there were about 20 GB of FreeMem.
Now FreeMem is basically 0, but I see 20 GB of Buffers,

I guess this triggered some swapping, probably because I forgot to
set vm.swappiness to 0 (it was set to 60, the default value).

I was wondering if this the expected behavior

PS: Actually besides updating ceph, I also updated all the other packages
(yum update), so I am not sure that this different memory usage is because
of the ceph update
For the record in this update the kernel was updated from 3.10.0-1062.1.2
to 3.10.0-1062.9.1

Thanks, Massimo
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Network performance checks

2020-01-31 Thread Massimo Sgaravatto
I am seeing very few of such error messages in the mon logs (~ a couple per
day)
If I issue on every OSD the command "ceph daemon osd.$id dump_osd_network"
with the default 1000 ms threshold, I can't see entries.
I guess this is because that command considers only the last (15 ?) minutes.

Am I supposed to see in some log files which are the problematic OSDs ?

Thanks, Massimo


On Thu, Jan 30, 2020 at 11:13 AM Stefan Kooman  wrote:

> Hi,
>
> Quoting Massimo Sgaravatto (massimo.sgarava...@gmail.com):
> > Thanks for your answer
> >
> > MON-MGR hosts have a mgmt network and a public network.
> > OSD nodes have instead a mgmt network, a  public network. and a cluster
> > network
> > This is what I have in ceph.conf:
> >
> > public network = 192.168.61.0/24
> > cluster network = 192.168.222.0/24
> >
> >
> > public and cluster networks are 10 Gbps networks (actually there is a
> > single 10 Gbps NIC on each node used for both the public and the cluster
> > networks).
>
> In that case there is no advantage of using a seperate cluster network.
> As it would only be beneficial when replication data between OSDs is on
> a seperate interface. Is the cluster heavily loaded? Do you have metrics
> on bandwith usage / switch port statistics? If you have many "discards"
> (and / or errors) this might impact the ping times as well.
>
> > The mgmt network is a 1 Gbps network, but this one shouldn't be used for
> > such pings among the OSDs ...
>
> I doubt Ceph will use the mgmt network, but not sure if 't doing a
> lookup on hostname which might use mgmt network in your case or if it's
> using configured IPs for ceph
>
>
> You can dump osd network info per OSD on the storage nodes themselves by
> this command:
>
> ceph daemon osd.$id dump_osd_network
>
> You would have to do that for every OSD and see which ones report
> "entries".
>
> Gr. Stefan
>
> --
> | BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Network performance checks

2020-01-30 Thread Massimo Sgaravatto
Thanks for your answer

MON-MGR hosts have a mgmt network and a public network.
OSD nodes have instead a mgmt network, a  public network. and a cluster
network
This is what I have in ceph.conf:

public network = 192.168.61.0/24
cluster network = 192.168.222.0/24


public and cluster networks are 10 Gbps networks (actually there is a
single 10 Gbps NIC on each node used for both the public and the cluster
networks).
The mgmt network is a 1 Gbps network, but this one shouldn't be used for
such pings among the OSDs ...

Cheers, Massimo


On Thu, Jan 30, 2020 at 9:26 AM Stefan Kooman  wrote:

> Quoting Massimo Sgaravatto (massimo.sgarava...@gmail.com):
> > After having upgraded my ceph cluster from Luminous to Nautilus 14.2.6 ,
> > from time to time "ceph health detail" claims about some"Long heartbeat
> > ping times on front/back interface seen".
> >
> > As far as I can understand (after having read
> > https://docs.ceph.com/docs/nautilus/rados/operations/monitoring/), this
> > means that  the ping from one OSD to another one exceeded 1 s.
> >
> > I have some questions on these network performance checks
> >
> > 1) What is meant exactly with front and back interface ?
>
> Do you have a "public" and a "cluster" network? I would expect that the
> "back" interface is a "cluster" network interface.
>
> > 2) I can see the involved OSDs only in the output of "ceph health detail"
> > (when there is the problem) but I can't find this information  in the log
> > files. In the mon log file I can only see messages such as:
> >
> >
> > 2020-01-28 11:14:07.641 7f618e644700  0 log_channel(cluster) log [WRN] :
> > Health check failed: Long heartbeat ping times on back interface seen,
> > longest is 1416.618 msec (OSD_SLOW_PING_TIME_BACK)
> >
> > but the involved OSDs are not reported in this log.
> > Do I just need to increase the verbosity of the mon log ?
> >
> > 3) Is 1 s a reasonable value for this threshold ? How could this value be
> > changed ? What is the relevant configuration variable ?
>
> Not sure how much priority Ceph gives to this ping check. But if you're
> on a 10 Gb/s network I would start complaining when things take longer
> than 1 ms ... a ping should not take much longer than 0.05 ms so if it
> would take an order of magnitude longer than expected latency is not
> optimal.
>
> For Gigabit networks I would bump above values by an order of magnitude.
>
> Gr. Stefan
>
> --
> | BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Network performance checks

2020-01-29 Thread Massimo Sgaravatto
After having upgraded my ceph cluster from Luminous to Nautilus 14.2.6 ,
from time to time "ceph health detail" claims about some"Long heartbeat
ping times on front/back interface seen".

As far as I can understand (after having read
https://docs.ceph.com/docs/nautilus/rados/operations/monitoring/), this
means that  the ping from one OSD to another one exceeded 1 s.

I have some questions on these network performance checks

1) What is meant exactly with front and back interface ?

2) I can see the involved OSDs only in the output of "ceph health detail"
(when there is the problem) but I can't find this information  in the log
files. In the mon log file I can only see messages such as:


2020-01-28 11:14:07.641 7f618e644700  0 log_channel(cluster) log [WRN] :
Health check failed: Long heartbeat ping times on back interface seen,
longest is 1416.618 msec (OSD_SLOW_PING_TIME_BACK)

but the involved OSDs are not reported in this log.
Do I just need to increase the verbosity of the mon log ?

3) Is 1 s a reasonable value for this threshold ? How could this value be
changed ? What is the relevant configuration variable ?

4)  https://docs.ceph.com/docs/nautilus/rados/operations/monitoring/
suggests to use the dump_osd_network command. I think there is an error in
that page: it says that the command should be issued on ceph-mgr.x.asok,
while I think that instead the ceph-osd-x.asok should be used

I have an other ceph cluster (running nautilus 14.2.6 as well) where there
aren't OSD_SLOW_PING_* error messages in the mon logs, but:

ceph daemon /var/run/ceph/ceph-osd..asok dump_osd_network 1

reports a lot of entries (i.e. pings exceeded 1 s). How can this be
explained ?


Thanks, Massimo
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Balancer active plan

2019-09-26 Thread Massimo Sgaravatto
I think you should see this info in the mgr log (with debug mgr = 4/5)

Cheers, Massimo

On Thu, Sep 26, 2019 at 3:43 PM Marc Roos  wrote:

>
> Suddenly I have objects misplaced, I assume this is because of the
> balancer being active. But
>
> 1. how can I see the currently/last executed plan of the balancer?
> 2. when it was activated?
>
>
>  7   active+remapped+backfill_wait
>  7   active+remapped+backfilling
>
>
>
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -.
> F1 Outsourcing Development Sp. z o.o.
> Poland
>
> t:  +48 (0)124466845
> f:  +48 (0)124466843
> e:  m...@f1-outsourcing.eu
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 2 OpenStack environment, 1 Ceph cluster

2019-09-10 Thread Massimo Sgaravatto
We have a single ceph cluster used by 2 openstack installations.

We use different ceph pools for the 2 openstack clusters.
For nova, cinder and glance this is straightforward.

It was a bit more complicated fo radosgw. In this case the setup I used was:

- creating 2 realms (one for each cloud)
- creating one zonegroup for each realm
- creating one zone for each zonegroup
- having 1 ore more rgw instances for each zone

I don't know if there are simpler approaches

Cheers, Massimo

On Tue, Sep 10, 2019 at 11:20 AM Wesley Peng  wrote:

>
>
> on 2019/9/10 17:14, vladimir franciz blando wrote:
> > I have 2 OpenStack environment that I want to integrate to an existing
> > ceph cluster.  I know technically it can be done but has anyone tried
> this?
> >
>
> Sure you can. Ceph could be deployed as separate storage service,
> openstack is just its customer. You can have one customer, but also can
> have multi-customers for Ceph service.
>
> regards.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to map 2 different Openstack users belonging to the same project to 2 distinct radosgw users ?

2019-08-29 Thread Massimo Sgaravatto
Both (swift and S3)

S3 access would be done using the EC2 credentials available in OpenStack

The main use case that I would like to address is to prevent users from
being able to delete the objects created by other users of the same project.

Thanks, Massimo

On Thu, Aug 29, 2019 at 4:41 PM Burkhard Linke <
burkhard.li...@computational.bio.uni-giessen.de> wrote:

> Hi,
>
>
> which protocol do you intend to use? Swift and S3 behave completely
> different with respect to users and keystone based authentication.
>
>
> Regards,
>
> Burkhard
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How to map 2 different Openstack users belonging to the same project to 2 distinct radosgw users ?

2019-08-29 Thread Massimo Sgaravatto
I have a question regarding ceph-openstack integration for object storage.

Is there some configuration/hack/workaround that allows to map 2 different
users belonging to the same OpenStack project to 2 distinct radosgw users ?

I saw this old bug:

https://tracker.ceph.com/issues/20570

but it looks like it isn't going to be addressed in the near future

Thanks, Massimo
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io