[ceph-users] Re: Upgrade from v15.2.16 to v16.2.7 not starting

2022-05-18 Thread Lo Re Giuseppe

Hi,

I didn’t notice anything suspicious in mgr logs, neither in the cephadm.log one 
(attaching an extract of the latest).
What I have noticed is that one the mgr container, the active one, gets 
restarted about every 3 minutes (as reported by ceph -w)
"""
2022-05-18T15:30:49.883238+0200 mon.naret-monitor01 [INF] Active manager daemon 
naret-monitor01.tvddjv restarted
2022-05-18T15:30:49.889294+0200 mon.naret-monitor01 [INF] Activating manager 
daemon naret-monitor01.tvddjv
2022-05-18T15:30:50.832200+0200 mon.naret-monitor01 [INF] Manager daemon 
naret-monitor01.tvddjv is now available
2022-05-18T15:34:16.979735+0200 mon.naret-monitor01 [INF] Active manager daemon 
naret-monitor01.tvddjv restarted
2022-05-18T15:34:16.985531+0200 mon.naret-monitor01 [INF] Activating manager 
daemon naret-monitor01.tvddjv
2022-05-18T15:34:18.246784+0200 mon.naret-monitor01 [INF] Manager daemon 
naret-monitor01.tvddjv is now available
2022-05-18T15:37:34.576159+0200 mon.naret-monitor01 [INF] Active manager daemon 
naret-monitor01.tvddjv restarted
2022-05-18T15:37:34.582935+0200 mon.naret-monitor01 [INF] Activating manager 
daemon naret-monitor01.tvddjv
2022-05-18T15:37:35.821200+0200 mon.naret-monitor01 [INF] Manager daemon 
naret-monitor01.tvddjv is now available
2022-05-18T15:40:00.000148+0200 mon.naret-monitor01 [INF] overall HEALTH_OK
2022-05-18T15:40:52.456182+0200 mon.naret-monitor01 [INF] Active manager daemon 
naret-monitor01.tvddjv restarted
2022-05-18T15:40:52.461826+0200 mon.naret-monitor01 [INF] Activating manager 
daemon naret-monitor01.tvddjv
2022-05-18T15:40:53.787353+0200 mon.naret-monitor01 [INF] Manager daemon 
naret-monitor01.tvddjv is now available
"""
Attaching also the active mgr proc logs.
The cluster is working fine, but I wonder if this behaviour of mgr/cephadm is 
itself wrong and might cause the stall of the upgrade.

Thanks,

Giuseppe 
 

On 18.05.22, 14:19, "Eugen Block"  wrote:

Do you see anything suspicious in /var/log/ceph/cephadm.log? Also  
check the mgr logs for any hints.


Zitat von Lo Re  Giuseppe :

> Hi,
>
> We have happily tested the upgrade from v15.2.16 to v16.2.7 with  
> cephadm on a test cluster made of 3 nodes and everything went  
> smoothly.
> Today we started the very same operation on the production one (20  
> OSD servers, 720 HDDs) and the upgrade process doesn’t do anything  
> at all…
>
> To be more specific, we have issued the command
>
> ceph orch upgrade start --image quay.io/ceph/ceph:v16.2.7
>
> and soon after “ceph -s” reports
>
> Upgrade to quay.io/ceph/ceph:v16.2.7 (0s)
>   []
>
> But only for few seconds, after that
>
> [root@naret-monitor01 ~]# ceph -s
>   cluster:
> id: 63334166-d991-11eb-99de-40a6b72108d0
> health: HEALTH_OK
>
>   services:
> mon: 3 daemons, quorum  
> naret-monitor01,naret-monitor02,naret-monitor03 (age 7d)
> mgr: naret-monitor01.tvddjv(active, since 60s), standbys:  
> naret-monitor02.btynnb
> mds: cephfs:1 {0=cephfs.naret-monitor01.uvevbf=up:active} 2 up:standby
> osd: 760 osds: 760 up (since 6d), 760 in (since 2w)
> rgw: 3 daemons active (cscs-realm.naret-zone.naret-rgw01.qvhhbi,  
> cscs-realm.naret-zone.naret-rgw02.pduagk,  
> cscs-realm.naret-zone.naret-rgw03.aqdkkb)
>
>   task status:
>
>   data:
> pools:   30 pools, 16497 pgs
> objects: 833.14M objects, 3.1 PiB
> usage:   5.0 PiB used, 5.9 PiB / 11 PiB avail
> pgs: 16460 active+clean
>  37active+clean+scrubbing+deep
>
>   io:
> client:   4.7 MiB/s rd, 4.0 MiB/s wr, 122 op/s rd, 47 op/s wr
>
>   progress:
> Removing image fulen-hdd/c991f6fdf41964 from trash (53s)
>   [] (remaining: 81m)
>
>
>
> The command “ceph orch upgrade status” says:
>
> {
> "target_image": "quay.io/ceph/ceph:v16.2.7",
> "in_progress": true,
> "services_complete": [],
> "message": ""
> }
>
> It doesn’t even pull the container image.
> I have tested that the podman pull command works, I was able to pull  
> quay.io/ceph/ceph:v16.2.7.
>
> “ceph -w” and “ceph -W cephadm” don’t report any activity related to  
> the upgrade.
>
>
> Does anyone have seen anything similar?
> Do you have advises on how to understand what’s holding the upgrade  
> process to actually start?
>
> Thanks in advance,
>
> Giuseppe
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

_

[ceph-users] Re: S3 and RBD backup

2022-05-18 Thread Lo Re Giuseppe
Hi,

We are doing exactly the same, exporting bucket as NFS share and run on it our 
backup software to get data to tape.
Given the data volumes replication to another S3 disk based endpoint is not 
viable for us.
Regards,

Giuseppe

On 18.05.22, 23:14, "stéphane chalansonnet"  wrote:

Hello,

In fact S3 should be replicated on another region or AZ , and backup should
be managed with versioning on bucket.

But, in our case, we needed to secure the backup of databases (on K8S) into
our external backup solution (EMC Networker)

We implemented Ganesha and create an export NFS link to the bucket of some
users S3.
NFS export was mounted into storage backup Node and backup .

Not the simpler solution but it works ;)

Regards,
Stephane



Le mer. 18 mai 2022 à 22:34, Sanjeev Jha  a écrit :

> Thanks Janne for the information in detail.
>
> We have RHCS 4.2 non-collocated setup in one DC only. There are few RBD
> volumes mapped to MariaDB Database.
> Also, S3 endpoint with bucket is being used to upload objects. There is no
> multisite zone has been implemented yet.
> My Requirement is to take backup of RBD images and database.
> How can S3 bucket backup and restore be possible?
> We are looking for many opensource tool like rclone for S3 and Benji for
> RBD but not able to make sure whether these tools would be enough to
> achieve backup goal.
> Your suggestion based on the above case would be much appreciated.
>
> Best,
> Sanjeev
>
> 
> From: Janne Johansson 
> Sent: Tuesday, May 17, 2022 1:01 PM
> To: Sanjeev Jha 
> Cc: ceph-users@ceph.io 
> Subject: Re: [ceph-users] S3 and RBD backup
>
> Den mån 16 maj 2022 kl 13:41 skrev Sanjeev Jha :
> > Could someone please let me know how to take S3 and RBD backup from Ceph
> side and possibility to take backup from Client/user side?
> > Which tool should I use for the backup?
>
> Backing data up, or replicating it is a choice between a lot of
> variables and options, and choosing something that has the least
> negative effects for your own environment and your own demands. Some
> options will cause a lot of network traffic, others will use a lot of
> CPU somewhere, others will waste disk on the destination for
> performance reasons and some will have long and complicated restore
> procedures. Some will be realtime copies but those might put extra
> load on the cluster while running, others will be asynchronous but
> might need a database at all times to keep track of what not to copy
> because it is already at the destination. Some synchronous options
> might even cause writes to be slower in order to guarantee that ALL
> copies are in place before sending clients an ACK, some will not and
> those might lose data that the client thought was delivered 100% ok.
>
> Without knowing what your demands are, or knowing what situation and
> environment you are in, it will be almost impossible to match the
> above into something that is good for you.
> Some might have a monetary cost, some may require a complete second
> cluster of equal size, some might have a cost in terms of setup work
> from clueful ceph admins that will take a certain amount of time and
> effort. Some options might require clients to change how they write
> data into the cluster in order to help the backup/replication system.
>
> There is unfortunately not a single best choice for all clusters,
> there might even not exist a good option just to cover both S3 and RBD
> since they are inherently very different.
> RBD will almost certainly be only full restores of a large complete
> image, S3 users might want to have the object
> foo/bar/MyImportantWriting.doc from last wednesday back only and not
> revert the whole bucket or the whole S3 setup.
>
> I'm quite certain that there will not be a single
> cheap,fast,efficient,scalable,unnoticeable,easy solution that solves
> all these problems at once, but rather you will have to focus on what
> the toughest limitations are (money, time, disk, rackspace, network
> capacity, client and IO demands?) and look for solutions (or products)
> that work well with those restrictions.
>
> --
> May the most significant bit of your life be positive.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] v16.2.9 Pacific released

2022-05-18 Thread David Galloway
16.2.9 is a hotfix release to address a bug in 16.2.8 that can cause the 
MGRs to deadlock.


See https://tracker.ceph.com/issues/55687.


Getting Ceph

* Git at git://github.com/ceph/ceph.git
* Tarball at https://download.ceph.com/tarballs/ceph-16.2.9.tar.gz
* Containers at https://quay.io/repository/ceph/ceph
* For packages, see https://docs.ceph.com/docs/master/install/get-packages/
* Release git sha1: 4c3647a322c0ff5a1dd2344e039859dcbd28c830

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: osd_disk_thread_ioprio_class deprecated?

2022-05-18 Thread Richard Bade
> See this PR
> https://github.com/ceph/ceph/pull/19973

> Doing "git log -Sosd_disk_thread_ioprio_class -u
> src/common/options.cc" in the Ceph source indicates that they were
> removed in commit 3a331c8be28f59e2b9d952e5b5e864256429d9d5 which first
> appeared in Mimic.

Thanks Matthew and Josh for the info. That clears it up.
I'll pull those settings out of my config.

Rich
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Options for RADOS client-side write latency monitoring

2022-05-18 Thread stéphane chalansonnet
Hello,

In my opinion the better way is to deploy a batch fio pod (PVC volume on
your rook ceph) on your K8S.
IO profile depend of your workload but you can try 8Kb (postgresql default)
random read/write and seq
In this way, you will be as close as possible from the client side
Export on Json the result and just graph it

Regards,

Stéphane


Le mer. 18 mai 2022 à 00:01,  a écrit :

> Greetings, all.
>
> I'm attempting to introduce client-side RADOS write latency monitoring on a
> (rook) Ceph cluster. The use case is a mixture of containers, serving file
> and
> database workloads (although my question my applies more broadly.)
>
> The aim here is to measure the average write latency as observed by a
> client,
> rather than relying entirely on the metrics reported by the OSDs
> (i.e ceph_osd_commit_latency_ms and ceph_osd_apply_latency_ms.)
>
> So far, I’ve tested using `rados bench` to produce some basic write latency
> monitoring using a shell script.
>
> The parameters I’m using:
> •Single thread
> •64KB block size
> •10 seconds to benchmark
>
> Essentially, the script parses output (average latency) from the following:
>
> rados bench --pool=xxx 10 write -t 1 -b 65536
>
> Questions:
>
> 1. Are the parameters outlined above optimal for this kind of performance
> monitoring (for example, would it be better to use a block size of 4KB, or
> even 1KB)?
>
> 2. Is there a better approach here (for example, using a ceph-manager
> plugin or other more standard approach)?
>
> Thanks!
>
> Best regards,
>
> Jules
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: S3 and RBD backup

2022-05-18 Thread stéphane chalansonnet
Hello,

In fact S3 should be replicated on another region or AZ , and backup should
be managed with versioning on bucket.

But, in our case, we needed to secure the backup of databases (on K8S) into
our external backup solution (EMC Networker)

We implemented Ganesha and create an export NFS link to the bucket of some
users S3.
NFS export was mounted into storage backup Node and backup .

Not the simpler solution but it works ;)

Regards,
Stephane



Le mer. 18 mai 2022 à 22:34, Sanjeev Jha  a écrit :

> Thanks Janne for the information in detail.
>
> We have RHCS 4.2 non-collocated setup in one DC only. There are few RBD
> volumes mapped to MariaDB Database.
> Also, S3 endpoint with bucket is being used to upload objects. There is no
> multisite zone has been implemented yet.
> My Requirement is to take backup of RBD images and database.
> How can S3 bucket backup and restore be possible?
> We are looking for many opensource tool like rclone for S3 and Benji for
> RBD but not able to make sure whether these tools would be enough to
> achieve backup goal.
> Your suggestion based on the above case would be much appreciated.
>
> Best,
> Sanjeev
>
> 
> From: Janne Johansson 
> Sent: Tuesday, May 17, 2022 1:01 PM
> To: Sanjeev Jha 
> Cc: ceph-users@ceph.io 
> Subject: Re: [ceph-users] S3 and RBD backup
>
> Den mån 16 maj 2022 kl 13:41 skrev Sanjeev Jha :
> > Could someone please let me know how to take S3 and RBD backup from Ceph
> side and possibility to take backup from Client/user side?
> > Which tool should I use for the backup?
>
> Backing data up, or replicating it is a choice between a lot of
> variables and options, and choosing something that has the least
> negative effects for your own environment and your own demands. Some
> options will cause a lot of network traffic, others will use a lot of
> CPU somewhere, others will waste disk on the destination for
> performance reasons and some will have long and complicated restore
> procedures. Some will be realtime copies but those might put extra
> load on the cluster while running, others will be asynchronous but
> might need a database at all times to keep track of what not to copy
> because it is already at the destination. Some synchronous options
> might even cause writes to be slower in order to guarantee that ALL
> copies are in place before sending clients an ACK, some will not and
> those might lose data that the client thought was delivered 100% ok.
>
> Without knowing what your demands are, or knowing what situation and
> environment you are in, it will be almost impossible to match the
> above into something that is good for you.
> Some might have a monetary cost, some may require a complete second
> cluster of equal size, some might have a cost in terms of setup work
> from clueful ceph admins that will take a certain amount of time and
> effort. Some options might require clients to change how they write
> data into the cluster in order to help the backup/replication system.
>
> There is unfortunately not a single best choice for all clusters,
> there might even not exist a good option just to cover both S3 and RBD
> since they are inherently very different.
> RBD will almost certainly be only full restores of a large complete
> image, S3 users might want to have the object
> foo/bar/MyImportantWriting.doc from last wednesday back only and not
> revert the whole bucket or the whole S3 setup.
>
> I'm quite certain that there will not be a single
> cheap,fast,efficient,scalable,unnoticeable,easy solution that solves
> all these problems at once, but rather you will have to focus on what
> the toughest limitations are (money, time, disk, rackspace, network
> capacity, client and IO demands?) and look for solutions (or products)
> that work well with those restrictions.
>
> --
> May the most significant bit of your life be positive.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS fails to start with error PurgeQueue.cc: 286: FAILED ceph_assert(readable)

2022-05-18 Thread Eugen Block

Hi,

I don’t know what could cause that error, but could you share more  
details? You seem to have multiple active MDSs, is that correct? Could  
they be overloaded? What happened exactly, did one MDS fail or all of  
them? Do the standby MDS report anything different?


Zitat von Kuko Armas :

Hello, I've been having problems with my MDS and they got stuck in  
up:reply state
The journal was ok and everything seemed ok, so I reset the journal  
and now all MDS fail to start with the following error:


 2022-05-18 12:27:40.092 7f8748561700 -1  
/home/abuild/rpmbuild/BUILD/ceph-14.2.16-402-g7d47dbaf4d/src/mds/PurgeQueue.cc: In function 'void PurgeQueue::_recover()' thread 7f8748561700 time 2022-05-18  
12:27:40.094406
/home/abuild/rpmbuild/BUILD/ceph-14.2.16-402-g7d47dbaf4d/src/mds/PurgeQueue.cc: 286: FAILED  
ceph_assert(readable)


 ceph version 14.2.16-402-g7d47dbaf4d  
(7d47dbaf4d0960a2e910628360ae36def84ed913) nautilus (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char  
const*)+0x152) [0x7f8756ca91a6]
 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char  
const*, char const*, ...)+0) [0x7f8756ca9381]

 3: (PurgeQueue::_recover()+0x4ad) [0x55f1666ad26d]
 4: (()+0x2b0353) [0x55f1666ad353]
 5: (FunctionContext::finish(int)+0x2c) [0x55f16652b63c]
 6: (Context::complete(int)+0x9) [0x55f166529339]
 7: (Finisher::finisher_thread_entry()+0x15e) [0x7f8756cf231e]
 8: (()+0x84f9) [0x7f87565a64f9]
 9: (clone()+0x3f) [0x7f87557affbf]

2022-05-18 12:27:40.092 7f8748561700 -1 *** Caught signal (Aborted) **
 in thread 7f8748561700 thread_name:PQ_Finisher

 ceph version 14.2.16-402-g7d47dbaf4d  
(7d47dbaf4d0960a2e910628360ae36def84ed913) nautilus (stable)


 Its a production cluster, so it's fairly urgent

Salu2!
--
Miguel Armas
CanaryTek Consultoria y Sistemas SL
http://www.canarytek.com/

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: S3 and RBD backup

2022-05-18 Thread Sanjeev Jha
Thanks Janne for the information in detail.

We have RHCS 4.2 non-collocated setup in one DC only. There are few RBD volumes 
mapped to MariaDB Database.
Also, S3 endpoint with bucket is being used to upload objects. There is no 
multisite zone has been implemented yet.
My Requirement is to take backup of RBD images and database.
How can S3 bucket backup and restore be possible?
We are looking for many opensource tool like rclone for S3 and Benji for RBD 
but not able to make sure whether these tools would be enough to achieve backup 
goal.
Your suggestion based on the above case would be much appreciated.

Best,
Sanjeev


From: Janne Johansson 
Sent: Tuesday, May 17, 2022 1:01 PM
To: Sanjeev Jha 
Cc: ceph-users@ceph.io 
Subject: Re: [ceph-users] S3 and RBD backup

Den mån 16 maj 2022 kl 13:41 skrev Sanjeev Jha :
> Could someone please let me know how to take S3 and RBD backup from Ceph side 
> and possibility to take backup from Client/user side?
> Which tool should I use for the backup?

Backing data up, or replicating it is a choice between a lot of
variables and options, and choosing something that has the least
negative effects for your own environment and your own demands. Some
options will cause a lot of network traffic, others will use a lot of
CPU somewhere, others will waste disk on the destination for
performance reasons and some will have long and complicated restore
procedures. Some will be realtime copies but those might put extra
load on the cluster while running, others will be asynchronous but
might need a database at all times to keep track of what not to copy
because it is already at the destination. Some synchronous options
might even cause writes to be slower in order to guarantee that ALL
copies are in place before sending clients an ACK, some will not and
those might lose data that the client thought was delivered 100% ok.

Without knowing what your demands are, or knowing what situation and
environment you are in, it will be almost impossible to match the
above into something that is good for you.
Some might have a monetary cost, some may require a complete second
cluster of equal size, some might have a cost in terms of setup work
from clueful ceph admins that will take a certain amount of time and
effort. Some options might require clients to change how they write
data into the cluster in order to help the backup/replication system.

There is unfortunately not a single best choice for all clusters,
there might even not exist a good option just to cover both S3 and RBD
since they are inherently very different.
RBD will almost certainly be only full restores of a large complete
image, S3 users might want to have the object
foo/bar/MyImportantWriting.doc from last wednesday back only and not
revert the whole bucket or the whole S3 setup.

I'm quite certain that there will not be a single
cheap,fast,efficient,scalable,unnoticeable,easy solution that solves
all these problems at once, but rather you will have to focus on what
the toughest limitations are (money, time, disk, rackspace, network
capacity, client and IO demands?) and look for solutions (or products)
that work well with those restrictions.

--
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Best way to change disk in controller disk without affect cluster

2022-05-18 Thread Anthony D'Atri

First question:  why do you want to do this?

There are some deployment scenarios in which moving the drives will Just Work, 
and others in which it won’t.  If you try, I suggest shutting the system down 
all the way, exchanging just two drives, then powering back on — and see if all 
is well before doing all.

On which Ceph release were these OSDs deployed? Containerized? Are you using 
ceph-disk or ceph-volume? LVM?  Colocated journal/DB/WAL, or on a seperate 
device?

Try `ls -l /var/lib/ceph/someosd` or whatever you have, look for symlinks that 
reference device paths that may be stale if drives are swapped.

> 
> Hello,
> 
> Have I check same global flag for this operation?
> 
> Thanks!
> 
> De: Stefan Kooman 
> Enviado: miércoles, 18 de mayo de 2022 14:13
> Para: Jorge JP 
> Asunto: Re: [ceph-users] Best way to change disk in controller disk without 
> affect cluster
> 
> On 5/18/22 13:06, Jorge JP wrote:
>> Hello!
>> 
>> I have a cluster ceph with 6 nodes with 6 HDD disks in each one. The status 
>> of my cluster is OK and the pool 45.25% (95.55 TB of 211.14 TB). I don't 
>> have any problem.
>> 
>> I want change the position of a various disks in the disk controller of some 
>> nodes and I don't know what is the way.
>> 
>>  - Stop osd and move the disk of position (hotplug).
>> 
>>  - Reweight osd to 0 and move the pgs to other osds, stop osd and change 
>> position
>> 
>> I think first option is ok, the data not deleted and when I will changed the 
>> disk the server recognised again and I will can start osd without problems.
> 
> Order of the disks should not matter. First option is fine.
> 
> Gr. Stefan
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: osd_disk_thread_ioprio_class deprecated?

2022-05-18 Thread Matthew H
See this PR

https://github.com/ceph/ceph/pull/19973



From: Josh Baergen 
Sent: Wednesday, May 18, 2022 10:54 AM
To: Richard Bade 
Cc: Ceph Users 
Subject: [ceph-users] Re: osd_disk_thread_ioprio_class deprecated?

Hi Richard,

> Could anyone confirm this? And which release it was deprecated in?

Doing "git log -Sosd_disk_thread_ioprio_class -u
src/common/options.cc" in the Ceph source indicates that they were
removed in commit 3a331c8be28f59e2b9d952e5b5e864256429d9d5 which first
appeared in Mimic.

Josh
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Best way to change disk in controller disk without affect cluster

2022-05-18 Thread Jorge JP
Hello,

Have I check same global flag for this operation?

Thanks!

De: Stefan Kooman 
Enviado: miércoles, 18 de mayo de 2022 14:13
Para: Jorge JP 
Asunto: Re: [ceph-users] Best way to change disk in controller disk without 
affect cluster

On 5/18/22 13:06, Jorge JP wrote:
> Hello!
>
> I have a cluster ceph with 6 nodes with 6 HDD disks in each one. The status 
> of my cluster is OK and the pool 45.25% (95.55 TB of 211.14 TB). I don't have 
> any problem.
>
> I want change the position of a various disks in the disk controller of some 
> nodes and I don't know what is the way.
>
>   - Stop osd and move the disk of position (hotplug).
>
>   - Reweight osd to 0 and move the pgs to other osds, stop osd and change 
> position
>
> I think first option is ok, the data not deleted and when I will changed the 
> disk the server recognised again and I will can start osd without problems.

Order of the disks should not matter. First option is fine.

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS upgrade to Quincy

2022-05-18 Thread Patrick Donnelly
Hi Jimmy,

On Fri, Apr 22, 2022 at 11:02 AM Jimmy Spets  wrote:
>
> Does cephadm automatically reduce ranks to 1 or does that have to be done
> manually?

Automatically.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] May Ceph Science Virtual User Group

2022-05-18 Thread Kevin Hrpcek

Hey all,

We will be having a Ceph science/research/big cluster call on Tuesday 
May 24th. Please note we're doing this on a Tuesday not the usual 
Wednesday we've done in the past. If anyone wants to discuss something 
specific they can add it to the pad linked below. If you have questions 
or comments you can contact me.


This is an informal open call of community members mostly from 
hpc/htc/research environments where we discuss whatever is on our minds 
regarding ceph. Updates, outages, features, maintenance, etc...there is 
no set presenter but I do attempt to keep the conversation lively.


https://pad.ceph.com/p/Ceph_Science_User_Group_20220524 



We try to keep it to an hour or less.

Ceph calendar event details:
May 24, 2022
14:00 UTC
4pm Central European
9am Central US

Description: Main pad for discussions: 
https://pad.ceph.com/p/Ceph_Science_User_Group_Index

Meetings will be recorded and posted to the Ceph Youtube channel.
To join the meeting on a computer or mobile phone: 
https://bluejeans.com/908675367?src=calendarLink

To join from a Red Hat Deskphone or Softphone, dial: 84336.
Connecting directly from a room system?
    1.) Dial: 199.48.152.152 or bjn.vc
    2.) Enter Meeting ID: 908675367
Just want to dial in on your phone?
    1.) Dial one of the following numbers: 408-915-6466 (US)
    See all numbers: https://www.redhat.com/en/conference-numbers
    2.) Enter Meeting ID: 908675367
    3.) Press #
Want to test your video connection? https://bluejeans.com/111


Kevin

--
Kevin Hrpcek
NASA VIIRS Atmosphere SIPS/TROPICS
Space Science & Engineering Center
University of Wisconsin-Madison
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Moving data between two mounts of the same CephFS

2022-05-18 Thread Magnus HAGDORN
Hi Mathias,
I have noticed in the past the moving directories within the same mount
point can take a very long time using the system mv command. I use a
python script to archive old user directories by moving them to a
different part of the filesystem which is not exposed to the users. I
use the rename method of a python Path object which is atomic.

In your case I would expect a copy and unlink because the operation is
across different mount points.

Is this operation a one off or a regular occurance? If it is a one off
then I would do it as adminstrator. If it is a regular occurance I
would look into re-arranging the filesystem layout to make this
possible.

Regards
magnus


On Wed, 2022-05-18 at 13:34 +, Kuhring, Mathias wrote:
> This email was sent to you by someone outside the University.
> You should only click on links or attachments if you are certain that
> the email is genuine and the content is safe.
>
> Dear Ceph community,
>
> Let's say I want to make different sub-directories of my CephFS
> separately available on a client system,
> i.e. without exposing the parent directories (because it contains
> other
> sensitive data, for instance).
>
> I can simply mount specific different folders, as primitively
> illustrated here:
>
> CephFS root:
> - FolderA
> - FolderB
> - FolderC
>
> Client mounts:
> - MountA --> cephfs:/FolderA
> - MountB --> cephfs:/FolderB
>
> Now I'm wondering what actually happens in the background when I move
> (not copy) data from MountA to MountB.
> In particular, is CephFS by chance aware of this situation and
> actually
> performs an atomic move internally?
> Or is more like a copy and unlink operation via the client?
>
> I appreciate your thoughts.
>
> Best wishes,
> Mathias
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh 
Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Moving data between two mounts of the same CephFS

2022-05-18 Thread Kuhring, Mathias
Dear Ceph community,

Let's say I want to make different sub-directories of my CephFS 
separately available on a client system,
i.e. without exposing the parent directories (because it contains other 
sensitive data, for instance).

I can simply mount specific different folders, as primitively 
illustrated here:

CephFS root:
- FolderA
- FolderB
- FolderC

Client mounts:
- MountA --> cephfs:/FolderA
- MountB --> cephfs:/FolderB

Now I'm wondering what actually happens in the background when I move 
(not copy) data from MountA to MountB.
In particular, is CephFS by chance aware of this situation and actually 
performs an atomic move internally?
Or is more like a copy and unlink operation via the client?

I appreciate your thoughts.

Best wishes,
Mathias

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade from v15.2.16 to v16.2.7 not starting

2022-05-18 Thread Eugen Block
Do you see anything suspicious in /var/log/ceph/cephadm.log? Also  
check the mgr logs for any hints.



Zitat von Lo Re  Giuseppe :


Hi,

We have happily tested the upgrade from v15.2.16 to v16.2.7 with  
cephadm on a test cluster made of 3 nodes and everything went  
smoothly.
Today we started the very same operation on the production one (20  
OSD servers, 720 HDDs) and the upgrade process doesn’t do anything  
at all…


To be more specific, we have issued the command

ceph orch upgrade start --image quay.io/ceph/ceph:v16.2.7

and soon after “ceph -s” reports

Upgrade to quay.io/ceph/ceph:v16.2.7 (0s)
  []

But only for few seconds, after that

[root@naret-monitor01 ~]# ceph -s
  cluster:
id: 63334166-d991-11eb-99de-40a6b72108d0
health: HEALTH_OK

  services:
mon: 3 daemons, quorum  
naret-monitor01,naret-monitor02,naret-monitor03 (age 7d)
mgr: naret-monitor01.tvddjv(active, since 60s), standbys:  
naret-monitor02.btynnb

mds: cephfs:1 {0=cephfs.naret-monitor01.uvevbf=up:active} 2 up:standby
osd: 760 osds: 760 up (since 6d), 760 in (since 2w)
rgw: 3 daemons active (cscs-realm.naret-zone.naret-rgw01.qvhhbi,  
cscs-realm.naret-zone.naret-rgw02.pduagk,  
cscs-realm.naret-zone.naret-rgw03.aqdkkb)


  task status:

  data:
pools:   30 pools, 16497 pgs
objects: 833.14M objects, 3.1 PiB
usage:   5.0 PiB used, 5.9 PiB / 11 PiB avail
pgs: 16460 active+clean
 37active+clean+scrubbing+deep

  io:
client:   4.7 MiB/s rd, 4.0 MiB/s wr, 122 op/s rd, 47 op/s wr

  progress:
Removing image fulen-hdd/c991f6fdf41964 from trash (53s)
  [] (remaining: 81m)



The command “ceph orch upgrade status” says:

{
"target_image": "quay.io/ceph/ceph:v16.2.7",
"in_progress": true,
"services_complete": [],
"message": ""
}

It doesn’t even pull the container image.
I have tested that the podman pull command works, I was able to pull  
quay.io/ceph/ceph:v16.2.7.


“ceph -w” and “ceph -W cephadm” don’t report any activity related to  
the upgrade.



Does anyone have seen anything similar?
Do you have advises on how to understand what’s holding the upgrade  
process to actually start?


Thanks in advance,

Giuseppe
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] MDS fails to start with error PurgeQueue.cc: 286: FAILED ceph_assert(readable)

2022-05-18 Thread Kuko Armas


Hello, I've been having problems with my MDS and they got stuck in up:reply 
state
The journal was ok and everything seemed ok, so I reset the journal and now all 
MDS fail to start with the following error:

 2022-05-18 12:27:40.092 7f8748561700 -1 
/home/abuild/rpmbuild/BUILD/ceph-14.2.16-402-g7d47dbaf4d/src/mds/PurgeQueue.cc: 
In function 'void PurgeQueue::_recover()' thread 7f8748561700 time 2022-05-18 
12:27:40.094406
/home/abuild/rpmbuild/BUILD/ceph-14.2.16-402-g7d47dbaf4d/src/mds/PurgeQueue.cc: 
286: FAILED ceph_assert(readable)

 ceph version 14.2.16-402-g7d47dbaf4d 
(7d47dbaf4d0960a2e910628360ae36def84ed913) nautilus (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x152) [0x7f8756ca91a6]
 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char 
const*, ...)+0) [0x7f8756ca9381]
 3: (PurgeQueue::_recover()+0x4ad) [0x55f1666ad26d]
 4: (()+0x2b0353) [0x55f1666ad353]
 5: (FunctionContext::finish(int)+0x2c) [0x55f16652b63c]
 6: (Context::complete(int)+0x9) [0x55f166529339]
 7: (Finisher::finisher_thread_entry()+0x15e) [0x7f8756cf231e]
 8: (()+0x84f9) [0x7f87565a64f9]
 9: (clone()+0x3f) [0x7f87557affbf]

2022-05-18 12:27:40.092 7f8748561700 -1 *** Caught signal (Aborted) **
 in thread 7f8748561700 thread_name:PQ_Finisher

 ceph version 14.2.16-402-g7d47dbaf4d 
(7d47dbaf4d0960a2e910628360ae36def84ed913) nautilus (stable)

 Its a production cluster, so it's fairly urgent

Salu2!
--
Miguel Armas
CanaryTek Consultoria y Sistemas SL
http://www.canarytek.com/

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: No rebalance after ceph osd crush unlink

2022-05-18 Thread Dan van der Ster
Hi,

It's interesting that crushtool doesn't include the shadow tree -- I
am pretty sure that used to be included. I don't suggest editing the
crush map, compiling, then re-injecting -- I don't know what it will
do in this case.

What you could do instead is something like:
* ceph osd getcrushmap -o crush.map # backup the map
* ceph osd set norebalance # disable rebalancing while we experiment
* ceph osd crush reweight-all # see if this fixes the crush shadow tree

The unset norebalance if the crush tree looks good. Or if the crush
tree isn't what you expect, revert to your backup with `ceph osd
setcrushmap -i crush.map`.

-- dan



On Wed, May 18, 2022 at 12:47 PM Frank Schilder  wrote:
>
> Hi Dan,
>
> thanks for pointing me to this. Yes, it looks like a/the bug, the shadow tree 
> is not changed although it should be updated as well. This is not even shown 
> in the crush map I exported with getcrushmap. The option --show-shadow did 
> the trick.
>
> Will `ceph osd crush reweight-all` actually remove these shadow leafs or just 
> set the weight to 0? I need to link this host later again and I would like a 
> solution as clean as possible. What would, for example, happen if I edit the 
> crush map and execute setcrushmap? Will it recompile the correct crush map 
> from the textual definition, or will these hanging leafs persist?
>
> Thanks!
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: Dan van der Ster 
> Sent: 18 May 2022 12:04:07
> To: Frank Schilder
> Cc: ceph-users
> Subject: Re: [ceph-users] No rebalance after ceph osd crush unlink
>
> Hi Frank,
>
> Did you check the shadow tree (the one with tilde's in the name, seen
> with `ceph osd crush tree --show-shadow`)? Maybe the host was removed
> in the outer tree, but not the one used for device-type selection.
> There were bugs in this area before, e.g. 
> https://tracker.ceph.com/issues/48065
> In those cases, the way to make the crush tree consistent again was
> `ceph osd crush reweight-all`.
>
> Cheers, Dan
>
>
>
> On Wed, May 18, 2022 at 11:51 AM Frank Schilder  wrote:
> >
> > Dear all,
> >
> > I have a strange problem. I have some hosts linked under an additional 
> > logical data center and needed to unlink two of the hosts. After unlinking 
> > the first host with
> >
> > ceph osd crush unlink ceph-18 MultiSite
> >
> > the crush map for this data center is updated correctly:
> >
> > datacenter MultiSite {
> > id -148 # do not change unnecessarily
> > id -149 class hdd   # do not change unnecessarily
> > id -150 class ssd   # do not change unnecessarily
> > id -236 class rbd_meta  # do not change unnecessarily
> > id -200 class rbd_data  # do not change unnecessarily
> > id -320 class rbd_perf  # do not change unnecessarily
> > # weight 643.321
> > alg straw2
> > hash 0  # rjenkins1
> > item ceph-04 weight 79.691
> > item ceph-05 weight 81.474
> > item ceph-06 weight 79.691
> > item ceph-07 weight 79.691
> > item ceph-19 weight 81.695
> > item ceph-20 weight 81.695
> > item ceph-21 weight 79.691
> > item ceph-22 weight 79.691
> > }
> >
> > The host is gone. However, nothing happened. The pools with the crush rule
> >
> > rule ms-ssd {
> > id 12
> > type replicated
> > min_size 1
> > max_size 10
> > step take MultiSite class rbd_data
> > step chooseleaf firstn 0 type host
> > step emit
> > }
> >
> > should now move data away from OSDs on this host, but nothing is happening. 
> > A pool with crush rule ms-ssd is:
> >
> > # ceph osd pool get sr-rbd-meta-one all
> > size: 3
> > min_size: 2
> > pg_num: 128
> > pgp_num: 128
> > crush_rule: ms-ssd
> > hashpspool: true
> > nodelete: true
> > nopgchange: false
> > nosizechange: false
> > write_fadvise_dontneed: false
> > noscrub: false
> > nodeep-scrub: false
> > use_gmt_hitset: 1
> > auid: 0
> > fast_read: 0
> >
> > However, its happily keeping data on the OSDs of host ceph-18. For example, 
> > one of the OSDs on this host has ID 1076. There are 4 PGs using this OSD:
> >
> > # ceph pg ls-by-pool sr-rbd-meta-one | grep 1076
> > 1.33 2500 0   0 7561564817834125 
> > 3073 active+clean 2022-05-18 10:54:41.840097 757122'10112944  
> > 757122:84604327[574,286,1076]p574[574,286,1076]p574 2022-05-18 
> > 04:24:32.900261 2022-05-11 19:56:32.781889
> > 1.3d 2590 0   0 7962393603380 64 
> > 3006 active+clean 2022-05-18 10:54:41.749090 757122'24166942  
> > 757122:57010202 [1074,1076,1052]p1074 [1074,1076,1052]p1074 2022-05-18 
> > 06:16:35.605026 2022-05-16 19:37:56.829763
> > 1.4d 2490 0   0 7136789485690105 
> > 3070 active+clean 2022-05-18 10:54:41.738918  7571

[ceph-users] Best way to change disk in controller disk without affect cluster

2022-05-18 Thread Jorge JP
Hello!

I have a cluster ceph with 6 nodes with 6 HDD disks in each one. The status of 
my cluster is OK and the pool 45.25% (95.55 TB of 211.14 TB). I don't have any 
problem.

I want change the position of a various disks in the disk controller of some 
nodes and I don't know what is the way.

 - Stop osd and move the disk of position (hotplug).

 - Reweight osd to 0 and move the pgs to other osds, stop osd and change 
position

I think first option is ok, the data not deleted and when I will changed the 
disk the server recognised again and I will can start osd without problems.

What is the best option for you? Exist any problem?

Best regards!!
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: No rebalance after ceph osd crush unlink

2022-05-18 Thread Dan van der Ster
Hi Frank,

Did you check the shadow tree (the one with tilde's in the name, seen
with `ceph osd crush tree --show-shadow`)? Maybe the host was removed
in the outer tree, but not the one used for device-type selection.
There were bugs in this area before, e.g. https://tracker.ceph.com/issues/48065
In those cases, the way to make the crush tree consistent again was
`ceph osd crush reweight-all`.

Cheers, Dan



On Wed, May 18, 2022 at 11:51 AM Frank Schilder  wrote:
>
> Dear all,
>
> I have a strange problem. I have some hosts linked under an additional 
> logical data center and needed to unlink two of the hosts. After unlinking 
> the first host with
>
> ceph osd crush unlink ceph-18 MultiSite
>
> the crush map for this data center is updated correctly:
>
> datacenter MultiSite {
> id -148 # do not change unnecessarily
> id -149 class hdd   # do not change unnecessarily
> id -150 class ssd   # do not change unnecessarily
> id -236 class rbd_meta  # do not change unnecessarily
> id -200 class rbd_data  # do not change unnecessarily
> id -320 class rbd_perf  # do not change unnecessarily
> # weight 643.321
> alg straw2
> hash 0  # rjenkins1
> item ceph-04 weight 79.691
> item ceph-05 weight 81.474
> item ceph-06 weight 79.691
> item ceph-07 weight 79.691
> item ceph-19 weight 81.695
> item ceph-20 weight 81.695
> item ceph-21 weight 79.691
> item ceph-22 weight 79.691
> }
>
> The host is gone. However, nothing happened. The pools with the crush rule
>
> rule ms-ssd {
> id 12
> type replicated
> min_size 1
> max_size 10
> step take MultiSite class rbd_data
> step chooseleaf firstn 0 type host
> step emit
> }
>
> should now move data away from OSDs on this host, but nothing is happening. A 
> pool with crush rule ms-ssd is:
>
> # ceph osd pool get sr-rbd-meta-one all
> size: 3
> min_size: 2
> pg_num: 128
> pgp_num: 128
> crush_rule: ms-ssd
> hashpspool: true
> nodelete: true
> nopgchange: false
> nosizechange: false
> write_fadvise_dontneed: false
> noscrub: false
> nodeep-scrub: false
> use_gmt_hitset: 1
> auid: 0
> fast_read: 0
>
> However, its happily keeping data on the OSDs of host ceph-18. For example, 
> one of the OSDs on this host has ID 1076. There are 4 PGs using this OSD:
>
> # ceph pg ls-by-pool sr-rbd-meta-one | grep 1076
> 1.33 2500 0   0 7561564817834125 3073 
> active+clean 2022-05-18 10:54:41.840097 757122'10112944  757122:84604327
> [574,286,1076]p574[574,286,1076]p574 2022-05-18 04:24:32.900261 
> 2022-05-11 19:56:32.781889
> 1.3d 2590 0   0 7962393603380 64 3006 
> active+clean 2022-05-18 10:54:41.749090 757122'24166942  757122:57010202 
> [1074,1076,1052]p1074 [1074,1076,1052]p1074 2022-05-18 06:16:35.605026 
> 2022-05-16 19:37:56.829763
> 1.4d 2490 0   0 7136789485690105 3070 
> active+clean 2022-05-18 10:54:41.738918  757119'5861104  757122:45718157  
> [1072,262,1076]p1072  [1072,262,1076]p1072 2022-05-18 06:50:04.731194 
> 2022-05-18 06:50:04.731194
> 1.70 2720 0   0 8143173984591 76 3007 
> active+clean 2022-05-18 10:54:41.743604 757122'11849453  757122:72537747
> [268,279,1076]p268[268,279,1076]p268 2022-05-17 15:43:46.512941 
> 2022-05-17 15:43:46.512941
>
> I don't understand why these are not remapped and rebalancing. Any ideas?
>
> Version is mimic latest.
>
> Thanks and best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Upgrade from v15.2.16 to v16.2.7 not starting

2022-05-18 Thread Lo Re Giuseppe
Hi,

We have happily tested the upgrade from v15.2.16 to v16.2.7 with cephadm on a 
test cluster made of 3 nodes and everything went smoothly.
Today we started the very same operation on the production one (20 OSD servers, 
720 HDDs) and the upgrade process doesn’t do anything at all…

To be more specific, we have issued the command

ceph orch upgrade start --image quay.io/ceph/ceph:v16.2.7

and soon after “ceph -s” reports

Upgrade to quay.io/ceph/ceph:v16.2.7 (0s)
  []

But only for few seconds, after that

[root@naret-monitor01 ~]# ceph -s
  cluster:
id: 63334166-d991-11eb-99de-40a6b72108d0
health: HEALTH_OK

  services:
mon: 3 daemons, quorum naret-monitor01,naret-monitor02,naret-monitor03 (age 
7d)
mgr: naret-monitor01.tvddjv(active, since 60s), standbys: 
naret-monitor02.btynnb
mds: cephfs:1 {0=cephfs.naret-monitor01.uvevbf=up:active} 2 up:standby
osd: 760 osds: 760 up (since 6d), 760 in (since 2w)
rgw: 3 daemons active (cscs-realm.naret-zone.naret-rgw01.qvhhbi, 
cscs-realm.naret-zone.naret-rgw02.pduagk, 
cscs-realm.naret-zone.naret-rgw03.aqdkkb)

  task status:

  data:
pools:   30 pools, 16497 pgs
objects: 833.14M objects, 3.1 PiB
usage:   5.0 PiB used, 5.9 PiB / 11 PiB avail
pgs: 16460 active+clean
 37active+clean+scrubbing+deep

  io:
client:   4.7 MiB/s rd, 4.0 MiB/s wr, 122 op/s rd, 47 op/s wr

  progress:
Removing image fulen-hdd/c991f6fdf41964 from trash (53s)
  [] (remaining: 81m)



The command “ceph orch upgrade status” says:

{
"target_image": "quay.io/ceph/ceph:v16.2.7",
"in_progress": true,
"services_complete": [],
"message": ""
}

It doesn’t even pull the container image.
I have tested that the podman pull command works, I was able to pull 
quay.io/ceph/ceph:v16.2.7.

“ceph -w” and “ceph -W cephadm” don’t report any activity related to the 
upgrade.


Does anyone have seen anything similar?
Do you have advises on how to understand what’s holding the upgrade process to 
actually start?

Thanks in advance,

Giuseppe
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io