[ceph-users] Re: CephFS: convert directory into subvolume

2023-10-10 Thread Eugen Block

Hi,

check out this thread [1] as well, where Anh Phan pointed out:
Not really sure what you want, but for simplicity, just move folder  
to following structure:


/volumes/[Sub Volume Group Name]/[Sub Volume Name]

ceph will recognize it (no extend attr needed), if you use  
subvolumegroup name difference than "_nogroup", you must provide it  
in all subvolume command [--group_name ]


You'll need an existing group (the _nogroup won't work) where you can  
move your directory tree to. That worked in my test as expected.


Regards,
Eugen

[1]  
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/G4ZWGGUPPFQIOVB4SFAIK73H3NLU2WRF/#HB3WW2ENNBBC2ODVSWX2DBGZO2KVB5VK



Zitat von jie.zha...@gmail.com:


Hello,

I'm following this tread and the original.  I'm trying to convert  
directories into subvolumes.  Where I'm stuck is how you move a  
directory into the subvolume root directory.


I have a volume 'tank' and it's mounted on the host as '/mnt/tank'   
I have subfolders '/mnt/tank/database', '/mnt/tank/gitlab', etc...


I create a subvolume and getpath gives me:
/volumes/_nogroup/database/4a74

Questions:
1) How do I move /mnt/tank/database into /volumes/_nogroup/database/4a...74
2) Each of the directories have different pools associated with  
them, do I need to create the sub volume in the same pool?
3) Or can I just move '/mnt/tank/gitlab' -->  
/volumes/_nogroup/gitlab without first creating the volume?  This  
would skip question 2..


Thx!

Jie
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 18: Unable to delete image after imcomplete migration "image being migrated"

2023-10-10 Thread Eugen Block

Hi,

then I misinterpreted your message and thought you were actually  
surprised about the trash image. Yeah I don't think messing with  
hexedit really helped here, but I'm not sure either. Anyway, let us  
know how it went.


Zitat von Rhys Goodwin :

Thanks again Eugen. Looking at my command history it does look like  
I did execute the migration but didn't commit it. I wasn't surprised  
to see it in the trash based on the doc you mentioned, I only tried  
the restore as a desperate measure to clean up my mess. It doesn't  
help that I messed around like this, including with hexedit :O. I  
should have reached out before messing around.


I'll proceed with the migrate/re-create and report back. I'm just  
crossing my fingers that I'll be allowed to delete the pool. It's a  
lesson to me to take more care of my wee cluster.


Cheers,
Rhys

--- Original Message ---
On Wednesday, October 11th, 2023 at 7:54 AM, Eugen Block  
 wrote:




Hi,

I just re-read the docs on rbd migration [1], haven't done that in a
while, and it states the following:

> Note that the source image will be moved to the RBD trash to avoid
> mistaken usage during the migration process


So it was expected that your source image was in the trash during the
migration, no need to restore. According to your history you also ran
the "execute" command, do you remember if ran successfully as well?
Did you "execute" after the prepare command completed? But you also
state that the target image isn't there anymore, so it's hard to tell
what exactly happened here. I'm not sure how to continue from here,
maybe migrating/re-creating is the only way now.

[1] https://docs.ceph.com/en/quincy/rbd/rbd-live-migration/

Zitat von Rhys Goodwin rhys.good...@proton.me:

> Thanks Eugen.
>
> root@hcn03:~# rbd status infra-pool/sophosbuild
> 2023-10-10T09:44:21.234+ 7f1675c524c0 -1 librbd::Migration:
> open_images: failed to open destination image images/65d188c5f5a34:
> (2) No such file or directory
> rbd: getting migration status failed: (2) No such file or directory
> Watchers: none
>
> I've checked over the other pools again, but they only contain
> Openstack images. There are only 42 images in total across all
> pools. In fact, the "infra-pool" pool only has 3 images, including
> the faulty one. So migrating/re-creating is not a big deal. It's
> more just that I'd like to learn more about how to resolve such
> issues, if possible.
>
> Good call on the history. I found this smoking gun with: 'history
> |grep "rbd migration":
> rbd migration prepare infra-pool/sophosbuild images/sophosbuild
> rbd migration execute images/sophosbuild
>
> But images/sophosbuild is definitely not there anymore, and not in
> the trash. It looks like I was missing the commit.
>
> Kind regards,
> Rhys
>
> --- Original Message ---
>
> Eugen Block Wrote:
>
> Hi, there are a couple of things I would check before migrating all
> images. What's the current 'rbd status infra-pool/sophosbuild'? You
> probably don't have an infinite number of pools so I would also
> check if any of the other pools contains an image with the same
> name, just in case you wanted to keep its original name and only
> change the pool. Even if you don't have the terminal output, maybe
> you find some of the commands in the history?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Problem: Upgrading CEPH Pacific to Quincy resulted in CEPH Storage pool to stop functioning.

2023-10-10 Thread Konstantin Shalygin
Hi,

You need revert your packages from Quincy to Pacific. `dnf downgrade ceph-mon` 
command should help with this

k
Sent from my iPhone

> On Oct 11, 2023, at 03:22, Waywatcher  wrote:
> 
> I am unable to get any of the current monitors to run.  They all fail to start
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Problem: Upgrading CEPH Pacific to Quincy resulted in CEPH Storage pool to stop functioning.

2023-10-10 Thread Dan Mulkiewicz
I think the problem is that there's no MONs running now in the cluster
because he upgraded them without heading the warning to update the database
first. Are you suggesting he deploy new Pacific MONs using levelDB.then
update them to rocksdb after the CEPH cluster recovers?


A bit confused by your response...

On Mon, Oct 9, 2023 at 11:42 PM Konstantin Shalygin  wrote:

> Hi,
>
> For this upgrade you need at least some mon's up, then you can redeploy
> your pacific mon's to rocksdb
>
> k
> Sent from my iPhone
>
> > On Oct 10, 2023, at 02:01, Waywatcher  wrote:
> >
> > I upgraded my CEPH cluster without properly following the mon upgrade so
> > they were no longer on leveldb.
> >
> > Proxmox and CEPH were updated to the latest for the current release.
> > https://pve.proxmox.com/wiki/Ceph_Pacific_to_Quincy
> >
> >   1. The upgrade to Quincy states a recommendation that Mons are using
> >   RocksDB.
> >   2. Leveldb support has been removed from quincy.
> >
> >
> > The monitors were still running as leveldb.
> >
> >   1. Does this mean the mons cannot work at all since they are levelDB?
> >
> >
> > I upgraded all nodes to the quincy release 17.2.6 and restarted the mons.
> >
> > At this point the cluster stopped responding.
> > `ceph` commands do not work since the service fails to start.
> >
> > Are there steps for recovery?
> >
> > 1) Roll back to Pacific without being able to use CEPH commands (ceph
> orch
> > upgrade start --ceph-version ).
> > 2) Rebuild the monitors using data from the OSDs while maintaining Quincy
> > release.
> > 3) Is this actually related to the bug about 17.2.6 (which is what
> > Proxmox/CEPH upgrades to) https://tracker.ceph.com/issues/58156 ?
> >
> >
> > I ran the upgrade on another cluster prior to this without issue. The
> Mons
> > were set with RocksDB and running on Quincy 17.2.6.
> >
> > I appreciate any suggestions.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Hardware recommendations for a Ceph cluster

2023-10-10 Thread Christian Wuerdig
On Mon, 9 Oct 2023 at 14:24, Anthony D'Atri  wrote:

>
>
> > AFAIK the standing recommendation for all flash setups is to prefer fewer
> > but faster cores
>
> Hrm, I think this might depend on what you’re solving for.  This is the
> conventional wisdom for MDS for sure.  My sense is that OSDs can use
> multiple cores fairly well, so I might look at the cores * GHz product.
> Especially since this use-case sounds like long-tail performance probably
> isn’t worth thousands.  Only four OSD servers, Neutron, Kingston.  I don’t
> think the OP has stated any performance goals other than being more
> suitable to OpenStack instances than LFF spinners.
>

Well, the 75F3 seems to retail for less than the 7713P, so it should
technically be cheaper but then availability and supplier quotes are always
an important factor.


>
> > so something like a 75F3 might be yielding better latency.
> > Plus you probably want to experiment with partitioning the NVMEs and
> > running multiple OSDs per drive - either 2 or 4.
>
> Mark Nelson has authored a series of blog posts that explore this in great
> detail over a number of releases.  TL;DR: with Quincy or Reef, especially,
> my sense is that multiple OSDs per NVMe device is not the clear win that it
> once was, and just eats more RAM.  Mark has also authored detailed posts
> about OSD performance vs cores per OSD, though IIRC those are for one OSD
> in isolation.  In a real-world cluster, especially one this small, I
> suspect that replication and the network will be bottlenecks before either
> of the factors discussed above.
>
>
Thanks for reminding me of those. One thing I'm missing from
https://ceph.io/en/news/blog/2023/reef-osds-per-nvme/ is the NVMe
utilization - no point in buying NVMe that are blazingly fast (in terms
sustained of random 4k IOPS performance) if you have no chance to actually
utilize it.
In summary it seems - if you have many cores then multiple OSD/NVME would
provide a benefit, with fewer cores not so much. Still, it would also be
good to see the same benchmark with a faster CPU (but less cores) and see
what the actual difference is but I guess duplicating the test setup with a
different CPU is a bit tricky budget-wsie.


> ymmv.
>
>
>
> >
> > On Sat, 7 Oct 2023 at 08:23, Gustavo Fahnle  wrote:
> >
> >> Hi,
> >>
> >> Currently, I have an OpenStack installation with a Ceph cluster
> consisting
> >> of 4 servers for OSD, each with 16TB SATA HDDs. My intention is to add a
> >> second, independent Ceph cluster to provide faster disks for OpenStack
> VMs.
> >> The idea for this second cluster is to exclusively provide RBD services
> to
> >> OpenStack. I plan to start with a cluster composed of 3 mon/mgr nodes
> >> similar to what we currently have (3 virtualized servers with VMware)
> with
> >> 4 cores, 8GB of memory, 80GB disk and 10GB network
> >> each server.
> >> In the current cluster, these nodes have low resource consumption, less
> >> than 10% CPU usage, 40% memory usage, and less than 100Mb/s of network
> >> usage.
> >>
> >> For the OSDs, I'm thinking of starting with 3 or 4 servers, specifically
> >> Supermicro AS-1114S-WN10RT, each with:
> >>
> >> 1 AMD EPYC 7713P Gen 3 processor (64 Core, 128 Threads, 2.0GHz)
> >> 256GB of RAM
> >> 2 x NVME 1TB for the operating system
> >> 10 x NVME Kingston DC1500M U.2 7.68TB for the OSDs
> >> Two Intel NIC E810-XXVDA2 25GbE Dual Port (2 x SFP28) PCIe 4.0 x8 cards
> >> Connected to 2 MikroTik CRS518-16XS-2XQ-RM switches at 100GbE per server
> >> Connection to OpenStack would be via 4 x 10GB to our core switch.
> >>
> >> I would like to hear opinions about this configuration, recommendations,
> >> criticisms, etc.
> >>
> >> If any of you have references or experience with any of the components
> in
> >> this initial configuration, they would be very welcome.
> >>
> >> Thank you very much in advance.
> >>
> >> Gustavo Fahnle
> >>
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Hardware recommendations for a Ceph cluster

2023-10-10 Thread Gustavo Fahnle
Anthony,

Thank you very much for your comments; they were very helpful.
It made me reconsider some aspects of the configuration,
and it also helped me see that I wasn't too far off in general.

I'll respond to some of your suggestions, explaining my reasons.

>  Indeed, I know from experience that LFF spinners don't cut it for boot 
> drives.  Even with strawberries.

My experience with LFF spinners is the same; when I set up the first cluster, 
it was the only economically viable option.

> Do you strictly need a second cluster?  Or could you just constrain your 
> pools on the existing cluster based on deviceclass?

I want to set up a second cluster since the first one is on leased hardware, 
and I want to be prepared for when it expires.

> SMCI offers chassis that are NVMe-only I think.  The above I think comes with 
> an HBA you don't need or want.

The HBA is only for the operating system disks; the rest of the NVMe U.2 drives 
are connected to the PCIe bus.

> The Kingstons are cost-effective, but last I looked up the specs they were 
> kinda meh.  Beats spinners though.
> This is more CPU and more RAM than you need for 10xNVMe unless you're also 
> going to run RGW or other compute on them.

I know there are better drives, but these U.2 drives are more affordable, just 
like the server.
I did an exercise with U.3 drives that had double the capacity, and each server 
cost twice as much.
It's a good option, but with my current budget, it's not feasible.

>> Two Intel NIC E810-XXVDA2 25GbE Dual Port (2 x SFP28) PCIe 4.0 x8 cards

> Why two?

>> Connected to 2 MikroTik CRS518-16XS-2XQ-RM switches at 100GbE per server
>> Connection to OpenStack would be via 4 x 10GB to our core switch.

> Might 25GE be an alternative?

Again, for economic reasons,
I installed 2 NIC 25GB Dual Port to create a LAG and achieve a 100GB connection.
The connection to the core switch is also through another LAG with 4 x 10GB 
(and if needed, I can add more ports).
This is because our core switch doesn't have any free SFP ports.
For now, I can only purchase Mikrotik switches due to their cost,
but in the future, when the leasing period ends, I'll consider other types of 
switches.


Thank you so much
Gustavo







De: Anthony D'Atri 
Enviado: viernes, 6 de octubre de 2023 16:52
Para: Gustavo Fahnle 
Cc: ceph-users@ceph.io 
Asunto: Re: [ceph-users] Hardware recommendations for a Ceph cluster


> Currently, I have an OpenStack installation with a Ceph cluster consisting of 
> 4 servers for OSD, each with 16TB SATA HDDs. My intention is to add a second, 
> independent Ceph cluster to provide faster disks for OpenStack VMs.

Indeed, I know from experience that LFF spinners don't cut it for boot drives.  
Even with strawberries.

> The idea for this second cluster is to exclusively provide RBD services to 
> OpenStack

Do you strictly need a second cluster?  Or could you just constrain your pools 
on the existing cluster based on deviceclass?

> For the OSDs, I'm thinking of starting with 3 or 4 servers, specifically 
> Supermicro AS-1114S-WN10RT,

SMCI offers chassis that are NVMe-only I think.  The above I think comes with 
an HBA you don't need or want.

> each with:
>
> 1 AMD EPYC 7713P Gen 3 processor (64 Core, 128 Threads, 2.0GHz)
> 256GB of RAM
> 2 x NVME 1TB for the operating system
> 10 x NVME Kingston DC1500M U.2 7.68TB for the OSDs

The Kingstons are cost-effective, but last I looked up the specs they were 
kinda meh.  Beats spinners though.
This is more CPU and more RAM than you need for 10xNVMe unless you're also 
going to run RGW or other compute on them.

> Two Intel NIC E810-XXVDA2 25GbE Dual Port (2 x SFP28) PCIe 4.0 x8 cards

Why two?

> Connected to 2 MikroTik CRS518-16XS-2XQ-RM switches at 100GbE per server
> Connection to OpenStack would be via 4 x 10GB to our core switch.

Might 25GE be an alternative?


>
> I would like to hear opinions about this configuration, recommendations, 
> criticisms, etc.
>
> If any of you have references or experience with any of the components in 
> this initial configuration, they would be very welcome.
>
> Thank you very much in advance.
>
> Gustavo Fahnle
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unable to fix 1 Inconsistent PG

2023-10-10 Thread Wesley Dillingham
In case it's not obvious I forgot a space: "rados list-inconsistent-obj
15.f4f"

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Tue, Oct 10, 2023 at 4:55 PM Wesley Dillingham 
wrote:

> You likely have a failing disk, what does "rados
> list-inconsistent-obj15.f4f" return?
>
> It should identify the failing osd. Assuming "ceph osd ok-to-stop "
> returns in the affirmative for that osd, you likely need to stop the
> associated osd daemon, then mark it out "ceph osd out  wait for it
> to backfill the inconsistent PG and then re-issue the repair. Then turn to
> replacing the disk.
>
> Respectfully,
>
> *Wes Dillingham*
> w...@wesdillingham.com
> LinkedIn 
>
>
> On Tue, Oct 10, 2023 at 4:46 PM  wrote:
>
>> Hello All,
>> Greetings. We've a Ceph Cluster with the version
>> *ceph version 14.2.16-402-g7d47dbaf4d
>> (7d47dbaf4d0960a2e910628360ae36def84ed913) nautilus (stable)
>>
>>
>> ===
>>
>> Issues: 1 pg in inconsistent state and does not recover.
>>
>> # ceph -s
>>   cluster:
>> id: 30d6f7ee-fa02-4ab3-8a09-9321c8002794
>> health: HEALTH_ERR
>> 2 large omap objects
>> 1 pools have many more objects per pg than average
>> 159224 scrub errors
>> Possible data damage: 1 pg inconsistent
>> 2 pgs not deep-scrubbed in time
>> 2 pgs not scrubbed in time
>>
>> # ceph health detail
>>
>> HEALTH_ERR 2 large omap objects; 1 pools have many more objects per pg
>> than average; 159224 scrub errors; Possible data damage: 1 pg inconsistent;
>> 2 pgs not deep-scrubbed in time; 2 pgs not scrubbed in time
>> LARGE_OMAP_OBJECTS 2 large omap objects
>> 2 large objects found in pool 'default.rgw.log'
>> Search the cluster log for 'Large omap object found' for more details.
>> MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average
>> pool iscsi-images objects per pg (541376) is more than 14.9829 times
>> cluster average (36133)
>> OSD_SCRUB_ERRORS 159224 scrub errors
>> PG_DAMAGED Possible data damage: 1 pg inconsistent
>> pg 15.f4f is active+clean+inconsistent, acting
>> [238,106,402,266,374,498,590,627,684,73,66]
>> PG_NOT_DEEP_SCRUBBED 2 pgs not deep-scrubbed in time
>> pg 1.5c not deep-scrubbed since 2021-04-05 23:20:13.714446
>> pg 1.55 not deep-scrubbed since 2021-04-11 07:12:37.185074
>> PG_NOT_SCRUBBED 2 pgs not scrubbed in time
>> pg 1.5c not scrubbed since 2023-07-10 21:15:50.352848
>> pg 1.55 not scrubbed since 2023-06-24 10:02:10.038311
>>
>> ==
>>
>>
>> We have implemented below command to resolve it
>>
>> 1. We have ran pg repair command "ceph pg repair 15.f4f
>> 2. We have restarted associated  OSDs that is mapped to pg 15.f4f
>> 3. We tuned osd_max_scrubs value and set it to 9.
>> 4. We have done scrub and deep scrub by ceph pg scrub 15.4f4 & ceph pg
>> deep-scrub 15.f4f
>> 5. We also tried to ceph-objectstore-tool command to fix it
>> ==
>>
>> We have checked the logs of the primary OSD of the respective
>> inconsistent PG and found the below errors.
>> [ERR] : 15.f4fs0 shard 402(2)
>> 15:f2f3fff4:::94a51ddb-a94f-47bc-9068-509e8c09af9a.7862003.20_c%2f4%2fd61%2f885%2f49627697%2f192_1.ts:head
>> : missing
>> /var/log/ceph/ceph-osd.238.log:339:2023-10-06 00:37:06.410 7f65024cb700
>> -1 log_channel(cluster) log [ERR] : 15.f4fs0 shard 266(3)
>> 15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
>> : missing
>> /var/log/ceph/ceph-osd.238.log:340:2023-10-06 00:37:06.410 7f65024cb700
>> -1 log_channel(cluster) log [ERR] : 15.f4fs0 shard 402(2)
>> 15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
>> : missing
>> /var/log/ceph/ceph-osd.238.log:341:2023-10-06 00:37:06.410 7f65024cb700
>> -1 log_channel(cluster) log [ERR] : 15.f4fs0 shard 590(6)
>> 15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
>> : missing
>> ===
>> and also we noticed that the no. of scrub errors in ceph health status
>> are matching with the ERR log entries in the primary OSD logs of the
>> inconsistent PG as below
>> grep -Hn 'ERR' /var/log/ceph/ceph-osd.238.log|wc -l
>> 159226
>> 
>> Ceph is cleaning the scrub errors but rate of scrub repair is very slow
>> (avg of 200 scrub errors per day) ,we want to increase the rate of scrub
>> error repair to finish the cleanup of pending 159224 scrub errors.
>>
>> #ceph pg 15.f4f query
>>
>>
>> {
>> "state": "active+clean+inconsistent",
>> "snap_trimq": "[]",
>

[ceph-users] Re: Unable to fix 1 Inconsistent PG

2023-10-10 Thread Wesley Dillingham
You likely have a failing disk, what does "rados
list-inconsistent-obj15.f4f" return?

It should identify the failing osd. Assuming "ceph osd ok-to-stop "
returns in the affirmative for that osd, you likely need to stop the
associated osd daemon, then mark it out "ceph osd out  wait for it
to backfill the inconsistent PG and then re-issue the repair. Then turn to
replacing the disk.

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Tue, Oct 10, 2023 at 4:46 PM  wrote:

> Hello All,
> Greetings. We've a Ceph Cluster with the version
> *ceph version 14.2.16-402-g7d47dbaf4d
> (7d47dbaf4d0960a2e910628360ae36def84ed913) nautilus (stable)
>
>
> ===
>
> Issues: 1 pg in inconsistent state and does not recover.
>
> # ceph -s
>   cluster:
> id: 30d6f7ee-fa02-4ab3-8a09-9321c8002794
> health: HEALTH_ERR
> 2 large omap objects
> 1 pools have many more objects per pg than average
> 159224 scrub errors
> Possible data damage: 1 pg inconsistent
> 2 pgs not deep-scrubbed in time
> 2 pgs not scrubbed in time
>
> # ceph health detail
>
> HEALTH_ERR 2 large omap objects; 1 pools have many more objects per pg
> than average; 159224 scrub errors; Possible data damage: 1 pg inconsistent;
> 2 pgs not deep-scrubbed in time; 2 pgs not scrubbed in time
> LARGE_OMAP_OBJECTS 2 large omap objects
> 2 large objects found in pool 'default.rgw.log'
> Search the cluster log for 'Large omap object found' for more details.
> MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average
> pool iscsi-images objects per pg (541376) is more than 14.9829 times
> cluster average (36133)
> OSD_SCRUB_ERRORS 159224 scrub errors
> PG_DAMAGED Possible data damage: 1 pg inconsistent
> pg 15.f4f is active+clean+inconsistent, acting
> [238,106,402,266,374,498,590,627,684,73,66]
> PG_NOT_DEEP_SCRUBBED 2 pgs not deep-scrubbed in time
> pg 1.5c not deep-scrubbed since 2021-04-05 23:20:13.714446
> pg 1.55 not deep-scrubbed since 2021-04-11 07:12:37.185074
> PG_NOT_SCRUBBED 2 pgs not scrubbed in time
> pg 1.5c not scrubbed since 2023-07-10 21:15:50.352848
> pg 1.55 not scrubbed since 2023-06-24 10:02:10.038311
>
> ==
>
>
> We have implemented below command to resolve it
>
> 1. We have ran pg repair command "ceph pg repair 15.f4f
> 2. We have restarted associated  OSDs that is mapped to pg 15.f4f
> 3. We tuned osd_max_scrubs value and set it to 9.
> 4. We have done scrub and deep scrub by ceph pg scrub 15.4f4 & ceph pg
> deep-scrub 15.f4f
> 5. We also tried to ceph-objectstore-tool command to fix it
> ==
>
> We have checked the logs of the primary OSD of the respective inconsistent
> PG and found the below errors.
> [ERR] : 15.f4fs0 shard 402(2)
> 15:f2f3fff4:::94a51ddb-a94f-47bc-9068-509e8c09af9a.7862003.20_c%2f4%2fd61%2f885%2f49627697%2f192_1.ts:head
> : missing
> /var/log/ceph/ceph-osd.238.log:339:2023-10-06 00:37:06.410 7f65024cb700 -1
> log_channel(cluster) log [ERR] : 15.f4fs0 shard 266(3)
> 15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
> : missing
> /var/log/ceph/ceph-osd.238.log:340:2023-10-06 00:37:06.410 7f65024cb700 -1
> log_channel(cluster) log [ERR] : 15.f4fs0 shard 402(2)
> 15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
> : missing
> /var/log/ceph/ceph-osd.238.log:341:2023-10-06 00:37:06.410 7f65024cb700 -1
> log_channel(cluster) log [ERR] : 15.f4fs0 shard 590(6)
> 15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
> : missing
> ===
> and also we noticed that the no. of scrub errors in ceph health status are
> matching with the ERR log entries in the primary OSD logs of the
> inconsistent PG as below
> grep -Hn 'ERR' /var/log/ceph/ceph-osd.238.log|wc -l
> 159226
> 
> Ceph is cleaning the scrub errors but rate of scrub repair is very slow
> (avg of 200 scrub errors per day) ,we want to increase the rate of scrub
> error repair to finish the cleanup of pending 159224 scrub errors.
>
> #ceph pg 15.f4f query
>
>
> {
> "state": "active+clean+inconsistent",
> "snap_trimq": "[]",
> "snap_trimq_len": 0,
> "epoch": 409009,
> "up": [
> 238,
> 106,
> 402,
> 266,
> 374,
> 498,
> 590,
> 627,
> 684,
> 73,
> 66
> ],
> "acting": [
> 238,
> 106,
> 402,
> 266,
> 374,
> 498,
> 590,
> 627,

[ceph-users] Re: CephFS: convert directory into subvolume

2023-10-10 Thread jie . zhang7
Hello,

I'm following this tread and the original.  I'm trying to convert directories 
into subvolumes.  Where I'm stuck is how you move a directory into the 
subvolume root directory.

I have a volume 'tank' and it's mounted on the host as '/mnt/tank'  I have 
subfolders '/mnt/tank/database', '/mnt/tank/gitlab', etc...

I create a subvolume and getpath gives me:
/volumes/_nogroup/database/4a74

Questions:
1) How do I move /mnt/tank/database into /volumes/_nogroup/database/4a...74
2) Each of the directories have different pools associated with them, do I need 
to create the sub volume in the same pool?
3) Or can I just move '/mnt/tank/gitlab' --> /volumes/_nogroup/gitlab without 
first creating the volume?  This would skip question 2..

Thx!

Jie
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 18: Unable to delete image after imcomplete migration "image being migrated"

2023-10-10 Thread rhys . goodwin
Thanks Eugen.

root@hcn03:~# rbd status infra-pool/sophosbuild
2023-10-10T09:44:21.234+ 7f1675c524c0 -1 librbd::Migration: open_images: 
failed to open destination image images/65d188c5f5a34: (2) No such file or 
directory
rbd: getting migration status failed: (2) No such file or directory
Watchers: none

I've checked over the other pools again, but they are all Openstack images. 
There are only 42 images in total across all pools. In fact, the infra-pool 
only has 3 images including the faulty one. So, migrating/re-creating is not a 
big deal. It's more just that I'd like to learn more about how to resolve such 
issues, if possible. 

Good call on the history. I found this smoking gun with: 'history |grep "rbd 
migration":
rbd migration prepare infra-pool/sophosbuild images/sophosbuild
rbd migration execute images/sophosbuild

But images/sophosbuild is not there anymore, and not in the trash. It looks 
like I was missing the commit.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Unable to fix 1 Inconsistent PG

2023-10-10 Thread samdto987
Hello All,
Greetings. We've a Ceph Cluster with the version
*ceph version 14.2.16-402-g7d47dbaf4d
(7d47dbaf4d0960a2e910628360ae36def84ed913) nautilus (stable)


===

Issues: 1 pg in inconsistent state and does not recover.

# ceph -s
  cluster:
id: 30d6f7ee-fa02-4ab3-8a09-9321c8002794
health: HEALTH_ERR
2 large omap objects
1 pools have many more objects per pg than average
159224 scrub errors
Possible data damage: 1 pg inconsistent
2 pgs not deep-scrubbed in time
2 pgs not scrubbed in time

# ceph health detail

HEALTH_ERR 2 large omap objects; 1 pools have many more objects per pg than 
average; 159224 scrub errors; Possible data damage: 1 pg inconsistent; 2 pgs 
not deep-scrubbed in time; 2 pgs not scrubbed in time
LARGE_OMAP_OBJECTS 2 large omap objects
2 large objects found in pool 'default.rgw.log'
Search the cluster log for 'Large omap object found' for more details.
MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average
pool iscsi-images objects per pg (541376) is more than 14.9829 times 
cluster average (36133)
OSD_SCRUB_ERRORS 159224 scrub errors
PG_DAMAGED Possible data damage: 1 pg inconsistent
pg 15.f4f is active+clean+inconsistent, acting 
[238,106,402,266,374,498,590,627,684,73,66]
PG_NOT_DEEP_SCRUBBED 2 pgs not deep-scrubbed in time
pg 1.5c not deep-scrubbed since 2021-04-05 23:20:13.714446
pg 1.55 not deep-scrubbed since 2021-04-11 07:12:37.185074
PG_NOT_SCRUBBED 2 pgs not scrubbed in time
pg 1.5c not scrubbed since 2023-07-10 21:15:50.352848
pg 1.55 not scrubbed since 2023-06-24 10:02:10.038311
 
==


We have implemented below command to resolve it

1. We have ran pg repair command "ceph pg repair 15.f4f
2. We have restarted associated  OSDs that is mapped to pg 15.f4f
3. We tuned osd_max_scrubs value and set it to 9.
4. We have done scrub and deep scrub by ceph pg scrub 15.4f4 & ceph pg 
deep-scrub 15.f4f
5. We also tried to ceph-objectstore-tool command to fix it 
==

We have checked the logs of the primary OSD of the respective inconsistent PG 
and found the below errors.
[ERR] : 15.f4fs0 shard 402(2) 
15:f2f3fff4:::94a51ddb-a94f-47bc-9068-509e8c09af9a.7862003.20_c%2f4%2fd61%2f885%2f49627697%2f192_1.ts:head
 : missing
/var/log/ceph/ceph-osd.238.log:339:2023-10-06 00:37:06.410 7f65024cb700 -1 
log_channel(cluster) log [ERR] : 15.f4fs0 shard 266(3) 
15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
 : missing
/var/log/ceph/ceph-osd.238.log:340:2023-10-06 00:37:06.410 7f65024cb700 -1 
log_channel(cluster) log [ERR] : 15.f4fs0 shard 402(2) 
15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
 : missing
/var/log/ceph/ceph-osd.238.log:341:2023-10-06 00:37:06.410 7f65024cb700 -1 
log_channel(cluster) log [ERR] : 15.f4fs0 shard 590(6) 
15:f2f2:::94a51ddb-a94f-47bc-9068-509e8c09af9a.11432468.3_TN8QHE_04.20.2020_08.41%2fCV_MAGNETIC%2fV_274396%2fCHUNK_2440801%2fSFILE_CONTAINER_031.FOLDER%2f3:head
 : missing
===
and also we noticed that the no. of scrub errors in ceph health status are 
matching with the ERR log entries in the primary OSD logs of the inconsistent 
PG as below
grep -Hn 'ERR' /var/log/ceph/ceph-osd.238.log|wc -l
159226

Ceph is cleaning the scrub errors but rate of scrub repair is very slow (avg of 
200 scrub errors per day) ,we want to increase the rate of scrub error repair 
to finish the cleanup of pending 159224 scrub errors.

#ceph pg 15.f4f query


{
"state": "active+clean+inconsistent",
"snap_trimq": "[]",
"snap_trimq_len": 0,
"epoch": 409009,
"up": [
238,
106,
402,
266,
374,
498,
590,
627,
684,
73,
66
],
"acting": [
238,
106,
402,
266,
374,
498,
590,
627,
684,
73,
66
],
"acting_recovery_backfill": [
"66(10)",
"73(9)",
"106(1)",
"238(0)",
"266(3)",
"374(4)",
"402(2)",
"498(5)",
"590(6)",
"627(7)",
"684(8)"
],
"info": {
"pgid": "15.f4fs0",
"last_update": "409009'7998",
"last_complete": "409009'7998",
"log_tail": "382701'4900",
"last_user_version": 592883,
"last_backfill": "MAX",
"last_backfill_bitwise": 0,
"purged_snaps": [],
"history": {
"epoch_created": 19813,
"epoch_pool_created": 16141,
"last_epoch_started": 407097,
"last_inter

[ceph-users] cephadm, cannot use ECDSA key with quincy

2023-10-10 Thread paul . jurco
Hi ceph users,
We have a few clusters with quincy 17.2.6 and we are preparing to migrate from 
ceph-deploy to cephadm for better management.
We are using Ubuntu20 with latest updates (latest openssh).
While testing the migration to cephadm on a test cluster with octopus (v16 
latest) we had no issues replacing ceph generated cert/key with our own CA 
signed certs (ECDSA).
After upgrading to quincy the test cluster and test again the migration we 
cannot add hosts due to the errors below, ssh access errors specified a while 
ago in a tracker.
We use the following type of certs:
Type: ecdsa-sha2-nistp384-cert-...@openssh.com user certificate
The certificate works everytime when using ssh client from shell to connect to 
all hosts in the cluster.
We do a ceph mgr fail every time we replace cert/key so they are restarted.

- cephadm logs from mgr --
Oct 06 09:23:27 ceph-m2 bash[1363]: Log: Opening SSH connection to 
10.10.10.232, port 22
Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Connected to SSH server at 
10.10.10.232, port 22
Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3]   Local address: 10.10.12.160, 
port 51870
Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3]   Peer address: 10.10.10.232, port 
22
Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Beginning auth for user root
Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Auth failed for user root
Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Connection failure: Permission 
denied
Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Aborting connection
Oct 06 09:23:27 ceph-m2 bash[1363]: Traceback (most recent call last):
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/usr/share/ceph/mgr/cephadm/ssh.py", line 111, in redirect_log
Oct 06 09:23:27 ceph-m2 bash[1363]: yield
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/usr/share/ceph/mgr/cephadm/ssh.py", line 90, in _remote_connection
Oct 06 09:23:27 ceph-m2 bash[1363]: preferred_auth=['publickey'], 
options=ssh_options)
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/lib/python3.6/site-packages/asyncssh/connection.py", line 6804, in connect
Oct 06 09:23:27 ceph-m2 bash[1363]: 'Opening SSH connection to')
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/lib/python3.6/site-packages/asyncssh/connection.py", line 303, in _connect
Oct 06 09:23:27 ceph-m2 bash[1363]: await conn.wait_established()
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/lib/python3.6/site-packages/asyncssh/connection.py", line 2243, in 
wait_established
Oct 06 09:23:27 ceph-m2 bash[1363]: await self._waiter
Oct 06 09:23:27 ceph-m2 bash[1363]: asyncssh.misc.PermissionDenied: Permission 
denied
Oct 06 09:23:27 ceph-m2 bash[1363]: During handling of the above exception, 
another exception occurred:
Oct 06 09:23:27 ceph-m2 bash[1363]: Traceback (most recent call last):
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/usr/share/ceph/mgr/orchestrator/_interface.py", line 125, in wrapper
Oct 06 09:23:27 ceph-m2 bash[1363]: return OrchResult(f(*args, **kwargs))
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/usr/share/ceph/mgr/cephadm/module.py", line 2810, in apply
Oct 06 09:23:27 ceph-m2 bash[1363]: results.append(self._apply(spec))
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/usr/share/ceph/mgr/cephadm/module.py", line 2558, in _apply
Oct 06 09:23:27 ceph-m2 bash[1363]: return self._add_host(cast(HostSpec, 
spec))
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/usr/share/ceph/mgr/cephadm/module.py", line 1434, in _add_host
Oct 06 09:23:27 ceph-m2 bash[1363]: ip_addr = 
self._check_valid_addr(spec.hostname, spec.addr)
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/usr/share/ceph/mgr/cephadm/module.py", line 1415, in _check_valid_addr
Oct 06 09:23:27 ceph-m2 bash[1363]: error_ok=True, no_fsid=True))
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/usr/share/ceph/mgr/cephadm/module.py", line 615, in wait_async
Oct 06 09:23:27 ceph-m2 bash[1363]: return self.event_loop.get_result(coro)
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/usr/share/ceph/mgr/cephadm/ssh.py", line 56, in get_result
Oct 06 09:23:27 ceph-m2 bash[1363]: return 
asyncio.run_coroutine_threadsafe(coro, self._loop).result()
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/lib64/python3.6/concurrent/futures/_base.py", line 432, in result
Oct 06 09:23:27 ceph-m2 bash[1363]: return self.__get_result()
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result
Oct 06 09:23:27 ceph-m2 bash[1363]: raise self._exception
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/usr/share/ceph/mgr/cephadm/serve.py", line 1361, in _run_cephadm
Oct 06 09:23:27 ceph-m2 bash[1363]: await 
self.mgr.ssh._remote_connection(host, addr)
Oct 06 09:23:27 ceph-m2 bash[1363]:   File 
"/usr/share/ceph/mgr/cephadm/ssh.py", line 96, in _remote_connection
Oct 06 09:23:27 ceph-m2 bash[1363]: raise
Oct 06 09:23:27 ceph-m2 bash[1363]:   File "/lib64/python3.6/contextlib.py", 
line 99, in __exit__
Oct 06 09:23:27 ceph-m2 bash[1363]:   

[ceph-users] Re: Remove empty orphaned PGs not mapped to a pool

2023-10-10 Thread Accounting Clyso GmbH

@Eugen

We have seen the same problems 8 years ago. I can only recommend never 
to use cache tiering in production.
At Cephalocon this was part of my talk and as far as I remember cache 
tiering will also disappear from ceph soon.


Cache tiering has been deprecated in the Reef release as it has lacked a 
maintainer for a very long time. This does not mean it will be certainly 
removed, but we may choose to remove it without much further notice.


https://docs.ceph.com/en/latest/rados/operations/cache-tiering/

Regards, Joachim


Am 05.10.23 um 10:02 schrieb Eugen Block:
Which ceph version is this? I'm trying to understand how removing a 
pool leaves the PGs of that pool... Do you have any logs or something 
from when you removed the pool?
We'll have to deal with a cache tier in the forseeable future as well 
so this is quite relevant for us as well. Maybe I'll try to reproduce 
it in a test cluster first.
Are those SSDs exclusively for the cache tier or are they used by 
other pools as well? If they were used only for the cache tier you 
should be able to just remove them without any risk. But as I said, 
I'd rather try to understand before purging them.



Zitat von Malte Stroem :


Hello Eugen,

yes, we followed the documentation and everything worked fine. The 
cache is gone.


Removing the pool worked well. Everything is clean.

The PGs are empty active+clean.

Possible solutions:

1.

ceph pg {pg-id} mark_unfound_lost delete

I do not think this is the right way since it is for PGs with status 
unfound. But it could work also.


2.

Set the following for the three disk:

ceph osd lost {osd-id}

I am not sure how the cluster will react to this.

3.

ceph-objectstore-tool --data-path /path/to/osd --op remove --pgid 3.0 
--force


Now, will the cluster accept the removed PG status?

4.

The three disks are still presented in the crush rule, class ssd, 
each single OSD under one host entry.


What if I remove them from crush?

Do you have a better idea, Eugen?

Best,
Malte

Am 04.10.23 um 09:21 schrieb Eugen Block:

Hi,

just for clarity, you're actually talking about the cache tier as 
described in the docs [1]? And you followed the steps until 'ceph 
osd tier remove cold-storage hot-storage' successfully? And the pool 
has been really deleted successfully ('ceph osd pool ls detail')?


[1] 
https://docs.ceph.com/en/latest/rados/operations/cache-tiering/#removing-a-cache-tier


Zitat von Malte Stroem :


Hello,

we removed an SSD cache tier and its pool.

The PGs for the pool do still exist.

The cluster is healthy.

The PGs are empty and they reside on the cache tier pool's SSDs.

We like to take out the disks but it is not possible. The cluster 
sees the PGs and answers with a HEALTH_WARN.


Because of the replication of three there are still 128 PGs on 
three of the 24 OSDs. We were able to remove the other OSDs.


Summary:

- pool removed
- 3 x 128 empty PGs still exist
- 3 of 24 OSDs still exist

How is it possible to remove these empty and healthy PGs?

The only way I found was something like:

ceph pg {pg-id} mark_unfound_lost delete

Is that the right way?

Some output of:

ceph pg ls-by-osd 23

PG  OBJECTS  DEGRADED  MISPLACED  UNFOUND  BYTES OMAP_BYTES* 
OMAP_KEYS*  LOG   STATE SINCE VERSION REPORTED UP 
ACTING SCRUB_STAMP DEEP_SCRUB_STAMP
3.0  0 0  0    0 0    0   0     
0  active+clean    27h 0'0    2627265:196316 [15,6,23]p15  
[15,6,23]p15 2023-09-28T12:41:52.982955+0200 
2023-09-27T06:48:23.265838+0200
3.1  0 0  0    0 0    0   0     
0  active+clean 9h 0'0    2627266:19330 [6,23,15]p6  
[6,23,15]p6 2023-09-29T06:30:57.630016+0200 
2023-09-27T22:58:21.992451+0200
3.2  0 0  0    0 0    0   0     
0  active+clean 2h 0'0   2627265:1135185 [23,15,6]p23  
[23,15,6]p23 2023-09-29T13:42:07.346658+0200 
2023-09-24T14:31:52.844427+0200
3.3  0 0  0    0 0    0   0     
0  active+clean    13h 0'0    2627266:193170 [6,15,23]p6  
[6,15,23]p6 2023-09-29T01:56:54.517337+0200 
2023-09-27T17:47:24.961279+0200
3.4  0 0  0    0 0    0   0     
0  active+clean    14h 0'0   2627265:2343551 [23,6,15]p23  
[23,6,15]p23 2023-09-29T00:47:47.548860+0200 
2023-09-25T09:39:51.259304+0200
3.5  0 0  0    0 0    0   0     
0  active+clean 2h 0'0    2627265:194111 [15,6,23]p15  
[15,6,23]p15 2023-09-29T13:28:48.879959+0200 
2023-09-26T15:35:44.217302+0200
3.6  0 0  0    0 0    0   0     
0  active+clean 6h 0'0   2627265:2345717 [23,15,6]p23  
[23,15,6]p23 2023-09-29T09:26:02.534825+0200 
2023-09-27T21:56:57.500126+0200


Best regards,
Malte
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



_

[ceph-users] Re: snap_schedule works after 1 hour of scheduling

2023-10-10 Thread Kushagr Gupta
Hi Milind,

Thank you for your response.
Please find the logs attached, as instructed.

Thanks and Regards,
Kushagra Gupta


On Thu, Oct 5, 2023 at 12:09 PM Milind Changire  wrote:

> this is really odd
>
> Please run following commands and send over their outputs:
> # ceph status
> # ceph fs status
> # ceph report
> # ls -ld //volumes/subvolgrp/test
> # ls -l //volumes/subvolgrp/test/.snap
>
> On Thu, Oct 5, 2023 at 11:17 AM Kushagr Gupta
>  wrote:
> >
> > Hi Milind,Team
> >
> > Thank you for your response @Milind Changire
> >
> > >>The only thing I can think of is a stale mgr that wasn't restarted
> > >>after an upgrade.
> > >>Was an upgrade performed lately ?
> >
> > Yes an upgrade was performed after which we faced this. But we were
> facing this issue previously as well.
> > Another interesting thing which we observed was that even after the
> upgrade, the schedules that we created  before upgrade were still running.
> >
> > But to eliminate this, I installed a fresh cluster after purging the old
> one.
> > Commands used for as follows:
> > ```
> > ansible-playbook -i hosts infrastructure-playbooks/purge-cluster.yml
> > ansible-playbook -i hosts site.yml
> > ```
> >
> > After this kindly note the commands which we followed:
> > ```
> > [root@storagenode-1 ~]# ceph mgr module  enable snap_schedule
> > [root@storagenode-1 ~]# ceph config set mgr mgr/snap_schedule/log_level
> debug
> > [root@storagenode-1 ~]# sudo ceph fs subvolumegroup create cephfs
> subvolgrp
> > [root@storagenode-1 ~]# ceph fs subvolume create cephfs test subvolgrp
> > [root@storagenode-1 ~]# date
> > Thu Oct  5 04:23:09 UTC 2023
> > [root@storagenode-1 ~]# ceph fs snap-schedule add
> /volumes/subvolgrp/test 1h 2023-10-05T04:30:00
> > Schedule set for path /volumes/subvolgrp/test
> > [root@storagenode-1 ~]#  ceph fs snap-schedule list / --recursive=true
> > /volumes/subvolgrp/test 1h
> > [root@storagenode-1 ~]# ceph fs snap-schedule status
> /volumes/subvolgrp/test
> > {"fs": "cephfs", "subvol": null, "path": "/volumes/subvolgrp/test",
> "rel_path": "/volumes/subvolgrp/test", "schedule": "1h", "retention": {},
> "start": "2023-10-05T04:30:00", "created": "2023-10-05T04:23:39", "first":
> null, "last": null, "last_pruned": null, "created_count": 0,
> "pruned_count": 0, "active": true}
> > [root@storagenode-1 ~]# ceph fs subvolume info cephfs test subvolgrp
> > {
> > "atime": "2023-10-05 04:20:18",
> > "bytes_pcent": "undefined",
> > "bytes_quota": "infinite",
> > "bytes_used": 0,
> > "created_at": "2023-10-05 04:20:18",
> > "ctime": "2023-10-05 04:20:18",
> > "data_pool": "cephfs_data",
> > "features": [
> > "snapshot-clone",
> > "snapshot-autoprotect",
> > "snapshot-retention"
> > ],
> > "gid": 0,
> > "mode": 16877,
> > "mon_addrs": [
> > "[abcd:abcd:abcd::34]:6789",
> > "[abcd:abcd:abcd::35]:6789",
> > "[abcd:abcd:abcd::36]:6789"
> > ],
> > "mtime": "2023-10-05 04:20:18",
> > "path":
> "/volumes/subvolgrp/test/73d82b1a-6fb1-4160-a388-66b898967a85",
> > "pool_namespace": "",
> > "state": "complete",
> > "type": "subvolume",
> > "uid": 0
> > }
> > [root@storagenode-1 ~]#
> > [root@storagenode-1 ~]# ceph fs snap-schedule status
> /volumes/subvolgrp/test
> > {"fs": "cephfs", "subvol": null, "path": "/volumes/subvolgrp/test",
> "rel_path": "/volumes/subvolgrp/test", "schedule": "1h", "retention": {"h":
> 4}, "start": "2023-10-05T04:30:00", "created": "2023-10-05T04:23:39",
> "first": null, "last": null, "last_pruned": null, "created_count": 0,
> "pruned_count": 0, "active": true}
> > [root@storagenode-1 ~]# date
> > Thu Oct  5 05:31:20 UTC 2023
> > [root@storagenode-1 ~]#
> > ```
> >
> > Could you please help us. Are we doing something wrong? Because still
> the schedules are not getting created.
> >
> > Thanks and Regards,
> > Kushagra Gupta
> >
> > On Wed, Oct 4, 2023 at 9:33 PM Milind Changire 
> wrote:
> >>
> >> On Wed, Oct 4, 2023 at 7:19 PM Kushagr Gupta
> >>  wrote:
> >> >
> >> > Hi Milind,
> >> >
> >> > Thank you for your swift response.
> >> >
> >> > >>How many hours did you wait after the "start time" and decide to
> restart mgr ?
> >> > We waited for ~3 days before restarting the mgr-service.
> >>
> >> The only thing I can think of is a stale mgr that wasn't restarted
> >> after an upgrade.
> >> Was an upgrade performed lately ?
> >>
> >> Did the dir exist at the time the snapshot was scheduled to take place.
> >> If it didn't then the schedule gets disabled until explicitly enabled.
> >>
> >> >
> >> > There was one more instance where we waited for 2 hours and then
> re-started and in the third hour the schedule started working.
> >> >
> >> > Could you please guide us if we are doing anything wrong.
> >> > Kindly let us know if any logs are required.
> >> >
> >> > Thanks and Regards,
> >> > Kushagra Gupta
> >> >
> >> > On Wed, Oct 4, 2023 at 5:39 PM Milind Changire 
> wrote:
> >> >>
> >> >> On Wed, Oct 4

[ceph-users] Re: snap_schedule works after 1 hour of scheduling

2023-10-10 Thread Kushagr Gupta
Hi Milind,Team

Thank you for your response @Milind Changire 

>>The only thing I can think of is a stale mgr that wasn't restarted
>>after an upgrade.
>>Was an upgrade performed lately ?

Yes an upgrade was performed after which we faced this. But we were facing
this issue previously as well.
Another interesting thing which we observed was that even after the
upgrade, the schedules that we created  before upgrade were still running.

But to eliminate this, I installed a fresh cluster after purging the old
one.
Commands used for as follows:
```
ansible-playbook -i hosts infrastructure-playbooks/purge-cluster.yml
ansible-playbook -i hosts site.yml
```

After this kindly note the commands which we followed:
```
[root@storagenode-1 ~]# ceph mgr module  enable snap_schedule
[root@storagenode-1 ~]# ceph config set mgr mgr/snap_schedule/log_level
debug
[root@storagenode-1 ~]# sudo ceph fs subvolumegroup create cephfs subvolgrp
[root@storagenode-1 ~]# ceph fs subvolume create cephfs test subvolgrp
[root@storagenode-1 ~]# date
Thu Oct  5 04:23:09 UTC 2023
[root@storagenode-1 ~]# ceph fs snap-schedule add /volumes/subvolgrp/test
1h 2023-10-05T04:30:00
Schedule set for path /volumes/subvolgrp/test
[root@storagenode-1 ~]#  ceph fs snap-schedule list / --recursive=true
/volumes/subvolgrp/test 1h
[root@storagenode-1 ~]# ceph fs snap-schedule status /volumes/subvolgrp/test
{"fs": "cephfs", "subvol": null, "path": "/volumes/subvolgrp/test",
"rel_path": "/volumes/subvolgrp/test", "schedule": "1h", "retention": {},
"start": "2023-10-05T04:30:00", "created": "2023-10-05T04:23:39", "first":
null, "last": null, "last_pruned": null, "created_count": 0,
"pruned_count": 0, "active": true}
[root@storagenode-1 ~]# ceph fs subvolume info cephfs test subvolgrp
{
"atime": "2023-10-05 04:20:18",
"bytes_pcent": "undefined",
"bytes_quota": "infinite",
"bytes_used": 0,
"created_at": "2023-10-05 04:20:18",
"ctime": "2023-10-05 04:20:18",
"data_pool": "cephfs_data",
"features": [
"snapshot-clone",
"snapshot-autoprotect",
"snapshot-retention"
],
"gid": 0,
"mode": 16877,
"mon_addrs": [
"[abcd:abcd:abcd::34]:6789",
"[abcd:abcd:abcd::35]:6789",
"[abcd:abcd:abcd::36]:6789"
],
"mtime": "2023-10-05 04:20:18",
"path": "/volumes/subvolgrp/test/73d82b1a-6fb1-4160-a388-66b898967a85",
"pool_namespace": "",
"state": "complete",
"type": "subvolume",
"uid": 0
}
[root@storagenode-1 ~]#
[root@storagenode-1 ~]# ceph fs snap-schedule status /volumes/subvolgrp/test
{"fs": "cephfs", "subvol": null, "path": "/volumes/subvolgrp/test",
"rel_path": "/volumes/subvolgrp/test", "schedule": "1h", "retention": {"h":
4}, "start": "2023-10-05T04:30:00", "created": "2023-10-05T04:23:39",
"first": null, "last": null, "last_pruned": null, "created_count": 0,
"pruned_count": 0, "active": true}
[root@storagenode-1 ~]# date
Thu Oct  5 05:31:20 UTC 2023
[root@storagenode-1 ~]#
```

Could you please help us. Are we doing something wrong? Because still the
schedules are not getting created.

Thanks and Regards,
Kushagra Gupta

On Wed, Oct 4, 2023 at 9:33 PM Milind Changire  wrote:

> On Wed, Oct 4, 2023 at 7:19 PM Kushagr Gupta
>  wrote:
> >
> > Hi Milind,
> >
> > Thank you for your swift response.
> >
> > >>How many hours did you wait after the "start time" and decide to
> restart mgr ?
> > We waited for ~3 days before restarting the mgr-service.
>
> The only thing I can think of is a stale mgr that wasn't restarted
> after an upgrade.
> Was an upgrade performed lately ?
>
> Did the dir exist at the time the snapshot was scheduled to take place.
> If it didn't then the schedule gets disabled until explicitly enabled.
>
> >
> > There was one more instance where we waited for 2 hours and then
> re-started and in the third hour the schedule started working.
> >
> > Could you please guide us if we are doing anything wrong.
> > Kindly let us know if any logs are required.
> >
> > Thanks and Regards,
> > Kushagra Gupta
> >
> > On Wed, Oct 4, 2023 at 5:39 PM Milind Changire 
> wrote:
> >>
> >> On Wed, Oct 4, 2023 at 3:40 PM Kushagr Gupta
> >>  wrote:
> >> >
> >> > Hi Team,Milind
> >> >
> >> > Ceph-version: Quincy, Reef
> >> > OS: Almalinux 8
> >> >
> >> > Issue: snap_schedule works after 1 hour of schedule
> >> >
> >> > Description:
> >> >
> >> > We are currently working in a 3-node ceph cluster.
> >> > We are currently exploring the scheduled snapshot capability of the
> ceph-mgr module.
> >> > To enable/configure scheduled snapshots, we followed the following
> link:
> >> >
> >> >
> >> >
> >> > https://docs.ceph.com/en/quincy/cephfs/snap-schedule/
> >> >
> >> >
> >> >
> >> > We were able to create snap schedules for the subvolumes as suggested.
> >> > But we have observed a two very strange behaviour:
> >> > 1. The snap_schedules only work when we restart the ceph-mgr service
> on the mgr node:
> >> > We then restarted the mgr-service on the active mgr node, and

[ceph-users] Re: Ceph 18: Unable to delete image after imcomplete migration "image being migrated"

2023-10-10 Thread Rhys Goodwin
Thanks again Eugen. Looking at my command history it does look like I did 
execute the migration but didn't commit it. I wasn't surprised to see it in the 
trash based on the doc you mentioned, I only tried the restore as a desperate 
measure to clean up my mess. It doesn't help that I messed around like this, 
including with hexedit :O. I should have reached out before messing around.

I'll proceed with the migrate/re-create and report back. I'm just crossing my 
fingers that I'll be allowed to delete the pool. It's a lesson to me to take 
more care of my wee cluster.

Cheers,
Rhys

--- Original Message ---
On Wednesday, October 11th, 2023 at 7:54 AM, Eugen Block  wrote:


> Hi,
> 
> I just re-read the docs on rbd migration [1], haven't done that in a
> while, and it states the following:
> 
> > Note that the source image will be moved to the RBD trash to avoid
> > mistaken usage during the migration process
> 
> 
> So it was expected that your source image was in the trash during the
> migration, no need to restore. According to your history you also ran
> the "execute" command, do you remember if ran successfully as well?
> Did you "execute" after the prepare command completed? But you also
> state that the target image isn't there anymore, so it's hard to tell
> what exactly happened here. I'm not sure how to continue from here,
> maybe migrating/re-creating is the only way now.
> 
> [1] https://docs.ceph.com/en/quincy/rbd/rbd-live-migration/
> 
> Zitat von Rhys Goodwin rhys.good...@proton.me:
> 
> > Thanks Eugen.
> > 
> > root@hcn03:~# rbd status infra-pool/sophosbuild
> > 2023-10-10T09:44:21.234+ 7f1675c524c0 -1 librbd::Migration:
> > open_images: failed to open destination image images/65d188c5f5a34:
> > (2) No such file or directory
> > rbd: getting migration status failed: (2) No such file or directory
> > Watchers: none
> > 
> > I've checked over the other pools again, but they only contain
> > Openstack images. There are only 42 images in total across all
> > pools. In fact, the "infra-pool" pool only has 3 images, including
> > the faulty one. So migrating/re-creating is not a big deal. It's
> > more just that I'd like to learn more about how to resolve such
> > issues, if possible.
> > 
> > Good call on the history. I found this smoking gun with: 'history
> > |grep "rbd migration":
> > rbd migration prepare infra-pool/sophosbuild images/sophosbuild
> > rbd migration execute images/sophosbuild
> > 
> > But images/sophosbuild is definitely not there anymore, and not in
> > the trash. It looks like I was missing the commit.
> > 
> > Kind regards,
> > Rhys
> > 
> > --- Original Message ---
> > 
> > Eugen Block Wrote:
> > 
> > Hi, there are a couple of things I would check before migrating all
> > images. What's the current 'rbd status infra-pool/sophosbuild'? You
> > probably don't have an infinite number of pools so I would also
> > check if any of the other pools contains an image with the same
> > name, just in case you wanted to keep its original name and only
> > change the pool. Even if you don't have the terminal output, maybe
> > you find some of the commands in the history?
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> 
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 18: Unable to delete image after imcomplete migration "image being migrated"

2023-10-10 Thread Eugen Block

Hi,

I just re-read the docs on rbd migration [1], haven't done that in a  
while, and it states the following:


Note that the source image will be moved to the RBD trash to avoid  
mistaken usage during the migration process


So it was expected that your source image was in the trash during the  
migration, no need to restore. According to your history you also ran  
the "execute" command, do you remember if ran successfully as well?  
Did you "execute" after the prepare command completed? But you also  
state that the target image isn't there anymore, so it's hard to tell  
what exactly happened here. I'm not sure how to continue from here,  
maybe migrating/re-creating is the only way now.


[1] https://docs.ceph.com/en/quincy/rbd/rbd-live-migration/

Zitat von Rhys Goodwin :


Thanks Eugen.

root@hcn03:~# rbd status infra-pool/sophosbuild
2023-10-10T09:44:21.234+ 7f1675c524c0 -1 librbd::Migration:  
open_images: failed to open destination image images/65d188c5f5a34:  
(2) No such file or directory

rbd: getting migration status failed: (2) No such file or directory
Watchers: none

I've checked over the other pools again, but they only contain  
Openstack images. There are only 42 images in total across all  
pools. In fact, the "infra-pool" pool only has 3 images, including  
the faulty one. So migrating/re-creating is not a big deal. It's  
more just that I'd like to learn more about how to resolve such  
issues, if possible.


Good call on the history. I found this smoking gun with: 'history  
|grep "rbd migration":

rbd migration prepare infra-pool/sophosbuild images/sophosbuild
rbd migration execute images/sophosbuild

But images/sophosbuild is definitely not there anymore, and not in  
the trash. It looks like I was missing the commit.


Kind regards,
Rhys

--- Original Message ---

Eugen Block Wrote:

Hi, there are a couple of things I would check before migrating all  
images. What's the current 'rbd status infra-pool/sophosbuild'? You  
probably don't have an infinite number of pools so I would also  
check if any of the other pools contains an image with the same  
name, just in case you wanted to keep its original name and only  
change the pool. Even if you don't have the terminal output, maybe  
you find some of the commands in the history?

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Announcing go-ceph v0.24.0

2023-10-10 Thread John Mulligan
We are happy to announce another release of the go-ceph API library.
This is a regular release following our every-two-months release
cadence.


https://github.com/ceph/go-ceph/releases/tag/v0.24.0

Changes include fixes to the rgw admin and rbd packages.
More details are available at the link above.

The library includes bindings that aim to play a similar role to the
"pybind" python bindings in the ceph tree but for the Go language. The
library also includes additional APIs that can be used to administer
cephfs, rbd, and rgw subsystems.

There are already a few consumers of this library in the wild,
including the ceph-csi project.


-- 
John Mulligan

phlogistonj...@asynchrono.us
jmulli...@redhat.com


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 18: Unable to delete image after imcomplete migration "image being migrated"

2023-10-10 Thread Rhys Goodwin
Thanks Eugen.

root@hcn03:~# rbd status infra-pool/sophosbuild
2023-10-10T09:44:21.234+ 7f1675c524c0 -1 librbd::Migration: open_images: 
failed to open destination image images/65d188c5f5a34: (2) No such file or 
directory
rbd: getting migration status failed: (2) No such file or directory
Watchers: none

I've checked over the other pools again, but they only contain Openstack 
images. There are only 42 images in total across all pools. In fact, the 
"infra-pool" pool only has 3 images, including the faulty one. So 
migrating/re-creating is not a big deal. It's more just that I'd like to learn 
more about how to resolve such issues, if possible.

Good call on the history. I found this smoking gun with: 'history |grep "rbd 
migration":
rbd migration prepare infra-pool/sophosbuild images/sophosbuild
rbd migration execute images/sophosbuild

But images/sophosbuild is definitely not there anymore, and not in the trash. 
It looks like I was missing the commit.

Kind regards,
Rhys

--- Original Message ---

Eugen Block Wrote:

Hi, there are a couple of things I would check before migrating all images. 
What's the current 'rbd status infra-pool/sophosbuild'? You probably don't have 
an infinite number of pools so I would also check if any of the other pools 
contains an image with the same name, just in case you wanted to keep its 
original name and only change the pool. Even if you don't have the terminal 
output, maybe you find some of the commands in the history?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: slow recovery with Quincy

2023-10-10 Thread 胡 玮文
Hi Ben,

Please see this thread 
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/PWHG6QJ6N2TJEYD2U4AXJAJ23CRPJG4E/#7ZMBM23GXYFIGY52ZWJDY5NUSYSDSYL6
 for possible workaround.

发自我的 iPad

在 2023年10月10日,22:26,Ben  写道:

Dear cephers,

with one osd down(200GB/9.1TB data), rebalance takes 3 hours still in
progress. Client bandwidth can go as high as 200MB/s. With little client
request throughput, recovery goes at couple MB/s. I wonder if there is
configuration to polish for improvement. It runs with quincy 17.2.5,
deployed by cephadm. The slowness can do harm in peak hours of usage.

Best wishes,

Ben
-
   volumes: 1/1 healthy
   pools:   8 pools, 209 pgs
   objects: 93.04M objects, 4.8 TiB
   usage:   15 TiB used, 467 TiB / 482 TiB avail
   pgs: 1206837/279121971 objects degraded (0.432%)
208 active+clean
1   active+undersized+degraded+remapped+backfilling

 io:
   client:   80 KiB/s rd, 420 KiB/s wr, 12 op/s rd, 29 op/s wr
   recovery: 6.2 MiB/s, 113 objects/s
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Nothing provides libthrift-0.14.0.so()(64bit)

2023-10-10 Thread Casey Bodley
we're tracking this in https://tracker.ceph.com/issues/61882. my
understanding is that we're just waiting for the next quincy point
release builds to resolve this

On Tue, Oct 10, 2023 at 11:07 AM Graham Derryberry
 wrote:
>
> I have just started adding a ceph client on a rocky 9 system to our ceph
> cluster (we're on quincy 17.2.6) and just discovered that epel 9 now
> provides thrift-0.15.0-2.el9 not thrift-0.14.0-7.el9 as of June 21 2023.
> So the Nothing provides libthrift-0.14.0.so()(64bit) error has returned!
> Recommendations?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Copying big objects (>5GB) doesn't work after upgrade to Quincy on S3

2023-10-10 Thread Casey Bodley
hi Arvydas,

it looks like this change corresponds to
https://tracker.ceph.com/issues/48322 and
https://github.com/ceph/ceph/pull/38234. the intent was to enforce the
same limitation as AWS S3 and force clients to use multipart copy
instead. this limit is controlled by the config option
rgw_max_put_size which defaults to 5G. the same option controls other
operations like Put/PostObject, so i wouldn't recommend raising it as
a workaround for copy

this change really should have been mentioned in the release notes -
apologies for that omission

On Tue, Oct 10, 2023 at 10:58 AM Arvydas Opulskis  wrote:
>
> Hi all,
>
> after upgrading our cluster from Nautilus -> Pacific -> Quincy we noticed
> we can't copy bigger objects anymore via S3.
>
> An error we get:
> "Aws::S3::Errors::EntityTooLarge (Aws::S3::Errors::EntityTooLarge)"
>
> After some tests we have following findings:
> * Problems starts for objects bigger than 5 GB (multipart limit)
> * Issue starts after upgrading to Quincy (17.2.6). In latest Pacific
> (16.2.13) it works fine.
> * For Quincy it works ok with AWS S3 CLI "cp" command, but doesn't work
> using AWS Ruby3 SDK client with copy_object command.
> * For Pacific setup both clients work ok
> * From RGW logs seems like AWS S3 CLI client handles multipart copying
> "under the hood", so it is succesful.
>
> It is stated in AWS documentation, that for uploads (and copying) bigger
> than 5GB files we should use multi part API for AWS S3. For some reason it
> worked for years in Ceph and stopped working after Quincy release, even I
> couldn't find something in release notes addressing this change.
>
> So, is this change permanent and should be considered as bug fix?
>
> Both Pacific and Quincy clusters were running on Rocky 8.6 OS, using Beast
> frontend.
>
> Arvydas
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Nothing provides libthrift-0.14.0.so()(64bit)

2023-10-10 Thread Graham Derryberry
I have just started adding a ceph client on a rocky 9 system to our ceph
cluster (we're on quincy 17.2.6) and just discovered that epel 9 now
provides thrift-0.15.0-2.el9 not thrift-0.14.0-7.el9 as of June 21 2023.
So the Nothing provides libthrift-0.14.0.so()(64bit) error has returned!
Recommendations?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Copying big objects (>5GB) doesn't work after upgrade to Quincy on S3

2023-10-10 Thread Arvydas Opulskis
Hi all,

after upgrading our cluster from Nautilus -> Pacific -> Quincy we noticed
we can't copy bigger objects anymore via S3.

An error we get:
"Aws::S3::Errors::EntityTooLarge (Aws::S3::Errors::EntityTooLarge)"

After some tests we have following findings:
* Problems starts for objects bigger than 5 GB (multipart limit)
* Issue starts after upgrading to Quincy (17.2.6). In latest Pacific
(16.2.13) it works fine.
* For Quincy it works ok with AWS S3 CLI "cp" command, but doesn't work
using AWS Ruby3 SDK client with copy_object command.
* For Pacific setup both clients work ok
* From RGW logs seems like AWS S3 CLI client handles multipart copying
"under the hood", so it is succesful.

It is stated in AWS documentation, that for uploads (and copying) bigger
than 5GB files we should use multi part API for AWS S3. For some reason it
worked for years in Ceph and stopped working after Quincy release, even I
couldn't find something in release notes addressing this change.

So, is this change permanent and should be considered as bug fix?

Both Pacific and Quincy clusters were running on Rocky 8.6 OS, using Beast
frontend.

Arvydas
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] slow recovery with Quincy

2023-10-10 Thread Ben
Dear cephers,

with one osd down(200GB/9.1TB data), rebalance takes 3 hours still in
progress. Client bandwidth can go as high as 200MB/s. With little client
request throughput, recovery goes at couple MB/s. I wonder if there is
configuration to polish for improvement. It runs with quincy 17.2.5,
deployed by cephadm. The slowness can do harm in peak hours of usage.

Best wishes,

Ben
-
volumes: 1/1 healthy
pools:   8 pools, 209 pgs
objects: 93.04M objects, 4.8 TiB
usage:   15 TiB used, 467 TiB / 482 TiB avail
pgs: 1206837/279121971 objects degraded (0.432%)
 208 active+clean
 1   active+undersized+degraded+remapped+backfilling

  io:
client:   80 KiB/s rd, 420 KiB/s wr, 12 op/s rd, 29 op/s wr
recovery: 6.2 MiB/s, 113 objects/s
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: outdated mds slow requests

2023-10-10 Thread Ben
Hi,
It get cleared by restarting ceph client with issues. It works. to do that,
you would umount problematic cephfs volume and remount. All ceph warning is
gone in couple minutes, trimming well now. Indeed I wouldn't restart mds
unless I had to.

Many thanks for help,
Ben

Eugen Block  于2023年10月10日周二 15:44写道:

> Hi,
>
> > 2, restart problematic mds with trimming behind issue: 3,4,5: mds will
> > start up quickly, won't they? investigating...
>
> this one you should be able to answer better than the rest of us. You
> probably have restarted MDS daemons before, I would assume.
> Just don't restart them all at once but one after the other after
> everything has settled.
>
> Zitat von Ben :
>
> > Hi Eugen,
> >
> > warnings continue to spam cluster log.Actually for the whole picture of
> the
> > issue please see:
> >
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/VDL56J75FG5LO4ZECIWWGGBW4ULPZUIP/
> >
> > I was thinking about the following options:
> > 1, restart problematic nodes: 24,32,34,36: need to schedule with business
> > partners
> > 2, restart problematic mds with trimming behind issue: 3,4,5: mds will
> > start up quickly, won't they? investigating...
> >
> > mds 3,4,5 have logsegment item in segments list that are stuck in
> expiring
> > status, which break the trimming process. Growing segments lists draw
> > concerns overtime.
> > Any other ideas?
> >
> > Thanks,
> > Ben
> >
> >
> > Eugen Block  于2023年10月4日周三 16:44写道:
> >
> >> Hi,
> >>
> >> is this still an issue? If so, I would try to either evict the client
> >> via admin socket:
> >>
> >> ceph tell mds.5 client evict [...] --- Evict client
> >> session(s) based on a filter
> >>
> >> alternatively locally on the MDS:
> >> cephadm enter mds.
> >> ceph daemon mds. client evict 
> >>
> >> or restart the MDS which should also clear the client, I believe.
> >>
> >> Zitat von Ben :
> >>
> >> > Hi,
> >> > It is running 17.2.5. there are slow requests warnings in cluster log.
> >> >
> >> > ceph tell mds.5 dump_ops_in_flight,
> >> > get the following.
> >> >
> >> > These look like outdated and clients were k8s pods. There are warning
> of
> >> > the kind in other mds as well. How could they be cleaned from warnings
> >> > safely?
> >> >
> >> > Many thanks.
> >> >
> >> > {
> >> > "ops": [
> >> > {
> >> > "description": "peer_request(mds.3:5311742.0 authpin)",
> >> > "initiated_at": "2023-09-14T12:25:43.092201+",
> >> > "age": 926013.05098558997,
> >> > "duration": 926013.051015759,
> >> > "type_data": {
> >> > "flag_point": "dispatched",
> >> > "reqid": "mds.3:5311742",
> >> > "op_type": "peer_request",
> >> > "leader_info": {
> >> > "leader": "3"
> >> > },
> >> > "request_info": {
> >> > "attempt": 0,
> >> > "op_type": "authpin",
> >> > "lock_type": 0,
> >> > "object_info": "0x60001205d6d.head",
> >> > "srcdnpath": "",
> >> > "destdnpath": "",
> >> > "witnesses": "",
> >> > "has_inode_export": false,
> >> > "inode_export_v": 0,
> >> > "op_stamp": "0.00"
> >> > },
> >> > "events": [
> >> > {
> >> > "time": "2023-09-14T12:25:43.092201+",
> >> > "event": "initiated"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092202+",
> >> > "event": "throttled"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092201+",
> >> > "event": "header_read"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092207+",
> >> > "event": "all_read"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092218+",
> >> > "event": "dispatched"
> >> > }
> >> > ]
> >> > }
> >> > },
> >> > {
> >> > "description": "peer_request(mds.3:5311743.0 authpin)",
> >> > "initiated_at": "2023-09-14T12:25:43.092371+",
> >> > "age": 926013.05081614305,
> >> > "duration": 926013.05089185503,
> >> > "type_data": {
> >> > "flag_point": "dispatched",
> >> > "reqid": "mds.3:5311743",
> >> > "op_type": "peer_request",
> >> > "leader_info": {
> >> > "leader": "3"
> >> > },
> >> > "request_info": {
> >> > "attempt": 0,
> >> > "op_type": "authpin",
> >> > "lock_type": 0,
> >> > "object_info": "0x60001205d6d.head",
> >> > "srcdnpath": "",
> >> > "destdnpath": "",
> >> > "witnesses": "",
> >> > "has_inode_export": false,
> >> > "inode_export_v": 0,
> >> > "op_stamp": "0.00"
> >> > },
> >> > "events": [
> >> > {
> >> > "time": "2023-09-14T12:25:43.092371+",
> >> > "event": "initiated"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092371+",
> >> > "event": "throttled"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092371+",
> >> > "event": "header_read"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092374+",
> >> > "event": "all_read"
> >> > },
> >> > {
> >> > "time": "2023-09-14T12:25:43.092381+",
> >> > "event": "dispatched"
> >> > }
> >> > ]
> >> > }
> >> > },
> >> > {
> >> > "description": "peer_request(mds.4:4503615.0 authpin)",
> >> > "initiated_at": "2023-09-14T13:40:25.150040+",
> >> > "age": 921530.99314722,
> >> > "duration": 921530.99326053297,
> >> > "type_data": {
> >> > "flag_point

[ceph-users] Re: cephadm, cannot use ECDSA key with quincy

2023-10-10 Thread Adam King
The CA signed keys working in pacific was sort of accidental. We found out
that it was a working use case in pacific but not in quincy earlier this
year, which resulted in this tracker https://tracker.ceph.com/issues/62009.
That has since been implemented in main, and backported to the reef branch
(but wasn't in the initial reef release, will be in 18.2.1). It hasn't been
beackported to quincy yet though. I think it was decided no more PRs for
17.2.7 which is about to come out so the earliest the support could get to
quincy is 17.2.8. I don't know of any workaround unfortunately.

On Tue, Oct 10, 2023 at 7:57 AM Paul JURCO  wrote:

> Hi!
> If is because old ssh client was replaced with asyncssh (
> https://github.com/ceph/ceph/pull/51899) and only ported to reef, when
> will
> be added to quincy?
> For us is a blocker as we cannot move to cephadm anymore, as we planned for
> Q4.
> Is there a workarround?
>
> Thank you for your efforts!
> Paul
>
>
> On Sat, Oct 7, 2023 at 12:03 PM Paul JURCO  wrote:
>
> > Resent due to moderation when using web interface.
> >
> > Hi ceph users,
> > We have a few clusters with quincy 17.2.6 and we are preparing to migrate
> > from ceph-deploy to cephadm for better management.
> > We are using Ubuntu20 with latest updates (latest openssh).
> > While testing the migration to cephadm on a test cluster with octopus
> (v16
> > latest) we had no issues replacing ceph generated cert/key with our own
> CA
> > signed certs (ECDSA).
> > After upgrading to quincy the test cluster and test again the migration
> we
> > cannot add hosts due to the errors below, ssh access errors specified a
> > while ago in a tracker.
> > We use the following type of certs:
> > Type: ecdsa-sha2-nistp384-cert-...@openssh.com user certificate
> > The certificate works everytime when using ssh client from shell to
> > connect to all hosts in the cluster.
> > We do a ceph mgr fail every time we replace cert/key so they are
> restarted.
> >
> > - cephadm logs from mgr --
> > Oct 06 09:23:27 ceph-m2 bash[1363]: Log: Opening SSH connection to
> > 10.10.10.232, port 22
> > Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Connected to SSH server at
> > 10.10.10.232, port 22
> > Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3]   Local address:
> > 10.10.12.160, port 51870
> > Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3]   Peer address:
> 10.10.10.232,
> > port 22
> > Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Beginning auth for user root
> > Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Auth failed for user root
> > Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Connection failure:
> > Permission denied
> > Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Aborting connection
> > Oct 06 09:23:27 ceph-m2 bash[1363]: Traceback (most recent call last):
> > Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> > "/usr/share/ceph/mgr/cephadm/ssh.py", line 111, in redirect_log
> > Oct 06 09:23:27 ceph-m2 bash[1363]: yield
> > Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> > "/usr/share/ceph/mgr/cephadm/ssh.py", line 90, in _remote_connection
> > Oct 06 09:23:27 ceph-m2 bash[1363]: preferred_auth=['publickey'],
> > options=ssh_options)
> > Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> > "/lib/python3.6/site-packages/asyncssh/connection.py", line 6804, in
> connect
> > Oct 06 09:23:27 ceph-m2 bash[1363]: 'Opening SSH connection to')
> > Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> > "/lib/python3.6/site-packages/asyncssh/connection.py", line 303, in
> _connect
> > Oct 06 09:23:27 ceph-m2 bash[1363]: await conn.wait_established()
> > Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> > "/lib/python3.6/site-packages/asyncssh/connection.py", line 2243, in
> > wait_established
> > Oct 06 09:23:27 ceph-m2 bash[1363]: await self._waiter
> > Oct 06 09:23:27 ceph-m2 bash[1363]: asyncssh.misc.PermissionDenied:
> > Permission denied
> > Oct 06 09:23:27 ceph-m2 bash[1363]: During handling of the above
> > exception, another exception occurred:
> > Oct 06 09:23:27 ceph-m2 bash[1363]: Traceback (most recent call last):
> > Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> > "/usr/share/ceph/mgr/orchestrator/_interface.py", line 125, in wrapper
> > Oct 06 09:23:27 ceph-m2 bash[1363]: return OrchResult(f(*args,
> > **kwargs))
> > Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> > "/usr/share/ceph/mgr/cephadm/module.py", line 2810, in apply
> > Oct 06 09:23:27 ceph-m2 bash[1363]: results.append(self._apply(spec))
> > Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> > "/usr/share/ceph/mgr/cephadm/module.py", line 2558, in _apply
> > Oct 06 09:23:27 ceph-m2 bash[1363]: return
> > self._add_host(cast(HostSpec, spec))
> > Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> > "/usr/share/ceph/mgr/cephadm/module.py", line 1434, in _add_host
> > Oct 06 09:23:27 ceph-m2 bash[1363]: ip_addr =
> > self._check_valid_addr(spec.hostname, spec.addr)
> > Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> > "/usr/share/ceph/mgr/cephadm/module.py", line 1415, in _check

[ceph-users] Re: cephadm, cannot use ECDSA key with quincy

2023-10-10 Thread Paul JURCO
Hi!
If is because old ssh client was replaced with asyncssh (
https://github.com/ceph/ceph/pull/51899) and only ported to reef, when will
be added to quincy?
For us is a blocker as we cannot move to cephadm anymore, as we planned for
Q4.
Is there a workarround?

Thank you for your efforts!
Paul


On Sat, Oct 7, 2023 at 12:03 PM Paul JURCO  wrote:

> Resent due to moderation when using web interface.
>
> Hi ceph users,
> We have a few clusters with quincy 17.2.6 and we are preparing to migrate
> from ceph-deploy to cephadm for better management.
> We are using Ubuntu20 with latest updates (latest openssh).
> While testing the migration to cephadm on a test cluster with octopus (v16
> latest) we had no issues replacing ceph generated cert/key with our own CA
> signed certs (ECDSA).
> After upgrading to quincy the test cluster and test again the migration we
> cannot add hosts due to the errors below, ssh access errors specified a
> while ago in a tracker.
> We use the following type of certs:
> Type: ecdsa-sha2-nistp384-cert-...@openssh.com user certificate
> The certificate works everytime when using ssh client from shell to
> connect to all hosts in the cluster.
> We do a ceph mgr fail every time we replace cert/key so they are restarted.
>
> - cephadm logs from mgr --
> Oct 06 09:23:27 ceph-m2 bash[1363]: Log: Opening SSH connection to
> 10.10.10.232, port 22
> Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Connected to SSH server at
> 10.10.10.232, port 22
> Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3]   Local address:
> 10.10.12.160, port 51870
> Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3]   Peer address: 10.10.10.232,
> port 22
> Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Beginning auth for user root
> Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Auth failed for user root
> Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Connection failure:
> Permission denied
> Oct 06 09:23:27 ceph-m2 bash[1363]: [conn=3] Aborting connection
> Oct 06 09:23:27 ceph-m2 bash[1363]: Traceback (most recent call last):
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/usr/share/ceph/mgr/cephadm/ssh.py", line 111, in redirect_log
> Oct 06 09:23:27 ceph-m2 bash[1363]: yield
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/usr/share/ceph/mgr/cephadm/ssh.py", line 90, in _remote_connection
> Oct 06 09:23:27 ceph-m2 bash[1363]: preferred_auth=['publickey'],
> options=ssh_options)
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/lib/python3.6/site-packages/asyncssh/connection.py", line 6804, in connect
> Oct 06 09:23:27 ceph-m2 bash[1363]: 'Opening SSH connection to')
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/lib/python3.6/site-packages/asyncssh/connection.py", line 303, in _connect
> Oct 06 09:23:27 ceph-m2 bash[1363]: await conn.wait_established()
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/lib/python3.6/site-packages/asyncssh/connection.py", line 2243, in
> wait_established
> Oct 06 09:23:27 ceph-m2 bash[1363]: await self._waiter
> Oct 06 09:23:27 ceph-m2 bash[1363]: asyncssh.misc.PermissionDenied:
> Permission denied
> Oct 06 09:23:27 ceph-m2 bash[1363]: During handling of the above
> exception, another exception occurred:
> Oct 06 09:23:27 ceph-m2 bash[1363]: Traceback (most recent call last):
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/usr/share/ceph/mgr/orchestrator/_interface.py", line 125, in wrapper
> Oct 06 09:23:27 ceph-m2 bash[1363]: return OrchResult(f(*args,
> **kwargs))
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/usr/share/ceph/mgr/cephadm/module.py", line 2810, in apply
> Oct 06 09:23:27 ceph-m2 bash[1363]: results.append(self._apply(spec))
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/usr/share/ceph/mgr/cephadm/module.py", line 2558, in _apply
> Oct 06 09:23:27 ceph-m2 bash[1363]: return
> self._add_host(cast(HostSpec, spec))
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/usr/share/ceph/mgr/cephadm/module.py", line 1434, in _add_host
> Oct 06 09:23:27 ceph-m2 bash[1363]: ip_addr =
> self._check_valid_addr(spec.hostname, spec.addr)
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/usr/share/ceph/mgr/cephadm/module.py", line 1415, in _check_valid_addr
> Oct 06 09:23:27 ceph-m2 bash[1363]: error_ok=True, no_fsid=True))
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/usr/share/ceph/mgr/cephadm/module.py", line 615, in wait_async
> Oct 06 09:23:27 ceph-m2 bash[1363]: return
> self.event_loop.get_result(coro)
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/usr/share/ceph/mgr/cephadm/ssh.py", line 56, in get_result
> Oct 06 09:23:27 ceph-m2 bash[1363]: return
> asyncio.run_coroutine_threadsafe(coro, self._loop).result()
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/lib64/python3.6/concurrent/futures/_base.py", line 432, in result
> Oct 06 09:23:27 ceph-m2 bash[1363]: return self.__get_result()
> Oct 06 09:23:27 ceph-m2 bash[1363]:   File
> "/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result
> Oct 06 09:23:27 ce

[ceph-users] Re: Ceph 18: Unable to delete image after imcomplete migration "image being migrated"

2023-10-10 Thread Eugen Block

Hi,

there are a couple of things I would check before migrating all  
images. What's the current 'rbd status infra-pool/sophosbuild'?
You probably don't have an infinite number of pools so I would also  
check if any of the other pools contains an image with the same name,  
just in case you wanted to keep its original name and only change the  
pool. Even if you don't have the terminal output, maybe you find some  
of the commands in the history?


Zitat von Rhys Goodwin :


Hi Folks,

I'm running Ceph 18 with OpenStack for my lab (and home services) in  
a 3 node cluster on Ubuntu 22.04. I'm quite new to these platforms.  
Just learning. This is my build, for what it's worth:  
https://blog.rhysgoodwin.com/it/openstack-ceph-hyperconverged/


I got myself into some trouble as follows. This is the sequence of events:

I don't recall when but at some stage I must have tried an image  
migration from one pool to another. The source pool/image is  
infra-pool/sophosbuild I don't know what the target would have been.  
In any case on my travels, I found the infra-pool/sophosbuild image  
in the trash:

rhys@hcn03:/imagework# rbd trash ls --all infra-pool
65a87bb2472fe sophosbuild

I tried to delete it but got the following:

rhys@hcn03:/imagework# rbd trash rm infra-pool/65a87bb2472fe
2023-10-06T04:23:13.775+ 7f28bbfff640 -1  
librbd::image::RefreshRequest: image being migrated
2023-10-06T04:23:13.775+ 7f28bbfff640 -1  
librbd::image::OpenRequest: failed to refresh image: (30) Read-only  
file system
2023-10-06T04:23:13.775+ 7f28bbfff640 -1 librbd::ImageState:  
0x7f28a804b600 failed to open image: (30) Read-only file system
2023-10-06T04:23:13.775+ 7f28a2ffd640 -1  
librbd::image::RemoveRequest: 0x7f28a8000b90 handle_open_image:  
error opening image: (30) Read-only file system
rbd: remove error: (30) Read-only file systemRemoving image: 0%  
complete...failed.


Next, I tried to restore the image, and this also failed:
rhys@hcn03:/imagework:# rbd trash restore infra-pool/65a87bb2472fe
librbd::api::Trash: restore: Current trash source 'migration' does  
not match expected: user,mirroring,unknown (4)


Probably stupidly, I followed the steps in this post:  
https://www.spinics.net/lists/ceph-users/msg72786.html to change  
offset 07 from 02 (TRASH_IMAGE_SOURCE_MIGRATION) in omap value to  
00(TRASH_IMAGE_SOURCE_USER)


After this I was able to restore the image successfully.
However, I still could not delete it:
rhys@hcn03:/imagework:# rbd rm infra-pool/sophosbuild
2023-10-06T05:52:30.708+ 7ff5937fe640 -1  
librbd::image::RefreshRequest: image being migrated
2023-10-06T05:52:30.708+ 7ff5937fe640 -1  
librbd::image::OpenRequest: failed to refresh image: (30) Read-only  
file system
2023-10-06T05:52:30.708+ 7ff5937fe640 -1 librbd::ImageState:  
0x564d3f83d680 failed to open image: (30) Read-only file system
Removing image: 0% complete...failed.rbd: delete error: (30)  
Read-only file system


I tried to abort the migration with: root@hcn03:/imagework# rbd  
migration abort infra-pool/sophosbuild

This took a few mins but failed at 99% (sorry, terminal scroll back lost)

So now I'm stuck, I don't know how to get rid of this image and  
while everything is otherwise healthy in the cluster, the dashboard  
is throwing errors when it tries to enumerate the images in that pool.


I'm considering migrating the good images off this pool and deleing  
the pool. But I don't even know if I'll be allowed to delete the  
pool while this issue is present.


Any advice would be much appreciated.

Kind regards,
Rhys
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: outdated mds slow requests

2023-10-10 Thread Eugen Block

Hi,


2, restart problematic mds with trimming behind issue: 3,4,5: mds will
start up quickly, won't they? investigating...


this one you should be able to answer better than the rest of us. You  
probably have restarted MDS daemons before, I would assume.
Just don't restart them all at once but one after the other after  
everything has settled.


Zitat von Ben :


Hi Eugen,

warnings continue to spam cluster log.Actually for the whole picture of the
issue please see:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/VDL56J75FG5LO4ZECIWWGGBW4ULPZUIP/

I was thinking about the following options:
1, restart problematic nodes: 24,32,34,36: need to schedule with business
partners
2, restart problematic mds with trimming behind issue: 3,4,5: mds will
start up quickly, won't they? investigating...

mds 3,4,5 have logsegment item in segments list that are stuck in expiring
status, which break the trimming process. Growing segments lists draw
concerns overtime.
Any other ideas?

Thanks,
Ben


Eugen Block  于2023年10月4日周三 16:44写道:


Hi,

is this still an issue? If so, I would try to either evict the client
via admin socket:

ceph tell mds.5 client evict [...] --- Evict client
session(s) based on a filter

alternatively locally on the MDS:
cephadm enter mds.
ceph daemon mds. client evict 

or restart the MDS which should also clear the client, I believe.

Zitat von Ben :

> Hi,
> It is running 17.2.5. there are slow requests warnings in cluster log.
>
> ceph tell mds.5 dump_ops_in_flight,
> get the following.
>
> These look like outdated and clients were k8s pods. There are warning of
> the kind in other mds as well. How could they be cleaned from warnings
> safely?
>
> Many thanks.
>
> {
> "ops": [
> {
> "description": "peer_request(mds.3:5311742.0 authpin)",
> "initiated_at": "2023-09-14T12:25:43.092201+",
> "age": 926013.05098558997,
> "duration": 926013.051015759,
> "type_data": {
> "flag_point": "dispatched",
> "reqid": "mds.3:5311742",
> "op_type": "peer_request",
> "leader_info": {
> "leader": "3"
> },
> "request_info": {
> "attempt": 0,
> "op_type": "authpin",
> "lock_type": 0,
> "object_info": "0x60001205d6d.head",
> "srcdnpath": "",
> "destdnpath": "",
> "witnesses": "",
> "has_inode_export": false,
> "inode_export_v": 0,
> "op_stamp": "0.00"
> },
> "events": [
> {
> "time": "2023-09-14T12:25:43.092201+",
> "event": "initiated"
> },
> {
> "time": "2023-09-14T12:25:43.092202+",
> "event": "throttled"
> },
> {
> "time": "2023-09-14T12:25:43.092201+",
> "event": "header_read"
> },
> {
> "time": "2023-09-14T12:25:43.092207+",
> "event": "all_read"
> },
> {
> "time": "2023-09-14T12:25:43.092218+",
> "event": "dispatched"
> }
> ]
> }
> },
> {
> "description": "peer_request(mds.3:5311743.0 authpin)",
> "initiated_at": "2023-09-14T12:25:43.092371+",
> "age": 926013.05081614305,
> "duration": 926013.05089185503,
> "type_data": {
> "flag_point": "dispatched",
> "reqid": "mds.3:5311743",
> "op_type": "peer_request",
> "leader_info": {
> "leader": "3"
> },
> "request_info": {
> "attempt": 0,
> "op_type": "authpin",
> "lock_type": 0,
> "object_info": "0x60001205d6d.head",
> "srcdnpath": "",
> "destdnpath": "",
> "witnesses": "",
> "has_inode_export": false,
> "inode_export_v": 0,
> "op_stamp": "0.00"
> },
> "events": [
> {
> "time": "2023-09-14T12:25:43.092371+",
> "event": "initiated"
> },
> {
> "time": "2023-09-14T12:25:43.092371+",
> "event": "throttled"
> },
> {
> "time": "2023-09-14T12:25:43.092371+",
> "event": "header_read"
> },
> {
> "time": "2023-09-14T12:25:43.092374+",
> "event": "all_read"
> },
> {
> "time": "2023-09-14T12:25:43.092381+",
> "event": "dispatched"
> }
> ]
> }
> },
> {
> "description": "peer_request(mds.4:4503615.0 authpin)",
> "initiated_at": "2023-09-14T13:40:25.150040+",
> "age": 921530.99314722,
> "duration": 921530.99326053297,
> "type_data": {
> "flag_point": "dispatched",
> "reqid": "mds.4:4503615",
> "op_type": "peer_request",
> "leader_info": {
> "leader": "4"
> },
> "request_info": {
> "attempt": 0,
> "op_type": "authpin",
> "lock_type": 0,
> "object_info": "0x60001205c4f.head",
> "srcdnpath": "",
> "destdnpath": "",
> "witnesses": "",
> "has_inode_export": false,
> "inode_export_v": 0,
> "op_stamp": "0.00"
> },
> "events": [
> {
> "time": "2023-09-14T13:40:25.150040+",
> "event": "initiated"
> },
> {
> "time": "2023-09-14T13:40:25.150040+",
> "event": "throttled"
> },
> {
> "time": "2023-09-14T13:40:25.150040+",
> "event": "header_read"
> },
> {
> "time": "2023-09-14T13:40:25.150045+",
> "event": "all_read"
> },
> {
> "time": "2023-09-14T13:40:25.150053+",
> "event": "dispatched"
> }
> ]
> }
> },
> {
> "description": "client_request(client.460983:5731820 getattr pAsLsXsFs
> #0x60001205c4f 2023-09-14T13:40:25.144336+ caller_uid=0,
> caller_gid=0{})",
> "initiated_at": "2023-09-14T13:40:25.150176+",
> "age": 921530.99301089498,
> "duration": 921530.99316312897,
> "type_data": {
>