[ceph-users] [rbd mirror] integrity of journal-based image mirror
Hi, Say, the source image is being updated and data is mirrored to destination continuously. Suddenly, networking of source is down and destination will be promoted and used to restore the VM. Is that going to cause any FS issue and, for example, fsck needs to be invoked to check and repair FS? Is there any integrity check during journal-based mirror to avoid "partial" update caused by networking issue? Any insight from dev or experiences from users is appreciated. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: the image used size becomes 0 after export/import with snapshot
Hi Ilya, That explains it. Thank you for clarification! Tony From: Ilya Dryomov Sent: December 4, 2023 09:40 AM To: Tony Liu Cc: ceph-users@ceph.io; d...@ceph.io Subject: Re: [ceph-users] the image used size becomes 0 after export/import with snapshot On Tue, Nov 28, 2023 at 8:18 AM Tony Liu wrote: > > Hi, > > I have an image with a snapshot and some changes after snapshot. > ``` > $ rbd du backup/f0408e1e-06b6-437b-a2b5-70e3751d0a26 > NAME > PROVISIONED USED > f0408e1e-06b6-437b-a2b5-70e3751d0a26@snapshot-eb085877-7557-4620-9c01-c5587b857029 >10 GiB 2.4 GiB > f0408e1e-06b6-437b-a2b5-70e3751d0a26 >10 GiB 2.4 GiB > >10 GiB 4.8 GiB > ``` > If there is no changes after snapshot, the image line will show 0 used. > > I did export and import. > ``` > $ rbd export --export-format 2 backup/f0408e1e-06b6-437b-a2b5-70e3751d0a26 - > | rbd import --export-format 2 - backup/test > Exporting image: 100% complete...done. > Importing image: 100% complete...done. > ``` > > When check the imported image, the image line shows 0 used. > ``` > $ rbd du backup/test > NAMEPROVISIONED USED > test@snapshot-eb085877-7557-4620-9c01-c5587b857029 10 GiB 2.4 GiB > test 10 GiB 0 B > 10 GiB 2.4 GiB > ``` > Any clues how that happened? I'd expect the same du as the source. Hi Tony, "rbd import" command does zero detection at 4k granularity by default. If the "after snapshot" changes just zeroed everything in the snapshot, such a discrepancy in "rbd du" USED column is expected. Thanks, Ilya ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] the image used size becomes 0 after export/import with snapshot
Hi, I have an image with a snapshot and some changes after snapshot. ``` $ rbd du backup/f0408e1e-06b6-437b-a2b5-70e3751d0a26 NAME PROVISIONED USED f0408e1e-06b6-437b-a2b5-70e3751d0a26@snapshot-eb085877-7557-4620-9c01-c5587b857029 10 GiB 2.4 GiB f0408e1e-06b6-437b-a2b5-70e3751d0a26 10 GiB 2.4 GiB 10 GiB 4.8 GiB ``` If there is no changes after snapshot, the image line will show 0 used. I did export and import. ``` $ rbd export --export-format 2 backup/f0408e1e-06b6-437b-a2b5-70e3751d0a26 - | rbd import --export-format 2 - backup/test Exporting image: 100% complete...done. Importing image: 100% complete...done. ``` When check the imported image, the image line shows 0 used. ``` $ rbd du backup/test NAMEPROVISIONED USED test@snapshot-eb085877-7557-4620-9c01-c5587b857029 10 GiB 2.4 GiB test 10 GiB 0 B 10 GiB 2.4 GiB ``` Any clues how that happened? I'd expect the same du as the source. I tried another quick test. It works fine. ``` $ rbd create backup/test-src --size 10G $ sudo rbd map backup/test-src /dev/rbd0 $ echo "hello" | sudo tee /dev/rbd0 hello $ rbd du backup/test-src NAME PROVISIONED USED test-src 10 GiB 4 MiB $ rbd snap create backup/test-src@snap-1 Creating snap: 100% complete...done. $ rbd du backup/test-src NAME PROVISIONED USED test-src@snap-1 10 GiB 4 MiB test-src 10 GiB0 B 10 GiB 4 MiB $ echo "world" | sudo tee /dev/rbd0 world $ rbd du backup/test-src NAME PROVISIONED USED test-src@snap-1 10 GiB 4 MiB test-src 10 GiB 4 MiB 10 GiB 8 MiB $ rbd export --export-format 2 backup/test-src - | rbd import --export-format 2 - backup/test-dst Exporting image: 100% complete...done. Importing image: 100% complete...done. $ rbd du backup/test-dst NAME PROVISIONED USED test-dst@snap-1 10 GiB 4 MiB test-dst 10 GiB 4 MiB 10 GiB 8 MiB ``` Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] import/export with --export-format 2
Hi, src-image is 1GB (provisioned size). I did the following 3 tests. 1. rbd export src-image - | rbd import - dst-image 2. rbd export --export-format 2 src-image - | rbd import --export-format 2 - dst-image 3. rbd export --export-format 2 src-image - | rbd import - dst-image With #1 and #2, dst-image size (rbd info) is the same as src-image, which is expected. With #3, dst-image size (rbd info) is close to used size (rbd du), not the provisioned size of src-image. I'm not sure if this image is actually useable when write into it. The questions is that, is #3 not supposed to be used at all? I checked doc, didn't see something like "--export-format 2 has to be used for importing the image which is exported with --export-format 2 option". Any comments? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: easy way to find out the number of allocated objects for a RBD image
Thank you Eugen! "rbd du" is it. The used_size from "rbd du" is object count times object size. That's the actual storage taken by the image in backend. For export, it actually flattens and also sparsifies the image. In case of many small data pieces, the export size is smaller than du size. Thanks! Tony From: Eugen Block Sent: November 25, 2023 12:17 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: easy way to find out the number of allocated objects for a RBD image Maybe I misunderstand, but isn’t ’rbd du‘ what you're looking for? Zitat von Tony Liu : > Hi, > > Other than get all objects of the pool and filter by image ID, > is there any easier way to get the number of allocated objects for > a RBD image? > > What I really want to know is the actual usage of an image. > An allocated object could be used partially, but that's fine, > no need to be 100% accurate. To get the object count and > times object size, that should be sufficient. > > "rbd export" exports actual used data, but to get the actual usage > by exporting the image seems too much. This brings up another > question, is there any way to know the export size before running it? > > > Thanks! > Tony > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] understand "extent"
Hi, The context is RBD on bluestore. I did check extent on Wiki. I see "extent" when talking about snapshot and export/import. For example, when create a snapshot, we mark extents. When there is write to marked extents, we will make a copy. I also know that user data on block device maps to objects. How "extent" and "object" are related? Can I say extent is a set of continuous objects (with default tripe settings)? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] easy way to find out the number of allocated objects for a RBD image
Hi, Other than get all objects of the pool and filter by image ID, is there any easier way to get the number of allocated objects for a RBD image? What I really want to know is the actual usage of an image. An allocated object could be used partially, but that's fine, no need to be 100% accurate. To get the object count and times object size, that should be sufficient. "rbd export" exports actual used data, but to get the actual usage by exporting the image seems too much. This brings up another question, is there any way to know the export size before running it? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rbd export-diff/import-diff hangs
Figured it out. It's not rbd issue. Sorry for this false alarm. Thanks! Tony From: Tony Liu Sent: August 27, 2023 08:19 PM To: Eugen Block; ceph-users@ceph.io Subject: [ceph-users] Re: rbd export-diff/import-diff hangs It's export-diff from an in-use image, both from-snapshot and to-snapshot exist. The same from-snapshot exists in import image, which is the to-snapshot from last diff. export/import is used for local backup, rbd-mirroring is used for remote backup. Looking for options to get more info to troubleshoot. Thanks! Tony From: Eugen Block Sent: August 27, 2023 11:53 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: rbd export-diff/import-diff hangs You mean the image is in use while you’re exporting? Have thought about creating snapshots and exporting those? Or set up rbd mirroring? Zitat von Tony Liu : > To update, hanging happens when updating local image, not remote, networking > is not a concern here. Any advices how to look into it? > > Thanks! > Tony > ________ > From: Tony Liu > Sent: August 26, 2023 10:43 PM > To: d...@ceph.io; ceph-users@ceph.io > Subject: [ceph-users] rbd export-diff/import-diff hangs > > Hi, > > I'm using rbd import and export to copy image from one cluster to another. > Also using import-diff and export-diff to update image in remote cluster. > For example, "rbd --cluster local export-diff ... | rbd --cluster > remote import-diff ...". > Sometimes, the whole command is stuck. I can't tell it's stuck on > which end of the pipe. > I did some search, [1] seems the same issue and [2] is also related. > > Wonder if there is any way to identify where it's stuck and get more > debugging info. > Given [2], I'd suspect the import-diff is stuck, cause rbd client is > importing to the > remote cluster. Networking latency could be involved here? Ping > latency is 7~8 ms. > > Any comments is appreciated! > > [1] https://bugs.launchpad.net/cinder/+bug/2031897 > [2] https://stackoverflow.com/questions/69858763/ceph-rbd-import-hangs > > Thanks! > Tony > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rbd export-diff/import-diff hangs
It's export-diff from an in-use image, both from-snapshot and to-snapshot exist. The same from-snapshot exists in import image, which is the to-snapshot from last diff. export/import is used for local backup, rbd-mirroring is used for remote backup. Looking for options to get more info to troubleshoot. Thanks! Tony From: Eugen Block Sent: August 27, 2023 11:53 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: rbd export-diff/import-diff hangs You mean the image is in use while you’re exporting? Have thought about creating snapshots and exporting those? Or set up rbd mirroring? Zitat von Tony Liu : > To update, hanging happens when updating local image, not remote, networking > is not a concern here. Any advices how to look into it? > > Thanks! > Tony > ________ > From: Tony Liu > Sent: August 26, 2023 10:43 PM > To: d...@ceph.io; ceph-users@ceph.io > Subject: [ceph-users] rbd export-diff/import-diff hangs > > Hi, > > I'm using rbd import and export to copy image from one cluster to another. > Also using import-diff and export-diff to update image in remote cluster. > For example, "rbd --cluster local export-diff ... | rbd --cluster > remote import-diff ...". > Sometimes, the whole command is stuck. I can't tell it's stuck on > which end of the pipe. > I did some search, [1] seems the same issue and [2] is also related. > > Wonder if there is any way to identify where it's stuck and get more > debugging info. > Given [2], I'd suspect the import-diff is stuck, cause rbd client is > importing to the > remote cluster. Networking latency could be involved here? Ping > latency is 7~8 ms. > > Any comments is appreciated! > > [1] https://bugs.launchpad.net/cinder/+bug/2031897 > [2] https://stackoverflow.com/questions/69858763/ceph-rbd-import-hangs > > Thanks! > Tony > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rbd export with export-format 2 exports all snapshots?
Thank you Alex for confirmation! Tony From: Alex Gorbachev Sent: August 27, 2023 05:29 PM To: Tony Liu Cc: d...@ceph.io; ceph-users@ceph.io Subject: Re: [ceph-users] rbd export with export-format 2 exports all snapshots? Tony, From what I recall having worked with snapshots a while ago, you would want export-diff to achieve a differential export. "export" will always go for a full image. -- Alex Gorbachev https://alextelescope.blogspot.com On Sun, Aug 27, 2023 at 8:03 PM Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: Hi, Say, source image has snapshot s1, s2 and s3. I expect "export" behaves the same as "deep cp", when specify a snapshot, with "--export-format 2", only the specified snapshot and all snapshots earlier than that will be exported. What I see is that, no matter which snapshot I specify, "export" with "--export-format 2" always exports the whole image with all snapshots. Is this expected? Could anyone help to clarify? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io> To unsubscribe send an email to ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io> ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] rbd export with export-format 2 exports all snapshots?
Hi, Say, source image has snapshot s1, s2 and s3. I expect "export" behaves the same as "deep cp", when specify a snapshot, with "--export-format 2", only the specified snapshot and all snapshots earlier than that will be exported. What I see is that, no matter which snapshot I specify, "export" with "--export-format 2" always exports the whole image with all snapshots. Is this expected? Could anyone help to clarify? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rbd export-diff/import-diff hangs
To update, hanging happens when updating local image, not remote, networking is not a concern here. Any advices how to look into it? Thanks! Tony From: Tony Liu Sent: August 26, 2023 10:43 PM To: d...@ceph.io; ceph-users@ceph.io Subject: [ceph-users] rbd export-diff/import-diff hangs Hi, I'm using rbd import and export to copy image from one cluster to another. Also using import-diff and export-diff to update image in remote cluster. For example, "rbd --cluster local export-diff ... | rbd --cluster remote import-diff ...". Sometimes, the whole command is stuck. I can't tell it's stuck on which end of the pipe. I did some search, [1] seems the same issue and [2] is also related. Wonder if there is any way to identify where it's stuck and get more debugging info. Given [2], I'd suspect the import-diff is stuck, cause rbd client is importing to the remote cluster. Networking latency could be involved here? Ping latency is 7~8 ms. Any comments is appreciated! [1] https://bugs.launchpad.net/cinder/+bug/2031897 [2] https://stackoverflow.com/questions/69858763/ceph-rbd-import-hangs Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] rbd export-diff/import-diff hangs
Hi, I'm using rbd import and export to copy image from one cluster to another. Also using import-diff and export-diff to update image in remote cluster. For example, "rbd --cluster local export-diff ... | rbd --cluster remote import-diff ...". Sometimes, the whole command is stuck. I can't tell it's stuck on which end of the pipe. I did some search, [1] seems the same issue and [2] is also related. Wonder if there is any way to identify where it's stuck and get more debugging info. Given [2], I'd suspect the import-diff is stuck, cause rbd client is importing to the remote cluster. Networking latency could be involved here? Ping latency is 7~8 ms. Any comments is appreciated! [1] https://bugs.launchpad.net/cinder/+bug/2031897 [2] https://stackoverflow.com/questions/69858763/ceph-rbd-import-hangs Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: snapshot timestamp
Thank you Ilya for confirmation! Tony From: Ilya Dryomov Sent: August 4, 2023 04:51 AM To: Tony Liu Cc: d...@ceph.io; ceph-users@ceph.io Subject: Re: [ceph-users] snapshot timestamp On Fri, Aug 4, 2023 at 7:49 AM Tony Liu wrote: > > Hi, > > We know snapshot is on a point of time. Is this point of time tracked > internally by > some sort of sequence number, or the timestamp showed by "snap ls", or > something else? Hi Tony, The timestamp in "rbd snap ls" output is the snapshot creation timestamp. > > I noticed that when "deep cp", the timestamps of all snapshot are changed to > copy-time. Correct -- exactly the same as the image creation timestamp (visible in "rbd info" output). > Say I create a snapshot at 1PM and make a copy at 3PM, the timestamp of > snapshot in > the copy is 3PM. If I rollback the copy to this snapshot, I'd assume it will > actually bring me > back to the state of 1PM. Is that correct? Correct. > > If the above is true, I won't be able to rely on timestamp to track snapshots. > > Say I create a snapshot every hour and make a backup by copy at the end of > the day. > Then the original image is damaged and backup is used to restore the work. On > this > backup image, how do I know which snapshot was on 1PM, which was on 2PM, etc.? > Any advices to track snapshots properly in such case? I would suggest embedding that info along with any additional metadata needed in the snapshot name. Thanks, Ilya ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: What's the max of snap ID?
Thank you Eugen and Nathan! uint64 is big enough, no concerns any more. Tony From: Nathan Fish Sent: August 4, 2023 04:19 AM To: Eugen Block Cc: ceph-users@ceph.io Subject: [ceph-users] Re: What's the max of snap ID? 2^64 byte in peta byte = 18446.744073709551616 (peta⋅byte) Assuming that a snapshot requires storing any data at all, which it must, nobody has a Ceph cluster that could store that much snapshot metadata even for empty snapshots. On Fri, Aug 4, 2023 at 7:05 AM Eugen Block wrote: > > I'm no programmer but if I understand [1] correctly it's an unsigned > long long: > > > int ImageCtx::snap_set(uint64_t in_snap_id) { > > which means the max snap_id should be this: > > 2^64 = 18446744073709551616 > > Not sure if you can get your cluster to reach that limit, but I also > don't know what would happen if you actually would reach it. I also > might be misunderstanding so maybe someone with more knowledge can > confirm oder correct me. > > [1] https://github.com/ceph/ceph/blob/main/src/librbd/ImageCtx.cc#L328 > > Zitat von Tony Liu : > > > Hi, > > > > There is a snap ID for each snapshot. How is this ID allocated, > > sequentially? > > Did some tests, it seems this ID is per pool, starting from 4 and > > always going up. > > Is that correct? > > What's the max of this ID? > > What's going to happen when ID reaches the max, going back to start > > from 4 again? > > > > > > Thanks! > > Tony > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] snapshot timestamp
Hi, We know snapshot is on a point of time. Is this point of time tracked internally by some sort of sequence number, or the timestamp showed by "snap ls", or something else? I noticed that when "deep cp", the timestamps of all snapshot are changed to copy-time. Say I create a snapshot at 1PM and make a copy at 3PM, the timestamp of snapshot in the copy is 3PM. If I rollback the copy to this snapshot, I'd assume it will actually bring me back to the state of 1PM. Is that correct? If the above is true, I won't be able to rely on timestamp to track snapshots. Say I create a snapshot every hour and make a backup by copy at the end of the day. Then the original image is damaged and backup is used to restore the work. On this backup image, how do I know which snapshot was on 1PM, which was on 2PM, etc.? Any advices to track snapshots properly in such case? I can definitely build something else to help on this, but I'd like to know how much Ceph can support it. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] What's the max of snap ID?
Hi, There is a snap ID for each snapshot. How is this ID allocated, sequentially? Did some tests, it seems this ID is per pool, starting from 4 and always going up. Is that correct? What's the max of this ID? What's going to happen when ID reaches the max, going back to start from 4 again? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [rbd-mirror] can't enable journal-based image mirroring
In case the image has parent, the parent image also needs to be mirrored. After enabling the mirroring on parent image, it works as expected. Thanks! Tony From: Tony Liu Sent: July 31, 2023 08:13 AM To: ceph-users@ceph.io; d...@ceph.io Subject: [ceph-users] [rbd-mirror] can't enable journal-based image mirroring Hi, The Ceph cluster is with Pacific v16.2.10. "rbd mirror image enable journal" seems not working. Any clues what I'm missing? There is no error messages from the CLI. Any way to troubleshooting? ``` # rbd mirror pool info volume-ssd Mode: image Site Name: 35d050c0-77c0-11eb-9242-2cea7ff9d07c Peer Sites: UUID: 86eacc0f-6657-4742-8daf-2942ea23affd Name: qa Mirror UUID: Direction: rx-tx Client: client.infra # rbd feature enable volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b journaling # rbd mirror image enable volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b journal # rbd info volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b rbd image 'volume-aceee005-265e-44ea-a591-b6dda639a76b': size 40 GiB in 10240 objects order 22 (4 MiB objects) snapshot_count: 0 id: e6fce674350e55 block_name_prefix: rbd_data.e6fce674350e55 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, journaling op_features: flags: create_timestamp: Sun Jul 30 22:21:09 2023 access_timestamp: Mon Jul 31 00:36:20 2023 modify_timestamp: Mon Jul 31 08:01:26 2023 parent: image/801d1850-de2f-443b-a30c-71966e90c118@snap overlap: 10 GiB journal: e6fce674350e55 mirroring state: disabled # rbd mirror image status volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b rbd: mirroring not enabled on the image ``` Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] [rbd-mirror] can't enable journal-based image mirroring
Hi, The Ceph cluster is with Pacific v16.2.10. "rbd mirror image enable journal" seems not working. Any clues what I'm missing? There is no error messages from the CLI. Any way to troubleshooting? ``` # rbd mirror pool info volume-ssd Mode: image Site Name: 35d050c0-77c0-11eb-9242-2cea7ff9d07c Peer Sites: UUID: 86eacc0f-6657-4742-8daf-2942ea23affd Name: qa Mirror UUID: Direction: rx-tx Client: client.infra # rbd feature enable volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b journaling # rbd mirror image enable volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b journal # rbd info volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b rbd image 'volume-aceee005-265e-44ea-a591-b6dda639a76b': size 40 GiB in 10240 objects order 22 (4 MiB objects) snapshot_count: 0 id: e6fce674350e55 block_name_prefix: rbd_data.e6fce674350e55 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, journaling op_features: flags: create_timestamp: Sun Jul 30 22:21:09 2023 access_timestamp: Mon Jul 31 00:36:20 2023 modify_timestamp: Mon Jul 31 08:01:26 2023 parent: image/801d1850-de2f-443b-a30c-71966e90c118@snap overlap: 10 GiB journal: e6fce674350e55 mirroring state: disabled # rbd mirror image status volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b rbd: mirroring not enabled on the image ``` Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: configure rgw
When deploy rgw container, cephadm adds rgw_frontends into config db on daemon level. I was adding settings on node level. That's why I didn't see my setting take effect. I need to put rgw_frontends on daemon level after deployment. Thanks! Tony From: Tony Liu Sent: July 29, 2023 11:44 PM To: ceph-users@ceph.io; d...@ceph.io Subject: [ceph-users] Re: configure rgw A few updates. 1. "radosgw-admin --show-config -n client.rgw.qa.ceph-1.hzfrwq" doesn't show actual running config. 2. "ceph --admin-daemon /var/run/ceph/69f94d08-2811-11ee-9ab1-089204adfafa/ceph-client.rgw.qa.ceph-1.hzfrwq.7.94254493095368.asok config show" shows the actual running config. 3. All settings in client.rgw are applied to rgw running config, except for rgw_frontends. ``` # ceph config get client.rgw rgw_frontends beast port=8086 # ceph --admin-daemon /var/run/ceph/69f94d08-2811-11ee-9ab1-089204adfafa/ceph-client.rgw.qa.ceph-1.hzfrwq.7.94254493095368.asok config get rgw_frontends { "rgw_frontends": "beast endpoint=10.250.80.100:80" } ``` The only place I see "10.250.80.100" and "80" is unit.meta. How is that applied? Found a workaround, remove rgw_frontends from config, restart rgw, rgw_frontends goes back to default "port=7480". Add it back to config, restart rgw. Now rgw_frontends is what I expect. The logic doesn't make much sense to me. I'd assume that unit.meta has something to do with this, hopefully someone could shed light here. Thanks! Tony From: Tony Liu Sent: July 29, 2023 10:40 PM To: ceph-users@ceph.io; d...@ceph.io Subject: [ceph-users] configure rgw Hi, I'm using Pacific v16.2.10 container image, deployed by cephadm. I used to manually build config file for rgw, deploy rgw, put config file in place and restart rgw. It works fine. Now, I'd like to put rgw config into config db. I tried with client.rgw, but the config is not taken by rgw. Also "config show" doesn't work. It always says "no config state". ``` # ceph orch ps | grep rgw rgw.qa.ceph-1.hzfrwq ceph-1 10.250.80.100:80 running (10m)10m ago 53m51.4M- 16.2.10 32214388de9d 13169a213bc5 # ceph config get client.rgw | grep frontends client.rgwbasic rgw_frontendsbeast port=8086 * # ceph config show rgw.qa.ceph-1.hzfrwq Error ENOENT: no config state for daemon rgw.qa.ceph-1.hzfrwq # ceph config show client.rgw.qa.ceph-1.hzfrwq Error ENOENT: no config state for daemon client.rgw.qa.ceph-1.hzfrwq # radosgw-admin --show-config -n client.rgw.qa.ceph-1.hzfrwq | grep frontends rgw_frontends = beast port=7480 ``` Any clues what I am missing here? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: configure rgw
A few updates. 1. "radosgw-admin --show-config -n client.rgw.qa.ceph-1.hzfrwq" doesn't show actual running config. 2. "ceph --admin-daemon /var/run/ceph/69f94d08-2811-11ee-9ab1-089204adfafa/ceph-client.rgw.qa.ceph-1.hzfrwq.7.94254493095368.asok config show" shows the actual running config. 3. All settings in client.rgw are applied to rgw running config, except for rgw_frontends. ``` # ceph config get client.rgw rgw_frontends beast port=8086 # ceph --admin-daemon /var/run/ceph/69f94d08-2811-11ee-9ab1-089204adfafa/ceph-client.rgw.qa.ceph-1.hzfrwq.7.94254493095368.asok config get rgw_frontends { "rgw_frontends": "beast endpoint=10.250.80.100:80" } ``` The only place I see "10.250.80.100" and "80" is unit.meta. How is that applied? Found a workaround, remove rgw_frontends from config, restart rgw, rgw_frontends goes back to default "port=7480". Add it back to config, restart rgw. Now rgw_frontends is what I expect. The logic doesn't make much sense to me. I'd assume that unit.meta has something to do with this, hopefully someone could shed light here. Thanks! Tony From: Tony Liu Sent: July 29, 2023 10:40 PM To: ceph-users@ceph.io; d...@ceph.io Subject: [ceph-users] configure rgw Hi, I'm using Pacific v16.2.10 container image, deployed by cephadm. I used to manually build config file for rgw, deploy rgw, put config file in place and restart rgw. It works fine. Now, I'd like to put rgw config into config db. I tried with client.rgw, but the config is not taken by rgw. Also "config show" doesn't work. It always says "no config state". ``` # ceph orch ps | grep rgw rgw.qa.ceph-1.hzfrwq ceph-1 10.250.80.100:80 running (10m)10m ago 53m51.4M- 16.2.10 32214388de9d 13169a213bc5 # ceph config get client.rgw | grep frontends client.rgwbasic rgw_frontendsbeast port=8086 * # ceph config show rgw.qa.ceph-1.hzfrwq Error ENOENT: no config state for daemon rgw.qa.ceph-1.hzfrwq # ceph config show client.rgw.qa.ceph-1.hzfrwq Error ENOENT: no config state for daemon client.rgw.qa.ceph-1.hzfrwq # radosgw-admin --show-config -n client.rgw.qa.ceph-1.hzfrwq | grep frontends rgw_frontends = beast port=7480 ``` Any clues what I am missing here? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] configure rgw
Hi, I'm using Pacific v16.2.10 container image, deployed by cephadm. I used to manually build config file for rgw, deploy rgw, put config file in place and restart rgw. It works fine. Now, I'd like to put rgw config into config db. I tried with client.rgw, but the config is not taken by rgw. Also "config show" doesn't work. It always says "no config state". ``` # ceph orch ps | grep rgw rgw.qa.ceph-1.hzfrwq ceph-1 10.250.80.100:80 running (10m)10m ago 53m51.4M- 16.2.10 32214388de9d 13169a213bc5 # ceph config get client.rgw | grep frontends client.rgwbasic rgw_frontendsbeast port=8086 * # ceph config show rgw.qa.ceph-1.hzfrwq Error ENOENT: no config state for daemon rgw.qa.ceph-1.hzfrwq # ceph config show client.rgw.qa.ceph-1.hzfrwq Error ENOENT: no config state for daemon client.rgw.qa.ceph-1.hzfrwq # radosgw-admin --show-config -n client.rgw.qa.ceph-1.hzfrwq | grep frontends rgw_frontends = beast port=7480 ``` Any clues what I am missing here? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: resume RBD mirror on another host
Super! Thanks Ilya! Tony From: Ilya Dryomov Sent: July 13, 2023 01:30 PM To: Tony Liu Cc: d...@ceph.io; ceph-users@ceph.io Subject: Re: [ceph-users] resume RBD mirror on another host On Thu, Jul 13, 2023 at 10:23 PM Ilya Dryomov wrote: > > On Thu, Jul 13, 2023 at 6:16 PM Tony Liu wrote: > > > > Hi, > > > > How RBD mirror tracks mirroring process, on local storage? > > Say, RBD mirror is running on host-1, when host-1 goes down, > > start RBD mirror on host-2. In that case, is RBD mirror on host-2 > > going to continue the mirroring? > > Hi Tony, > > No, it's tracked on the RBD image itself -- meaning in RADOS. To be clear, "no" here was meant to answer the "How RBD mirror tracks mirroring process, on local storage?" question. To answer the second question explicitly: yes, rbd-mirror daemon on host-2 will continue mirroring from where rbd-mirror daemon on host-1 left off. Thanks, Ilya ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] resume RBD mirror on another host
Hi, How RBD mirror tracks mirroring process, on local storage? Say, RBD mirror is running on host-1, when host-1 goes down, start RBD mirror on host-2. In that case, is RBD mirror on host-2 going to continue the mirroring? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] librbd Python asyncio
Hi, Wondering if there is librbd supporting Python asyncio, or any plan to do that? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: import OSD after host OS reinstallation
Thank you Eugen for looking into it! In short, it works. I'm using 16.2.10. What I did wrong was to remove the OSD, which makes no sense. Tony From: Eugen Block Sent: April 28, 2023 06:46 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: import OSD after host OS reinstallation I chatted with Mykola who helped me get the OSDs back up. My test cluster was on 16.2.5 (and still mostly is), after upgrading only the MGRs to a more recent version (16.2.10) the activate command worked successfully and the existing OSDs got back up. Not sure if that's a bug or something else, but which exact versions are you using? Zitat von Eugen Block : > I found a small two-node cluster to test this on pacific, I can > reproduce it. After reinstalling the host (VM) most of the other > services are redeployed (mon, mgr, mds, crash), but not the OSDs. I > will take a closer look. > > Zitat von Tony Liu : > >> Tried [1] already, but got error. >> Created no osd(s) on host ceph-4; already created? >> >> The error is from [2] in deploy_osd_daemons_for_existing_osds(). >> >> Not sure what's missing. >> Should OSD be removed, or removed with --replace, or untouched >> before host reinstallation? >> >> [1] >> https://docs.ceph.com/en/pacific/cephadm/services/osd/#activate-existing-osds >> [2] >> https://github.com/ceph/ceph/blob/0a5b3b373b8a5ba3081f1f110cec24d82299cac8/src/pybind/mgr/cephadm/services/osd.py#L196 >> >> Thanks! >> Tony >> >> From: Tony Liu >> Sent: April 27, 2023 10:20 PM >> To: ceph-users@ceph.io; d...@ceph.io >> Subject: [ceph-users] import OSD after host OS reinstallation >> >> Hi, >> >> The cluster is with Pacific and deployed by cephadm on container. >> The case is to import OSDs after host OS reinstallation. >> All OSDs are SSD who has DB/WAL and data together. >> Did some research, but not able to find a working solution. >> Wondering if anyone has experiences in this? >> What needs to be done before host OS reinstallation and what's after? >> >> >> Thanks! >> Tony >> ___ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io >> ___ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: import OSD after host OS reinstallation
Tried [1] already, but got error. Created no osd(s) on host ceph-4; already created? The error is from [2] in deploy_osd_daemons_for_existing_osds(). Not sure what's missing. Should OSD be removed, or removed with --replace, or untouched before host reinstallation? [1] https://docs.ceph.com/en/pacific/cephadm/services/osd/#activate-existing-osds [2] https://github.com/ceph/ceph/blob/0a5b3b373b8a5ba3081f1f110cec24d82299cac8/src/pybind/mgr/cephadm/services/osd.py#L196 Thanks! Tony From: Tony Liu Sent: April 27, 2023 10:20 PM To: ceph-users@ceph.io; d...@ceph.io Subject: [ceph-users] import OSD after host OS reinstallation Hi, The cluster is with Pacific and deployed by cephadm on container. The case is to import OSDs after host OS reinstallation. All OSDs are SSD who has DB/WAL and data together. Did some research, but not able to find a working solution. Wondering if anyone has experiences in this? What needs to be done before host OS reinstallation and what's after? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] import OSD after host OS reinstallation
Hi, The cluster is with Pacific and deployed by cephadm on container. The case is to import OSDs after host OS reinstallation. All OSDs are SSD who has DB/WAL and data together. Did some research, but not able to find a working solution. Wondering if anyone has experiences in this? What needs to be done before host OS reinstallation and what's after? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rbd cp vs. rbd clone + rbd flatten
Thank you Ilya! Tony From: Ilya Dryomov Sent: March 27, 2023 10:28 AM To: Tony Liu Cc: ceph-users@ceph.io; d...@ceph.io Subject: Re: [ceph-users] rbd cp vs. rbd clone + rbd flatten On Wed, Mar 22, 2023 at 10:51 PM Tony Liu wrote: > > Hi, > > I want > 1) copy a snapshot to an image, > 2) no need to copy snapshots, > 3) no dependency after copy, > 4) all same image format 2. > In that case, is rbd cp the same as rbd clone + rbd flatten? > I ran some tests, seems like it, but want to confirm, in case of missing > anything. Hi Tony, Yes, at a high level it should be the same. > Also, seems cp is a bit faster and flatten, is that true? I can't think of anything that would make "rbd cp" faster. I would actually expect it to be slower since "rbd cp" also attempts to sparsify the destination image (see --sparse-size option), making it more space efficient. Thanks, Ilya ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] rbd cp vs. rbd clone + rbd flatten
Hi, I want 1) copy a snapshot to an image, 2) no need to copy snapshots, 3) no dependency after copy, 4) all same image format 2. In that case, is rbd cp the same as rbd clone + rbd flatten? I ran some tests, seems like it, but want to confirm, in case of missing anything. Also, seems cp is a bit faster and flatten, is that true? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Is it a bug that OSD crashed when it's full?
Thank you Igor! Tony From: Igor Fedotov Sent: November 1, 2022 04:34 PM To: Tony Liu; ceph-users@ceph.io; d...@ceph.io Subject: Re: [ceph-users] Re: Is it a bug that OSD crashed when it's full? Hi Tony, first of all let me share my understanding of the issue you're facing. This recalls me an upstream ticket and I presume my root cause analysis from there (https://tracker.ceph.com/issues/57672#note-9) is applicable in your case as well. So generally speaking your OSD isn't 100% full - from the log output one can see that 0x57acbc000 of 0x6fc840 bytes are free. But there are not enough contiguous 64K chunks for BlueFS to proceed operating.. As a result OSD managed to escape any *full* sentries and reached the state when it's crashed - these safety means just weren't designed to take that additional free space fragmentation factor into account... Similarly the lack of available 64K chunks prevents OSD from starting up - it needs to write out some more data to BlueFS during startup recovery. I'm currently working on enabling BlueFS functioning with default main device allocation unit (=4K) which will hopefully fix the above issue. Meanwhile you might want to workaround the current OSD's state by setting bluefs_shared_allocat_size to 32K - this might have some operational and performance effects but highly likely OSD should be able to startup afterwards. Please do not use 4K for now - it's known for causing more problems in some circumstances. And I'd highly recommend to redeploy the OSD ASAP as you drained all the data off it - I presume that's the reason why you want to bring it up instead of letting the cluster to recover using regular means applied on OSD loss. Alternative approach would be to add standalone DB volume and migrate BlueFS there - ceph-volume should be able to do that even in the current OSD state. Expanding main volume (if backed by LVM and extra spare space is available) is apparently a valid option too Thanks, Igor On 11/1/2022 8:09 PM, Tony Liu wrote: > The actual question is that, is crash expected when OSD is full? > My focus is more on how to prevent this from happening. > My expectation is that OSD rejects write request when it's full, but not > crash. > Otherwise, no point to have ratio threshold. > Please let me know if this is the design or a bug. > > Thanks! > Tony > ________ > From: Tony Liu > Sent: October 31, 2022 05:46 PM > To: ceph-users@ceph.io; d...@ceph.io > Subject: [ceph-users] Is it a bug that OSD crashed when it's full? > > Hi, > > Based on doc, Ceph prevents you from writing to a full OSD so that you don’t > lose data. > In my case, with v16.2.10, OSD crashed when it's full. Is this expected or > some bug? > I'd expect write failure instead of OSD crash. It keeps crashing when tried > to bring it up. > Is there any way to bring it back? > > -7> 2022-10-31T22:52:57.426+ 7fe37fd94200 4 rocksdb: EVENT_LOG_v1 > {"time_micros": 1667256777427646, "job": 1, "event": "recovery_started", > "log_files": [23300]} > -6> 2022-10-31T22:52:57.426+ 7fe37fd94200 4 rocksdb: > [db_impl/db_impl_open.cc:760] Recovering log #23300 mode 2 > -5> 2022-10-31T22:52:57.529+ 7fe37fd94200 3 rocksdb: > [le/block_based/filter_policy.cc:584] Using legacy Bloom filter with high > (20) bits/key. Dramatic filter space and/or accuracy improvement is available > with format_version>=5. > -4> 2022-10-31T22:52:57.592+ 7fe37fd94200 1 bluefs _allocate unable > to allocate 0x9 on bdev 1, allocator name block, allocator type hybrid, > capacity, block size 0x1000, free 0x57acbc000, fragmentation 0.359784, > allocated 0x0 > -3> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _allocate > allocation failed, needed 0x8064a > -2> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _flush_range > allocated: 0x0 offset: 0x0 length: 0x8064a > -1> 2022-10-31T22:52:57.604+ 7fe37fd94200 -1 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc: > In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, > uint64_t)' thread 7fe37fd94200 time 2022-10-31T22:52:57.593873+ > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc: > 2768: ceph_abort_msg("bluefs enospc") > > ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific > (stable) > 1: (ceph::__ceph_abort(cha
[ceph-users] Re: Is it a bug that OSD crashed when it's full?
The actual question is that, is crash expected when OSD is full? My focus is more on how to prevent this from happening. My expectation is that OSD rejects write request when it's full, but not crash. Otherwise, no point to have ratio threshold. Please let me know if this is the design or a bug. Thanks! Tony From: Tony Liu Sent: October 31, 2022 05:46 PM To: ceph-users@ceph.io; d...@ceph.io Subject: [ceph-users] Is it a bug that OSD crashed when it's full? Hi, Based on doc, Ceph prevents you from writing to a full OSD so that you don’t lose data. In my case, with v16.2.10, OSD crashed when it's full. Is this expected or some bug? I'd expect write failure instead of OSD crash. It keeps crashing when tried to bring it up. Is there any way to bring it back? -7> 2022-10-31T22:52:57.426+ 7fe37fd94200 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1667256777427646, "job": 1, "event": "recovery_started", "log_files": [23300]} -6> 2022-10-31T22:52:57.426+ 7fe37fd94200 4 rocksdb: [db_impl/db_impl_open.cc:760] Recovering log #23300 mode 2 -5> 2022-10-31T22:52:57.529+ 7fe37fd94200 3 rocksdb: [le/block_based/filter_policy.cc:584] Using legacy Bloom filter with high (20) bits/key. Dramatic filter space and/or accuracy improvement is available with format_version>=5. -4> 2022-10-31T22:52:57.592+ 7fe37fd94200 1 bluefs _allocate unable to allocate 0x9 on bdev 1, allocator name block, allocator type hybrid, capacity 0x6fc840, block size 0x1000, free 0x57acbc000, fragmentation 0.359784, allocated 0x0 -3> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _allocate allocation failed, needed 0x8064a -2> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _flush_range allocated: 0x0 offset: 0x0 length: 0x8064a -1> 2022-10-31T22:52:57.604+ 7fe37fd94200 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7fe37fd94200 time 2022-10-31T22:52:57.593873+ /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc: 2768: ceph_abort_msg("bluefs enospc") ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable) 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string, std::allocator > const&)+0xe5) [0x55858d7e2e7c] 2: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1131) [0x55858dee8cc1] 3: (BlueFS::_flush(BlueFS::FileWriter*, bool, bool*)+0x90) [0x55858dee8fa0] 4: (BlueFS::_flush(BlueFS::FileWriter*, bool, std::unique_lock&)+0x32) [0x55858defa0b2] 5: (BlueRocksWritableFile::Append(rocksdb::Slice const&)+0x11b) [0x55858df129eb] 6: (rocksdb::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x55858e3ae55f] 7: (rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned long)+0x58a) [0x55858e4c02aa] 8: (rocksdb::WritableFileWriter::Append(rocksdb::Slice const&)+0x2d0) [0x55858e4c1700] 9: (rocksdb::BlockBasedTableBuilder::WriteRawBlock(rocksdb::Slice const&, rocksdb::CompressionType, rocksdb::BlockHandle*, bool)+0xb6) [0x55858e5dce86] 10: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::Slice const&, rocksdb::BlockHandle*, bool)+0x26c) [0x55858e5dd7cc] 11: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::BlockBuilder*, rocksdb::BlockHandle*, bool)+0x3c) [0x55858e5ddecc] 12: (rocksdb::BlockBasedTableBuilder::Flush()+0x6d) [0x55858e5ddf5d] 13: (rocksdb::BlockBasedTableBuilder::Add(rocksdb::Slice const&, rocksdb::Slice const&)+0x2b8) [0x55858e5e13c8] 14: (rocksdb::BuildTable(std::__cxx11::basic_string, std::allocator > const&, rocksdb::Env*, rocksdb::FileSystem*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::FileOptions const&, rocksdb::TableCache*, rocksdb::InternalIteratorBase*, std::vector >, std::allocator > > >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector >, std::allocator > > > const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&, std::vector >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, unsigned long, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::Wri
[ceph-users] Re: Is it a bug that OSD crashed when it's full?
Hi Zizon, I know I ran out of space. I thought that full ratio would prevent me from being here. I tried a few ceph-*-tool, they crash the same way. I guess they need rockdb to start? Any recommendations how I can restore it or copy data out, or copy the volume to another bigger disk? Thanks! Tony From: Zizon Qiu Sent: October 31, 2022 08:13 PM To: Tony Liu Cc: ceph-users@ceph.io; d...@ceph.io Subject: Re: Is it a bug that OSD crashed when it's full? 15: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xcf5) [0x55858e3f0ea5] 16: (rocksdb::DBImpl::RecoverLogFiles(std::vector > const&, unsigned long*, bool, bool*)+0x1c2e) [0x55858e3f35de] 17: (rocksdb::DBImpl::Recover(std::vector > const&, bool, bool, bool, unsigned long*)+0xae8) [0x55858e3f4938] 18: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string, std::allocator > const&, std::vector > const&, std::vector >*, rocksdb::DB**, bool, bool)+0x59d) [0x55858e3ee65d] I think you should manage to make room for at least this to finish. The rocksdb can not even startup with enough disk space. On Tue, Nov 1, 2022 at 8:49 AM Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: Hi, Based on doc, Ceph prevents you from writing to a full OSD so that you don’t lose data. In my case, with v16.2.10, OSD crashed when it's full. Is this expected or some bug? I'd expect write failure instead of OSD crash. It keeps crashing when tried to bring it up. Is there any way to bring it back? -7> 2022-10-31T22:52:57.426+ 7fe37fd94200 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1667256777427646, "job": 1, "event": "recovery_started", "log_files": [23300]} -6> 2022-10-31T22:52:57.426+ 7fe37fd94200 4 rocksdb: [db_impl/db_impl_open.cc:760] Recovering log #23300 mode 2 -5> 2022-10-31T22:52:57.529+ 7fe37fd94200 3 rocksdb: [le/block_based/filter_policy.cc:584] Using legacy Bloom filter with high (20) bits/key. Dramatic filter space and/or accuracy improvement is available with format_version>=5. -4> 2022-10-31T22:52:57.592+ 7fe37fd94200 1 bluefs _allocate unable to allocate 0x9 on bdev 1, allocator name block, allocator type hybrid, capacity 0x6fc840, block size 0x1000, free 0x57acbc000, fragmentation 0.359784, allocated 0x0 -3> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _allocate allocation failed, needed 0x8064a -2> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _flush_range allocated: 0x0 offset: 0x0 length: 0x8064a -1> 2022-10-31T22:52:57.604+ 7fe37fd94200 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7fe37fd94200 time 2022-10-31T22:52:57.593873+ /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc: 2768: ceph_abort_msg("bluefs enospc") ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable) 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string, std::allocator > const&)+0xe5) [0x55858d7e2e7c] 2: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1131) [0x55858dee8cc1] 3: (BlueFS::_flush(BlueFS::FileWriter*, bool, bool*)+0x90) [0x55858dee8fa0] 4: (BlueFS::_flush(BlueFS::FileWriter*, bool, std::unique_lock&)+0x32) [0x55858defa0b2] 5: (BlueRocksWritableFile::Append(rocksdb::Slice const&)+0x11b) [0x55858df129eb] 6: (rocksdb::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x55858e3ae55f] 7: (rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned long)+0x58a) [0x55858e4c02aa] 8: (rocksdb::WritableFileWriter::Append(rocksdb::Slice const&)+0x2d0) [0x55858e4c1700] 9: (rocksdb::BlockBasedTableBuilder::WriteRawBlock(rocksdb::Slice const&, rocksdb::CompressionType, rocksdb::BlockHandle*, bool)+0xb6) [0x55858e5dce86] 10: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::Slice const&, rocksdb::BlockHandle*, bool)+0x26c) [0x55858e5dd7cc] 11: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::BlockBuilder*, rocksdb::BlockHandle*, bool)+0x3c) [0x55858e5ddecc] 12: (rocksdb::BlockBasedTableBuilder::Flush()+0x6d) [0x55858e5ddf5d] 13: (rocksdb::BlockBasedTableBuilder::Add(rocksdb::Slice const&, rocksdb::Slice const&)+0x2b8) [0x55858e5e13c8] 14: (rocksdb::BuildTable(std::__cxx11::basic_string, std
[ceph-users] Re: Is it a bug that OSD crashed when it's full?
std::allocator > const&)+0x10c1) [0x56102e e39c41] 21: (BlueStore::_open_db(bool, bool, bool)+0x8c7) [0x56102ec9de17] [40/1932] 22: (BlueStore::_open_db_and_around(bool, bool)+0x2f7) [0x56102ed0beb7] 23: (BlueStore::_mount()+0x204) [0x56102ed0ed74] 24: main() 25: __libc_start_main() 26: _start() *** Caught signal (Aborted) ** in thread 7f31699b1000 thread_name:ceph-objectstor ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable ) 1: /lib64/libpthread.so.0(+0x12ce0) [0x7f315ec91ce0] 2: gsignal() 3: abort() 4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_strin g, std::allocator > const&)+0x1b6) [0x7f315f8 4cf5d] 5: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1 131) [0x56102ed93a71] 6: (BlueFS::_flush(BlueFS::FileWriter*, bool, bool*)+0x90) [0x56102ed93d50] 7: (BlueFS::_flush(BlueFS::FileWriter*, bool, std::unique_lock&)+0x 32) [0x56102eda4e62] 8: (BlueRocksWritableFile::Append(rocksdb::Slice const&)+0x11b) [0x56102edbd79b ] 9: (rocksdb::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, rocksdb:: IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x56102ee80ebf] 10: (rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned long)+0x5 8a) [0x56102ef9360a] 11: (rocksdb::WritableFileWriter::Append(rocksdb::Slice const&)+0x2d0) [0x56102 ef94a60] 12: (rocksdb::BlockBasedTableBuilder::WriteRawBlock(rocksdb::Slice const&, rock sdb::CompressionType, rocksdb::BlockHandle*, bool)+0xb6) [0x56102f0b0bc6] 13: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::Slice const&, rocksdb ::BlockHandle*, bool)+0x26c) [0x56102f0b150c] 14: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::BlockBuilder*, rocksd b::BlockHandle*, bool)+0x3c) [0x56102f0b1c0c] 15: (rocksdb::BlockBasedTableBuilder::Flush()+0x6d) [0x56102f0b1c9d] 16: (rocksdb::BlockBasedTableBuilder::Add(rocksdb::Slice const&, rocksdb::Slice const&)+0x2b8) [0x56102f0b5108] 17: (rocksdb::BuildTable(std::__cxx11::basic_string, std::allocator > const&, rocksdb::Env*, rocksdb::FileSystem*, rocksdb:: ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::FileOption s const&, rocksdb::TableCache*, rocksdb::InternalIteratorBase*, std::vector >, std::allocator > > >, rocksdb::FileMetaData*, rocksdb::Intern alKeyComparator const&, std::vector >, std::alloca tor > > > const*, unsigned int, std::__cxx11::basi c_string, std::allocator > const&, std::vecto r >, unsigned long, rocksdb::Snapsh otChecker*, rocksdb::CompressionType, unsigned long, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksd b::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint, unsigned long)+0x a45) [0x56102f05fad5] 18: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyDat a*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xcf5) [0x56102eec39f5] 19: (rocksdb::DBImpl::RecoverLogFiles(std::vector > const&, unsigned long*, bool, bool*)+0x1c2e) [0x56102eec612e] 20: (rocksdb::DBImpl::Recover(std::vector > const&, bool, bool, bool, unsigned long*)+0xae8) [0x56102eec7488] 21: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::__cxx11::basic_strin g, std::allocator > const&, std::vector > co nst&, std::vector >*, rocksdb::DB**, bool, bool)+0x59d) [0x56102eec11ad] 22: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string, std::allocator > const&, std::vector > const& , std::vector >*, rocksdb::DB**)+0x15) [0x56102eec2545] 23: (RocksDBStore::do_open(std::ostream&, bool, bool, std::__cxx11::basic_strin g, std::allocator > const&)+0x10c1) [0x56102e e39c41] 24: (BlueStore::_open_db(bool, bool, bool)+0x8c7) [0x56102ec9de17] 25: (BlueStore::_open_db_and_around(bool, bool)+0x2f7) [0x56102ed0beb7] 26: (BlueStore::_mount()+0x204) [0x56102ed0ed74] 27: main() 28: __libc_start_main() 29: _start() Aborted (core dumped) Any clues? Thanks again! Tony From: Steven Umbehocker Sent: October 31, 2022 07:07 PM To: Tony Liu; ceph-users@ceph.io; d...@ceph.io Subject: Re: Is it a bug that OSD crashed when it's full? Hi Tony, Once an OSD is wedged like that you have to manually remove some objects from it to get below 100% full before you can start it. You can use the ceph-objectstore-tool to get a list of all the objects in the OSD where $OSDID is the ID of your 100% full OSD to scan. You us
[ceph-users] Is it a bug that OSD crashed when it's full?
Hi, Based on doc, Ceph prevents you from writing to a full OSD so that you don’t lose data. In my case, with v16.2.10, OSD crashed when it's full. Is this expected or some bug? I'd expect write failure instead of OSD crash. It keeps crashing when tried to bring it up. Is there any way to bring it back? -7> 2022-10-31T22:52:57.426+ 7fe37fd94200 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1667256777427646, "job": 1, "event": "recovery_started", "log_files": [23300]} -6> 2022-10-31T22:52:57.426+ 7fe37fd94200 4 rocksdb: [db_impl/db_impl_open.cc:760] Recovering log #23300 mode 2 -5> 2022-10-31T22:52:57.529+ 7fe37fd94200 3 rocksdb: [le/block_based/filter_policy.cc:584] Using legacy Bloom filter with high (20) bits/key. Dramatic filter space and/or accuracy improvement is available with format_version>=5. -4> 2022-10-31T22:52:57.592+ 7fe37fd94200 1 bluefs _allocate unable to allocate 0x9 on bdev 1, allocator name block, allocator type hybrid, capacity 0x6fc840, block size 0x1000, free 0x57acbc000, fragmentation 0.359784, allocated 0x0 -3> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _allocate allocation failed, needed 0x8064a -2> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _flush_range allocated: 0x0 offset: 0x0 length: 0x8064a -1> 2022-10-31T22:52:57.604+ 7fe37fd94200 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7fe37fd94200 time 2022-10-31T22:52:57.593873+ /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc: 2768: ceph_abort_msg("bluefs enospc") ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable) 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string, std::allocator > const&)+0xe5) [0x55858d7e2e7c] 2: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1131) [0x55858dee8cc1] 3: (BlueFS::_flush(BlueFS::FileWriter*, bool, bool*)+0x90) [0x55858dee8fa0] 4: (BlueFS::_flush(BlueFS::FileWriter*, bool, std::unique_lock&)+0x32) [0x55858defa0b2] 5: (BlueRocksWritableFile::Append(rocksdb::Slice const&)+0x11b) [0x55858df129eb] 6: (rocksdb::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x55858e3ae55f] 7: (rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned long)+0x58a) [0x55858e4c02aa] 8: (rocksdb::WritableFileWriter::Append(rocksdb::Slice const&)+0x2d0) [0x55858e4c1700] 9: (rocksdb::BlockBasedTableBuilder::WriteRawBlock(rocksdb::Slice const&, rocksdb::CompressionType, rocksdb::BlockHandle*, bool)+0xb6) [0x55858e5dce86] 10: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::Slice const&, rocksdb::BlockHandle*, bool)+0x26c) [0x55858e5dd7cc] 11: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::BlockBuilder*, rocksdb::BlockHandle*, bool)+0x3c) [0x55858e5ddecc] 12: (rocksdb::BlockBasedTableBuilder::Flush()+0x6d) [0x55858e5ddf5d] 13: (rocksdb::BlockBasedTableBuilder::Add(rocksdb::Slice const&, rocksdb::Slice const&)+0x2b8) [0x55858e5e13c8] 14: (rocksdb::BuildTable(std::__cxx11::basic_string, std::allocator > const&, rocksdb::Env*, rocksdb::FileSystem*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::FileOptions const&, rocksdb::TableCache*, rocksdb::InternalIteratorBase*, std::vector >, std::allocator > > >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector >, std::allocator > > > const*, unsigned int, std::__cxx11::basic_string, std::allocator > const&, std::vector >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, unsigned long, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long,
[ceph-users] Re: Ceph configuration for rgw
You can always "config get" what was set by "config set", cause that's just write and read KV to and from configuration DB. To "config show" what was set by "config set" requires the support for mgr to connect to the service daemon to get running config. I see such support for mgr, mon and osd, but not rgw. The case I am asking is about the latter, for rgw, after "config set", I can't get it by "config show". I'd like to know if this is expected. Also, the config in configuration DB doesn't seem being applied to rgw, even restart the service. I also noticed that, when cephadm deploys rgw, it tries to add firewall rule for the open port. In my case, the port is no in "public" zone. And I don't see a way to set the zone or disable this action. Thanks! Tony From: Eugen Block Sent: September 26, 2022 12:08 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: Ceph configuration for rgw Just adding this: ses7-host1:~ # ceph config set client.rgw.ebl-rgw rgw_frontends "beast port=8080" This change is visible in the config get output: client.rgw.ebl-rgwbasic rgw_frontendsbeast port=8080 Zitat von Eugen Block : > Hi, > > the docs [1] show how to specifiy the rgw configuration via yaml > file (similar to OSDs). > If you applied it with ceph orch you should see your changes in the > 'ceph config dump' output, or like this: > > ---snip--- > ses7-host1:~ # ceph orch ls | grep rgw > rgw.ebl-rgw?:80 2/2 33s ago3M ses7-host3;ses7-host4 > > ses7-host1:~ # ceph config get client.rgw.ebl-rgw > WHO MASK LEVEL OPTION VALUE > > RO > globalbasic container_image > registry.fqdn:5000/ses/7.1/ceph/ceph@sha256:... * > client.rgw.ebl-rgwbasic rgw_frontendsbeast port=80 > > * > client.rgw.ebl-rgwadvanced rgw_realmebl-rgw > > * > client.rgw.ebl-rgwadvanced rgw_zone ebl-zone > ---snip--- > > As you see the RGWs are clients so you need to consider that when > you request the current configuration. But what I find strange is > that apparently it only shows the config initially applied, it > doesn't show the changes after running 'ceph orch apply -i rgw.yaml' > although the changes are applied to the containers after restarting > them. I don't know if this is intended but sounds like a bug to me > (I haven't checked). > >> 1) When start rgw with cephadm ("orch apply -i "), I have >> to start the daemon >>then update configuration file and restart. I don't find a way >> to achieve this by single step. > > I haven't played around too much yet, but you seem to be right, > changing the config isn't applied immediately, but only after a > service restart ('ceph orch restart rgw.ebl-rgw'). Maybe that's on > purpose? So you can change your config now and apply it later when a > service interruption is not critical. > > > [1] https://docs.ceph.com/en/pacific/cephadm/services/rgw/ > > Zitat von Tony Liu : > >> Hi, >> >> The cluster is Pacific 16.2.10 with containerized service and >> managed by cephadm. >> >> "config show" shows running configuration. Who is supported? >> mon, mgr and osd all work, but rgw doesn't. Is this expected? >> I tried with client. and >> without "client", >> neither works. >> >> When issue "config show", who connects the daemon and retrieves >> running config? >> Is it mgr or mon? >> >> Config update by "config set" will be populated to the service. >> Which services are >> supported by this? I know mon, mgr and osd work, but rgw doesn't. >> Is this expected? >> I assume this is similar to "config show", this support needs the >> capability of mgr/mon >> to connect to service daemon? >> >> To get running config from rgw, I always do >> "docker exec ceph daemon config show". >> Is that the only way? I assume it's the same to get running config >> from all services. >> Just the matter of supported by mgr/mon or not? >> >> I've been configuring rgw by configuration file. Is that the >> recommended way? >> I tried with configuration db, like "config set", it doesn't seem working. >> Is this expected? >> >> I see two cons with configuration file for rgw. >> 1) When start rgw with cephadm ("orch apply -i "), I have >> to start the daemon >>then update
[ceph-users] Ceph configuration for rgw
Hi, The cluster is Pacific 16.2.10 with containerized service and managed by cephadm. "config show" shows running configuration. Who is supported? mon, mgr and osd all work, but rgw doesn't. Is this expected? I tried with client. and without "client", neither works. When issue "config show", who connects the daemon and retrieves running config? Is it mgr or mon? Config update by "config set" will be populated to the service. Which services are supported by this? I know mon, mgr and osd work, but rgw doesn't. Is this expected? I assume this is similar to "config show", this support needs the capability of mgr/mon to connect to service daemon? To get running config from rgw, I always do "docker exec ceph daemon config show". Is that the only way? I assume it's the same to get running config from all services. Just the matter of supported by mgr/mon or not? I've been configuring rgw by configuration file. Is that the recommended way? I tried with configuration db, like "config set", it doesn't seem working. Is this expected? I see two cons with configuration file for rgw. 1) When start rgw with cephadm ("orch apply -i "), I have to start the daemon then update configuration file and restart. I don't find a way to achieve this by single step. 2) When "orch daemon redeploy" or upgrade rgw, the configuration file will be re-generated and I have to update it again. Is this all how it's supposed to work or I am missing anything? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] host disk used by osd container
Hi, "df -h" on the OSD host shows 187G is being used. "du -sh /" shows 36G. bluefs_buffered_io is enabled here. What's taking that 150G disk space, cache? Then where is that cache file? Any way to configure it smaller? # free -h totalusedfree shared buff/cache available Mem: 187Gi28Gi 4.4Gi 4.1Gi 154Gi 152Gi Swap: 8.0Gi82Mi 7.9Gi # df -h FilesystemSize Used Avail Use% Mounted on devtmpfs 94G 0 94G 0% /dev tmpfs 94G 0 94G 0% /dev/shm tmpfs 94G 4.2G 90G 5% /run tmpfs 94G 0 94G 0% /sys/fs/cgroup /dev/mapper/vg0-root 215G 187G 29G 87% / /dev/sdk2 239M 150M 72M 68% /boot /dev/sdk1 250M 6.9M 243M 3% /boot/efi overlay 215G 187G 29G 87% /var/lib/docker/overlay2/bc4904d8da14dd9ab0fbc49ae60f20ba4a3cbf8f361c0ed13e818e0d65e22531/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/617494be5e05d5f91d1d08aad6b6ace8f335a346ca9ea868dc2bc7fd07906901/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/3b039d5daffaf212d3384afc30b5bf75353fd215b238101b9bfba4050638eab5/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/507c24e65c7cd075b5e1ab4901f8e198263c85265b3e4610606dc3dfd4dad0b5/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/6856e867322a73cb1d0e203a2c12f8516bd76fa3866a945b199e477396704f76/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/5bb197c41ba584981d1767e377bff84cd13750476a63f26206f58b274d854739/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/1c2b16f94ffda06fc277c6906e6df8bd150de16c80a1ba7f113f0774ad8a5de1/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/58de1f5b8e3638e94cbc55f02f690937295e8714dfea44f155271df70093a69f/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/be5d6ac02ab83436b18c475f43df48732c0b2b5c73732237064631deb2d5243f/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/e59810bd48f0667bd3f91dcc65ec1b51227314754dfbcc7ba8dee376bdcd4c0a/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/78326ce8e1cf36680eaa56e744b4ea97f1b358adac17eacaf67b88937dd5e876/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/6a53cf14e33b69c418794514fbd35f5257c553f5a9b0ead62e03b76163112de4/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/efec8e5382be117acdbfc81e9d9a9fbc62e289c2a9fcdfa4c53868de50faf420/merged overlay 215G 187G 29G 87% /var/lib/docker/overlay2/eca247de3d54f43372961146b84b485d7c5715d1784afae83e44763717ecf552/merged tmpfs 19G 0 19G 0% /run/user/0 overlay 215G 187G 29G 87% /var/lib/docker/overlay2/bfdf90bdc15c9059d9436caddb1d927788ae9eeff15df631ed150cae966528eb/merged ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph 15 and Podman compatability
I hit couple issues with Podman when deploy Ceph 16.2. https://www.spinics.net/lists/ceph-users/msg71367.html https://www.spinics.net/lists/ceph-users/msg71346.html Switching back to Docker, all work fine. Thanks! Tony From: Robert Sander Sent: May 19, 2022 05:43 AM To: ceph-users@ceph.io Subject: [ceph-users] Ceph 15 and Podman compatability Hi, the table on https://docs.ceph.com/en/quincy/cephadm/compatibility/ tells us that to run Ceph 15 one needs Podman 2.0 or 2.1 and not 3.0. While planning to upgrade an installation of Ceph 14 on Ubuntu 18 I only found Podman 3 for that distribution version (or any newer Ubuntu version). Is it possible to run current Ceph 15 containers with Podman 3.0 today? The only way I found is to use .deb packages to upgrade to Ceph 15, then upgrade the distribution to Ubuntu 20, upgrade to Ceph 17 (.deb packages available for Ubuntu 20) and then containerize it with Podman 3. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] rbd mirror between clusters with private "public" network
Hi, I understand that, for rbd mirror to work, the rbd mirror service requires the connectivity to all nodes from both cluster. In my case, for security purpose, the "public" network is actually a private network, which is not routable to external. All internal RBD clients are on that private network. I also put HAProxy there for accessing dashboard and radosgw from external. I wonder if there is any way to use rbd-mirror in this case? Using some sort of proxy? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: the easiest way to copy image to another cluster
Thank you Anthony! I agree that rbd-mirror is more reliable and manageable and it's not that complicated to user. I will try both and see which works better for me. Tony From: Anthony D'Atri Sent: April 21, 2022 09:02 PM To: Tony Liu Cc: ceph-users@ceph.io; d...@ceph.io Subject: Re: [ceph-users] the easiest way to copy image to another cluster As someone noted, rbd export / import work. I’ve also used rbd-mirror for capacity management, it works well for moving attached as well as unattached images. When using rbd-mirror to move 1-2 images at a time, adjustments to default parameters speeds progress substantially. It’s easy to see when src and dst are synced, then flip primary / secondary, disable mirroring, and rm the src. I’ve used this technique (via an in-house wrapper service) to move hundreds of images. It can handle snaps even, with the right execution. > On Apr 21, 2022, at 8:40 PM, Tony Liu wrote: > > Hi, > > I want to copy an image, which is not being used, to another cluster. > rbd-mirror would do it, but rbd-mirror is designed to handle image > which is being used/updated, to ensure the mirrored image is always > consistent with the source. I wonder if there is any easier way to copy > an image without worrying about the update/sync, like copy a snapshot > or a backup image. > > > Thanks! > Tony > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: the easiest way to copy image to another cluster
Thank you Mart! Pipe is indeed easier. I found this blog. Will give it a try. https://machinenix.com/ceph/how-to-export-a-ceph-rbd-image-from-one-cluster-to-another-without-using-a-bridge-server Tony From: Mart van Santen Sent: April 21, 2022 08:52 PM To: Tony Liu Cc: ceph-users@ceph.io; d...@ceph.io Subject: Re: [ceph-users] the easiest way to copy image to another cluster Hi Tony, Have a look at rbd export and rbd import, they dump the image to a file or stdout. You can pipe the rbd export directly into an rbd import assuming you have a host which has access to both ceph clusters. Hope this helps! Mart From mobile > On Apr 22, 2022, at 11:42, Tony Liu wrote: > > Hi, > > I want to copy an image, which is not being used, to another cluster. > rbd-mirror would do it, but rbd-mirror is designed to handle image > which is being used/updated, to ensure the mirrored image is always > consistent with the source. I wonder if there is any easier way to copy > an image without worrying about the update/sync, like copy a snapshot > or a backup image. > > > Thanks! > Tony > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] the easiest way to copy image to another cluster
Hi, I want to copy an image, which is not being used, to another cluster. rbd-mirror would do it, but rbd-mirror is designed to handle image which is being used/updated, to ensure the mirrored image is always consistent with the source. I wonder if there is any easier way to copy an image without worrying about the update/sync, like copy a snapshot or a backup image. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: logging with container
Thank you Adam! After "orch daemon redeploy", all works as expected. Tony From: Adam King Sent: March 24, 2022 11:50 AM To: Tony Liu Cc: ceph-users@ceph.io; d...@ceph.io Subject: Re: [ceph-users] Re: logging with container Hmm, I'm assuming from "Setting "log_to_stderr" doesn't help" you've already tried all the steps in https://docs.ceph.com/en/latest/cephadm/operations/#disabling-logging-to-journald. That's meant to be the steps for stopping cluster logs from going to the container logs. From my personal testing, just setting the global config options made it work for all the daemons without needing to redeploy or set any of the values at runtime. I verified locally after setting log to file to true as well as the steps in the posted link new logs were getting put in /var/log/ceph//mon.host1 file but the journal had no new logs after when I changed the settings. Perhaps because you've modified the values directly at runtime for the daemons it isn't picking up the set config options as runtime changes override config options? It could be worth trying just redeploying the daemons after having all 6 of the relevant config options set properly. I'll also note that I have been using podman. Not sure if there is some major logging difference between podman and docker. Thanks, - Adam King On Thu, Mar 24, 2022 at 1:00 PM Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: Any comments on this? Thanks! Tony ________ From: Tony Liu mailto:tonyliu0...@hotmail.com>> Sent: March 21, 2022 10:01 PM To: Adam King Cc: ceph-users@ceph.io<mailto:ceph-users@ceph.io>; d...@ceph.io<mailto:d...@ceph.io> Subject: [ceph-users] Re: logging with container Hi Adam, When I do "ceph tell mon.ceph-1 config set log_to_file true", I see the log file is created. That confirms that those options in command line can only be override by runtime config change. Could you check mon and mgr logging on your setup? Can we remove those options in command line and let logging to be controlled by cluster configuration or configuration file? Another issue is that, log keeps going to /var/lib/docker/containers//-json.log, which keeps growing up and it's not under logrotate management. How can I stop logging to container stdout/stderr? Setting "log_to_stderr" doesn't help. Thanks! Tony From: Tony Liu mailto:tonyliu0...@hotmail.com>> Sent: March 21, 2022 09:41 PM To: Adam King Cc: ceph-users@ceph.io<mailto:ceph-users@ceph.io>; d...@ceph.io<mailto:d...@ceph.io> Subject: [ceph-users] Re: logging with container Hi Adam, # ceph config get mgr log_to_file true # ceph config get mgr log_file /var/log/ceph/$cluster-$name.log # ceph config get osd log_to_file true # ceph config get osd log_file /var/log/ceph/$cluster-$name.log # ls /var/log/ceph/fa771070-a975-11ec-86c7-e4434be9cb2e/ ceph-osd.10.log ceph-osd.13.log ceph-osd.16.log ceph-osd.19.log ceph-osd.1.log ceph-osd.22.log ceph-osd.4.log ceph-osd.7.log ceph-volume.log # ceph version ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable) "log_to_file" and "log_file" are set the same for mgr and osd, but why there is osd log only, but no mgr log? Thanks! Tony ________ From: Adam King mailto:adk...@redhat.com>> Sent: March 21, 2022 08:26 AM To: Tony Liu Cc: ceph-users@ceph.io<mailto:ceph-users@ceph.io>; d...@ceph.io<mailto:d...@ceph.io> Subject: Re: [ceph-users] logging with container Hi Tony, Afaik those container flags just set the defaults and the config options override them. Setting the necessary flags (https://docs.ceph.com/en/latest/cephadm/operations/#logging-to-files) seemed to work for me. [ceph: root@vm-00 /]# ceph config get osd.0 log_to_file false [ceph: root@vm-00 /]# ceph config show osd.0 log_to_file false [ceph: root@vm-00 /]# ceph config set global log_to_file true [ceph: root@vm-00 /]# ceph config set global mon_cluster_log_to_file true [ceph: root@vm-00 /]# ceph config get osd.0 log_to_file true [ceph: root@vm-00 /]# ceph config show osd.0 log_to_file true [ceph: root@vm-00 /]# ceph version ceph version 16.2.7-601-g179a7bca (179a7bca8a84771b0dde09e26f7a2146a985df90) pacific (stable) [ceph: root@vm-00 /]# exit exit [root@vm-00 ~]# ls /var/log/ceph/413f7ec8-a91e-11ec-9b02-52540092b5a3/ ceph.audit.log ceph.cephadm.log ceph.log ceph-mgr.vm-00.ukcctb.log ceph-mon.vm-00.log ceph-osd.0.log ceph-osd.10.log ceph-osd.2.log ceph-osd.4.log ceph-osd.6.log ceph-osd.8.log ceph-volume.log On Mon, Mar 21, 2022 at 1:06 AM Tony Liu mailto:tonyliu0...@hotmail.com><mailto:tonyliu0...@hotmail.com<mailto:tonyliu0...@hotmail.com>>> wrote: Hi, After reading through doc, it's still not very clear to me how logging works wi
[ceph-users] Re: logging with container
Any comments on this? Thanks! Tony From: Tony Liu Sent: March 21, 2022 10:01 PM To: Adam King Cc: ceph-users@ceph.io; d...@ceph.io Subject: [ceph-users] Re: logging with container Hi Adam, When I do "ceph tell mon.ceph-1 config set log_to_file true", I see the log file is created. That confirms that those options in command line can only be override by runtime config change. Could you check mon and mgr logging on your setup? Can we remove those options in command line and let logging to be controlled by cluster configuration or configuration file? Another issue is that, log keeps going to /var/lib/docker/containers//-json.log, which keeps growing up and it's not under logrotate management. How can I stop logging to container stdout/stderr? Setting "log_to_stderr" doesn't help. Thanks! Tony ________ From: Tony Liu Sent: March 21, 2022 09:41 PM To: Adam King Cc: ceph-users@ceph.io; d...@ceph.io Subject: [ceph-users] Re: logging with container Hi Adam, # ceph config get mgr log_to_file true # ceph config get mgr log_file /var/log/ceph/$cluster-$name.log # ceph config get osd log_to_file true # ceph config get osd log_file /var/log/ceph/$cluster-$name.log # ls /var/log/ceph/fa771070-a975-11ec-86c7-e4434be9cb2e/ ceph-osd.10.log ceph-osd.13.log ceph-osd.16.log ceph-osd.19.log ceph-osd.1.log ceph-osd.22.log ceph-osd.4.log ceph-osd.7.log ceph-volume.log # ceph version ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable) "log_to_file" and "log_file" are set the same for mgr and osd, but why there is osd log only, but no mgr log? Thanks! Tony From: Adam King Sent: March 21, 2022 08:26 AM To: Tony Liu Cc: ceph-users@ceph.io; d...@ceph.io Subject: Re: [ceph-users] logging with container Hi Tony, Afaik those container flags just set the defaults and the config options override them. Setting the necessary flags (https://docs.ceph.com/en/latest/cephadm/operations/#logging-to-files) seemed to work for me. [ceph: root@vm-00 /]# ceph config get osd.0 log_to_file false [ceph: root@vm-00 /]# ceph config show osd.0 log_to_file false [ceph: root@vm-00 /]# ceph config set global log_to_file true [ceph: root@vm-00 /]# ceph config set global mon_cluster_log_to_file true [ceph: root@vm-00 /]# ceph config get osd.0 log_to_file true [ceph: root@vm-00 /]# ceph config show osd.0 log_to_file true [ceph: root@vm-00 /]# ceph version ceph version 16.2.7-601-g179a7bca (179a7bca8a84771b0dde09e26f7a2146a985df90) pacific (stable) [ceph: root@vm-00 /]# exit exit [root@vm-00 ~]# ls /var/log/ceph/413f7ec8-a91e-11ec-9b02-52540092b5a3/ ceph.audit.log ceph.cephadm.log ceph.log ceph-mgr.vm-00.ukcctb.log ceph-mon.vm-00.log ceph-osd.0.log ceph-osd.10.log ceph-osd.2.log ceph-osd.4.log ceph-osd.6.log ceph-osd.8.log ceph-volume.log On Mon, Mar 21, 2022 at 1:06 AM Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: Hi, After reading through doc, it's still not very clear to me how logging works with container. This is with Pacific v16.2 container. In OSD container, I see this. ``` /usr/bin/ceph-osd -n osd.16 -f --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-stderr=true --default-log-stderr-prefix=debug ``` When check ceph configuration. ``` # ceph config get osd.16 log_file /var/log/ceph/$cluster-$name.log # ceph config get osd.16 log_to_file true # ceph config show osd.16 log_to_file false ``` Q1, what's the intention of those log settings in command line? It's high priority and overrides configuration in file and mon. Is there any option not doing that when deploy the container? Q2, since log_to_file is set to false by command line, why there is still loggings in log_file? The same for mgr and mon. What I want is to have everything in log file and minimize the stdout and stderr from container. Because log file is managed by logrotate, it unlikely blow up disk space. But stdout and stderr from container is stored in a single file, not managed by logrotate. It may grow up to huge file. Also, it's easier to check log file by vi than "podman logs". And log file is also collected and stored by ELK for central management. Any comments how I can achieve what I want? Runtime override may not be the best option, cause it's not persistent. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io> To unsubscribe send an email to ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io> ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___
[ceph-users] Re: logging with container
Hi Adam, When I do "ceph tell mon.ceph-1 config set log_to_file true", I see the log file is created. That confirms that those options in command line can only be override by runtime config change. Could you check mon and mgr logging on your setup? Can we remove those options in command line and let logging to be controlled by cluster configuration or configuration file? Another issue is that, log keeps going to /var/lib/docker/containers//-json.log, which keeps growing up and it's not under logrotate management. How can I stop logging to container stdout/stderr? Setting "log_to_stderr" doesn't help. Thanks! Tony ________ From: Tony Liu Sent: March 21, 2022 09:41 PM To: Adam King Cc: ceph-users@ceph.io; d...@ceph.io Subject: [ceph-users] Re: logging with container Hi Adam, # ceph config get mgr log_to_file true # ceph config get mgr log_file /var/log/ceph/$cluster-$name.log # ceph config get osd log_to_file true # ceph config get osd log_file /var/log/ceph/$cluster-$name.log # ls /var/log/ceph/fa771070-a975-11ec-86c7-e4434be9cb2e/ ceph-osd.10.log ceph-osd.13.log ceph-osd.16.log ceph-osd.19.log ceph-osd.1.log ceph-osd.22.log ceph-osd.4.log ceph-osd.7.log ceph-volume.log # ceph version ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable) "log_to_file" and "log_file" are set the same for mgr and osd, but why there is osd log only, but no mgr log? Thanks! Tony From: Adam King Sent: March 21, 2022 08:26 AM To: Tony Liu Cc: ceph-users@ceph.io; d...@ceph.io Subject: Re: [ceph-users] logging with container Hi Tony, Afaik those container flags just set the defaults and the config options override them. Setting the necessary flags (https://docs.ceph.com/en/latest/cephadm/operations/#logging-to-files) seemed to work for me. [ceph: root@vm-00 /]# ceph config get osd.0 log_to_file false [ceph: root@vm-00 /]# ceph config show osd.0 log_to_file false [ceph: root@vm-00 /]# ceph config set global log_to_file true [ceph: root@vm-00 /]# ceph config set global mon_cluster_log_to_file true [ceph: root@vm-00 /]# ceph config get osd.0 log_to_file true [ceph: root@vm-00 /]# ceph config show osd.0 log_to_file true [ceph: root@vm-00 /]# ceph version ceph version 16.2.7-601-g179a7bca (179a7bca8a84771b0dde09e26f7a2146a985df90) pacific (stable) [ceph: root@vm-00 /]# exit exit [root@vm-00 ~]# ls /var/log/ceph/413f7ec8-a91e-11ec-9b02-52540092b5a3/ ceph.audit.log ceph.cephadm.log ceph.log ceph-mgr.vm-00.ukcctb.log ceph-mon.vm-00.log ceph-osd.0.log ceph-osd.10.log ceph-osd.2.log ceph-osd.4.log ceph-osd.6.log ceph-osd.8.log ceph-volume.log On Mon, Mar 21, 2022 at 1:06 AM Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: Hi, After reading through doc, it's still not very clear to me how logging works with container. This is with Pacific v16.2 container. In OSD container, I see this. ``` /usr/bin/ceph-osd -n osd.16 -f --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-stderr=true --default-log-stderr-prefix=debug ``` When check ceph configuration. ``` # ceph config get osd.16 log_file /var/log/ceph/$cluster-$name.log # ceph config get osd.16 log_to_file true # ceph config show osd.16 log_to_file false ``` Q1, what's the intention of those log settings in command line? It's high priority and overrides configuration in file and mon. Is there any option not doing that when deploy the container? Q2, since log_to_file is set to false by command line, why there is still loggings in log_file? The same for mgr and mon. What I want is to have everything in log file and minimize the stdout and stderr from container. Because log file is managed by logrotate, it unlikely blow up disk space. But stdout and stderr from container is stored in a single file, not managed by logrotate. It may grow up to huge file. Also, it's easier to check log file by vi than "podman logs". And log file is also collected and stored by ELK for central management. Any comments how I can achieve what I want? Runtime override may not be the best option, cause it's not persistent. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io> To unsubscribe send an email to ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io> ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: logging with container
Hi Adam, # ceph config get mgr log_to_file true # ceph config get mgr log_file /var/log/ceph/$cluster-$name.log # ceph config get osd log_to_file true # ceph config get osd log_file /var/log/ceph/$cluster-$name.log # ls /var/log/ceph/fa771070-a975-11ec-86c7-e4434be9cb2e/ ceph-osd.10.log ceph-osd.13.log ceph-osd.16.log ceph-osd.19.log ceph-osd.1.log ceph-osd.22.log ceph-osd.4.log ceph-osd.7.log ceph-volume.log # ceph version ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable) "log_to_file" and "log_file" are set the same for mgr and osd, but why there is osd log only, but no mgr log? Thanks! Tony From: Adam King Sent: March 21, 2022 08:26 AM To: Tony Liu Cc: ceph-users@ceph.io; d...@ceph.io Subject: Re: [ceph-users] logging with container Hi Tony, Afaik those container flags just set the defaults and the config options override them. Setting the necessary flags (https://docs.ceph.com/en/latest/cephadm/operations/#logging-to-files) seemed to work for me. [ceph: root@vm-00 /]# ceph config get osd.0 log_to_file false [ceph: root@vm-00 /]# ceph config show osd.0 log_to_file false [ceph: root@vm-00 /]# ceph config set global log_to_file true [ceph: root@vm-00 /]# ceph config set global mon_cluster_log_to_file true [ceph: root@vm-00 /]# ceph config get osd.0 log_to_file true [ceph: root@vm-00 /]# ceph config show osd.0 log_to_file true [ceph: root@vm-00 /]# ceph version ceph version 16.2.7-601-g179a7bca (179a7bca8a84771b0dde09e26f7a2146a985df90) pacific (stable) [ceph: root@vm-00 /]# exit exit [root@vm-00 ~]# ls /var/log/ceph/413f7ec8-a91e-11ec-9b02-52540092b5a3/ ceph.audit.log ceph.cephadm.log ceph.log ceph-mgr.vm-00.ukcctb.log ceph-mon.vm-00.log ceph-osd.0.log ceph-osd.10.log ceph-osd.2.log ceph-osd.4.log ceph-osd.6.log ceph-osd.8.log ceph-volume.log On Mon, Mar 21, 2022 at 1:06 AM Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: Hi, After reading through doc, it's still not very clear to me how logging works with container. This is with Pacific v16.2 container. In OSD container, I see this. ``` /usr/bin/ceph-osd -n osd.16 -f --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-stderr=true --default-log-stderr-prefix=debug ``` When check ceph configuration. ``` # ceph config get osd.16 log_file /var/log/ceph/$cluster-$name.log # ceph config get osd.16 log_to_file true # ceph config show osd.16 log_to_file false ``` Q1, what's the intention of those log settings in command line? It's high priority and overrides configuration in file and mon. Is there any option not doing that when deploy the container? Q2, since log_to_file is set to false by command line, why there is still loggings in log_file? The same for mgr and mon. What I want is to have everything in log file and minimize the stdout and stderr from container. Because log file is managed by logrotate, it unlikely blow up disk space. But stdout and stderr from container is stored in a single file, not managed by logrotate. It may grow up to huge file. Also, it's easier to check log file by vi than "podman logs". And log file is also collected and stored by ELK for central management. Any comments how I can achieve what I want? Runtime override may not be the best option, cause it's not persistent. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io> To unsubscribe send an email to ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io> ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: bind monitoring service to specific network and port
It's probably again related to podman. After switching back to Docker, this works fine. Thanks! Tony From: Tony Liu Sent: March 20, 2022 06:31 PM To: ceph-users@ceph.io; d...@ceph.io Subject: [ceph-users] bind monitoring service to specific network and port Hi, https://docs.ceph.com/en/pacific/cephadm/services/monitoring/#networks-and-ports When I try that with Pacific v16.2 image, port works, network doesn't. No matter which network specified in yaml file, orch apply always bind the service to *. Is this known issue or something I am missing? Could anyone point me to the coding for this? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: orch apply failed to use insecure private registry
It's podman issue. https://github.com/containers/podman/issues/11933 Switch back to Docker. Thanks! Tony From: Eugen Block Sent: March 21, 2022 06:11 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: orch apply failed to use insecure private registry Hi, > Setting mgr/cephadm/registry_insecure to false doesn't help. if you want to use an insecure registry you would need to set this option to true, not false. > I am using podman and /etc/containers/registries.conf is set with > that insecure private registry. Can you paste the whole content? It's been two years or so since I tested a setup with an insecure registry, I believe the registries.conf also requires a line with "insecure = true". I'm not sure if this will be enough, though. Did you successfully login to the registry from all nodes? ceph cephadm registry-login my_url my_username my_password Zitat von Tony Liu : > Hi, > > I am using Pacific v16.2 container image. I put images on a insecure > private registry. > I am using podman and /etc/containers/registries.conf is set with > that insecure private registry. > "cephadm bootstrap" works fine to pull the image and setup the first node. > When "ceph orch apply -i service.yaml" to deploy services on all > nodes, "ceph log last cephadm" > shows the failure to ping private registry with SSL. > Setting mgr/cephadm/registry_insecure to false doesn't help. > I have to manuall pull all images on all nodes, then "orch apply" > continues and all services are deployed. > Is this known issue or some settings I am missing? > Could anyone point me to the cephadm code to pull container image? > > > Thanks! > Tony > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] logging with container
Hi, After reading through doc, it's still not very clear to me how logging works with container. This is with Pacific v16.2 container. In OSD container, I see this. ``` /usr/bin/ceph-osd -n osd.16 -f --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-stderr=true --default-log-stderr-prefix=debug ``` When check ceph configuration. ``` # ceph config get osd.16 log_file /var/log/ceph/$cluster-$name.log # ceph config get osd.16 log_to_file true # ceph config show osd.16 log_to_file false ``` Q1, what's the intention of those log settings in command line? It's high priority and overrides configuration in file and mon. Is there any option not doing that when deploy the container? Q2, since log_to_file is set to false by command line, why there is still loggings in log_file? The same for mgr and mon. What I want is to have everything in log file and minimize the stdout and stderr from container. Because log file is managed by logrotate, it unlikely blow up disk space. But stdout and stderr from container is stored in a single file, not managed by logrotate. It may grow up to huge file. Also, it's easier to check log file by vi than "podman logs". And log file is also collected and stored by ELK for central management. Any comments how I can achieve what I want? Runtime override may not be the best option, cause it's not persistent. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] bind monitoring service to specific network and port
Hi, https://docs.ceph.com/en/pacific/cephadm/services/monitoring/#networks-and-ports When I try that with Pacific v16.2 image, port works, network doesn't. No matter which network specified in yaml file, orch apply always bind the service to *. Is this known issue or something I am missing? Could anyone point me to the coding for this? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] orch apply failed to use insecure private registry
Hi, I am using Pacific v16.2 container image. I put images on a insecure private registry. I am using podman and /etc/containers/registries.conf is set with that insecure private registry. "cephadm bootstrap" works fine to pull the image and setup the first node. When "ceph orch apply -i service.yaml" to deploy services on all nodes, "ceph log last cephadm" shows the failure to ping private registry with SSL. Setting mgr/cephadm/registry_insecure to false doesn't help. I have to manuall pull all images on all nodes, then "orch apply" continues and all services are deployed. Is this known issue or some settings I am missing? Could anyone point me to the cephadm code to pull container image? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] [rgw][dashboard] dashboard can't access rgw behind proxy
Hi, I have 3 rgw services behind HAProxy. rgw-api-host and rgw-api-port are set properly to the VIP and port. "curl http://:" works fine. But dashboard complains that it can't find rgw service on that vip:port. If I set rgw-api-host directly to the node, it also works fine. I ran tcpdump on the active mgr node, I don't see any traffic going out to that VIP at all. Am I missing anything here? Does dashboard need to resolve the rgw-api-host with some name? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [Ceph-community] Why MON,MDS,MGR are on Public network?
Is there any measurement about how much bandwidth will be taken by private traffic vs. public/client traffic when they are on the same network? I am currently having two 2x10G bondings for public and private, the intention is to provide 2x10G bandwidth for clients. I do understand the overhead caused by more networks, but I think it's more critical to guarantee client bandwidth, specially when there is more private traffic during maintenance, rebalance, etc. Thanks! Tony From: Anthony D'Atri Sent: November 29, 2021 02:14 PM To: ceph-us...@ceph.com Subject: [ceph-users] Re: [Ceph-community] Why MON,MDS,MGR are on Public network? > >> I don't trust the public network and afraid of if mons goes down due to this >> problem? So to be more secure and faster I need to understand the reason; 3- >> Why Mon,Mds,Mgr >should be >> on public network? > Remember that the clients need to reach the mons and any MDS the cluster has. It is not unusual for a separate replication/private/backend network to not have a default route or otherwise be unreachable from non-OSD nodes. > The idea to separate OSD<->OSD traffic probably comes from the fact > that replication means data gets multiplied over the network, so if a > client writes 1G data to a pool with replication=3, then two more > copies of that 1G needs to be sent, and if you do that on the "public" > network, you might starve it with replication (or repair/backfill) > traffic. Indeed, one of the rationales is to prevent client/mon traffic and OSD replication traffic from DoSing each other. > Many run with only one network, using as fast a network as you can > afford, but if two separate networks at moderate speed is cheaper than > one super fast, it might be worth considering, otherwise just scale > the one single network to your needs. Notably sometimes we see nodes with only two network ports. One could run separate public/client and private/replication networks, without redundancy — or use bonding / EQR for redundancy but no dedicated replication network. The two-network strategy dates from a time when 1 Gb/s networking was common and 10 Gb/s was cutting edge. With today’s faster networks and Ceph’s multiple improvements in recovery/backfill, the equation and tradeoffs are different from where they were ten years ago. Ceph is pretty good these days at detecting when an entire node is down, and with scrub randomization, reporters settings and a wise mon_osd_down_out_subtree_limit setting, thundering herds of backfill/recovery are much less of a problem than they used to be. Switches, patch panels, crossconnects take up RUs and cost OpEx. Sometimes the RUs saved by not having two networks means you can fit another node or two into each rack. Having a replication network can result in certain flapping situations that can be tricky to troubleshoot, that’s what finally led me to embrace the single network architecture. ymmv. Also when you have five network connections to a given node, it’s super easy during maintenance to not get them plugged back in correctly, no matter how laboriously one labels the cables. (dual public, dual private, BMC/netmgt). Admittedly this probably isn’t a gating factor, but it still happens ;) — aad ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [EXTERNAL] Re: Why you might want packages not containers for Ceph deployments
Instead of complaining, take some time to learn more about container would help. Tony From: Marc Sent: November 18, 2021 10:50 AM To: Pickett, Neale T; Hans van den Bogert; ceph-users@ceph.io Subject: [ceph-users] Re: [EXTERNAL] Re: Why you might want packages not containers for Ceph deployments > We also use containers for ceph and love it. If for some reason we > couldn't run ceph this way any longer, we would probably migrate > everything to a different solution. We are absolutely committed to > containerization. I wonder if you are really using containers. Are you not just using ceph-adm? If you would be using containers you would have selected your OC already, and would be pissed about how the current containers are being developed and have to use a 2nd system. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph cluster Sync
For PR-DR case, I am using RGW multi-site support to replicate backup image. Tony From: Manuel Holtgrewe Sent: October 12, 2021 11:40 AM To: dhils...@performair.com Cc: mico...@gmail.com; ceph-users Subject: [ceph-users] Re: Ceph cluster Sync To chime in here, there is https://github.com/45Drives/cephgeorep That allows cephfs replication pre pacific. There is a mail thread somewhere on the list where a ceph developer warns about semantics issues of recursive mtime even on pacific. However, according to 45 drives they have never had an issue so YMMD. HTH schrieb am Di., 12. Okt. 2021, 18:55: > Michel; > > I am neither a Ceph evangelist, nor a Ceph expert, but here is my current > understanding: > Ceph clusters do not have in-built cross cluster synchronization. That > said, there are several things which might meet your needs. > > 1) If you're just planning your Ceph deployment, then the latest release > (Pacific) introduced the concept of a stretch cluster, essentially a > cluster which is stretched across datacenters (i.e. a relatively > low-bandwidth, high-latency link)[1]. > > 2) RADOSGW allows for uni-directional as well as bi-directional > synchronization of the data that it handles.[2] > > 3) RBD provides mirroring functionality for the data it handles.[3] > > Thank you, > > Dominic L. Hilsbos, MBA > Vice President - Information Technology > Perform Air International Inc. > dhils...@performair.com > www.PerformAir.com > > [1] https://docs.ceph.com/en/latest/rados/operations/stretch-mode/ > [2] https://docs.ceph.com/en/latest/radosgw/sync-modules/ > [3] https://docs.ceph.com/en/latest/rbd/rbd-mirroring/ > > > -Original Message- > From: Michel Niyoyita [mailto:mico...@gmail.com] > Sent: Tuesday, October 12, 2021 8:35 AM > To: ceph-users > Subject: [ceph-users] Ceph cluster Sync > > Dear team > > I want to build two different cluster: one for primary site and the second > for DR site. I would like to ask if these two cluster can > communicate(synchronized) each other and data written to the PR site be > synchronized to the DR site , if once we got trouble for the PR site the > DR automatically takeover. > > Please help me for the solution or advise me how to proceed > > Best Regards > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] etcd support
Hi, I wonder if anyone could share some experiences in etcd support by Ceph. My users build Kubernetes cluster in VMs on OpenStack with Ceph. With HDD (DB/WAL on SSD) volume, etcd performance test fails sometimes because of latency. With SSD (all SSD) volume, it works fine. I wonder if there is anything I can improve with HDD volume, or it has to be SSD volume to support etcd? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Cannot create a container, mandatory "Storage Policy" dropdown field is empty
Update /usr/lib/python3.6/site-packages/swiftclient/client.py and restart container horizon. This is to fix the error message on dashboard when it tries to retrieve policy list. -parsed = urlparse(urljoin(url, '/info')) +parsed = urlparse(urljoin(url, '/swift/info')) Tony From: Michel Niyoyita Sent: September 13, 2021 01:08 AM To: ceph-users Subject: [ceph-users] Cannot create a container, mandatory "Storage Policy" dropdown field is empty Hello team , I am replacing swift by Ceph Radosgateway , and I am successful by creating containers through openstack and ceph CLI side . but once trying to create through the horizon dashboard I get errors: *Error: *Unable to fetch the policy details. , Unable to get the Swift service info and Unable to get the Swift container listing. anyone faced the same issue can help I deployed Openstack wallaby using kolla-ansible running on ubuntu 20.04 and ceph pacific using ansible running on ubuntu 20.04 Michel ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: debug RBD timeout issue
Here it is. [global] fsid = 35d050c0-77c0-11eb-9242-2cea7ff9d07c mon_host = [v2:10.250.50.80:3300/0,v1:10.250.50.80:6789/0] [v2:10.250.50.81:3300/0,v1:10.250.50.81:6789/0] [v2:10.250.50.82:3300/0,v1:10.250.50.82:6789/0] Thanks! Tony From: Konstantin Shalygin Sent: September 8, 2021 08:29 AM To: Tony Liu Cc: ceph-users@ceph.io; d...@ceph.io Subject: Re: [ceph-users] debug RBD timeout issue What is ceoh.conf for this rbd client? k Sent from my iPhone > On 7 Sep 2021, at 19:54, Tony Liu wrote: > > > I have OpenStack Ussuri and Ceph Octopus. Sometimes, I see timeout when create > or delete volumes. I can see RBD timeout from cinder-volume. Has anyone seen > such > issue? I'd like to see what happens on Ceph. Which service should I look > into? Is it stuck > with mon or any OSD? Any option to enable debugging to get more details? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: debug RBD timeout issue
That's what I am trying to figure out, "what exactly could cause a timeout". User creates 10 VMs (boot on volume and an attached volume) by Terraform, then destroy them. Repeat the same, it works fine most times, timeout happens sometimes at different places, volume creation or volume deletion. Since Terraform manages resources in parallel, 10 by default, not sure if it matters how cinder-volume handles those requests. I doubt I can reproduce it with rbd directly. I will enable debug logging in cinder-volume to get more info. In the meantime, I wonder how I can get more info from Ceph to understand such timeout better. Thanks! Tony From: Eugen Block Sent: September 8, 2021 01:05 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: debug RBD timeout issue Hi, from an older cloud version I remember having to increase these settings: [DEFAULT] block_device_allocate_retries = 300 block_device_allocate_retries_interval = 10 block_device_creation_timeout = 300 The question is what exactly could cause a timeout. You write that you only see these timeouts from time to time, then you should try to find out what the difference is between successful and failing volumes. Is it the size or anything else? Which glance stores are enabled? Can you reproduce it, for example 'rbd create...' with the cinder user? Then you could increase 'debug_rbd' and see if that reveals anything. Zitat von Tony Liu : > Hi, > > I have OpenStack Ussuri and Ceph Octopus. Sometimes, I see timeout > when create > or delete volumes. I can see RBD timeout from cinder-volume. Has > anyone seen such > issue? I'd like to see what happens on Ceph. Which service should I > look into? Is it stuck > with mon or any OSD? Any option to enable debugging to get more details? > > oslo_messaging.rpc.server [req-7802dea8-15f6-4177-b07c-e5241615b777 > d0dddad1fc7a4adf8ef5b185567e1842 b9adeeb6dbd54710a0b033ee49045b54 - > default default] Exception during message handling: rbd.Timeout: > [errno 110] error removing image > oslo_messaging.rpc.server Traceback (most recent call last): > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", > line 165, in _process_incoming > oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", > line 276, in dispatch > oslo_messaging.rpc.server return self._do_dispatch(endpoint, > method, ctxt, args) > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", > line 196, in _do_dispatch > oslo_messaging.rpc.server result = func(ctxt, **new_args) > oslo_messaging.rpc.server File > "", > line 2, in delete_volume > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/cinder/coordination.py", line 151, > in _synchronized > oslo_messaging.rpc.server return f(*a, **k) > oslo_messaging.rpc.server File > "", > line 2, in delete_volume > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/cinder/objects/cleanable.py", line > 212, in wrapper > oslo_messaging.rpc.server result = f(*args, **kwargs) > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/cinder/volume/manager.py", line > 917, in delete_volume > oslo_messaging.rpc.server new_status) > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, > in __exit__ > oslo_messaging.rpc.server self.force_reraise() > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, > in force_reraise > oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise > oslo_messaging.rpc.server raise value > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/cinder/volume/manager.py", line > 899, in delete_volume > oslo_messaging.rpc.server self.driver.delete_volume(volume) > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py", > line 1160, in delete_volume > oslo_messaging.rpc.server _try_remove_volume(client, volume_name) > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/cinder/utils.py", line 696, in > _wrapper > oslo_messaging.rpc.server return r.call(f, *args, **kwargs) > oslo_messaging.rpc.server File > "/usr/lib/python3.6/site-packages/retrying.py", line 223, in call > oslo_messaging.r
[ceph-users] debug RBD timeout issue
Hi, I have OpenStack Ussuri and Ceph Octopus. Sometimes, I see timeout when create or delete volumes. I can see RBD timeout from cinder-volume. Has anyone seen such issue? I'd like to see what happens on Ceph. Which service should I look into? Is it stuck with mon or any OSD? Any option to enable debugging to get more details? oslo_messaging.rpc.server [req-7802dea8-15f6-4177-b07c-e5241615b777 d0dddad1fc7a4adf8ef5b185567e1842 b9adeeb6dbd54710a0b033ee49045b54 - default default] Exception during message handling: rbd.Timeout: [errno 110] error removing image oslo_messaging.rpc.server Traceback (most recent call last): oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 276, in dispatch oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 196, in _do_dispatch oslo_messaging.rpc.server result = func(ctxt, **new_args) oslo_messaging.rpc.server File "", line 2, in delete_volume oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/cinder/coordination.py", line 151, in _synchronized oslo_messaging.rpc.server return f(*a, **k) oslo_messaging.rpc.server File "", line 2, in delete_volume oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/cinder/objects/cleanable.py", line 212, in wrapper oslo_messaging.rpc.server result = f(*args, **kwargs) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/cinder/volume/manager.py", line 917, in delete_volume oslo_messaging.rpc.server new_status) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ oslo_messaging.rpc.server self.force_reraise() oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise oslo_messaging.rpc.server raise value oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/cinder/volume/manager.py", line 899, in delete_volume oslo_messaging.rpc.server self.driver.delete_volume(volume) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py", line 1160, in delete_volume oslo_messaging.rpc.server _try_remove_volume(client, volume_name) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/cinder/utils.py", line 696, in _wrapper oslo_messaging.rpc.server return r.call(f, *args, **kwargs) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/retrying.py", line 223, in call oslo_messaging.rpc.server return attempt.get(self._wrap_exception) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/retrying.py", line 261, in get oslo_messaging.rpc.server six.reraise(self.value[0], self.value[1], self.value[2]) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise oslo_messaging.rpc.server raise value oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/retrying.py", line 217, in call oslo_messaging.rpc.server attempt = Attempt(fn(*args, **kwargs), attempt_number, False) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py", line 1155, in _try_remove_volume oslo_messaging.rpc.server self.RBDProxy().remove(client.ioctx, volume_name) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 190, in doit oslo_messaging.rpc.server result = proxy_call(self._autowrap, f, *args, **kwargs) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 148, in proxy_call oslo_messaging.rpc.server rv = execute(f, *args, **kwargs) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 129, in execute oslo_messaging.rpc.server six.reraise(c, e, tb) oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise oslo_messaging.rpc.server raise value oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 83, in tworker oslo_messaging.rpc.server rv = meth(*args, **kwargs) oslo_messaging.rpc.server File "rbd.pyx", line 1283, in rbd.RBD.remove oslo_messaging.rpc.server rbd.Timeout: [errno 110] error removing image Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rbd object mapping
Thank you Konstantin! Tony From: Konstantin Shalygin Sent: August 9, 2021 01:20 AM To: Tony Liu Cc: ceph-users; d...@ceph.io Subject: Re: [ceph-users] rbd object mapping On 8 Aug 2021, at 20:10, Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: That's what I thought. I am confused by this. # ceph osd map vm fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk osdmap e18381 pool 'vm' (4) object 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk' -> pg 4.c7a78d40 (4.0) -> up ([4,17,6], p4) acting ([4,17,6], p4) It calls RBD image "object" and it shows the whole image maps to a single PG, while the image is actually split into many objects each of which maps to a PG. How am I supposed to understand the output of this command? You can execute `ceph osd map vm nonexist` and you will see mapping for 'nonexist' object. Future mapping... To achieve mappings for each object of your image, you need to find all objects by rbd_header and iterate over this list. k ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rbd object mapping
>> There are two types of "object", RBD-image-object and 8MiB-block-object. >> When create a RBD image, a RBD-image-object is created and 12800 >> 8MiB-block-objects >> are allocated. That whole RBD-image-object is mapped to a single PG, which >> is mapped >> to 3 OSDs (replica 3). That means, all user data on that RBD image is stored >> in those >> 3 OSDs. Is my understanding correct? > > RBD image is not a object, is a bunch of objects as block device abstraction. > Nope, each object of image may be placed to pseudo random placement. For > example if you > have 1 osds and 10GiB image with 4MiB objects your image may be placed to > 2560 > different PGs on 100-1000-2560 OSDs... That's what I thought. I am confused by this. # ceph osd map vm fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk osdmap e18381 pool 'vm' (4) object 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk' -> pg 4.c7a78d40 (4.0) -> up ([4,17,6], p4) acting ([4,17,6], p4) It calls RBD image "object" and it shows the whole image maps to a single PG, while the image is actually split into many objects each of which maps to a PG. How am I supposed to understand the output of this command? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rbd object mapping
There are two types of "object", RBD-image-object and 8MiB-block-object. When create a RBD image, a RBD-image-object is created and 12800 8MiB-block-objects are allocated. That whole RBD-image-object is mapped to a single PG, which is mapped to 3 OSDs (replica 3). That means, all user data on that RBD image is stored in those 3 OSDs. Is my understanding correct? I doubt it, because, for example, a Ceph cluster with bunch of 2TB drives, and user won't be able to create RBD image bigger than 2TB. I don't believe that's true. So, what am I missing here? Thanks! Tony From: Konstantin Shalygin Sent: August 7, 2021 11:35 AM To: Tony Liu Cc: ceph-users; d...@ceph.io Subject: Re: [ceph-users] rbd object mapping Object map show where your object with any object name will be placed in defined pool with your crush map, and which of osd will serve this PG. You can type anything in object name - and the the future placement or placement of existing object - this how algo works. 12800 means that your 100GiB image is a 12800 objects of 8 MiB of pool vm. All this objects prefixed with rbd header (seems block_name_prefix modern name of this) Cheers, k On 7 Aug 2021, at 21:27, Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: This shows one RBD image is treated as one object, and it's mapped to one PG. "object" here means a RBD image. # ceph osd map vm fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk osdmap e18381 pool 'vm' (4) object 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk' -> pg 4.c7a78d40 (4.0) -> up ([4,17,6], p4) acting ([4,17,6], p4) When show the info of this image, what's that "12800 objects" mean? And what's that "order 23 (8 MiB objects)" mean? What's "objects" here? # rbd info vm/fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk rbd image 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk': size 100 GiB in 12800 objects order 23 (8 MiB objects) snapshot_count: 0 id: affa8fb94beb7e block_name_prefix: rbd_data.affa8fb94beb7e format: 2 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] rbd object mapping
Hi, This shows one RBD image is treated as one object, and it's mapped to one PG. "object" here means a RBD image. # ceph osd map vm fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk osdmap e18381 pool 'vm' (4) object 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk' -> pg 4.c7a78d40 (4.0) -> up ([4,17,6], p4) acting ([4,17,6], p4) When show the info of this image, what's that "12800 objects" mean? And what's that "order 23 (8 MiB objects)" mean? What's "objects" here? # rbd info vm/fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk rbd image 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk': size 100 GiB in 12800 objects order 23 (8 MiB objects) snapshot_count: 0 id: affa8fb94beb7e block_name_prefix: rbd_data.affa8fb94beb7e format: 2 Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [cinder-backup][ceph] replicate volume between sites
Found a way to import volume [1]. Will validate it. Keep looking into object store multi-site vs. RBD mirror for replicating volumes between sites. Any comments is welcome. [1] https://docs.openstack.org/python-cinderclient/latest/cli/details.html#cinder-manage Thanks! Tony From: Tony Liu Sent: July 30, 2021 09:16 PM To: openstack-discuss; ceph-users Subject: [ceph-users] [cinder-backup][ceph] replicate volume between sites Hi, I have two sites with OpenStack Victoria deployed by Kolla and Ceph Octopus deployed by cephadm. As what I know, either Swift (implemented by RADOSGW) or RBD is supported to be the backend of cinder-backup. My intention is to use one of those option to replicate Cinder volume from one site to another, based on RADOSGW multi-site support or RBD mirroring. I wonder if anyone has done this and could share some opinions, like pros, cons, which way is better for which case? One specific thing I am not clear is that, how to import RBD volume into Cinder when it gets to another site. There used to be a way, but it's replicated. Any comments is appreciated. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] [cinder-backup][ceph] replicate volume between sites
Hi, I have two sites with OpenStack Victoria deployed by Kolla and Ceph Octopus deployed by cephadm. As what I know, either Swift (implemented by RADOSGW) or RBD is supported to be the backend of cinder-backup. My intention is to use one of those option to replicate Cinder volume from one site to another, based on RADOSGW multi-site support or RBD mirroring. I wonder if anyone has done this and could share some opinions, like pros, cons, which way is better for which case? One specific thing I am not clear is that, how to import RBD volume into Cinder when it gets to another site. There used to be a way, but it's replicated. Any comments is appreciated. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs?
Thank you Stefan and Josh! Tony From: Josh Baergen Sent: March 28, 2021 08:28 PM To: Tony Liu Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs? As was mentioned in this thread, all of the mon clients (OSDs included) learn about other mons through monmaps, which are distributed when mon membership and election changes. Thus, your OSDs should already know about the new mons. mon_host indicates the list of mons that mon clients should try to contact at boot. Thus, it's important to have correct in the config but doesn't need to be updated after the process starts. At least that's how I understand it; the config docs aren't terribly clear on this behaviour. Josh On Sat., Mar. 27, 2021, 2:07 p.m. Tony Liu, mailto:tonyliu0...@hotmail.com>> wrote: Just realized that all config files (/var/lib/ceph///config) on all nodes are already updated properly. It must be handled as part of adding MONs. But "ceph config show" shows only single host. mon_host [v2:10.250.50.80:3300/0,v1:10.250.50.80:6789/0<http://10.250.50.80:3300/0,v1:10.250.50.80:6789/0>] file That means I still need to restart all services to apply the update, right? Is this supposed to be part of adding MONs as well, or additional manual step? Thanks! Tony ________ From: Tony Liu mailto:tonyliu0...@hotmail.com>> Sent: March 27, 2021 12:53 PM To: Stefan Kooman; ceph-users@ceph.io<mailto:ceph-users@ceph.io> Subject: [ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs? # ceph config set osd.0 mon_host [v2:10.250.50.80:3300/0,v1:10.250.50.80:6789/0,v2:10.250.50.81:3300/0,v1:10.250.50.81:6789/0,v2:10.250.50.82:3300/0,v1:10.250.50.82:6789/0<http://10.250.50.80:3300/0,v1:10.250.50.80:6789/0,v2:10.250.50.81:3300/0,v1:10.250.50.81:6789/0,v2:10.250.50.82:3300/0,v1:10.250.50.82:6789/0>] Error EINVAL: mon_host is special and cannot be stored by the mon It seems that the only option is to update ceph.conf and restart service. Tony ________ From: Tony Liu mailto:tonyliu0...@hotmail.com>> Sent: March 27, 2021 12:20 PM To: Stefan Kooman; ceph-users@ceph.io<mailto:ceph-users@ceph.io> Subject: [ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs? I expanded MON from 1 to 3 by updating orch service "ceph orch apply". "mon_host" in all services (MON, MGR, OSDs) is not updated. It's still single host from source "file". What's the guidance here to update "mon_host" for all services? I am talking about Ceph services, not client side. Should I update ceph.conf for all services and restart all of them? Or I can update it on-the-fly by "ceph config set"? In the latter case, where the updated configuration is stored? Is it going to be overridden by ceph.conf when restart service? Thanks! Tony ____ From: Stefan Kooman mailto:ste...@bit.nl>> Sent: March 26, 2021 12:22 PM To: Tony Liu; ceph-users@ceph.io<mailto:ceph-users@ceph.io> Subject: Re: [ceph-users] Do I need to update ceph.conf and restart each OSD after adding more MONs? On 3/26/21 6:06 PM, Tony Liu wrote: > Hi, > > Do I need to update ceph.conf and restart each OSD after adding more MONs? This should not be necessary, as the OSDs should learn about these changes through monmaps. Updating the ceph.conf after the mons have been updated is advised. > This is with 15.2.8 deployed by cephadm. > > When adding MON, "mon_host" should be updated accordingly. > Given [1], is that update "the monitor cluster’s centralized configuration > database" or "runtime overrides set by an administrator"? No need to put that in the centralized config database. I *think* they mean ceph.conf file on the clients and hosts. At least, that's what you would normally do (if not using DNS). Gr. Stefan ___ ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io> To unsubscribe send an email to ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io> ___ ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io> To unsubscribe send an email to ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io> ___ ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io> To unsubscribe send an email to ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io> ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: memory consumption by osd
I don't see any problems yet. All OSDs are working fine. Just that 1.8GB free memory concerns me. I know 256GB memory for 10 OSDs (16TB HDD) is a lot, I am planning to reduce it or increate osd_memory_target (if that's what you meant) to boost performance. But before doing that, I'd like to understand what's taking so much buff/cache and if there is any option to control it. Thanks! Tony From: Anthony D'Atri Sent: March 27, 2021 07:27 PM To: ceph-users Subject: [ceph-users] Re: memory consumption by osd Depending on your kernel version, MemFree can be misleading. Attend to the value of MemAvailable instead. Your OSDs all look to be well below the target, I wouldn’t think you have any problems. In fact 256GB for just 10 OSDs is an embarassment of riches. What type of drives are you using, and what’s the cluster used for? If anything I might advise *raising* the target. You might check tcmalloc usage https://ceph-devel.vger.kernel.narkive.com/tYp0KkIT/ceph-daemon-memory-utilization-heap-release-drops-use-by-50 but I doubt this is an issue for you. > What's taking that much buffer? > # free -h > totalusedfree shared buff/cache available > Mem: 251Gi31Gi 1.8Gi 1.6Gi 217Gi > 215Gi > > # cat /proc/meminfo > MemTotal: 263454780 kB > MemFree: 2212484 kB > MemAvailable: 226842848 kB > Buffers:219061308 kB > Cached: 2066532 kB > SwapCached: 928 kB > Active: 142272648 kB > Inactive: 109641772 kB > .. > > > Thanks! > Tony > > From: Tony Liu > Sent: March 27, 2021 01:25 PM > To: ceph-users > Subject: [ceph-users] memory consumption by osd > > Hi, > > Here is a snippet from top on a node with 10 OSDs. > === > MiB Mem : 257280.1 total, 2070.1 free, 31881.7 used, 223328.3 buff/cache > MiB Swap: 128000.0 total, 126754.7 free, 1245.3 used. 221608.0 avail Mem > >PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND > 30492 167 20 0 4483384 2.9g 16696 S 6.0 1.2 707:05.25 ceph-osd > 35396 167 20 0 952 2.8g 16468 S 5.0 1.1 815:58.52 ceph-osd > 33488 167 20 0 4161872 2.8g 16580 S 4.7 1.1 496:07.94 ceph-osd > 36371 167 20 0 4387792 3.0g 16748 S 4.3 1.2 762:37.64 ceph-osd > 39185 167 20 0 5108244 3.1g 16576 S 4.0 1.2 998:06.73 ceph-osd > 38729 167 20 0 4748292 2.8g 16580 S 3.3 1.1 895:03.67 ceph-osd > 34439 167 20 0 4492312 2.8g 16796 S 2.0 1.1 921:55.50 ceph-osd > 31473 167 20 0 4314500 2.9g 16684 S 1.3 1.2 680:48.09 ceph-osd > 32495 167 20 0 4294196 2.8g 16552 S 1.0 1.1 545:14.53 ceph-osd > 37230 167 20 0 4586020 2.7g 16620 S 1.0 1.1 844:12.23 ceph-osd > === > Does it look OK with 2GB free? > I can't tell how that 220GB is used for buffer/cache. > Is that used by OSDs? Is it controlled by configuration or auto scaling based > on physical memory? Any clarifications would be helpful. > > > Thanks! > Tony > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: memory consumption by osd
Restarting OSD frees buff/cache memory. What kind of data is there? Is there any configuration to control this memory allocation? Thanks! Tony From: Tony Liu Sent: March 27, 2021 06:10 PM To: ceph-users Subject: [ceph-users] Re: memory consumption by osd To clarify, to avoid PG log taking too much memory, I already set osd_max_pg_log_entries from default 1 to 1000. I checked PG log size. They are all under 1100. ceph pg dump -f json | jq '.pg_map.pg_stats[]' | grep ondisk_log_size I also checked eash OSD. The total is only a few hundreds MB. ceph daemon osd. dump_mempools And osd_memory_target stays default 4GB. What's taking that much buffer? # free -h totalusedfree shared buff/cache available Mem: 251Gi31Gi 1.8Gi 1.6Gi 217Gi 215Gi # cat /proc/meminfo MemTotal: 263454780 kB MemFree: 2212484 kB MemAvailable: 226842848 kB Buffers:219061308 kB Cached: 2066532 kB SwapCached: 928 kB Active: 142272648 kB Inactive: 109641772 kB .. Thanks! Tony From: Tony Liu Sent: March 27, 2021 01:25 PM To: ceph-users Subject: [ceph-users] memory consumption by osd Hi, Here is a snippet from top on a node with 10 OSDs. === MiB Mem : 257280.1 total, 2070.1 free, 31881.7 used, 223328.3 buff/cache MiB Swap: 128000.0 total, 126754.7 free, 1245.3 used. 221608.0 avail Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 30492 167 20 0 4483384 2.9g 16696 S 6.0 1.2 707:05.25 ceph-osd 35396 167 20 0 952 2.8g 16468 S 5.0 1.1 815:58.52 ceph-osd 33488 167 20 0 4161872 2.8g 16580 S 4.7 1.1 496:07.94 ceph-osd 36371 167 20 0 4387792 3.0g 16748 S 4.3 1.2 762:37.64 ceph-osd 39185 167 20 0 5108244 3.1g 16576 S 4.0 1.2 998:06.73 ceph-osd 38729 167 20 0 4748292 2.8g 16580 S 3.3 1.1 895:03.67 ceph-osd 34439 167 20 0 4492312 2.8g 16796 S 2.0 1.1 921:55.50 ceph-osd 31473 167 20 0 4314500 2.9g 16684 S 1.3 1.2 680:48.09 ceph-osd 32495 167 20 0 4294196 2.8g 16552 S 1.0 1.1 545:14.53 ceph-osd 37230 167 20 0 4586020 2.7g 16620 S 1.0 1.1 844:12.23 ceph-osd === Does it look OK with 2GB free? I can't tell how that 220GB is used for buffer/cache. Is that used by OSDs? Is it controlled by configuration or auto scaling based on physical memory? Any clarifications would be helpful. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: memory consumption by osd
To clarify, to avoid PG log taking too much memory, I already set osd_max_pg_log_entries from default 1 to 1000. I checked PG log size. They are all under 1100. ceph pg dump -f json | jq '.pg_map.pg_stats[]' | grep ondisk_log_size I also checked eash OSD. The total is only a few hundreds MB. ceph daemon osd. dump_mempools And osd_memory_target stays default 4GB. What's taking that much buffer? # free -h totalusedfree shared buff/cache available Mem: 251Gi31Gi 1.8Gi 1.6Gi 217Gi 215Gi # cat /proc/meminfo MemTotal: 263454780 kB MemFree: 2212484 kB MemAvailable: 226842848 kB Buffers:219061308 kB Cached: 2066532 kB SwapCached: 928 kB Active: 142272648 kB Inactive: 109641772 kB .. Thanks! Tony From: Tony Liu Sent: March 27, 2021 01:25 PM To: ceph-users Subject: [ceph-users] memory consumption by osd Hi, Here is a snippet from top on a node with 10 OSDs. === MiB Mem : 257280.1 total, 2070.1 free, 31881.7 used, 223328.3 buff/cache MiB Swap: 128000.0 total, 126754.7 free, 1245.3 used. 221608.0 avail Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 30492 167 20 0 4483384 2.9g 16696 S 6.0 1.2 707:05.25 ceph-osd 35396 167 20 0 952 2.8g 16468 S 5.0 1.1 815:58.52 ceph-osd 33488 167 20 0 4161872 2.8g 16580 S 4.7 1.1 496:07.94 ceph-osd 36371 167 20 0 4387792 3.0g 16748 S 4.3 1.2 762:37.64 ceph-osd 39185 167 20 0 5108244 3.1g 16576 S 4.0 1.2 998:06.73 ceph-osd 38729 167 20 0 4748292 2.8g 16580 S 3.3 1.1 895:03.67 ceph-osd 34439 167 20 0 4492312 2.8g 16796 S 2.0 1.1 921:55.50 ceph-osd 31473 167 20 0 4314500 2.9g 16684 S 1.3 1.2 680:48.09 ceph-osd 32495 167 20 0 4294196 2.8g 16552 S 1.0 1.1 545:14.53 ceph-osd 37230 167 20 0 4586020 2.7g 16620 S 1.0 1.1 844:12.23 ceph-osd === Does it look OK with 2GB free? I can't tell how that 220GB is used for buffer/cache. Is that used by OSDs? Is it controlled by configuration or auto scaling based on physical memory? Any clarifications would be helpful. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] memory consumption by osd
Hi, Here is a snippet from top on a node with 10 OSDs. === MiB Mem : 257280.1 total, 2070.1 free, 31881.7 used, 223328.3 buff/cache MiB Swap: 128000.0 total, 126754.7 free, 1245.3 used. 221608.0 avail Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 30492 167 20 0 4483384 2.9g 16696 S 6.0 1.2 707:05.25 ceph-osd 35396 167 20 0 952 2.8g 16468 S 5.0 1.1 815:58.52 ceph-osd 33488 167 20 0 4161872 2.8g 16580 S 4.7 1.1 496:07.94 ceph-osd 36371 167 20 0 4387792 3.0g 16748 S 4.3 1.2 762:37.64 ceph-osd 39185 167 20 0 5108244 3.1g 16576 S 4.0 1.2 998:06.73 ceph-osd 38729 167 20 0 4748292 2.8g 16580 S 3.3 1.1 895:03.67 ceph-osd 34439 167 20 0 4492312 2.8g 16796 S 2.0 1.1 921:55.50 ceph-osd 31473 167 20 0 4314500 2.9g 16684 S 1.3 1.2 680:48.09 ceph-osd 32495 167 20 0 4294196 2.8g 16552 S 1.0 1.1 545:14.53 ceph-osd 37230 167 20 0 4586020 2.7g 16620 S 1.0 1.1 844:12.23 ceph-osd === Does it look OK with 2GB free? I can't tell how that 220GB is used for buffer/cache. Is that used by OSDs? Is it controlled by configuration or auto scaling based on physical memory? Any clarifications would be helpful. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs?
Just realized that all config files (/var/lib/ceph///config) on all nodes are already updated properly. It must be handled as part of adding MONs. But "ceph config show" shows only single host. mon_host [v2:10.250.50.80:3300/0,v1:10.250.50.80:6789/0] file That means I still need to restart all services to apply the update, right? Is this supposed to be part of adding MONs as well, or additional manual step? Thanks! Tony ____ From: Tony Liu Sent: March 27, 2021 12:53 PM To: Stefan Kooman; ceph-users@ceph.io Subject: [ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs? # ceph config set osd.0 mon_host [v2:10.250.50.80:3300/0,v1:10.250.50.80:6789/0,v2:10.250.50.81:3300/0,v1:10.250.50.81:6789/0,v2:10.250.50.82:3300/0,v1:10.250.50.82:6789/0] Error EINVAL: mon_host is special and cannot be stored by the mon It seems that the only option is to update ceph.conf and restart service. Tony ____ From: Tony Liu Sent: March 27, 2021 12:20 PM To: Stefan Kooman; ceph-users@ceph.io Subject: [ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs? I expanded MON from 1 to 3 by updating orch service "ceph orch apply". "mon_host" in all services (MON, MGR, OSDs) is not updated. It's still single host from source "file". What's the guidance here to update "mon_host" for all services? I am talking about Ceph services, not client side. Should I update ceph.conf for all services and restart all of them? Or I can update it on-the-fly by "ceph config set"? In the latter case, where the updated configuration is stored? Is it going to be overridden by ceph.conf when restart service? Thanks! Tony From: Stefan Kooman Sent: March 26, 2021 12:22 PM To: Tony Liu; ceph-users@ceph.io Subject: Re: [ceph-users] Do I need to update ceph.conf and restart each OSD after adding more MONs? On 3/26/21 6:06 PM, Tony Liu wrote: > Hi, > > Do I need to update ceph.conf and restart each OSD after adding more MONs? This should not be necessary, as the OSDs should learn about these changes through monmaps. Updating the ceph.conf after the mons have been updated is advised. > This is with 15.2.8 deployed by cephadm. > > When adding MON, "mon_host" should be updated accordingly. > Given [1], is that update "the monitor cluster’s centralized configuration > database" or "runtime overrides set by an administrator"? No need to put that in the centralized config database. I *think* they mean ceph.conf file on the clients and hosts. At least, that's what you would normally do (if not using DNS). Gr. Stefan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs?
# ceph config set osd.0 mon_host [v2:10.250.50.80:3300/0,v1:10.250.50.80:6789/0,v2:10.250.50.81:3300/0,v1:10.250.50.81:6789/0,v2:10.250.50.82:3300/0,v1:10.250.50.82:6789/0] Error EINVAL: mon_host is special and cannot be stored by the mon It seems that the only option is to update ceph.conf and restart service. Tony From: Tony Liu Sent: March 27, 2021 12:20 PM To: Stefan Kooman; ceph-users@ceph.io Subject: [ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs? I expanded MON from 1 to 3 by updating orch service "ceph orch apply". "mon_host" in all services (MON, MGR, OSDs) is not updated. It's still single host from source "file". What's the guidance here to update "mon_host" for all services? I am talking about Ceph services, not client side. Should I update ceph.conf for all services and restart all of them? Or I can update it on-the-fly by "ceph config set"? In the latter case, where the updated configuration is stored? Is it going to be overridden by ceph.conf when restart service? Thanks! Tony From: Stefan Kooman Sent: March 26, 2021 12:22 PM To: Tony Liu; ceph-users@ceph.io Subject: Re: [ceph-users] Do I need to update ceph.conf and restart each OSD after adding more MONs? On 3/26/21 6:06 PM, Tony Liu wrote: > Hi, > > Do I need to update ceph.conf and restart each OSD after adding more MONs? This should not be necessary, as the OSDs should learn about these changes through monmaps. Updating the ceph.conf after the mons have been updated is advised. > This is with 15.2.8 deployed by cephadm. > > When adding MON, "mon_host" should be updated accordingly. > Given [1], is that update "the monitor cluster’s centralized configuration > database" or "runtime overrides set by an administrator"? No need to put that in the centralized config database. I *think* they mean ceph.conf file on the clients and hosts. At least, that's what you would normally do (if not using DNS). Gr. Stefan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs?
I expanded MON from 1 to 3 by updating orch service "ceph orch apply". "mon_host" in all services (MON, MGR, OSDs) is not updated. It's still single host from source "file". What's the guidance here to update "mon_host" for all services? I am talking about Ceph services, not client side. Should I update ceph.conf for all services and restart all of them? Or I can update it on-the-fly by "ceph config set"? In the latter case, where the updated configuration is stored? Is it going to be overridden by ceph.conf when restart service? Thanks! Tony From: Stefan Kooman Sent: March 26, 2021 12:22 PM To: Tony Liu; ceph-users@ceph.io Subject: Re: [ceph-users] Do I need to update ceph.conf and restart each OSD after adding more MONs? On 3/26/21 6:06 PM, Tony Liu wrote: > Hi, > > Do I need to update ceph.conf and restart each OSD after adding more MONs? This should not be necessary, as the OSDs should learn about these changes through monmaps. Updating the ceph.conf after the mons have been updated is advised. > This is with 15.2.8 deployed by cephadm. > > When adding MON, "mon_host" should be updated accordingly. > Given [1], is that update "the monitor cluster’s centralized configuration > database" or "runtime overrides set by an administrator"? No need to put that in the centralized config database. I *think* they mean ceph.conf file on the clients and hosts. At least, that's what you would normally do (if not using DNS). Gr. Stefan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Do I need to update ceph.conf and restart each OSD after adding more MONs?
Hi, Do I need to update ceph.conf and restart each OSD after adding more MONs? This is with 15.2.8 deployed by cephadm. When adding MON, "mon_host" should be updated accordingly. Given [1], is that update "the monitor cluster’s centralized configuration database" or "runtime overrides set by an administrator"? [1] https://docs.ceph.com/en/latest/rados/configuration/ceph-conf/#config-sources Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph octopus mysterious OSD crash
Are you sure the OSD is with DB/WAL on SSD? Tony From: Philip Brown Sent: March 19, 2021 02:49 PM To: Eugen Block Cc: ceph-users Subject: [ceph-users] Re: [BULK] Re: Re: ceph octopus mysterious OSD crash Wow. My expectations have been adjusted. Thank you for detailing your experience, so I had motivation to try again. Explicit steps I took: 1. went into "cephadm shell" and did a vgremove on the HDD 2. ceph-volume zap /dev/(hdd) 3. lvremove (the matching old lv). This meant that the VG on the SSD had 25% space available. At this point, "ceph-volume inventory" shows the HDD as "available=True", but the shared SSD as false. 4. on my actual admin node, "ceph orch apply osd -i osd.deployspec.yml" and after a few minutes... it DID actually pick up the disk and make the OSD. (I had prevously "ceph osd rm"'d the id. so it used the prior ID) SO... there's still the concern about why the thing mysteriosly crashed in the first place :-/ (on TWO osd's!) But at least I know how to rebuild a single disk. - Original Message - From: "Eugen Block" To: "Stefan Kooman" Cc: "ceph-users" , "Philip Brown" Sent: Friday, March 19, 2021 2:19:55 PM Subject: [BULK] Re: [ceph-users] Re: ceph octopus mysterious OSD crash I am quite sure that this case is covered by cephadm already. A few months ago I tested it after a major rework of ceph-volume. I don’t have any links right now. But I had a lab environment with multiple OSDs per node with rocksDB on SSD and after wiping both HDD and DB LV cephadm automatically redeployed the OSD according to my drive group file. Zitat von Stefan Kooman : > On 3/19/21 7:47 PM, Philip Brown wrote: > > I see. > >> >> I dont think it works when 7/8 devices are already configured, and >> the SSD is already mostly sliced. > > OK. If it is a test cluster you might just blow it all away. By > doing this you are simulating a "SSD" failure taking down all HDDs > with it. It sure isn't pretty. I would say the situation you ended > up with is not a corner case by any means. I am afraid I would > really need to set up a test cluster with cephadm to help you > further at this point, besides the suggestion above. > > Gr. Stefan > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph orch daemon add , separate db
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/EC45YMDJZD3T6TQINGM222H2H4RZABJ4/ From: Philip Brown Sent: March 19, 2021 08:59 AM To: ceph-users Subject: [ceph-users] ceph orch daemon add , separate db I was having difficulty doing this myself, and I came across this semi-recent thread: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/T4R76XJN2NE442GQJ5P2KRJN6HXPMYKL/ " I've tried adding OSDs with ceph orch daemon add ... but it's pretty limited. ...you can't [have] a separate db device. " Has this been fixed yet? Is it GOING to be fixed? -- Philip Brown| Sr. Linux System Administrator | Medata, Inc. 5 Peters Canyon Rd Suite 250 Irvine CA 92606 Office 714.918.1310| Fax 714.918.1325 pbr...@medata.com| www.medata.com ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Networking Idea/Question
"but you may see significant performance improvement with a second "cluster" network in a large cluster." "does not usually have a significant impact on overall performance." The above two statements look conflict to me and cause confusing. What's the purpose of "cluster" network, simply increasing total bandwidth or for some isolations? For example, 1 network on 1 bonding with 2 x 40GB ports vs. 2 networks on 2 bonding each with 2 x 20GB ports They have the same total bandwidth 80GB, so they will support the same performance, right? Thanks! Tony > -Original Message- > From: Andrew Walker-Brown > Sent: Tuesday, March 16, 2021 9:18 AM > To: Tony Liu ; Stefan Kooman ; > Dave Hall ; ceph-users > Subject: RE: [ceph-users] Re: Networking Idea/Question > > https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/ > > > > Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for > Windows 10 > > > > From: Tony Liu <mailto:tonyliu0...@hotmail.com> > Sent: 16 March 2021 16:16 > To: Stefan Kooman <mailto:ste...@bit.nl> ; Dave Hall > <mailto:kdh...@binghamton.edu> ; ceph-users <mailto:ceph-users@ceph.io> > Subject: [ceph-users] Re: Networking Idea/Question > > > > > -Original Message- > > From: Stefan Kooman > > Sent: Tuesday, March 16, 2021 4:10 AM > > To: Dave Hall ; ceph-users > > Subject: [ceph-users] Re: Networking Idea/Question > > > > On 3/15/21 5:34 PM, Dave Hall wrote: > > > Hello, > > > > > > If anybody out there has tried this or thought about it, I'd like to > > > know... > > > > > > I've been thinking about ways to squeeze as much performance as > > > possible from the NICs on a Ceph OSD node. The nodes in our > cluster > > > (6 x OSD, 3 x MGR/MON/MDS/RGW) currently have 2 x 10GB ports. > > > Currently, one port is assigned to the front-side network, and one > to > > > the back-side network. However, there are times when the traffic on > > > one side or the other is more intense and might benefit from a bit > > more bandwidth. > > > > What is (are) the reason(s) to choose a separate cluster and public > > network? > > That used to be the recommendation to separate client traffic and > cluster traffic. I heard it's not true any more as the latest. > It would be good if someone can point to the right link of such > recommendation. > > > Thanks! > Tony > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Networking Idea/Question
> -Original Message- > From: Stefan Kooman > Sent: Tuesday, March 16, 2021 4:10 AM > To: Dave Hall ; ceph-users > Subject: [ceph-users] Re: Networking Idea/Question > > On 3/15/21 5:34 PM, Dave Hall wrote: > > Hello, > > > > If anybody out there has tried this or thought about it, I'd like to > > know... > > > > I've been thinking about ways to squeeze as much performance as > > possible from the NICs on a Ceph OSD node. The nodes in our cluster > > (6 x OSD, 3 x MGR/MON/MDS/RGW) currently have 2 x 10GB ports. > > Currently, one port is assigned to the front-side network, and one to > > the back-side network. However, there are times when the traffic on > > one side or the other is more intense and might benefit from a bit > more bandwidth. > > What is (are) the reason(s) to choose a separate cluster and public > network? That used to be the recommendation to separate client traffic and cluster traffic. I heard it's not true any more as the latest. It would be good if someone can point to the right link of such recommendation. Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph orch and mixed SSD/rotating disks
It may help if you could share how you added those OSDs. This guide works for me. https://docs.ceph.com/en/latest/cephadm/drivegroups/ Tony From: Philip Brown Sent: February 17, 2021 09:30 PM To: ceph-users Subject: [ceph-users] ceph orch and mixed SSD/rotating disks I'm coming back to trying mixed SSD+spinning disks after maybe a year. It was my vague recollection, that if you told ceph "go auto configure all the disks", it would actually automatically carve up the SSDs into the appropriate number of LVM segments, and use them as WAL devices for each hdd based OSD on the system. Was I wrong? Because when I tried to bring up a brand new cluster (Octopus, cephadm bootstrapped), with multiple nodes and multiple disks per node... it seemed to bring up the SSDS as just another set of OSDs. it clearly recognized them as ssd. The output of "ceph orch device ls" showed them as ssd vs hdd for the others. It just...didnt use them as I expected. ? Maybe I was thinking of ceph ansible. Is there not a nice way to do this with the new cephadm based "ceph orch"? I would rather not have to go write json files or whatever by hand, when a computer should be perfectly capable of auto generating this stuff itself -- Philip Brown| Sr. Linux System Administrator | Medata, Inc. 5 Peters Canyon Rd Suite 250 Irvine CA 92606 Office 714.918.1310| Fax 714.918.1325 pbr...@medata.com| www.medata.com ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] can't remove osd service by "ceph orch rm "
Hi, This is with v15.2 and v15.2.8. Once an OSD service is applied, it can't be removed. It always shows up from "ceph orch ls". "ceph orch rm " only marks it "unmanaged", but not actually removes it. Is this the expected? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Is replacing OSD whose data is on HDD and DB is on SSD supported?
To update and close this thread, what I am looking for is not supported yet. "ceph-volume lvm batch" requires clean device. It doesn't work to reuse DB LV or create a new DB LV. Followed https://tracker.ceph.com/issues/46691 with "ceph-volume lvm prepare" to make this work. Thanks! Tony ________ From: Tony Liu Sent: February 14, 2021 02:01 PM To: ceph-users@ceph.io; dev Subject: [ceph-users] Is replacing OSD whose data is on HDD and DB is on SSD supported? Hi, I've been trying with v15.2 and v15.2.8, no luck. Wondering if this is actually supported or ever worked for anyone? Here is what I've done. 1) Create a cluster with 1 controller (mon and mgr) and 3 OSD nodes, each of which is with 1 SSD for DB and 8 HDDs for data. 2) OSD service spec. service_type: osd service_id: osd-spec placement: hosts: - ceph-osd-1 - ceph-osd-2 - ceph-osd-3 spec: block_db_size: 92341796864 data_devices: model: ST16000NM010G db_devices: model: KPM5XRUG960G 3) Add OSD hosts and apply OSD service spec. 8 OSDs (data on HDD and DB on SSD) are created on each host properly. 4) Run "orch osd rm 1 --replace --force". OSD is marked "destroyed" and reweight is set to 0 in "osd tree". "pg dump" shows no PG on that OSD. "orch ps" shows no daemon running for that OSD. 5) Run "orch device zap ". VG and LV for HDD are removed. LV for DB stays. "orch device ls" shows HDD device is available. 6) Cephadm finds OSD claims and applies OSD spec on the host. Here is the message. cephadm [INF] Found osd claims -> {'ceph-osd-1': ['1']} cephadm [INF] Found osd claims for drivegroup osd-spec -> {'ceph-osd-1': ['1']} cephadm [INF] Applying osd-spec on host ceph-osd-1... cephadm [INF] Applying osd-spec on host ceph-osd-2... cephadm [INF] Applying osd-spec on host ceph-osd-3... cephadm [INF] ceph-osd-1: lvm batch --no-auto /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj --db-devices /dev/sdb --block-db-size 92341796864 --osd-ids 1 --yes --no-systemd code: 0 out: [''] err: ['/bin/docker:stderr --> passed data devices: 8 physical, 0 LVM', '/bin/docker:stderr --> relative data size: 1.0', '/bin/docker:stderr --> passed block_db devices: 1 physical, 0 LVM', '/bin/docker:stderr --> 1 fast devices were passed, but none are available'] Q1. Is DB LV on SSD supposed to be deleted or not, when replacing an OSD whose data is on HDD and DB is on SSD? Q2. If yes from Q1, is a new DB LV supposed to be created on SSD as long as there is sufficient free space, when building the new OSD? Q3. If no from Q1, since it's replacing, is the old DB LV going to be reused for the new OSD? Again, is this actually supposed to work? Am I missing anything or just trying on some unsupported feature? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: reinstalling node with orchestrator/cephadm
Never mind, the OSD daemon shows up in "orch ps" after a while. Thanks! Tony ____ From: Tony Liu Sent: February 14, 2021 09:47 PM To: Kenneth Waegeman; ceph-users Subject: [ceph-users] Re: reinstalling node with orchestrator/cephadm I followed https://tracker.ceph.com/issues/46691 to bring up the OSD. "ceph osd tree" shows it's up. "ceph pg dump" shows PGs are remapped. How can I make it to be aware by cephadm (showed up by "ceph orch ps")? Because "ceph status" complains "1 stray daemons(s) not managed by cephadm". Thanks! Tony From: Kenneth Waegeman Sent: February 12, 2021 05:14 AM To: ceph-users Subject: [ceph-users] Re: reinstalling node with orchestrator/cephadm On 08/02/2021 16:52, Kenneth Waegeman wrote: > Hi Eugen, all, > > Thanks for sharing your results! Since we have multiple clusters and > clusters with +500 OSDs, this solution is not feasible for us. > > In the meantime I created an issue for this : > > https://tracker.ceph.com/issues/49159 Hi all, For those who would have same/similar issues/questions, the ticket has been updated. it actually breaks down in two parts: - ceph-volume documentation (https://docs.ceph.com/en/latest/ceph-volume/lvm/activate/#activate <https://docs.ceph.com/en/latest/ceph-volume/lvm/activate/#activate>) notes that activate means: 'This activation process enables a systemd unit that persists the OSD ID and its UUID (also called fsid in Ceph CLI tools), so that at boot time it can understand what OSD is enabled and needs to be mounted.' -> This is not true/does not work for use with cephadm, ceph-volume can't make the osd directories/files like unit.run (yet) for osds that should run with cephadm - there is yet no way (documented) that existing OSD disks could be discovered by cephadm/ceph orch on reinstalling a node like it used to be with running ceph-volume activate --all. The workaround I see for now is running ceph-volume activate --all for id in `ls -1 /var/lib/ceph/osd`; do echo cephadm adopt --style legacy --name ${id/ceph-/osd.}; done This removes the ceph-volume units again and creates the cephadm ones :) As pointed out by Sebastian Wagner: 'Please verify that the container image used is consistent across the cluster after running the adoption process.' And thanks @Sebastian for making 'cephadm ceph-volume activate' a feature request! Kenneth > > We would need this especially to migrate/reinstall all our clusters to > Rhel8 (without destroying/recreating all osd disks), so I really hope > there is another solution :) > > Thanks again! > > Kenneth > > On 05/02/2021 16:11, Eugen Block wrote: >> Hi Kenneth, >> >> I managed to succeed with this just now. It's a lab environment and >> the OSDs are not encrypted but I was able to get the OSDs up again. >> The ceph-volume commands also worked (just activation didn't) so I >> had the required information about those OSDs. >> >> What I did was >> >> - collect the OSD data (fsid, keyring) >> - create directories for osd daemons under >> /var/lib/ceph//osd. >> - note that the directory with the ceph uuid already existed since >> the crash container had been created after bringing the node back >> into the cluster >> - creating the content for that OSD by copying the required files >> from a different host and changed the contents of >> - fsid >> - keyring >> - whoami >> - unit.run >> - unit.poststop >> >> - created the symlinks to the OSD devices: >> - ln -s /dev/ceph-/osd-block- block >> - ln -s /dev/ceph-/osd-block- block.db >> >> - changed ownership to ceph >> - chown -R ceph.ceph /var/lib/ceph//osd./ >> >> - started the systemd unit >> - systemctl start ceph-@osd..service >> >> I repeated this for all OSDs on that host, now all OSDs are online >> and the cluster is happy. I'm not sure what else is necessary in case >> of encrypted OSDs, but maybe this procedure helps you. >> I don't know if there's a smoother or even automated way, I don't >> think there currently is. Maybe someone is working on it though. >> >> Regards, >> Eugen >> >> >> Zitat von Kenneth Waegeman : >> >>> Hi all, >>> >>> I'm running a 15.2.8 cluster using ceph orch with all daemons >>> adopted to cephadm. >>> >>> I tried reinstall an OSD node. Is there a way to make ceph >>> orch/cephadm activate the devices on this node again, ideally >>> automatically? >>> >>> I tried running `cephadm ceph-volume -- lvm activate --all` but this >>&g
[ceph-users] Re: share haproxy config for radosgw [EXT]
You can have BGP-ECMP to multiple HAProxy instances to support active-active mode, instead of using keepalived for active-backup mode, if the traffic amount does required multiple HAProxy instances. Tony From: Graham Allan Sent: February 14, 2021 01:31 PM To: Matthew Vernon Cc: ceph-users Subject: [ceph-users] Re: share haproxy config for radosgw [EXT] On Tue, Feb 9, 2021 at 11:00 AM Matthew Vernon wrote: > On 07/02/2021 22:19, Marc wrote: > > > > I was wondering if someone could post a config for haproxy. Is there > something specific to configure? Like binding clients to a specific backend > server, client timeouts, security specific to rgw etc. > > Ours is templated out by ceph-ansible; to try and condense out just the > interesting bits: > > (snipped the config...) > > The aim is to use all available CPU on the RGWs at peak load, but to > also try and prevent one user overwhelming the service for everyone else > - hence the dropping of idle connections and soft (and then hard) limits > on per-IP connections. > Can I ask a followup question to this: how many haproxy instances do you then run - one on each of your gateways, with keepalived to manage which is active? I ask because, since before I was involved with our ceph object store, it has been load-balanced between multiple rgw servers directly using bgp-ecmp. It doesn't sound like this is common practise in the ceph community, and I'm wondering what the pros and cons are. The bgp-ecmp load balancing has the flaw that it's not truly fault tolerant, at least without additional checks to shut down the local quagga instance if rgw isn't responding - it's only fault tolerant in the case of an entire server going down, which meets our original goals of rolling maintenance/updates, but not a radosgw process going unresponsive. In addition I think we have always seen some background level of clients being sent "connection reset by peer" errors, which I have never tracked down within radosgw; I wonder if these might be masked by an haproxy frontend? The converse is that all client gateway traffic must generally pass through a single haproxy instance, while bgp-ecmp distributes the connections across all nodes. Perhaps haproxy is lightweight and efficient enough that this makes little difference to performance? Graham ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Is replacing OSD whose data is on HDD and DB is on SSD supported?
Hi, I've been trying with v15.2 and v15.2.8, no luck. Wondering if this is actually supported or ever worked for anyone? Here is what I've done. 1) Create a cluster with 1 controller (mon and mgr) and 3 OSD nodes, each of which is with 1 SSD for DB and 8 HDDs for data. 2) OSD service spec. service_type: osd service_id: osd-spec placement: hosts: - ceph-osd-1 - ceph-osd-2 - ceph-osd-3 spec: block_db_size: 92341796864 data_devices: model: ST16000NM010G db_devices: model: KPM5XRUG960G 3) Add OSD hosts and apply OSD service spec. 8 OSDs (data on HDD and DB on SSD) are created on each host properly. 4) Run "orch osd rm 1 --replace --force". OSD is marked "destroyed" and reweight is set to 0 in "osd tree". "pg dump" shows no PG on that OSD. "orch ps" shows no daemon running for that OSD. 5) Run "orch device zap ". VG and LV for HDD are removed. LV for DB stays. "orch device ls" shows HDD device is available. 6) Cephadm finds OSD claims and applies OSD spec on the host. Here is the message. cephadm [INF] Found osd claims -> {'ceph-osd-1': ['1']} cephadm [INF] Found osd claims for drivegroup osd-spec -> {'ceph-osd-1': ['1']} cephadm [INF] Applying osd-spec on host ceph-osd-1... cephadm [INF] Applying osd-spec on host ceph-osd-2... cephadm [INF] Applying osd-spec on host ceph-osd-3... cephadm [INF] ceph-osd-1: lvm batch --no-auto /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj --db-devices /dev/sdb --block-db-size 92341796864 --osd-ids 1 --yes --no-systemd code: 0 out: [''] err: ['/bin/docker:stderr --> passed data devices: 8 physical, 0 LVM', '/bin/docker:stderr --> relative data size: 1.0', '/bin/docker:stderr --> passed block_db devices: 1 physical, 0 LVM', '/bin/docker:stderr --> 1 fast devices were passed, but none are available'] Q1. Is DB LV on SSD supposed to be deleted or not, when replacing an OSD whose data is on HDD and DB is on SSD? Q2. If yes from Q1, is a new DB LV supposed to be created on SSD as long as there is sufficient free space, when building the new OSD? Q3. If no from Q1, since it's replacing, is the old DB LV going to be reused for the new OSD? Again, is this actually supposed to work? Am I missing anything or just trying on some unsupported feature? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: db_devices doesn't show up in exported osd service spec
/dev/sdb is the SSD holding DB LVs for multiple HDDs. What I expect is that, as long as there is sufficient space on db_devices specified in service spec, a LV should be created. Now, circling back to the original question, how does OSD replacement work? I've been trying for a few weeks and hitting different issues, no luck. Thanks! Tony From: Jens Hyllegaard (Soft Design A/S) Sent: February 10, 2021 11:54 PM To: ceph-users@ceph.io Subject: [ceph-users] Re: db_devices doesn't show up in exported osd service spec According to your "pvs" you still have a VG on your sdb device. As long as that is on there, it will not be available to ceph. I have had to do a lvremove, like this: lvremove ceph-78c78efb-af86-427c-8be1-886fa1d54f8a osd-db-72784b7a-b5c0-46e6-8566-74758c297adc Do a lvs command to see the right parameters. Regards Jens -Original Message----- From: Tony Liu Sent: 10. februar 2021 22:59 To: David Orman Cc: Jens Hyllegaard (Soft Design A/S) ; ceph-users@ceph.io Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd service spec Hi David, === # pvs PV VG Fmt Attr PSizePFree /dev/sda3 vg0 lvm2 a-- 1.09t 0 /dev/sdb ceph-block-dbs-f8d28f1f-2dd3-47d0-9110-959e88405112 lvm2 a-- <447.13g 127.75g /dev/sdc ceph-block-8f85121e-98bf-4466-aaf3-d888bcc938f6 lvm2 a-- 2.18t 0 /dev/sde ceph-block-0b47f685-a60b-42fb-b679-931ef763b3c8 lvm2 a-- 2.18t 0 /dev/sdf ceph-block-c526140d-c75f-4b0d-8c63-fbb2a8abfaa2 lvm2 a-- 2.18t 0 /dev/sdg ceph-block-52b422f7-900a-45ff-a809-69fadabe12fa lvm2 a-- 2.18t 0 /dev/sdh ceph-block-da269f0d-ae11-4178-bf1e-6441b8800336 lvm2 a-- 2.18t 0 === After "orch osd rm", which doesn't clean up DB LV on OSD node, I manually clean it up by running "ceph-volume lvm zap --osd-id 12", which does the cleanup. Is "orch device ls" supposed to show SSD device available if there is free space? That could be another issue. Thanks! Tony From: David Orman Sent: February 10, 2021 01:19 PM To: Tony Liu Cc: Jens Hyllegaard (Soft Design A/S); ceph-users@ceph.io Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd service spec It's displaying sdb (what I assume you want to be used as a DB device) as unavailable. What's "pvs" output look like on that "ceph-osd-1" host? Perhaps it is full. I see the other email you sent regarding replacement; I suspect the pre-existing LV from your previous OSD is not re-used. You may need to delete it then the service specification should re-create it along with the OSD. If I remember correctly, I stopped the automatic application of the service spec (ceph orch rm osd.servicespec) when I had to replace a failed OSD, removed the OSD, nuked the LV on the db device in question, put in the new drive, then re-enabled the service-spec (ceph orch apply osd -i) and the OSD + DB/WAL were created appropriately. I don't remember the exact sequence, and it may depend on the ceph version. I'm also unsure if the "orch osd rm --replace [--force]" will allow preservation of the db/wal mapping, it might be worth looking at in the future. On Wed, Feb 10, 2021 at 2:22 PM Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: Hi David, Request info is below. # ceph orch device ls ceph-osd-1 HOSTPATH TYPE SIZE DEVICE_ID MODEL VENDOR ROTATIONAL AVAIL REJECT REASONS ceph-osd-1 /dev/sdd hdd 2235G SEAGATE_DL2400MM0159_WBM2VL2G DL2400MM0159 SEAGATE 1 True ceph-osd-1 /dev/sda hdd 1117G SEAGATE_ST1200MM0099_WFK4NNDY ST1200MM0099 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sdb ssd447G ATA_MZ7KH480HAHQ0D3_S5CNNA0N305738 MZ7KH480HAHQ0D3 ATA 0 False LVM detected, locked ceph-osd-1 /dev/sdc hdd 2235G SEAGATE_DL2400MM0159_WBM2WNSE DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sde hdd 2235G SEAGATE_DL2400MM0159_WBM2WP2S DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sdf hdd 2235G SEAGATE_DL2400MM0159_WBM2VK99 DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sdg hdd 2235G SEAGATE_DL2400MM0159_WBM2VJBT DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sdh hdd 2235G SEAGATE_DL2400MM0159_WBM2VM
[ceph-users] Re: db_devices doesn't show up in exported osd service spec
Hi David, === # pvs PV VG Fmt Attr PSizePFree /dev/sda3 vg0 lvm2 a-- 1.09t 0 /dev/sdb ceph-block-dbs-f8d28f1f-2dd3-47d0-9110-959e88405112 lvm2 a-- <447.13g 127.75g /dev/sdc ceph-block-8f85121e-98bf-4466-aaf3-d888bcc938f6 lvm2 a-- 2.18t 0 /dev/sde ceph-block-0b47f685-a60b-42fb-b679-931ef763b3c8 lvm2 a-- 2.18t 0 /dev/sdf ceph-block-c526140d-c75f-4b0d-8c63-fbb2a8abfaa2 lvm2 a-- 2.18t 0 /dev/sdg ceph-block-52b422f7-900a-45ff-a809-69fadabe12fa lvm2 a-- 2.18t 0 /dev/sdh ceph-block-da269f0d-ae11-4178-bf1e-6441b8800336 lvm2 a-- 2.18t 0 === After "orch osd rm", which doesn't clean up DB LV on OSD node, I manually clean it up by running "ceph-volume lvm zap --osd-id 12", which does the cleanup. Is "orch device ls" supposed to show SSD device available if there is free space? That could be another issue. Thanks! Tony From: David Orman Sent: February 10, 2021 01:19 PM To: Tony Liu Cc: Jens Hyllegaard (Soft Design A/S); ceph-users@ceph.io Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd service spec It's displaying sdb (what I assume you want to be used as a DB device) as unavailable. What's "pvs" output look like on that "ceph-osd-1" host? Perhaps it is full. I see the other email you sent regarding replacement; I suspect the pre-existing LV from your previous OSD is not re-used. You may need to delete it then the service specification should re-create it along with the OSD. If I remember correctly, I stopped the automatic application of the service spec (ceph orch rm osd.servicespec) when I had to replace a failed OSD, removed the OSD, nuked the LV on the db device in question, put in the new drive, then re-enabled the service-spec (ceph orch apply osd -i) and the OSD + DB/WAL were created appropriately. I don't remember the exact sequence, and it may depend on the ceph version. I'm also unsure if the "orch osd rm --replace [--force]" will allow preservation of the db/wal mapping, it might be worth looking at in the future. On Wed, Feb 10, 2021 at 2:22 PM Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: Hi David, Request info is below. # ceph orch device ls ceph-osd-1 HOSTPATH TYPE SIZE DEVICE_ID MODEL VENDOR ROTATIONAL AVAIL REJECT REASONS ceph-osd-1 /dev/sdd hdd 2235G SEAGATE_DL2400MM0159_WBM2VL2G DL2400MM0159 SEAGATE 1 True ceph-osd-1 /dev/sda hdd 1117G SEAGATE_ST1200MM0099_WFK4NNDY ST1200MM0099 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sdb ssd447G ATA_MZ7KH480HAHQ0D3_S5CNNA0N305738 MZ7KH480HAHQ0D3 ATA 0 False LVM detected, locked ceph-osd-1 /dev/sdc hdd 2235G SEAGATE_DL2400MM0159_WBM2WNSE DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sde hdd 2235G SEAGATE_DL2400MM0159_WBM2WP2S DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sdf hdd 2235G SEAGATE_DL2400MM0159_WBM2VK99 DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sdg hdd 2235G SEAGATE_DL2400MM0159_WBM2VJBT DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sdh hdd 2235G SEAGATE_DL2400MM0159_WBM2VMFK DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked # cat osd-spec.yaml service_type: osd service_id: osd-spec placement: hosts: - ceph-osd-1 spec: objectstore: bluestore #block_db_size: 32212254720 block_db_size: 64424509440 data_devices: #rotational: 1 paths: - /dev/sdd db_devices: #rotational: 0 size: ":1T" #unmanaged: true +-+--+--+--++-+ # ceph orch apply osd -i osd-spec.yaml --dry-run +-+--++--++-+ |SERVICE |NAME |HOST|DATA |DB |WAL | +-+--++--++-+ |osd |osd-spec |ceph-osd-1 |/dev/sdd |- |-| +-+--++--++-+ Thanks! Tony ________ From: David Orman mailto:orma...@corenode.com>> Sent: February 10, 2021 11:02 AM To: Tony Liu Cc: Jens Hyllegaard (Soft Design A/S); ceph-users@ceph.io<mailto:ceph-users@ceph.io> Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd
[ceph-users] Re: Device is not available after zap
To update, the OSD had data on HDD and DB on SSD. After "ceph orch osd rm 12 --replace --force" and wait till rebalancing is done and daemon is stopped, I ran "ceph orch device zap ceph-osd-2 /dev/sdd" to zap the device. It cleared PV, VG and LV for data device, but not DB device. DB device issue is being discussed in another thread. Eventually, I restart the active mgr, then the device shows up available. Not sure what was stuck in mgr. Thanks! Tony From: Marc Sent: February 10, 2021 12:21 PM To: Philip Brown; Matt Wilder Cc: ceph-users Subject: [ceph-users] Re: Device is not available after zap I had something similar a while ago, can't remember how I solved it sorry, but it is not a lvm bug. Also posted it here. To bad this is still not fixed. > -Original Message- > Cc: ceph-users > Subject: [ceph-users] Re: Device is not available after zap > > ive always run it against the block dev > > > - Original Message - > From: "Matt Wilder" > To: "Philip Brown" > Cc: "ceph-users" > Sent: Wednesday, February 10, 2021 12:06:55 PM > Subject: Re: [ceph-users] Re: Device is not available after zap > > Are you running zap on the lvm volume, or the underlying block device? > > If you are running it against the lvm volume, it sounds like you need to > run it against the block device so it wipes the lvm volumes as well. > (Disclaimer: I don't run Ceph in this configuration) > > On Wed, Feb 10, 2021 at 10:24 AM Philip Brown wrote: > > > Sorry, not much to say other than a "me too". > > i spent a week testing ceph configurations.. it should have only been > 2 > > days. but a huge amount of my time was wasted because I needed to do a > full > > reboot on the hardware. > > > > on a related note: sometimes "zap" didnt fully clean things up. I had > to > > manually go in and clean up vgs. or pvs. or sometimes wipefs -a > > > > so, in theory, this could be a linux LVM bug. but if I recall, i was > > doing this with ceph octopus, and centos 7.9 > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: db_devices doesn't show up in exported osd service spec
Hi David, Request info is below. # ceph orch device ls ceph-osd-1 HOSTPATH TYPE SIZE DEVICE_ID MODEL VENDOR ROTATIONAL AVAIL REJECT REASONS ceph-osd-1 /dev/sdd hdd 2235G SEAGATE_DL2400MM0159_WBM2VL2G DL2400MM0159 SEAGATE 1 True ceph-osd-1 /dev/sda hdd 1117G SEAGATE_ST1200MM0099_WFK4NNDY ST1200MM0099 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sdb ssd447G ATA_MZ7KH480HAHQ0D3_S5CNNA0N305738 MZ7KH480HAHQ0D3 ATA 0 False LVM detected, locked ceph-osd-1 /dev/sdc hdd 2235G SEAGATE_DL2400MM0159_WBM2WNSE DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sde hdd 2235G SEAGATE_DL2400MM0159_WBM2WP2S DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sdf hdd 2235G SEAGATE_DL2400MM0159_WBM2VK99 DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sdg hdd 2235G SEAGATE_DL2400MM0159_WBM2VJBT DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked ceph-osd-1 /dev/sdh hdd 2235G SEAGATE_DL2400MM0159_WBM2VMFK DL2400MM0159 SEAGATE 1 False LVM detected, Insufficient space (<5GB) on vgs, locked # cat osd-spec.yaml service_type: osd service_id: osd-spec placement: hosts: - ceph-osd-1 spec: objectstore: bluestore #block_db_size: 32212254720 block_db_size: 64424509440 data_devices: #rotational: 1 paths: - /dev/sdd db_devices: #rotational: 0 size: ":1T" #unmanaged: true +-+--+--+--++-+ # ceph orch apply osd -i osd-spec.yaml --dry-run +-+--++--++-+ |SERVICE |NAME |HOST|DATA |DB |WAL | +-+--++--++-+ |osd |osd-spec |ceph-osd-1 |/dev/sdd |- |-| +-+--++--++-+ Thanks! Tony From: David Orman Sent: February 10, 2021 11:02 AM To: Tony Liu Cc: Jens Hyllegaard (Soft Design A/S); ceph-users@ceph.io Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd service spec What's "ceph orch device ls" look like, and please show us your specification that you've used. Jens was correct, his example is how we worked-around this problem, pending patch/new release. On Wed, Feb 10, 2021 at 12:05 AM Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: With db_devices.size, db_devices shows up from "orch ls --export", but no DB device/lvm created for the OSD. Any clues? Thanks! Tony From: Jens Hyllegaard (Soft Design A/S) mailto:jens.hyllega...@softdesign.dk>> Sent: February 9, 2021 01:16 AM To: ceph-users@ceph.io<mailto:ceph-users@ceph.io> Subject: [ceph-users] Re: db_devices doesn't show up in exported osd service spec Hi Tony. I assume they used a size constraint instead of rotational. So if all your SSD's are 1TB or less , and all HDD's are more than that you could use: spec: objectstore: bluestore data_devices: rotational: true filter_logic: AND db_devices: size: ':1TB' It was usable in my test environment, and seems to work. Regards Jens -Original Message- From: Tony Liu mailto:tonyliu0...@hotmail.com>> Sent: 9. februar 2021 02:09 To: David Orman mailto:orma...@corenode.com>> Cc: ceph-users@ceph.io<mailto:ceph-users@ceph.io> Subject: [ceph-users] Re: db_devices doesn't show up in exported osd service spec Hi David, Could you show me an example of OSD service spec YAML to workaround it by specifying size? Thanks! Tony From: David Orman mailto:orma...@corenode.com>> Sent: February 8, 2021 04:06 PM To: Tony Liu Cc: ceph-users@ceph.io<mailto:ceph-users@ceph.io> Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd service spec Adding ceph-users: We ran into this same issue, and we used a size specification to workaround for now. Bug and patch: https://tracker.ceph.com/issues/49014 https://github.com/ceph/ceph/pull/39083 Backport to Octopus: https://github.com/ceph/ceph/pull/39171 On Sat, Feb 6, 2021 at 7:05 PM Tony Liu mailto:tonyliu0...@hotmail.com><mailto:tonyliu0...@hotmail.com<mailto:tonyliu0...@hotmail.com>>> wrote: Add dev to comment. With 15.2.8, when apply OSD service spec, db_devices is gone. Here is the service spec file. == service_type: osd service_id: osd-spec placement: hosts: - ceph-osd-1 spec: objectstore: bluestore data_devices: rota
[ceph-users] Re: db_devices doesn't show up in exported osd service spec
With db_devices.size, db_devices shows up from "orch ls --export", but no DB device/lvm created for the OSD. Any clues? Thanks! Tony From: Jens Hyllegaard (Soft Design A/S) Sent: February 9, 2021 01:16 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: db_devices doesn't show up in exported osd service spec Hi Tony. I assume they used a size constraint instead of rotational. So if all your SSD's are 1TB or less , and all HDD's are more than that you could use: spec: objectstore: bluestore data_devices: rotational: true filter_logic: AND db_devices: size: ':1TB' It was usable in my test environment, and seems to work. Regards Jens -Original Message----- From: Tony Liu Sent: 9. februar 2021 02:09 To: David Orman Cc: ceph-users@ceph.io Subject: [ceph-users] Re: db_devices doesn't show up in exported osd service spec Hi David, Could you show me an example of OSD service spec YAML to workaround it by specifying size? Thanks! Tony From: David Orman Sent: February 8, 2021 04:06 PM To: Tony Liu Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd service spec Adding ceph-users: We ran into this same issue, and we used a size specification to workaround for now. Bug and patch: https://tracker.ceph.com/issues/49014 https://github.com/ceph/ceph/pull/39083 Backport to Octopus: https://github.com/ceph/ceph/pull/39171 On Sat, Feb 6, 2021 at 7:05 PM Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: Add dev to comment. With 15.2.8, when apply OSD service spec, db_devices is gone. Here is the service spec file. == service_type: osd service_id: osd-spec placement: hosts: - ceph-osd-1 spec: objectstore: bluestore data_devices: rotational: 1 db_devices: rotational: 0 == Here is the logging from mon. The message with "Tony" is added by me in mgr to confirm. The audit from mon shows db_devices is gone. Is there anything in mon to filter that out based on host info? How can I trace it? == audit 2021-02-07T00:45:38.106171+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4020 : audit [DBG] from='client.24184218 -' entity='client.admin' cmd=[{"prefix": "orch apply osd", "target": ["mon-mgr", ""]}]: dispatch cephadm 2021-02-07T00:45:38.108546+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4021 : cephadm [INF] Marking host: ceph-osd-1 for OSDSpec preview refresh. cephadm 2021-02-07T00:45:38.108798+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4022 : cephadm [INF] Saving service osd.osd-spec spec with placement ceph-osd-1 cephadm 2021-02-07T00:45:38.108893+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4023 : cephadm [INF] Tony: spec: placement=PlacementSpec(hosts=[HostPlacementSpec(hostname='ceph-osd-1', network='', name='')]), service_id='osd-spec', service_type='osd', data_devices=DeviceSelection(rotational=1, all=False), db_devices=DeviceSelection(rotational=0, all=False), osd_id_claims={}, unmanaged=False, filter_logic='AND', preview_only=False)> audit 2021-02-07T00:45:38.109782+ mon.ceph-control-3 (mon.2) 25 : audit [INF] from='mgr.24142551 10.6.50.30:0/2838166251<http://10.6.50.30:0/2838166251>' entity='mgr.ceph-control-1.nxjnzz' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": \"2021-02-07T00:45:38.108810\", \"spec\": {\"plac ement\": {\"hosts\": [\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": \"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": {\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": \"bluestore\"}}}"}]: dispatch audit 2021-02-07T00:45:38.110133+ mon.ceph-control-1 (mon.0) 107 : audit [INF] from='mgr.24142551 ' entity='mgr.ceph-control-1.nxjnzz' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": \"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": [\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": \"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": {\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": \"bluestore\"}}}"}]: dispatch audit 2021-02-07T00:45:38.152756+ mon.ceph-control-1 (mo
[ceph-users] How is DB handled when remove/replace and add OSD?
Hi, I'd like to know how DB device is expected to be handled by "orch osd rm". What I see is that, DB device on SSD is untouched when OSD on HDD is removed or replaced. "orch device zap" removes PV, VG and LV of the device. It doesn't touch the DB LV on SSD. To remove an OSD permanently, do I need to manually clean up the DB LV on SSD? To replace and OSD, is the old DB LV going to be reused for the new OSD, or a new DB LV will be created? I am asking this because, to replace an OSD, when the OSD is removed, I manually removed DB LV on SSD. Now, I try to add new OSD, but --try-run doesn't show DB device. ``` # cat osd-spec.yaml service_type: osd service_id: osd-spec placement: hosts: - ceph-osd-1 spec: #objectstore: bluestore #block_db_size: 32212254720 #block_db_size: 64424509440 data_devices: rotational: 1 db_devices: #rotational: 0 size: ":500GB" #unmanaged: true # ceph orch apply osd -i osd-spec.yaml --dry-run +-+--++--++-+ |SERVICE |NAME |HOST|DATA |DB |WAL | +-+--++--++-+ |osd |osd-spec |ceph-osd-1 |/dev/sdd |- |-| +-+--++--++-+ ``` Any clues? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: db_devices doesn't show up in exported osd service spec
Hi David, Could you show me an example of OSD service spec YAML to workaround it by specifying size? Thanks! Tony From: David Orman Sent: February 8, 2021 04:06 PM To: Tony Liu Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd service spec Adding ceph-users: We ran into this same issue, and we used a size specification to workaround for now. Bug and patch: https://tracker.ceph.com/issues/49014 https://github.com/ceph/ceph/pull/39083 Backport to Octopus: https://github.com/ceph/ceph/pull/39171 On Sat, Feb 6, 2021 at 7:05 PM Tony Liu mailto:tonyliu0...@hotmail.com>> wrote: Add dev to comment. With 15.2.8, when apply OSD service spec, db_devices is gone. Here is the service spec file. == service_type: osd service_id: osd-spec placement: hosts: - ceph-osd-1 spec: objectstore: bluestore data_devices: rotational: 1 db_devices: rotational: 0 == Here is the logging from mon. The message with "Tony" is added by me in mgr to confirm. The audit from mon shows db_devices is gone. Is there anything in mon to filter that out based on host info? How can I trace it? == audit 2021-02-07T00:45:38.106171+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4020 : audit [DBG] from='client.24184218 -' entity='client.admin' cmd=[{"prefix": "orch apply osd", "target": ["mon-mgr", ""]}]: dispatch cephadm 2021-02-07T00:45:38.108546+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4021 : cephadm [INF] Marking host: ceph-osd-1 for OSDSpec preview refresh. cephadm 2021-02-07T00:45:38.108798+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4022 : cephadm [INF] Saving service osd.osd-spec spec with placement ceph-osd-1 cephadm 2021-02-07T00:45:38.108893+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4023 : cephadm [INF] Tony: spec: placement=PlacementSpec(hosts=[HostPlacementSpec(hostname='ceph-osd-1', network='', name='')]), service_id='osd-spec', service_type='osd', data_devices=DeviceSelection(rotational=1, all=False), db_devices=DeviceSelection(rotational=0, all=False), osd_id_claims={}, unmanaged=False, filter_logic='AND', preview_only=False)> audit 2021-02-07T00:45:38.109782+ mon.ceph-control-3 (mon.2) 25 : audit [INF] from='mgr.24142551 10.6.50.30:0/2838166251<http://10.6.50.30:0/2838166251>' entity='mgr.ceph-control-1.nxjnzz' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": \"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": [\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": \"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": {\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": \"bluestore\"}}}"}]: dispatch audit 2021-02-07T00:45:38.110133+ mon.ceph-control-1 (mon.0) 107 : audit [INF] from='mgr.24142551 ' entity='mgr.ceph-control-1.nxjnzz' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": \"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": [\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": \"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": {\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": \"bluestore\"}}}"}]: dispatch audit 2021-02-07T00:45:38.152756+ mon.ceph-control-1 (mon.0) 108 : audit [INF] from='mgr.24142551 ' entity='mgr.ceph-control-1.nxjnzz' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": \"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": [\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": \"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": {\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": \"bluestore\"}}}"}]': finished == Thanks! Tony > -Original Message- > From: Jens Hyllegaard (Soft Design A/S) > mailto:jens.hyllega...@softdesign.dk>> >
[ceph-users] Re: Device is not available after zap
I built a new cluster from scratch, everything works fine. Could anyone help to find out what is stuck here? Another issue, devices don't show up after adding a host, could be the same cause. Any details about the workflow would be helpful too, like how mon gets devices when a host is added, is it pushed by something (mgr?) or pulled by mon? Thanks! Tony > -Original Message- > From: Tony Liu > Sent: Sunday, February 7, 2021 5:32 PM > To: ceph-users > Subject: [ceph-users] Re: Device is not available after zap > > I checked pvscan, vgscan, lvscan and "ceph-volume lvm list" on the OSD > node, that zapped device doesn't show anywhere. > Anything missing? > > Thanks! > Tony > ____ > From: Tony Liu > Sent: February 7, 2021 05:27 PM > To: ceph-users > Subject: [ceph-users] Device is not available after zap > > Hi, > > With v15.2.8, after zap a device on OSD node, it's still not available. > The reason is "locked, LVM detected". If I reboot the whole OSD node, > then the device will be available. There must be something no being > cleaned up. Any clues? > > Thanks! > Tony > ___ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > email to ceph-users-le...@ceph.io > ___ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Device is not available after zap
I checked pvscan, vgscan, lvscan and "ceph-volume lvm list" on the OSD node, that zapped device doesn't show anywhere. Anything missing? Thanks! Tony ____ From: Tony Liu Sent: February 7, 2021 05:27 PM To: ceph-users Subject: [ceph-users] Device is not available after zap Hi, With v15.2.8, after zap a device on OSD node, it's still not available. The reason is "locked, LVM detected". If I reboot the whole OSD node, then the device will be available. There must be something no being cleaned up. Any clues? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Device is not available after zap
Hi, With v15.2.8, after zap a device on OSD node, it's still not available. The reason is "locked, LVM detected". If I reboot the whole OSD node, then the device will be available. There must be something no being cleaned up. Any clues? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: db_devices doesn't show up in exported osd service spec
Add dev to comment. With 15.2.8, when apply OSD service spec, db_devices is gone. Here is the service spec file. == service_type: osd service_id: osd-spec placement: hosts: - ceph-osd-1 spec: objectstore: bluestore data_devices: rotational: 1 db_devices: rotational: 0 == Here is the logging from mon. The message with "Tony" is added by me in mgr to confirm. The audit from mon shows db_devices is gone. Is there anything in mon to filter that out based on host info? How can I trace it? == audit 2021-02-07T00:45:38.106171+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4020 : audit [DBG] from='client.24184218 -' entity='client.admin' cmd=[{"prefix": "orch apply osd", "target": ["mon-mgr", ""]}]: dispatch cephadm 2021-02-07T00:45:38.108546+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4021 : cephadm [INF] Marking host: ceph-osd-1 for OSDSpec preview refresh. cephadm 2021-02-07T00:45:38.108798+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4022 : cephadm [INF] Saving service osd.osd-spec spec with placement ceph-osd-1 cephadm 2021-02-07T00:45:38.108893+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4023 : cephadm [INF] Tony: spec: placement=PlacementSpec(hosts=[HostPlacementSpec(hostname='ceph-osd-1', network='', name='')]), service_id='osd-spec', service_type='osd', data_devices=DeviceSelection(rotational=1, all=False), db_devices=DeviceSelection(rotational=0, all=False), osd_id_claims={}, unmanaged=False, filter_logic='AND', preview_only=False)> audit 2021-02-07T00:45:38.109782+ mon.ceph-control-3 (mon.2) 25 : audit [INF] from='mgr.24142551 10.6.50.30:0/2838166251' entity='mgr.ceph-control-1.nxjnzz' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": \"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": [\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": \"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": {\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": \"bluestore\"}}}"}]: dispatch audit 2021-02-07T00:45:38.110133+ mon.ceph-control-1 (mon.0) 107 : audit [INF] from='mgr.24142551 ' entity='mgr.ceph-control-1.nxjnzz' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": \"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": [\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": \"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": {\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": \"bluestore\"}}}"}]: dispatch audit 2021-02-07T00:45:38.152756+ mon.ceph-control-1 (mon.0) 108 : audit [INF] from='mgr.24142551 ' entity='mgr.ceph-control-1.nxjnzz' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": \"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": [\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": \"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": {\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": \"bluestore\"}}}"}]': finished == Thanks! Tony > -Original Message- > From: Jens Hyllegaard (Soft Design A/S) > Sent: Thursday, February 4, 2021 6:31 AM > To: ceph-users@ceph.io > Subject: [ceph-users] Re: db_devices doesn't show up in exported osd > service spec > > Hi. > > I have the same situation. Running 15.2.8 I created a specification that > looked just like it. With rotational in the data and non-rotational in > the db. > > First use applied fine. Afterwards it only uses the hdd, and not the ssd. > Also, is there a way to remove an unused osd service. > I manages to create osd.all-available-devices, when I tried to stop the > autocreation of OSD's. Using ceph orch apply osd --all-available-devices > --unmanaged=true > > I created the original OSD using the web interface. > > Regards > > Jens >
[ceph-users] Re: replace OSD failed
Here is the issue. https://tracker.ceph.com/issues/47758 Thanks! Tony > -Original Message- > From: Tony Liu > Sent: Thursday, February 4, 2021 8:46 PM > To: ceph-users@ceph.io > Subject: [ceph-users] Re: replace OSD failed > > Here is the log from ceph-volume. > ``` > [2021-02-05 04:03:17,000][ceph_volume.process][INFO ] Running command: > /usr/sbin/vgcreate --force --yes ceph-a3886f74-3de9-4e6e-a983- > 8330eda0bd64 /dev/sdd > [2021-02-05 04:03:17,134][ceph_volume.process][INFO ] stdout Physical > volume "/dev/sdd" successfully created. > [2021-02-05 04:03:17,166][ceph_volume.process][INFO ] stdout Volume > group "ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64" successfully created > [2021-02-05 04:03:17,189][ceph_volume.process][INFO ] Running command: > /usr/sbin/vgs --noheadings --readonly --units=b --nosuffix -- > separator=";" -S vg_name=ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 -o > vg_name,pv_count,lv_count,vg_attr,vg_extent_count,vg_free_count,vg_exten > t_size > [2021-02-05 04:03:17,229][ceph_volume.process][INFO ] stdout ceph- > a3886f74-3de9-4e6e-a983-8330eda0bd64";"1";"0";"wz--n- > ";"572317";"572317";"4194304 > [2021-02-05 04:03:17,229][ceph_volume.api.lvm][DEBUG ] size was passed: > 2.18 TB -> 572318 > [2021-02-05 04:03:17,235][ceph_volume.process][INFO ] Running command: > /usr/sbin/lvcreate --yes -l 572318 -n osd-block-b05c3c90-b7d5-4f13-8a58- > f72761c1971b ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 > [2021-02-05 04:03:17,244][ceph_volume.process][INFO ] stderr Volume > group "ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64" has insufficient free > space (572317 extents): 572318 required. > ``` > size was passed: 2.18 TB -> 572318 > How is this calculated? > > > Thanks! > Tony > > -Original Message- > > From: Tony Liu > > Sent: Thursday, February 4, 2021 8:34 PM > > To: ceph-users@ceph.io > > Subject: [ceph-users] replace OSD failed > > > > Hi, > > > > With 15.2.8, run "ceph orch rm osd 12 --replace --force", PGs on > > osd.12 are remapped, osd.12 is removed from "ceph osd tree", the > > daemon is removed from "ceph orch ps", the device is "available" > > in "ceph orch device ls". Everything seems good at this point. > > > > Then dry-run service spec. > > ``` > > # cat osd-spec.yaml > > service_type: osd > > service_id: osd-spec > > placement: > > hosts: > > - ceph-osd-1 > > data_devices: > > rotational: 1 > > db_devices: > > rotational: 0 > > > > # ceph orch apply osd -i osd-spec.yaml --dry-run > > +-+--++--+--+-+ > > |SERVICE |NAME |HOST|DATA |DB|WAL | > > +-+--++--+--+-+ > > |osd |osd-spec |ceph-osd-3 |/dev/sdd |/dev/sdb |-| > > +-+--++--+--+-+ > > ``` > > It looks as expected. > > > > Then "ceph orch apply osd -i osd-spec.yaml". > > Here is the log of cephadm. > > ``` > > /bin/docker:stderr --> relative data size: 1.0 /bin/docker:stderr --> > > passed block_db devices: 1 physical, 0 LVM /bin/docker:stderr Running > > command: /usr/bin/ceph-authtool --gen-print-key /bin/docker:stderr > > Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap- > > osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f > > json /bin/docker:stderr Running command: /usr/bin/ceph --cluster ceph > > --name client.bootstrap-osd --keyring > > /var/lib/ceph/bootstrap-osd/ceph.keyring > > -i - osd new b05c3c90-b7d5-4f13-8a58-f72761c1971b 12 > > /bin/docker:stderr Running command: /usr/sbin/vgcreate --force --yes > > ceph-a3886f74-3de9- > > 4e6e-a983-8330eda0bd64 /dev/sdd /bin/docker:stderr stdout: Physical > > volume "/dev/sdd" successfully created. > > /bin/docker:stderr stdout: Volume group > > "ceph-a3886f74-3de9-4e6e-a983- 8330eda0bd64" successfully created > /bin/docker:stderr Running command: > > /usr/sbin/lvcreate --yes -l 572318 -n > > osd-block-b05c3c90-b7d5-4f13-8a58- > > f72761c1971b ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 > > /bin/docker:stderr stderr: Volume group > > "ceph-a3886f74-3de9-4e6e-a983- 8330eda0bd64" has insufficient free > > space (572317 extents): 572318 required. > > /bin/docker:stderr --> Was unable to complete a new OSD, will rollback > > changes ``` Q1, why VG na
[ceph-users] Re: replace OSD failed
Here is the log from ceph-volume. ``` [2021-02-05 04:03:17,000][ceph_volume.process][INFO ] Running command: /usr/sbin/vgcreate --force --yes ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 /dev/sdd [2021-02-05 04:03:17,134][ceph_volume.process][INFO ] stdout Physical volume "/dev/sdd" successfully created. [2021-02-05 04:03:17,166][ceph_volume.process][INFO ] stdout Volume group "ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64" successfully created [2021-02-05 04:03:17,189][ceph_volume.process][INFO ] Running command: /usr/sbin/vgs --noheadings --readonly --units=b --nosuffix --separator=";" -S vg_name=ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 -o vg_name,pv_count,lv_count,vg_attr,vg_extent_count,vg_free_count,vg_extent_size [2021-02-05 04:03:17,229][ceph_volume.process][INFO ] stdout ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64";"1";"0";"wz--n-";"572317";"572317";"4194304 [2021-02-05 04:03:17,229][ceph_volume.api.lvm][DEBUG ] size was passed: 2.18 TB -> 572318 [2021-02-05 04:03:17,235][ceph_volume.process][INFO ] Running command: /usr/sbin/lvcreate --yes -l 572318 -n osd-block-b05c3c90-b7d5-4f13-8a58-f72761c1971b ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 [2021-02-05 04:03:17,244][ceph_volume.process][INFO ] stderr Volume group "ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64" has insufficient free space (572317 extents): 572318 required. ``` size was passed: 2.18 TB -> 572318 How is this calculated? Thanks! Tony > -Original Message- > From: Tony Liu > Sent: Thursday, February 4, 2021 8:34 PM > To: ceph-users@ceph.io > Subject: [ceph-users] replace OSD failed > > Hi, > > With 15.2.8, run "ceph orch rm osd 12 --replace --force", PGs on osd.12 > are remapped, osd.12 is removed from "ceph osd tree", the daemon is > removed from "ceph orch ps", the device is "available" > in "ceph orch device ls". Everything seems good at this point. > > Then dry-run service spec. > ``` > # cat osd-spec.yaml > service_type: osd > service_id: osd-spec > placement: > hosts: > - ceph-osd-1 > data_devices: > rotational: 1 > db_devices: > rotational: 0 > > # ceph orch apply osd -i osd-spec.yaml --dry-run > +-+--++--+--+-+ > |SERVICE |NAME |HOST|DATA |DB|WAL | > +-+--++--+--+-+ > |osd |osd-spec |ceph-osd-3 |/dev/sdd |/dev/sdb |-| > +-+--++--+--+-+ > ``` > It looks as expected. > > Then "ceph orch apply osd -i osd-spec.yaml". > Here is the log of cephadm. > ``` > /bin/docker:stderr --> relative data size: 1.0 /bin/docker:stderr --> > passed block_db devices: 1 physical, 0 LVM /bin/docker:stderr Running > command: /usr/bin/ceph-authtool --gen-print-key /bin/docker:stderr > Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap- > osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json > /bin/docker:stderr Running command: /usr/bin/ceph --cluster ceph --name > client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring > -i - osd new b05c3c90-b7d5-4f13-8a58-f72761c1971b 12 /bin/docker:stderr > Running command: /usr/sbin/vgcreate --force --yes ceph-a3886f74-3de9- > 4e6e-a983-8330eda0bd64 /dev/sdd /bin/docker:stderr stdout: Physical > volume "/dev/sdd" successfully created. > /bin/docker:stderr stdout: Volume group "ceph-a3886f74-3de9-4e6e-a983- > 8330eda0bd64" successfully created /bin/docker:stderr Running command: > /usr/sbin/lvcreate --yes -l 572318 -n osd-block-b05c3c90-b7d5-4f13-8a58- > f72761c1971b ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 > /bin/docker:stderr stderr: Volume group "ceph-a3886f74-3de9-4e6e-a983- > 8330eda0bd64" has insufficient free space (572317 extents): 572318 > required. > /bin/docker:stderr --> Was unable to complete a new OSD, will rollback > changes ``` Q1, why VG name (ceph-) is different from others (ceph- > block-)? > Q2, where is that 572318 from? Since all HDDs are the same model, VG > "Total PE" of all HDDs is 572317. > Has anyone seen similar issues? Anything I am missing? > > > Thanks! > Tony > ___ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] replace OSD failed
Hi, With 15.2.8, run "ceph orch rm osd 12 --replace --force", PGs on osd.12 are remapped, osd.12 is removed from "ceph osd tree", the daemon is removed from "ceph orch ps", the device is "available" in "ceph orch device ls". Everything seems good at this point. Then dry-run service spec. ``` # cat osd-spec.yaml service_type: osd service_id: osd-spec placement: hosts: - ceph-osd-1 data_devices: rotational: 1 db_devices: rotational: 0 # ceph orch apply osd -i osd-spec.yaml --dry-run +-+--++--+--+-+ |SERVICE |NAME |HOST|DATA |DB|WAL | +-+--++--+--+-+ |osd |osd-spec |ceph-osd-3 |/dev/sdd |/dev/sdb |-| +-+--++--+--+-+ ``` It looks as expected. Then "ceph orch apply osd -i osd-spec.yaml". Here is the log of cephadm. ``` /bin/docker:stderr --> relative data size: 1.0 /bin/docker:stderr --> passed block_db devices: 1 physical, 0 LVM /bin/docker:stderr Running command: /usr/bin/ceph-authtool --gen-print-key /bin/docker:stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json /bin/docker:stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new b05c3c90-b7d5-4f13-8a58-f72761c1971b 12 /bin/docker:stderr Running command: /usr/sbin/vgcreate --force --yes ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 /dev/sdd /bin/docker:stderr stdout: Physical volume "/dev/sdd" successfully created. /bin/docker:stderr stdout: Volume group "ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64" successfully created /bin/docker:stderr Running command: /usr/sbin/lvcreate --yes -l 572318 -n osd-block-b05c3c90-b7d5-4f13-8a58-f72761c1971b ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 /bin/docker:stderr stderr: Volume group "ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64" has insufficient free space (572317 extents): 572318 required. /bin/docker:stderr --> Was unable to complete a new OSD, will rollback changes ``` Q1, why VG name (ceph-) is different from others (ceph-block-)? Q2, where is that 572318 from? Since all HDDs are the same model, VG "Total PE" of all HDDs is 572317. Has anyone seen similar issues? Anything I am missing? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io