[ceph-users] [rbd mirror] integrity of journal-based image mirror

2024-05-27 Thread Tony Liu
Hi,

Say, the source image is being updated and data is mirrored to destination 
continuously.
Suddenly, networking of source is down and destination will be promoted and 
used to
restore the VM. Is that going to cause any FS issue and, for example, fsck 
needs to be
invoked to check and repair FS? Is there any integrity check during 
journal-based mirror
to avoid "partial" update caused by networking issue?

Any insight from dev or experiences from users is appreciated.

Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: the image used size becomes 0 after export/import with snapshot

2023-12-04 Thread Tony Liu
Hi Ilya,

That explains it. Thank you for clarification!

Tony

From: Ilya Dryomov 
Sent: December 4, 2023 09:40 AM
To: Tony Liu
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: Re: [ceph-users] the image used size becomes 0 after export/import 
with snapshot

On Tue, Nov 28, 2023 at 8:18 AM Tony Liu  wrote:
>
> Hi,
>
> I have an image with a snapshot and some changes after snapshot.
> ```
> $ rbd du backup/f0408e1e-06b6-437b-a2b5-70e3751d0a26
> NAME  
>   PROVISIONED  USED
> f0408e1e-06b6-437b-a2b5-70e3751d0a26@snapshot-eb085877-7557-4620-9c01-c5587b857029
>10 GiB  2.4 GiB
> f0408e1e-06b6-437b-a2b5-70e3751d0a26  
>10 GiB  2.4 GiB
>
>10 GiB  4.8 GiB
> ```
> If there is no changes after snapshot, the image line will show 0 used.
>
> I did export and import.
> ```
> $ rbd export --export-format 2 backup/f0408e1e-06b6-437b-a2b5-70e3751d0a26 - 
> | rbd import --export-format 2 - backup/test
> Exporting image: 100% complete...done.
> Importing image: 100% complete...done.
> ```
>
> When check the imported image, the image line shows 0 used.
> ```
> $ rbd du backup/test
> NAMEPROVISIONED  USED
> test@snapshot-eb085877-7557-4620-9c01-c5587b857029   10 GiB  2.4 GiB
> test 10 GiB  0 B
>   10 GiB  2.4 GiB
> ```
> Any clues how that happened? I'd expect the same du as the source.

Hi Tony,

"rbd import" command does zero detection at 4k granularity by default.
If the "after snapshot" changes just zeroed everything in the snapshot,
such a discrepancy in "rbd du" USED column is expected.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] the image used size becomes 0 after export/import with snapshot

2023-11-27 Thread Tony Liu
Hi,

I have an image with a snapshot and some changes after snapshot.
```
$ rbd du backup/f0408e1e-06b6-437b-a2b5-70e3751d0a26  
NAME
PROVISIONED  USED   
f0408e1e-06b6-437b-a2b5-70e3751d0a26@snapshot-eb085877-7557-4620-9c01-c5587b857029
   10 GiB  2.4 GiB
f0408e1e-06b6-437b-a2b5-70e3751d0a26
 10 GiB  2.4 GiB
 
 10 GiB  4.8 GiB
```
If there is no changes after snapshot, the image line will show 0 used.

I did export and import.
```
$ rbd export --export-format 2 backup/f0408e1e-06b6-437b-a2b5-70e3751d0a26 - | 
rbd import --export-format 2 - backup/test
Exporting image: 100% complete...done.
Importing image: 100% complete...done.
```

When check the imported image, the image line shows 0 used.
```
$ rbd du backup/test
NAMEPROVISIONED  USED   
test@snapshot-eb085877-7557-4620-9c01-c5587b857029   10 GiB  2.4 GiB
test 10 GiB  0 B
  10 GiB  2.4 GiB
```
Any clues how that happened? I'd expect the same du as the source.

I tried another quick test. It works fine.
```
$ rbd create backup/test-src --size 10G
$ sudo rbd map backup/test-src
/dev/rbd0
$ echo "hello" | sudo tee /dev/rbd0
hello
$ rbd du backup/test-src
NAME  PROVISIONED  USED 
test-src   10 GiB  4 MiB
$ rbd snap create backup/test-src@snap-1
Creating snap: 100% complete...done.
$ rbd du backup/test-src  
NAME PROVISIONED  USED 
test-src@snap-1   10 GiB  4 MiB
test-src  10 GiB0 B
   10 GiB  4 MiB
$ echo "world" | sudo tee /dev/rbd0
world
$ rbd du backup/test-src
NAME PROVISIONED  USED 
test-src@snap-1   10 GiB  4 MiB
test-src  10 GiB  4 MiB
   10 GiB  8 MiB
$ rbd export --export-format 2 backup/test-src - | rbd import --export-format 2 
- backup/test-dst
Exporting image: 100% complete...done.
Importing image: 100% complete...done.
$ rbd du backup/test-dst
NAME PROVISIONED  USED 
test-dst@snap-1   10 GiB  4 MiB
test-dst  10 GiB  4 MiB
   10 GiB  8 MiB
```

Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] import/export with --export-format 2

2023-11-25 Thread Tony Liu
Hi,

src-image is 1GB (provisioned size). I did the following 3 tests.

1. rbd export src-image - | rbd import - dst-image
2. rbd export --export-format 2 src-image - | rbd import --export-format 2 - 
dst-image
3. rbd export --export-format 2 src-image - | rbd import - dst-image

With #1 and #2, dst-image size (rbd info) is the same as src-image, which is 
expected.
With #3, dst-image size (rbd info) is close to used size (rbd du), not the 
provisioned
size of src-image. I'm not sure if this image is actually useable when write 
into it.

The questions is that, is #3 not supposed to be used at all?
I checked doc, didn't see something like "--export-format 2 has to be used for
importing the image which is exported with --export-format 2 option".
Any comments?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: easy way to find out the number of allocated objects for a RBD image

2023-11-25 Thread Tony Liu
Thank you Eugen! "rbd du" is it.
The used_size from "rbd du" is object count times object size.
That's the actual storage taken by the image in backend.

For export, it actually flattens and also sparsifies the image.
In case of many small data pieces, the export size is smaller than du size.

 
Thanks!
Tony

From: Eugen Block 
Sent: November 25, 2023 12:17 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: easy way to find out the number of allocated objects 
for a RBD image

Maybe I misunderstand, but isn’t ’rbd du‘ what you're looking for?

Zitat von Tony Liu :

> Hi,
>
> Other than get all objects of the pool and filter by image ID,
> is there any easier way to get the number of allocated objects for
> a RBD image?
>
> What I really want to know is the actual usage of an image.
> An allocated object could be used partially, but that's fine,
> no need to be 100% accurate. To get the object count and
> times object size, that should be sufficient.
>
> "rbd export" exports actual used data, but to get the actual usage
> by exporting the image seems too much. This brings up another
> question, is there any way to know the export size before running it?
>
>
> Thanks!
> Tony
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] understand "extent"

2023-11-24 Thread Tony Liu
Hi,

The context is RBD on bluestore. I did check extent on Wiki.
I see "extent" when talking about snapshot and export/import.
For example, when create a snapshot, we mark extents. When
there is write to marked extents, we will make a copy.
I also know that user data on block device maps to objects.
How "extent" and "object" are related?
Can I say extent is a set of continuous objects (with default tripe settings)?


Thanks!
Tony


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] easy way to find out the number of allocated objects for a RBD image

2023-11-24 Thread Tony Liu
Hi,

Other than get all objects of the pool and filter by image ID,
is there any easier way to get the number of allocated objects for
a RBD image?

What I really want to know is the actual usage of an image.
An allocated object could be used partially, but that's fine,
no need to be 100% accurate. To get the object count and
times object size, that should be sufficient.

"rbd export" exports actual used data, but to get the actual usage
by exporting the image seems too much. This brings up another
question, is there any way to know the export size before running it?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd export-diff/import-diff hangs

2023-08-28 Thread Tony Liu
Figured it out. It's not rbd issue. Sorry for this false alarm.

Thanks!
Tony

From: Tony Liu 
Sent: August 27, 2023 08:19 PM
To: Eugen Block; ceph-users@ceph.io
Subject: [ceph-users] Re: rbd export-diff/import-diff hangs

It's export-diff from an in-use image, both from-snapshot and to-snapshot exist.
The same from-snapshot exists in import image, which is the to-snapshot from 
last diff.
export/import is used for local backup, rbd-mirroring is used for remote backup.
Looking for options to get more info to troubleshoot.


Thanks!
Tony

From: Eugen Block 
Sent: August 27, 2023 11:53 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: rbd export-diff/import-diff hangs

You mean the image is in use while you’re exporting? Have thought
about creating snapshots and exporting those? Or set up rbd mirroring?

Zitat von Tony Liu :

> To update, hanging happens when updating local image, not remote, networking
> is not a concern here. Any advices how to look into it?
>
> Thanks!
> Tony
> ________
> From: Tony Liu 
> Sent: August 26, 2023 10:43 PM
> To: d...@ceph.io; ceph-users@ceph.io
> Subject: [ceph-users] rbd export-diff/import-diff hangs
>
> Hi,
>
> I'm using rbd import and export to copy image from one cluster to another.
> Also using import-diff and export-diff to update image in remote cluster.
> For example, "rbd --cluster local export-diff ... | rbd --cluster
> remote import-diff ...".
> Sometimes, the whole command is stuck. I can't tell it's stuck on
> which end of the pipe.
> I did some search, [1] seems the same issue and [2] is also related.
>
> Wonder if there is any way to identify where it's stuck and get more
> debugging info.
> Given [2], I'd suspect the import-diff is stuck, cause rbd client is
> importing to the
> remote cluster. Networking latency could be involved here? Ping
> latency is 7~8 ms.
>
> Any comments is appreciated!
>
> [1] https://bugs.launchpad.net/cinder/+bug/2031897
> [2] https://stackoverflow.com/questions/69858763/ceph-rbd-import-hangs
>
> Thanks!
> Tony
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd export-diff/import-diff hangs

2023-08-27 Thread Tony Liu
It's export-diff from an in-use image, both from-snapshot and to-snapshot exist.
The same from-snapshot exists in import image, which is the to-snapshot from 
last diff.
export/import is used for local backup, rbd-mirroring is used for remote backup.
Looking for options to get more info to troubleshoot.


Thanks!
Tony

From: Eugen Block 
Sent: August 27, 2023 11:53 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: rbd export-diff/import-diff hangs

You mean the image is in use while you’re exporting? Have thought
about creating snapshots and exporting those? Or set up rbd mirroring?

Zitat von Tony Liu :

> To update, hanging happens when updating local image, not remote, networking
> is not a concern here. Any advices how to look into it?
>
> Thanks!
> Tony
> ________
> From: Tony Liu 
> Sent: August 26, 2023 10:43 PM
> To: d...@ceph.io; ceph-users@ceph.io
> Subject: [ceph-users] rbd export-diff/import-diff hangs
>
> Hi,
>
> I'm using rbd import and export to copy image from one cluster to another.
> Also using import-diff and export-diff to update image in remote cluster.
> For example, "rbd --cluster local export-diff ... | rbd --cluster
> remote import-diff ...".
> Sometimes, the whole command is stuck. I can't tell it's stuck on
> which end of the pipe.
> I did some search, [1] seems the same issue and [2] is also related.
>
> Wonder if there is any way to identify where it's stuck and get more
> debugging info.
> Given [2], I'd suspect the import-diff is stuck, cause rbd client is
> importing to the
> remote cluster. Networking latency could be involved here? Ping
> latency is 7~8 ms.
>
> Any comments is appreciated!
>
> [1] https://bugs.launchpad.net/cinder/+bug/2031897
> [2] https://stackoverflow.com/questions/69858763/ceph-rbd-import-hangs
>
> Thanks!
> Tony
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd export with export-format 2 exports all snapshots?

2023-08-27 Thread Tony Liu
Thank you Alex for confirmation!

Tony

From: Alex Gorbachev 
Sent: August 27, 2023 05:29 PM
To: Tony Liu
Cc: d...@ceph.io; ceph-users@ceph.io
Subject: Re: [ceph-users] rbd export with export-format 2 exports all snapshots?

Tony,

From what I recall having worked with snapshots a while ago, you would want 
export-diff to achieve a differential export.  "export" will always go for a 
full image.

--
Alex Gorbachev
https://alextelescope.blogspot.com



On Sun, Aug 27, 2023 at 8:03 PM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi,

Say, source image has snapshot s1, s2 and s3.

I expect "export" behaves the same as "deep cp", when specify a snapshot,
with "--export-format 2", only the specified snapshot and all snapshots
earlier than that will be exported.

What I see is that, no matter which snapshot I specify, "export" with
"--export-format 2" always exports the whole image with all snapshots.
Is this expected?

Could anyone help to clarify?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rbd export with export-format 2 exports all snapshots?

2023-08-27 Thread Tony Liu
Hi,

Say, source image has snapshot s1, s2 and s3.

I expect "export" behaves the same as "deep cp", when specify a snapshot,
with "--export-format 2", only the specified snapshot and all snapshots
earlier than that will be exported.

What I see is that, no matter which snapshot I specify, "export" with
"--export-format 2" always exports the whole image with all snapshots.
Is this expected?

Could anyone help to clarify?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd export-diff/import-diff hangs

2023-08-27 Thread Tony Liu
To update, hanging happens when updating local image, not remote, networking
is not a concern here. Any advices how to look into it?

Thanks!
Tony

From: Tony Liu 
Sent: August 26, 2023 10:43 PM
To: d...@ceph.io; ceph-users@ceph.io
Subject: [ceph-users] rbd export-diff/import-diff hangs

Hi,

I'm using rbd import and export to copy image from one cluster to another.
Also using import-diff and export-diff to update image in remote cluster.
For example, "rbd --cluster local export-diff ... | rbd --cluster remote 
import-diff ...".
Sometimes, the whole command is stuck. I can't tell it's stuck on which end of 
the pipe.
I did some search, [1] seems the same issue and [2] is also related.

Wonder if there is any way to identify where it's stuck and get more debugging 
info.
Given [2], I'd suspect the import-diff is stuck, cause rbd client is importing 
to the
remote cluster. Networking latency could be involved here? Ping latency is 7~8 
ms.

Any comments is appreciated!

[1] https://bugs.launchpad.net/cinder/+bug/2031897
[2] https://stackoverflow.com/questions/69858763/ceph-rbd-import-hangs

Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rbd export-diff/import-diff hangs

2023-08-26 Thread Tony Liu
Hi,

I'm using rbd import and export to copy image from one cluster to another.
Also using import-diff and export-diff to update image in remote cluster.
For example, "rbd --cluster local export-diff ... | rbd --cluster remote 
import-diff ...".
Sometimes, the whole command is stuck. I can't tell it's stuck on which end of 
the pipe.
I did some search, [1] seems the same issue and [2] is also related.

Wonder if there is any way to identify where it's stuck and get more debugging 
info.
Given [2], I'd suspect the import-diff is stuck, cause rbd client is importing 
to the
remote cluster. Networking latency could be involved here? Ping latency is 7~8 
ms.

Any comments is appreciated!

[1] https://bugs.launchpad.net/cinder/+bug/2031897
[2] https://stackoverflow.com/questions/69858763/ceph-rbd-import-hangs

Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: snapshot timestamp

2023-08-04 Thread Tony Liu
Thank you Ilya for confirmation!

Tony

From: Ilya Dryomov 
Sent: August 4, 2023 04:51 AM
To: Tony Liu
Cc: d...@ceph.io; ceph-users@ceph.io
Subject: Re: [ceph-users] snapshot timestamp

On Fri, Aug 4, 2023 at 7:49 AM Tony Liu  wrote:
>
> Hi,
>
> We know snapshot is on a point of time. Is this point of time tracked 
> internally by
> some sort of sequence number, or the timestamp showed by "snap ls", or 
> something else?

Hi Tony,

The timestamp in "rbd snap ls" output is the snapshot creation
timestamp.

>
> I noticed that when "deep cp", the timestamps of all snapshot are changed to 
> copy-time.

Correct -- exactly the same as the image creation timestamp (visible in
"rbd info" output).

> Say I create a snapshot at 1PM and make a copy at 3PM, the timestamp of 
> snapshot in
> the copy is 3PM. If I rollback the copy to this snapshot, I'd assume it will 
> actually bring me
> back to the state of 1PM. Is that correct?

Correct.

>
> If the above is true, I won't be able to rely on timestamp to track snapshots.
>
> Say I create a snapshot every hour and make a backup by copy at the end of 
> the day.
> Then the original image is damaged and backup is used to restore the work. On 
> this
> backup image, how do I know which snapshot was on 1PM, which was on 2PM, etc.?
> Any advices to track snapshots properly in such case?

I would suggest embedding that info along with any additional metadata
needed in the snapshot name.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: What's the max of snap ID?

2023-08-04 Thread Tony Liu
Thank you Eugen and Nathan!
uint64 is big enough, no concerns any more.

Tony

From: Nathan Fish 
Sent: August 4, 2023 04:19 AM
To: Eugen Block
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: What's the max of snap ID?

2^64 byte in peta byte

= 18446.744073709551616 (peta⋅byte)

Assuming that a snapshot requires storing any data at all, which it
must, nobody has a Ceph cluster that could store that much snapshot
metadata even for empty snapshots.

On Fri, Aug 4, 2023 at 7:05 AM Eugen Block  wrote:
>
> I'm no programmer but if I understand [1] correctly it's an unsigned
> long long:
>
> >  int ImageCtx::snap_set(uint64_t in_snap_id) {
>
> which means the max snap_id should be this:
>
> 2^64 = 18446744073709551616
>
> Not sure if you can get your cluster to reach that limit, but I also
> don't know what would happen if you actually would reach it. I also
> might be misunderstanding so maybe someone with more knowledge can
> confirm oder correct me.
>
> [1] https://github.com/ceph/ceph/blob/main/src/librbd/ImageCtx.cc#L328
>
> Zitat von Tony Liu :
>
> > Hi,
> >
> > There is a snap ID for each snapshot. How is this ID allocated, 
> > sequentially?
> > Did some tests, it seems this ID is per pool, starting from 4 and
> > always going up.
> > Is that correct?
> > What's the max of this ID?
> > What's going to happen when ID reaches the max, going back to start
> > from 4 again?
> >
> >
> > Thanks!
> > Tony
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] snapshot timestamp

2023-08-03 Thread Tony Liu
Hi,

We know snapshot is on a point of time. Is this point of time tracked 
internally by
some sort of sequence number, or the timestamp showed by "snap ls", or 
something else?

I noticed that when "deep cp", the timestamps of all snapshot are changed to 
copy-time.
Say I create a snapshot at 1PM and make a copy at 3PM, the timestamp of 
snapshot in
the copy is 3PM. If I rollback the copy to this snapshot, I'd assume it will 
actually bring me
back to the state of 1PM. Is that correct?

If the above is true, I won't be able to rely on timestamp to track snapshots.

Say I create a snapshot every hour and make a backup by copy at the end of the 
day.
Then the original image is damaged and backup is used to restore the work. On 
this
backup image, how do I know which snapshot was on 1PM, which was on 2PM, etc.?
Any advices to track snapshots properly in such case?

I can definitely build something else to help on this, but I'd like to know how 
much
Ceph can support it.


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] What's the max of snap ID?

2023-08-03 Thread Tony Liu
Hi,

There is a snap ID for each snapshot. How is this ID allocated, sequentially?
Did some tests, it seems this ID is per pool, starting from 4 and always going 
up.
Is that correct?
What's the max of this ID?
What's going to happen when ID reaches the max, going back to start from 4 
again?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [rbd-mirror] can't enable journal-based image mirroring

2023-07-31 Thread Tony Liu
In case the image has parent, the parent image also needs to be mirrored.
After enabling the mirroring on parent image, it works as expected.


Thanks!
Tony

From: Tony Liu 
Sent: July 31, 2023 08:13 AM
To: ceph-users@ceph.io; d...@ceph.io
Subject: [ceph-users] [rbd-mirror] can't enable journal-based image mirroring

Hi,

The Ceph cluster is with Pacific v16.2.10.
"rbd mirror image enable  journal" seems not working.
Any clues what I'm missing? There is no error messages from the CLI.
Any way to troubleshooting?

```
# rbd mirror pool info volume-ssd
Mode: image
Site Name: 35d050c0-77c0-11eb-9242-2cea7ff9d07c

Peer Sites:

UUID: 86eacc0f-6657-4742-8daf-2942ea23affd
Name: qa
Mirror UUID:
Direction: rx-tx
Client: client.infra

# rbd feature enable volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b 
journaling

# rbd mirror image enable 
volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b journal

# rbd info volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b
rbd image 'volume-aceee005-265e-44ea-a591-b6dda639a76b':
size 40 GiB in 10240 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: e6fce674350e55
block_name_prefix: rbd_data.e6fce674350e55
format: 2
features: layering, exclusive-lock, object-map, fast-diff, 
deep-flatten, journaling
op_features:
flags:
create_timestamp: Sun Jul 30 22:21:09 2023
access_timestamp: Mon Jul 31 00:36:20 2023
modify_timestamp: Mon Jul 31 08:01:26 2023
parent: image/801d1850-de2f-443b-a30c-71966e90c118@snap
overlap: 10 GiB
journal: e6fce674350e55
mirroring state: disabled

# rbd mirror image status volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b
rbd: mirroring not enabled on the image
```

Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [rbd-mirror] can't enable journal-based image mirroring

2023-07-31 Thread Tony Liu
Hi,

The Ceph cluster is with Pacific v16.2.10.
"rbd mirror image enable  journal" seems not working.
Any clues what I'm missing? There is no error messages from the CLI.
Any way to troubleshooting?

```
# rbd mirror pool info volume-ssd
Mode: image
Site Name: 35d050c0-77c0-11eb-9242-2cea7ff9d07c

Peer Sites: 

UUID: 86eacc0f-6657-4742-8daf-2942ea23affd
Name: qa
Mirror UUID: 
Direction: rx-tx
Client: client.infra

# rbd feature enable volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b 
journaling

# rbd mirror image enable 
volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b journal

# rbd info volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b
rbd image 'volume-aceee005-265e-44ea-a591-b6dda639a76b':
size 40 GiB in 10240 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: e6fce674350e55
block_name_prefix: rbd_data.e6fce674350e55
format: 2
features: layering, exclusive-lock, object-map, fast-diff, 
deep-flatten, journaling
op_features: 
flags: 
create_timestamp: Sun Jul 30 22:21:09 2023
access_timestamp: Mon Jul 31 00:36:20 2023
modify_timestamp: Mon Jul 31 08:01:26 2023
parent: image/801d1850-de2f-443b-a30c-71966e90c118@snap
overlap: 10 GiB
journal: e6fce674350e55
mirroring state: disabled

# rbd mirror image status volume-ssd/volume-aceee005-265e-44ea-a591-b6dda639a76b
rbd: mirroring not enabled on the image
```

Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: configure rgw

2023-07-30 Thread Tony Liu
When deploy rgw container, cephadm adds rgw_frontends into config db on daemon 
level. I was adding settings on node level. That's why I didn't see my setting 
take effect.
I need to put rgw_frontends on daemon level after deployment.

Thanks!
Tony

From: Tony Liu 
Sent: July 29, 2023 11:44 PM
To: ceph-users@ceph.io; d...@ceph.io
Subject: [ceph-users] Re: configure rgw

A few updates.
1. "radosgw-admin --show-config -n client.rgw.qa.ceph-1.hzfrwq" doesn't show 
actual running config.

2. "ceph --admin-daemon 
/var/run/ceph/69f94d08-2811-11ee-9ab1-089204adfafa/ceph-client.rgw.qa.ceph-1.hzfrwq.7.94254493095368.asok
 config show" shows the actual running config.

3. All settings in client.rgw are applied to rgw running config, except for 
rgw_frontends.
```
# ceph config get client.rgw rgw_frontends
beast port=8086
# ceph --admin-daemon 
/var/run/ceph/69f94d08-2811-11ee-9ab1-089204adfafa/ceph-client.rgw.qa.ceph-1.hzfrwq.7.94254493095368.asok
 config get rgw_frontends
{
"rgw_frontends": "beast endpoint=10.250.80.100:80"
}
```
The only place I see "10.250.80.100" and "80" is unit.meta. How is that applied?

Found a workaround, remove rgw_frontends from config, restart rgw, rgw_frontends
goes back to default "port=7480". Add it back to config, restart rgw. Now 
rgw_frontends
is what I expect. The logic doesn't make much sense to me. I'd assume that 
unit.meta has
something to do with this, hopefully someone could shed light here.


Thanks!
Tony


From: Tony Liu 
Sent: July 29, 2023 10:40 PM
To: ceph-users@ceph.io; d...@ceph.io
Subject: [ceph-users] configure rgw

Hi,

I'm using Pacific v16.2.10 container image, deployed by cephadm.
I used to manually build config file for rgw, deploy rgw, put config file in 
place
and restart rgw. It works fine.

Now, I'd like to put rgw config into config db. I tried with client.rgw, but 
the config
is not taken by rgw. Also "config show" doesn't work. It always says "no config 
state".

```
# ceph orch ps | grep rgw
rgw.qa.ceph-1.hzfrwq  ceph-1  10.250.80.100:80 running (10m)10m ago 
 53m51.4M-  16.2.10  32214388de9d  13169a213bc5
# ceph config get client.rgw | grep frontends
client.rgwbasic rgw_frontendsbeast port=8086

  *
# ceph config show rgw.qa.ceph-1.hzfrwq
Error ENOENT: no config state for daemon rgw.qa.ceph-1.hzfrwq
# ceph config show client.rgw.qa.ceph-1.hzfrwq
Error ENOENT: no config state for daemon client.rgw.qa.ceph-1.hzfrwq
# radosgw-admin --show-config -n client.rgw.qa.ceph-1.hzfrwq | grep frontends
rgw_frontends = beast port=7480
```

Any clues what I am missing here?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: configure rgw

2023-07-30 Thread Tony Liu
A few updates.
1. "radosgw-admin --show-config -n client.rgw.qa.ceph-1.hzfrwq" doesn't show 
actual running config.

2. "ceph --admin-daemon 
/var/run/ceph/69f94d08-2811-11ee-9ab1-089204adfafa/ceph-client.rgw.qa.ceph-1.hzfrwq.7.94254493095368.asok
 config show" shows the actual running config.

3. All settings in client.rgw are applied to rgw running config, except for 
rgw_frontends.
```
# ceph config get client.rgw rgw_frontends
beast port=8086
# ceph --admin-daemon 
/var/run/ceph/69f94d08-2811-11ee-9ab1-089204adfafa/ceph-client.rgw.qa.ceph-1.hzfrwq.7.94254493095368.asok
 config get rgw_frontends 
{
"rgw_frontends": "beast endpoint=10.250.80.100:80"
}
```
The only place I see "10.250.80.100" and "80" is unit.meta. How is that applied?

Found a workaround, remove rgw_frontends from config, restart rgw, rgw_frontends
goes back to default "port=7480". Add it back to config, restart rgw. Now 
rgw_frontends
is what I expect. The logic doesn't make much sense to me. I'd assume that 
unit.meta has
something to do with this, hopefully someone could shed light here.


Thanks!
Tony


From: Tony Liu 
Sent: July 29, 2023 10:40 PM
To: ceph-users@ceph.io; d...@ceph.io
Subject: [ceph-users] configure rgw

Hi,

I'm using Pacific v16.2.10 container image, deployed by cephadm.
I used to manually build config file for rgw, deploy rgw, put config file in 
place
and restart rgw. It works fine.

Now, I'd like to put rgw config into config db. I tried with client.rgw, but 
the config
is not taken by rgw. Also "config show" doesn't work. It always says "no config 
state".

```
# ceph orch ps | grep rgw
rgw.qa.ceph-1.hzfrwq  ceph-1  10.250.80.100:80 running (10m)10m ago 
 53m51.4M-  16.2.10  32214388de9d  13169a213bc5
# ceph config get client.rgw | grep frontends
client.rgwbasic rgw_frontendsbeast port=8086

  *
# ceph config show rgw.qa.ceph-1.hzfrwq
Error ENOENT: no config state for daemon rgw.qa.ceph-1.hzfrwq
# ceph config show client.rgw.qa.ceph-1.hzfrwq
Error ENOENT: no config state for daemon client.rgw.qa.ceph-1.hzfrwq
# radosgw-admin --show-config -n client.rgw.qa.ceph-1.hzfrwq | grep frontends
rgw_frontends = beast port=7480
```

Any clues what I am missing here?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] configure rgw

2023-07-29 Thread Tony Liu
Hi,

I'm using Pacific v16.2.10 container image, deployed by cephadm.
I used to manually build config file for rgw, deploy rgw, put config file in 
place
and restart rgw. It works fine.

Now, I'd like to put rgw config into config db. I tried with client.rgw, but 
the config
is not taken by rgw. Also "config show" doesn't work. It always says "no config 
state".

```
# ceph orch ps | grep rgw
rgw.qa.ceph-1.hzfrwq  ceph-1  10.250.80.100:80 running (10m)10m ago 
 53m51.4M-  16.2.10  32214388de9d  13169a213bc5  
# ceph config get client.rgw | grep frontends
client.rgwbasic rgw_frontendsbeast port=8086

  * 
# ceph config show rgw.qa.ceph-1.hzfrwq
Error ENOENT: no config state for daemon rgw.qa.ceph-1.hzfrwq
# ceph config show client.rgw.qa.ceph-1.hzfrwq
Error ENOENT: no config state for daemon client.rgw.qa.ceph-1.hzfrwq
# radosgw-admin --show-config -n client.rgw.qa.ceph-1.hzfrwq | grep frontends
rgw_frontends = beast port=7480
```

Any clues what I am missing here?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: resume RBD mirror on another host

2023-07-13 Thread Tony Liu
Super! Thanks Ilya!

Tony

From: Ilya Dryomov 
Sent: July 13, 2023 01:30 PM
To: Tony Liu
Cc: d...@ceph.io; ceph-users@ceph.io
Subject: Re: [ceph-users] resume RBD mirror on another host

On Thu, Jul 13, 2023 at 10:23 PM Ilya Dryomov  wrote:
>
> On Thu, Jul 13, 2023 at 6:16 PM Tony Liu  wrote:
> >
> > Hi,
> >
> > How RBD mirror tracks mirroring process, on local storage?
> > Say, RBD mirror is running on host-1, when host-1 goes down,
> > start RBD mirror on host-2. In that case, is RBD mirror on host-2
> > going to continue the mirroring?
>
> Hi Tony,
>
> No, it's tracked on the RBD image itself -- meaning in RADOS.

To be clear, "no" here was meant to answer the "How RBD mirror
tracks mirroring process, on local storage?" question.

To answer the second question explicitly: yes, rbd-mirror daemon on
host-2 will continue mirroring from where rbd-mirror daemon on host-1
left off.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] resume RBD mirror on another host

2023-07-13 Thread Tony Liu
Hi,

How RBD mirror tracks mirroring process, on local storage?
Say, RBD mirror is running on host-1, when host-1 goes down,
start RBD mirror on host-2. In that case, is RBD mirror on host-2
going to continue the mirroring?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] librbd Python asyncio

2023-07-09 Thread Tony Liu
Hi,

Wondering if there is librbd supporting Python asyncio,
or any plan to do that?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: import OSD after host OS reinstallation

2023-04-28 Thread Tony Liu
Thank you Eugen for looking into it!
In short, it works. I'm using 16.2.10.
What I did wrong was to remove the OSD, which makes no sense.

Tony

From: Eugen Block 
Sent: April 28, 2023 06:46 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: import OSD after host OS reinstallation

I chatted with Mykola who helped me get the OSDs back up. My test
cluster was on 16.2.5 (and still mostly is), after upgrading only the
MGRs to a more recent version (16.2.10) the activate command worked
successfully and the existing OSDs got back up. Not sure if that's a
bug or something else, but which exact versions are you using?

Zitat von Eugen Block :

> I found a small two-node cluster to test this on pacific, I can
> reproduce it. After reinstalling the host (VM) most of the other
> services are redeployed (mon, mgr, mds, crash), but not the OSDs. I
> will take a closer look.
>
> Zitat von Tony Liu :
>
>> Tried [1] already, but got error.
>> Created no osd(s) on host ceph-4; already created?
>>
>> The error is from [2] in deploy_osd_daemons_for_existing_osds().
>>
>> Not sure what's missing.
>> Should OSD be removed, or removed with --replace, or untouched
>> before host reinstallation?
>>
>> [1]
>> https://docs.ceph.com/en/pacific/cephadm/services/osd/#activate-existing-osds
>> [2]
>> https://github.com/ceph/ceph/blob/0a5b3b373b8a5ba3081f1f110cec24d82299cac8/src/pybind/mgr/cephadm/services/osd.py#L196
>>
>> Thanks!
>> Tony
>> 
>> From: Tony Liu 
>> Sent: April 27, 2023 10:20 PM
>> To: ceph-users@ceph.io; d...@ceph.io
>> Subject: [ceph-users] import OSD after host OS reinstallation
>>
>> Hi,
>>
>> The cluster is with Pacific and deployed by cephadm on container.
>> The case is to import OSDs after host OS reinstallation.
>> All OSDs are SSD who has DB/WAL and data together.
>> Did some research, but not able to find a working solution.
>> Wondering if anyone has experiences in this?
>> What needs to be done before host OS reinstallation and what's after?
>>
>>
>> Thanks!
>> Tony
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: import OSD after host OS reinstallation

2023-04-27 Thread Tony Liu
Tried [1] already, but got error.
Created no osd(s) on host ceph-4; already created?

The error is from [2] in deploy_osd_daemons_for_existing_osds().

Not sure what's missing.
Should OSD be removed, or removed with --replace, or untouched before host 
reinstallation?

[1] 
https://docs.ceph.com/en/pacific/cephadm/services/osd/#activate-existing-osds
[2] 
https://github.com/ceph/ceph/blob/0a5b3b373b8a5ba3081f1f110cec24d82299cac8/src/pybind/mgr/cephadm/services/osd.py#L196

Thanks!
Tony

From: Tony Liu 
Sent: April 27, 2023 10:20 PM
To: ceph-users@ceph.io; d...@ceph.io
Subject: [ceph-users] import OSD after host OS reinstallation

Hi,

The cluster is with Pacific and deployed by cephadm on container.
The case is to import OSDs after host OS reinstallation.
All OSDs are SSD who has DB/WAL and data together.
Did some research, but not able to find a working solution.
Wondering if anyone has experiences in this?
What needs to be done before host OS reinstallation and what's after?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] import OSD after host OS reinstallation

2023-04-27 Thread Tony Liu
Hi,

The cluster is with Pacific and deployed by cephadm on container.
The case is to import OSDs after host OS reinstallation.
All OSDs are SSD who has DB/WAL and data together.
Did some research, but not able to find a working solution.
Wondering if anyone has experiences in this?
What needs to be done before host OS reinstallation and what's after?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd cp vs. rbd clone + rbd flatten

2023-03-27 Thread Tony Liu
Thank you Ilya!

Tony

From: Ilya Dryomov 
Sent: March 27, 2023 10:28 AM
To: Tony Liu
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: Re: [ceph-users] rbd cp vs. rbd clone + rbd flatten

On Wed, Mar 22, 2023 at 10:51 PM Tony Liu  wrote:
>
> Hi,
>
> I want
> 1) copy a snapshot to an image,
> 2) no need to copy snapshots,
> 3) no dependency after copy,
> 4) all same image format 2.
> In that case, is rbd cp the same as rbd clone + rbd flatten?
> I ran some tests, seems like it, but want to confirm, in case of missing 
> anything.

Hi Tony,

Yes, at a high level it should be the same.

> Also, seems cp is a bit faster and flatten, is that true?

I can't think of anything that would make "rbd cp" faster.  I would
actually expect it to be slower since "rbd cp" also attempts to sparsify
the destination image (see --sparse-size option), making it more space
efficient.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rbd cp vs. rbd clone + rbd flatten

2023-03-22 Thread Tony Liu
Hi,

I want
1) copy a snapshot to an image,
2) no need to copy snapshots,
3) no dependency after copy,
4) all same image format 2.
In that case, is rbd cp the same as rbd clone + rbd flatten?
I ran some tests, seems like it, but want to confirm, in case of missing 
anything.
Also, seems cp is a bit faster and flatten, is that true?


Thanks!
Tony

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is it a bug that OSD crashed when it's full?

2022-11-01 Thread Tony Liu
Thank you Igor!
Tony

From: Igor Fedotov 
Sent: November 1, 2022 04:34 PM
To: Tony Liu; ceph-users@ceph.io; d...@ceph.io
Subject: Re: [ceph-users] Re: Is it a bug that OSD crashed when it's full?

Hi Tony,

first of all let me share my understanding of the issue you're facing.
This recalls me an upstream ticket and I presume my root cause analysis
from there (https://tracker.ceph.com/issues/57672#note-9) is applicable
in your case as well.

So generally speaking your OSD isn't 100% full - from the log output one
can see that 0x57acbc000 of 0x6fc840 bytes are free. But there are
not enough contiguous 64K chunks for BlueFS to proceed operating..

As a result OSD managed to escape any *full* sentries and reached the
state when it's crashed - these safety means just weren't designed to
take that additional free space fragmentation factor into account...

Similarly the lack of available 64K chunks prevents OSD from starting up
- it needs to write out some more data to BlueFS during startup recovery.

I'm currently working on enabling BlueFS functioning with default main
device allocation unit (=4K) which will hopefully fix the above issue.


Meanwhile you might want to workaround the current  OSD's state by
setting bluefs_shared_allocat_size to 32K - this might have some
operational and performance effects but highly likely OSD should be able
to startup afterwards. Please do not use 4K for now - it's known for
causing more problems in some circumstances. And I'd highly recommend to
redeploy the OSD ASAP as you drained all the data off it - I presume
that's the reason why you want to bring it up instead of letting the
cluster to recover using regular means applied on OSD loss.

Alternative approach would be to add standalone DB volume and migrate
BlueFS there - ceph-volume should be able to do that even in the current
OSD state. Expanding main volume (if backed by LVM and extra spare space
is available) is apparently a valid option too


Thanks,

Igor


On 11/1/2022 8:09 PM, Tony Liu wrote:
> The actual question is that, is crash expected when OSD is full?
> My focus is more on how to prevent this from happening.
> My expectation is that OSD rejects write request when it's full, but not 
> crash.
> Otherwise, no point to have ratio threshold.
> Please let me know if this is the design or a bug.
>
> Thanks!
> Tony
> ________
> From: Tony Liu 
> Sent: October 31, 2022 05:46 PM
> To: ceph-users@ceph.io; d...@ceph.io
> Subject: [ceph-users] Is it a bug that OSD crashed when it's full?
>
> Hi,
>
> Based on doc, Ceph prevents you from writing to a full OSD so that you don’t 
> lose data.
> In my case, with v16.2.10, OSD crashed when it's full. Is this expected or 
> some bug?
> I'd expect write failure instead of OSD crash. It keeps crashing when tried 
> to bring it up.
> Is there any way to bring it back?
>
>  -7> 2022-10-31T22:52:57.426+ 7fe37fd94200  4 rocksdb: EVENT_LOG_v1 
> {"time_micros": 1667256777427646, "job": 1, "event": "recovery_started", 
> "log_files": [23300]}
>  -6> 2022-10-31T22:52:57.426+ 7fe37fd94200  4 rocksdb: 
> [db_impl/db_impl_open.cc:760] Recovering log #23300 mode 2
>  -5> 2022-10-31T22:52:57.529+ 7fe37fd94200  3 rocksdb: 
> [le/block_based/filter_policy.cc:584] Using legacy Bloom filter with high 
> (20) bits/key. Dramatic filter space and/or accuracy improvement is available 
> with format_version>=5.
>  -4> 2022-10-31T22:52:57.592+ 7fe37fd94200  1 bluefs _allocate unable 
> to allocate 0x9 on bdev 1, allocator name block, allocator type hybrid, 
> capacity, block size 0x1000, free 0x57acbc000, fragmentation 0.359784, 
> allocated 0x0
>  -3> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _allocate 
> allocation failed, needed 0x8064a
>  -2> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _flush_range 
> allocated: 0x0 offset: 0x0 length: 0x8064a
>  -1> 2022-10-31T22:52:57.604+ 7fe37fd94200 -1 
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc:
>  In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, 
> uint64_t)' thread 7fe37fd94200 time 2022-10-31T22:52:57.593873+
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc:
>  2768: ceph_abort_msg("bluefs enospc")
>
>   ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific 
> (stable)
>   1: (ceph::__ceph_abort(cha

[ceph-users] Re: Is it a bug that OSD crashed when it's full?

2022-11-01 Thread Tony Liu
The actual question is that, is crash expected when OSD is full?
My focus is more on how to prevent this from happening.
My expectation is that OSD rejects write request when it's full, but not crash.
Otherwise, no point to have ratio threshold.
Please let me know if this is the design or a bug.

Thanks!
Tony

From: Tony Liu 
Sent: October 31, 2022 05:46 PM
To: ceph-users@ceph.io; d...@ceph.io
Subject: [ceph-users] Is it a bug that OSD crashed when it's full?

Hi,

Based on doc, Ceph prevents you from writing to a full OSD so that you don’t 
lose data.
In my case, with v16.2.10, OSD crashed when it's full. Is this expected or some 
bug?
I'd expect write failure instead of OSD crash. It keeps crashing when tried to 
bring it up.
Is there any way to bring it back?

-7> 2022-10-31T22:52:57.426+ 7fe37fd94200  4 rocksdb: EVENT_LOG_v1 
{"time_micros": 1667256777427646, "job": 1, "event": "recovery_started", 
"log_files": [23300]}
-6> 2022-10-31T22:52:57.426+ 7fe37fd94200  4 rocksdb: 
[db_impl/db_impl_open.cc:760] Recovering log #23300 mode 2
-5> 2022-10-31T22:52:57.529+ 7fe37fd94200  3 rocksdb: 
[le/block_based/filter_policy.cc:584] Using legacy Bloom filter with high (20) 
bits/key. Dramatic filter space and/or accuracy improvement is available with 
format_version>=5.
-4> 2022-10-31T22:52:57.592+ 7fe37fd94200  1 bluefs _allocate unable to 
allocate 0x9 on bdev 1, allocator name block, allocator type hybrid, 
capacity 0x6fc840, block size 0x1000, free 0x57acbc000, fragmentation 
0.359784, allocated 0x0
-3> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _allocate 
allocation failed, needed 0x8064a
-2> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _flush_range 
allocated: 0x0 offset: 0x0 length: 0x8064a
-1> 2022-10-31T22:52:57.604+ 7fe37fd94200 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc:
 In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, 
uint64_t)' thread 7fe37fd94200 time 2022-10-31T22:52:57.593873+
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc:
 2768: ceph_abort_msg("bluefs enospc")

 ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific 
(stable)
 1: (ceph::__ceph_abort(char const*, int, char const*, 
std::__cxx11::basic_string, std::allocator > 
const&)+0xe5) [0x55858d7e2e7c]
 2: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned 
long)+0x1131) [0x55858dee8cc1]
 3: (BlueFS::_flush(BlueFS::FileWriter*, bool, bool*)+0x90) [0x55858dee8fa0]
 4: (BlueFS::_flush(BlueFS::FileWriter*, bool, 
std::unique_lock&)+0x32) [0x55858defa0b2]
 5: (BlueRocksWritableFile::Append(rocksdb::Slice const&)+0x11b) 
[0x55858df129eb]
 6: (rocksdb::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, 
rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x55858e3ae55f]
 7: (rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned 
long)+0x58a) [0x55858e4c02aa]
 8: (rocksdb::WritableFileWriter::Append(rocksdb::Slice const&)+0x2d0) 
[0x55858e4c1700]
 9: (rocksdb::BlockBasedTableBuilder::WriteRawBlock(rocksdb::Slice const&, 
rocksdb::CompressionType, rocksdb::BlockHandle*, bool)+0xb6) [0x55858e5dce86]
 10: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::Slice const&, 
rocksdb::BlockHandle*, bool)+0x26c) [0x55858e5dd7cc]
 11: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::BlockBuilder*, 
rocksdb::BlockHandle*, bool)+0x3c) [0x55858e5ddecc]
 12: (rocksdb::BlockBasedTableBuilder::Flush()+0x6d) [0x55858e5ddf5d]
 13: (rocksdb::BlockBasedTableBuilder::Add(rocksdb::Slice const&, 
rocksdb::Slice const&)+0x2b8) [0x55858e5e13c8]
 14: (rocksdb::BuildTable(std::__cxx11::basic_string, std::allocator > const&, rocksdb::Env*, 
rocksdb::FileSystem*, rocksdb::ImmutableCFOptions const&, 
rocksdb::MutableCFOptions const&, rocksdb::FileOptions const&, 
rocksdb::TableCache*, rocksdb::InternalIteratorBase*, 
std::vector >, 
std::allocator > > >, 
rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, 
std::vector >, 
std::allocator > > > const*, unsigned 
int, std::__cxx11::basic_string, 
std::allocator > const&, std::vector >, unsigned long, rocksdb::SnapshotChecker*, 
rocksdb::CompressionType, unsigned long, rocksdb::CompressionOptions const&, 
bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, 
rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, 
rocksdb::TableProperties*, int, unsigned long, unsigned long, 
rocksdb::Env::Wri

[ceph-users] Re: Is it a bug that OSD crashed when it's full?

2022-10-31 Thread Tony Liu
Hi Zizon,

I know I ran out of space. I thought that full ratio would prevent me from 
being here.
I tried a few ceph-*-tool, they crash the same way. I guess they need rockdb to 
start?
Any recommendations how I can restore it or copy data out, or copy the volume to
another bigger disk?


Thanks!
Tony

From: Zizon Qiu 
Sent: October 31, 2022 08:13 PM
To: Tony Liu
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: Re: Is it a bug that OSD crashed when it's full?


 15: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, 
rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xcf5) 
[0x55858e3f0ea5]
 16: (rocksdb::DBImpl::RecoverLogFiles(std::vector > const&, unsigned long*, bool, bool*)+0x1c2e) 
[0x55858e3f35de]
 17: (rocksdb::DBImpl::Recover(std::vector > const&, bool, bool, bool, 
unsigned long*)+0xae8) [0x55858e3f4938]
 18: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, 
std::__cxx11::basic_string, std::allocator > 
const&, std::vector > const&, 
std::vector >*, rocksdb::DB**, bool, 
bool)+0x59d) [0x55858e3ee65d]

I think you should manage to make room for at least this to finish.
The rocksdb can not even startup with enough disk space.

On Tue, Nov 1, 2022 at 8:49 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi,

Based on doc, Ceph prevents you from writing to a full OSD so that you don’t 
lose data.
In my case, with v16.2.10, OSD crashed when it's full. Is this expected or some 
bug?
I'd expect write failure instead of OSD crash. It keeps crashing when tried to 
bring it up.
Is there any way to bring it back?

-7> 2022-10-31T22:52:57.426+ 7fe37fd94200  4 rocksdb: EVENT_LOG_v1 
{"time_micros": 1667256777427646, "job": 1, "event": "recovery_started", 
"log_files": [23300]}
-6> 2022-10-31T22:52:57.426+ 7fe37fd94200  4 rocksdb: 
[db_impl/db_impl_open.cc:760] Recovering log #23300 mode 2
-5> 2022-10-31T22:52:57.529+ 7fe37fd94200  3 rocksdb: 
[le/block_based/filter_policy.cc:584] Using legacy Bloom filter with high (20) 
bits/key. Dramatic filter space and/or accuracy improvement is available with 
format_version>=5.
-4> 2022-10-31T22:52:57.592+ 7fe37fd94200  1 bluefs _allocate unable to 
allocate 0x9 on bdev 1, allocator name block, allocator type hybrid, 
capacity 0x6fc840, block size 0x1000, free 0x57acbc000, fragmentation 
0.359784, allocated 0x0
-3> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _allocate 
allocation failed, needed 0x8064a
-2> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _flush_range 
allocated: 0x0 offset: 0x0 length: 0x8064a
-1> 2022-10-31T22:52:57.604+ 7fe37fd94200 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc:
 In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, 
uint64_t)' thread 7fe37fd94200 time 2022-10-31T22:52:57.593873+
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc:
 2768: ceph_abort_msg("bluefs enospc")

 ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific 
(stable)
 1: (ceph::__ceph_abort(char const*, int, char const*, 
std::__cxx11::basic_string, std::allocator > 
const&)+0xe5) [0x55858d7e2e7c]
 2: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned 
long)+0x1131) [0x55858dee8cc1]
 3: (BlueFS::_flush(BlueFS::FileWriter*, bool, bool*)+0x90) [0x55858dee8fa0]
 4: (BlueFS::_flush(BlueFS::FileWriter*, bool, 
std::unique_lock&)+0x32) [0x55858defa0b2]
 5: (BlueRocksWritableFile::Append(rocksdb::Slice const&)+0x11b) 
[0x55858df129eb]
 6: (rocksdb::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, 
rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x55858e3ae55f]
 7: (rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned 
long)+0x58a) [0x55858e4c02aa]
 8: (rocksdb::WritableFileWriter::Append(rocksdb::Slice const&)+0x2d0) 
[0x55858e4c1700]
 9: (rocksdb::BlockBasedTableBuilder::WriteRawBlock(rocksdb::Slice const&, 
rocksdb::CompressionType, rocksdb::BlockHandle*, bool)+0xb6) [0x55858e5dce86]
 10: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::Slice const&, 
rocksdb::BlockHandle*, bool)+0x26c) [0x55858e5dd7cc]
 11: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::BlockBuilder*, 
rocksdb::BlockHandle*, bool)+0x3c) [0x55858e5ddecc]
 12: (rocksdb::BlockBasedTableBuilder::Flush()+0x6d) [0x55858e5ddf5d]
 13: (rocksdb::BlockBasedTableBuilder::Add(rocksdb::Slice const&, 
rocksdb::Slice const&)+0x2b8) [0x55858e5e13c8]
 14: (rocksdb::BuildTable(std::__cxx11::basic_string, std

[ceph-users] Re: Is it a bug that OSD crashed when it's full?

2022-10-31 Thread Tony Liu
 std::allocator > const&)+0x10c1) [0x56102e
e39c41]
 21: (BlueStore::_open_db(bool, bool, bool)+0x8c7) [0x56102ec9de17]
[40/1932] 22: (BlueStore::_open_db_and_around(bool, bool)+0x2f7) 
[0x56102ed0beb7]   
 23: (BlueStore::_mount()+0x204) [0x56102ed0ed74]  
 24: main()
 25: __libc_start_main()
 26: _start()
*** Caught signal (Aborted) **
 in thread 7f31699b1000 thread_name:ceph-objectstor
 ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable
)
 1: /lib64/libpthread.so.0(+0x12ce0) [0x7f315ec91ce0]  
 2: gsignal()
 3: abort()
 4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_strin
g, std::allocator > const&)+0x1b6) [0x7f315f8
4cf5d]
 5: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1
131) [0x56102ed93a71]
 6: (BlueFS::_flush(BlueFS::FileWriter*, bool, bool*)+0x90) [0x56102ed93d50]   
 7: (BlueFS::_flush(BlueFS::FileWriter*, bool, std::unique_lock&)+0x
32) [0x56102eda4e62]
 8: (BlueRocksWritableFile::Append(rocksdb::Slice const&)+0x11b) [0x56102edbd79b
]
 9: (rocksdb::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, rocksdb::
IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x56102ee80ebf] 
 10: (rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned long)+0x5
8a) [0x56102ef9360a]
 11: (rocksdb::WritableFileWriter::Append(rocksdb::Slice const&)+0x2d0) [0x56102
ef94a60]
 12: (rocksdb::BlockBasedTableBuilder::WriteRawBlock(rocksdb::Slice const&, rock
sdb::CompressionType, rocksdb::BlockHandle*, bool)+0xb6) [0x56102f0b0bc6]  
 13: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::Slice const&, rocksdb
::BlockHandle*, bool)+0x26c) [0x56102f0b150c]  
 14: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::BlockBuilder*, rocksd
b::BlockHandle*, bool)+0x3c) [0x56102f0b1c0c]  
 15: (rocksdb::BlockBasedTableBuilder::Flush()+0x6d) [0x56102f0b1c9d]  
 16: (rocksdb::BlockBasedTableBuilder::Add(rocksdb::Slice const&, rocksdb::Slice
 const&)+0x2b8) [0x56102f0b5108]
 17: (rocksdb::BuildTable(std::__cxx11::basic_string, std::allocator > const&, rocksdb::Env*, 
rocksdb::FileSystem*, rocksdb::
ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::FileOption
s const&, rocksdb::TableCache*, rocksdb::InternalIteratorBase*,
std::vector >, std::allocator > > >, rocksdb::FileMetaData*, rocksdb::Intern
alKeyComparator const&, std::vector >, std::alloca
tor > > > const*, unsigned int, std::__cxx11::basi
c_string, std::allocator > const&, std::vecto
r >, unsigned long, rocksdb::Snapsh
otChecker*, rocksdb::CompressionType, unsigned long, rocksdb::CompressionOptions
 const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksd
b::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int,
unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint, unsigned long)+0x
a45) [0x56102f05fad5]
 18: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyDat
a*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xcf5) [0x56102eec39f5] 
 19: (rocksdb::DBImpl::RecoverLogFiles(std::vector > const&, unsigned long*, bool, bool*)+0x1c2e) [0x56102eec612e]
 20: (rocksdb::DBImpl::Recover(std::vector > const&, bool, bool, bool, unsigned
 long*)+0xae8) [0x56102eec7488]
 21: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::__cxx11::basic_strin
g, std::allocator > const&, std::vector > co
nst&, std::vector >*, rocksdb::DB**, bool, bool)+0x59d) [0x56102eec11ad]
 22: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string, std::allocator > const&, std::vector > const&
, std::vector >*, rocksdb::DB**)+0x15) [0x56102eec2545] 
 23: (RocksDBStore::do_open(std::ostream&, bool, bool, std::__cxx11::basic_strin
g, std::allocator > const&)+0x10c1) [0x56102e
e39c41]
 24: (BlueStore::_open_db(bool, bool, bool)+0x8c7) [0x56102ec9de17]
 25: (BlueStore::_open_db_and_around(bool, bool)+0x2f7) [0x56102ed0beb7]   
 26: (BlueStore::_mount()+0x204) [0x56102ed0ed74]  
 27: main()
 28: __libc_start_main()
 29: _start()
Aborted (core dumped)

Any clues?

Thanks again!
Tony

From: Steven Umbehocker 
Sent: October 31, 2022 07:07 PM
To: Tony Liu; ceph-users@ceph.io; d...@ceph.io
Subject: Re: Is it a bug that OSD crashed when it's full?

Hi Tony,

Once an OSD is wedged like that you have to manually remove some objects from 
it to get below 100% full before you can start it.  You can use the 
ceph-objectstore-tool to get a list of all the objects in the OSD where $OSDID 
is the ID of your 100% full OSD to scan.  You us

[ceph-users] Is it a bug that OSD crashed when it's full?

2022-10-31 Thread Tony Liu
Hi,

Based on doc, Ceph prevents you from writing to a full OSD so that you don’t 
lose data.
In my case, with v16.2.10, OSD crashed when it's full. Is this expected or some 
bug?
I'd expect write failure instead of OSD crash. It keeps crashing when tried to 
bring it up.
Is there any way to bring it back?

-7> 2022-10-31T22:52:57.426+ 7fe37fd94200  4 rocksdb: EVENT_LOG_v1 
{"time_micros": 1667256777427646, "job": 1, "event": "recovery_started", 
"log_files": [23300]}  
-6> 2022-10-31T22:52:57.426+ 7fe37fd94200  4 rocksdb: 
[db_impl/db_impl_open.cc:760] Recovering log #23300 mode 2  
 
-5> 2022-10-31T22:52:57.529+ 7fe37fd94200  3 rocksdb: 
[le/block_based/filter_policy.cc:584] Using legacy Bloom filter with high (20) 
bits/key. Dramatic filter space and/or accuracy improvement is available with 
format_version>=5.
-4> 2022-10-31T22:52:57.592+ 7fe37fd94200  1 bluefs _allocate unable to 
allocate 0x9 on bdev 1, allocator name block, allocator type hybrid, 
capacity 0x6fc840, block size 0x1000, free 0x57acbc000, fragmentation 
0.359784, allocated 0x0
-3> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _allocate 
allocation failed, needed 0x8064a
-2> 2022-10-31T22:52:57.592+ 7fe37fd94200 -1 bluefs _flush_range 
allocated: 0x0 offset: 0x0 length: 0x8064a  
  
-1> 2022-10-31T22:52:57.604+ 7fe37fd94200 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc:
 In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, 
uint64_t)' thread 7fe37fd94200 time 2022-10-31T22:52:57.593873+ 
   
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/os/bluestore/BlueFS.cc:
 2768: ceph_abort_msg("bluefs enospc")

 ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific 
(stable)
 1: (ceph::__ceph_abort(char const*, int, char const*, 
std::__cxx11::basic_string, std::allocator > 
const&)+0xe5) [0x55858d7e2e7c]  
 2: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned 
long)+0x1131) [0x55858dee8cc1]
 3: (BlueFS::_flush(BlueFS::FileWriter*, bool, bool*)+0x90) [0x55858dee8fa0]
 4: (BlueFS::_flush(BlueFS::FileWriter*, bool, 
std::unique_lock&)+0x32) [0x55858defa0b2]
 5: (BlueRocksWritableFile::Append(rocksdb::Slice const&)+0x11b) 
[0x55858df129eb]
 6: (rocksdb::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, 
rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x55858e3ae55f] 

 7: (rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned 
long)+0x58a) [0x55858e4c02aa]
 8: (rocksdb::WritableFileWriter::Append(rocksdb::Slice const&)+0x2d0) 
[0x55858e4c1700]
 9: (rocksdb::BlockBasedTableBuilder::WriteRawBlock(rocksdb::Slice const&, 
rocksdb::CompressionType, rocksdb::BlockHandle*, bool)+0xb6) [0x55858e5dce86]   

 10: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::Slice const&, 
rocksdb::BlockHandle*, bool)+0x26c) [0x55858e5dd7cc]
  
 11: (rocksdb::BlockBasedTableBuilder::WriteBlock(rocksdb::BlockBuilder*, 
rocksdb::BlockHandle*, bool)+0x3c) [0x55858e5ddecc] 
 
 12: (rocksdb::BlockBasedTableBuilder::Flush()+0x6d) [0x55858e5ddf5d]
 13: (rocksdb::BlockBasedTableBuilder::Add(rocksdb::Slice const&, 
rocksdb::Slice const&)+0x2b8) [0x55858e5e13c8]  
 
 14: (rocksdb::BuildTable(std::__cxx11::basic_string, std::allocator > const&, rocksdb::Env*, 
rocksdb::FileSystem*, rocksdb::ImmutableCFOptions const&, 
rocksdb::MutableCFOptions const&, rocksdb::FileOptions const&, 
rocksdb::TableCache*, rocksdb::InternalIteratorBase*, 
std::vector >, 
std::allocator > > >, 
rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, 
std::vector >, 
std::allocator > > > const*, unsigned 
int, std::__cxx11::basic_string, 
std::allocator > const&, std::vector >, unsigned long, rocksdb::SnapshotChecker*, 
rocksdb::CompressionType, unsigned long, rocksdb::CompressionOptions const&, 
bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, 
rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, 
rocksdb::TableProperties*, int, unsigned long, unsigned long, 

[ceph-users] Re: Ceph configuration for rgw

2022-09-26 Thread Tony Liu
You can always "config get" what was set by "config set", cause that's just
write and read KV to and from configuration DB.

To "config show" what was set by "config set" requires the support for mgr
to connect to the service daemon to get running config. I see such support
for mgr, mon and osd, but not rgw.

The case I am asking is about the latter, for rgw, after "config set", I can't 
get
it by "config show". I'd like to know if this is expected.

Also, the config in configuration DB doesn't seem being applied to rgw, even
restart the service.

I also noticed that, when cephadm deploys rgw, it tries to add firewall rule for
the open port. In my case, the port is no in "public" zone. And I don't see a 
way
to set the zone or disable this action.


Thanks!
Tony

From: Eugen Block 
Sent: September 26, 2022 12:08 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: Ceph configuration for rgw

Just adding this:

ses7-host1:~ # ceph config set client.rgw.ebl-rgw rgw_frontends "beast
port=8080"

This change is visible in the config get output:

client.rgw.ebl-rgwbasic rgw_frontendsbeast port=8080


Zitat von Eugen Block :

> Hi,
>
> the docs [1] show how to specifiy the rgw configuration via yaml
> file (similar to OSDs).
> If you applied it with ceph orch you should see your changes in the
> 'ceph config dump' output, or like this:
>
> ---snip---
> ses7-host1:~ # ceph orch ls | grep rgw
> rgw.ebl-rgw?:80 2/2  33s ago3M   ses7-host3;ses7-host4
>
> ses7-host1:~ # ceph config get client.rgw.ebl-rgw
> WHO MASK  LEVEL OPTION   VALUE
>
>  RO
> globalbasic container_image
> registry.fqdn:5000/ses/7.1/ceph/ceph@sha256:...  *
> client.rgw.ebl-rgwbasic rgw_frontendsbeast port=80
>
>  *
> client.rgw.ebl-rgwadvanced  rgw_realmebl-rgw
>
>  *
> client.rgw.ebl-rgwadvanced  rgw_zone ebl-zone
> ---snip---
>
> As you see the RGWs are clients so you need to consider that when
> you request the current configuration. But what I find strange is
> that apparently it only shows the config initially applied, it
> doesn't show the changes after running 'ceph orch apply -i rgw.yaml'
> although the changes are applied to the containers after restarting
> them. I don't know if this is intended but sounds like a bug to me
> (I haven't checked).
>
>> 1) When start rgw with cephadm ("orch apply -i "), I have
>> to start the daemon
>>then update configuration file and restart. I don't find a way
>> to achieve this by single step.
>
> I haven't played around too much yet, but you seem to be right,
> changing the config isn't applied immediately, but only after a
> service restart ('ceph orch restart rgw.ebl-rgw'). Maybe that's on
> purpose? So you can change your config now and apply it later when a
> service interruption is not critical.
>
>
> [1] https://docs.ceph.com/en/pacific/cephadm/services/rgw/
>
> Zitat von Tony Liu :
>
>> Hi,
>>
>> The cluster is Pacific 16.2.10 with containerized service and
>> managed by cephadm.
>>
>> "config show" shows running configuration. Who is supported?
>> mon, mgr and osd all work, but rgw doesn't. Is this expected?
>> I tried with client. and
>> without "client",
>> neither works.
>>
>> When issue "config show", who connects the daemon and retrieves
>> running config?
>> Is it mgr or mon?
>>
>> Config update by "config set" will be populated to the service.
>> Which services are
>> supported by this? I know mon, mgr and osd work, but rgw doesn't.
>> Is this expected?
>> I assume this is similar to "config show", this support needs the
>> capability of mgr/mon
>> to connect to service daemon?
>>
>> To get running config from rgw, I always do
>> "docker exec  ceph daemon  config show".
>> Is that the only way? I assume it's the same to get running config
>> from all services.
>> Just the matter of supported by mgr/mon or not?
>>
>> I've been configuring rgw by configuration file. Is that the
>> recommended way?
>> I tried with configuration db, like "config set", it doesn't seem working.
>> Is this expected?
>>
>> I see two cons with configuration file for rgw.
>> 1) When start rgw with cephadm ("orch apply -i "), I have
>> to start the daemon
>>then update

[ceph-users] Ceph configuration for rgw

2022-09-24 Thread Tony Liu
Hi,

The cluster is Pacific 16.2.10 with containerized service and managed by 
cephadm.

"config show" shows running configuration. Who is supported?
mon, mgr and osd all work, but rgw doesn't. Is this expected?
I tried with client. and without "client",
neither works.

When issue "config show", who connects the daemon and retrieves running config?
Is it mgr or mon?

Config update by "config set" will be populated to the service. Which services 
are
supported by this? I know mon, mgr and osd work, but rgw doesn't. Is this 
expected?
I assume this is similar to "config show", this support needs the capability of 
mgr/mon
to connect to service daemon?

To get running config from rgw, I always do
"docker exec  ceph daemon  config show".
Is that the only way? I assume it's the same to get running config from all 
services.
Just the matter of supported by mgr/mon or not?

I've been configuring rgw by configuration file. Is that the recommended way?
I tried with configuration db, like "config set", it doesn't seem working.
Is this expected?

I see two cons with configuration file for rgw.
1) When start rgw with cephadm ("orch apply -i "), I have to start 
the daemon
then update configuration file and restart. I don't find a way to achieve 
this by single step.
2) When "orch daemon redeploy" or upgrade rgw, the configuration file will be 
re-generated
   and I have to update it again.
Is this all how it's supposed to work or I am missing anything?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] host disk used by osd container

2022-06-15 Thread Tony Liu
Hi,

"df -h" on the OSD host shows 187G is being used.
"du -sh /" shows 36G. bluefs_buffered_io is enabled here.
What's  taking that 150G disk space, cache?
Then where is that cache file? Any way to configure it smaller?

# free -h
  totalusedfree  shared  buff/cache   available
Mem:  187Gi28Gi   4.4Gi   4.1Gi   154Gi   152Gi
Swap: 8.0Gi82Mi   7.9Gi

# df -h
FilesystemSize  Used Avail Use% Mounted on
devtmpfs   94G 0   94G   0% /dev
tmpfs  94G 0   94G   0% /dev/shm
tmpfs  94G  4.2G   90G   5% /run
tmpfs  94G 0   94G   0% /sys/fs/cgroup
/dev/mapper/vg0-root  215G  187G   29G  87% /
/dev/sdk2 239M  150M   72M  68% /boot
/dev/sdk1 250M  6.9M  243M   3% /boot/efi
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/bc4904d8da14dd9ab0fbc49ae60f20ba4a3cbf8f361c0ed13e818e0d65e22531/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/617494be5e05d5f91d1d08aad6b6ace8f335a346ca9ea868dc2bc7fd07906901/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/3b039d5daffaf212d3384afc30b5bf75353fd215b238101b9bfba4050638eab5/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/507c24e65c7cd075b5e1ab4901f8e198263c85265b3e4610606dc3dfd4dad0b5/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/6856e867322a73cb1d0e203a2c12f8516bd76fa3866a945b199e477396704f76/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/5bb197c41ba584981d1767e377bff84cd13750476a63f26206f58b274d854739/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/1c2b16f94ffda06fc277c6906e6df8bd150de16c80a1ba7f113f0774ad8a5de1/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/58de1f5b8e3638e94cbc55f02f690937295e8714dfea44f155271df70093a69f/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/be5d6ac02ab83436b18c475f43df48732c0b2b5c73732237064631deb2d5243f/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/e59810bd48f0667bd3f91dcc65ec1b51227314754dfbcc7ba8dee376bdcd4c0a/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/78326ce8e1cf36680eaa56e744b4ea97f1b358adac17eacaf67b88937dd5e876/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/6a53cf14e33b69c418794514fbd35f5257c553f5a9b0ead62e03b76163112de4/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/efec8e5382be117acdbfc81e9d9a9fbc62e289c2a9fcdfa4c53868de50faf420/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/eca247de3d54f43372961146b84b485d7c5715d1784afae83e44763717ecf552/merged
tmpfs  19G 0   19G   0% /run/user/0
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/bfdf90bdc15c9059d9436caddb1d927788ae9eeff15df631ed150cae966528eb/merged
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 15 and Podman compatability

2022-05-19 Thread Tony Liu
I hit couple issues with Podman when deploy Ceph 16.2.

https://www.spinics.net/lists/ceph-users/msg71367.html
https://www.spinics.net/lists/ceph-users/msg71346.html

Switching back to Docker, all work fine.

Thanks!
Tony

From: Robert Sander 
Sent: May 19, 2022 05:43 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Ceph 15 and Podman compatability

Hi,

the table on https://docs.ceph.com/en/quincy/cephadm/compatibility/
tells us that to run Ceph 15 one needs Podman 2.0 or 2.1 and not 3.0.

While planning to upgrade an installation of Ceph 14 on Ubuntu 18 I only
found Podman 3 for that distribution version (or any newer Ubuntu version).

Is it possible to run current Ceph 15 containers with Podman 3.0 today?

The only way I found is to use .deb packages to upgrade to Ceph 15, then
upgrade the distribution to Ubuntu 20, upgrade to Ceph 17 (.deb packages
available for Ubuntu 20) and then containerize it with Podman 3.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rbd mirror between clusters with private "public" network

2022-04-25 Thread Tony Liu
Hi,

I understand that, for rbd mirror to work, the rbd mirror service requires the
connectivity to all nodes from both cluster.

In my case, for security purpose, the "public" network is actually a private 
network,
which is not routable to external. All internal RBD clients are on that private 
network.
I also put HAProxy there for accessing dashboard and radosgw from external.

I wonder if there is any way to use rbd-mirror in this case?
Using some sort of proxy?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: the easiest way to copy image to another cluster

2022-04-21 Thread Tony Liu
Thank you Anthony! I agree that rbd-mirror is more reliable and manageable
and it's not that complicated to user. I will try both and see which works 
better
for me.

Tony

From: Anthony D'Atri 
Sent: April 21, 2022 09:02 PM
To: Tony Liu
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: Re: [ceph-users] the easiest way to copy image to another cluster

As someone noted, rbd export / import work.  I’ve also used rbd-mirror for 
capacity management, it works well for moving attached as well as unattached 
images.  When using rbd-mirror to move 1-2 images at a time, adjustments to 
default parameters speeds progress substantially.  It’s easy to see when src 
and dst are synced, then flip primary / secondary, disable mirroring, and rm 
the src.  I’ve used this technique (via an in-house wrapper service) to move 
hundreds of images.  It can handle snaps even, with the right execution.



> On Apr 21, 2022, at 8:40 PM, Tony Liu  wrote:
>
> Hi,
>
> I want to copy an image, which is not being used, to another cluster.
> rbd-mirror would do it, but rbd-mirror is designed to handle image
> which is being used/updated, to ensure the mirrored image is always
> consistent with the source. I wonder if there is any easier way to copy
> an image without worrying about the update/sync, like copy a snapshot
> or a backup image.
>
>
> Thanks!
> Tony
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: the easiest way to copy image to another cluster

2022-04-21 Thread Tony Liu
Thank you Mart! Pipe is indeed easier.
I found this blog. Will give it a try.
https://machinenix.com/ceph/how-to-export-a-ceph-rbd-image-from-one-cluster-to-another-without-using-a-bridge-server

Tony

From: Mart van Santen 
Sent: April 21, 2022 08:52 PM
To: Tony Liu
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: Re: [ceph-users] the easiest way to copy image to another cluster

Hi Tony,

Have a look at rbd export and rbd import, they dump the image to a file or 
stdout. You can pipe the rbd export directly into an rbd import assuming you 
have a host which has access to both ceph clusters.

Hope this helps!

Mart

From mobile

> On Apr 22, 2022, at 11:42, Tony Liu  wrote:
>
> Hi,
>
> I want to copy an image, which is not being used, to another cluster.
> rbd-mirror would do it, but rbd-mirror is designed to handle image
> which is being used/updated, to ensure the mirrored image is always
> consistent with the source. I wonder if there is any easier way to copy
> an image without worrying about the update/sync, like copy a snapshot
> or a backup image.
>
>
> Thanks!
> Tony
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] the easiest way to copy image to another cluster

2022-04-21 Thread Tony Liu
Hi,

I want to copy an image, which is not being used, to another cluster.
rbd-mirror would do it, but rbd-mirror is designed to handle image
which is being used/updated, to ensure the mirrored image is always
consistent with the source. I wonder if there is any easier way to copy
an image without worrying about the update/sync, like copy a snapshot
or a backup image.


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: logging with container

2022-03-24 Thread Tony Liu
Thank you Adam! After "orch daemon redeploy", all works as expected.

Tony

From: Adam King 
Sent: March 24, 2022 11:50 AM
To: Tony Liu
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: Re: [ceph-users] Re: logging with container

Hmm, I'm assuming from "Setting "log_to_stderr" doesn't help" you've already 
tried all the steps in 
https://docs.ceph.com/en/latest/cephadm/operations/#disabling-logging-to-journald.
 That's meant to be the steps for stopping cluster logs from going to the 
container logs. From my personal testing, just setting the global config 
options made it work for all the daemons without needing to redeploy or set any 
of the values at runtime. I verified locally after setting log to file to true 
as well as the steps in the posted link new logs were getting put in 
/var/log/ceph//mon.host1 file but the journal had no new logs after when 
I changed the settings. Perhaps because you've modified the values directly at 
runtime for the daemons it isn't picking up the set config options as runtime 
changes override config options? It could be worth trying just redeploying the 
daemons after having all 6 of the relevant config options set properly. I'll 
also note that I have been using podman. Not sure
  if there is some major logging difference between podman and docker.

Thanks,

 - Adam King

On Thu, Mar 24, 2022 at 1:00 PM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Any comments on this?

Thanks!
Tony
________
From: Tony Liu mailto:tonyliu0...@hotmail.com>>
Sent: March 21, 2022 10:01 PM
To: Adam King
Cc: ceph-users@ceph.io<mailto:ceph-users@ceph.io>; 
d...@ceph.io<mailto:d...@ceph.io>
Subject: [ceph-users] Re: logging with container

Hi Adam,

When I do "ceph tell mon.ceph-1 config set log_to_file true",
I see the log file is created. That confirms that those options in command line
can only be override by runtime config change.
Could you check mon and mgr logging on your setup?

Can we remove those options in command line and let logging to be controlled
by cluster configuration or configuration file?

Another issue is that, log keeps going to 
/var/lib/docker/containers//-json.log,
which keeps growing up and it's not under logrotate management. How can I stop
logging to container stdout/stderr? Setting "log_to_stderr" doesn't help.


Thanks!
Tony

From: Tony Liu mailto:tonyliu0...@hotmail.com>>
Sent: March 21, 2022 09:41 PM
To: Adam King
Cc: ceph-users@ceph.io<mailto:ceph-users@ceph.io>; 
d...@ceph.io<mailto:d...@ceph.io>
Subject: [ceph-users] Re: logging with container

Hi Adam,

# ceph config get mgr log_to_file
true
# ceph config get mgr log_file
/var/log/ceph/$cluster-$name.log
# ceph config get osd log_to_file
true
# ceph config get osd log_file
/var/log/ceph/$cluster-$name.log
# ls /var/log/ceph/fa771070-a975-11ec-86c7-e4434be9cb2e/
ceph-osd.10.log  ceph-osd.13.log  ceph-osd.16.log  ceph-osd.19.log  
ceph-osd.1.log  ceph-osd.22.log  ceph-osd.4.log  ceph-osd.7.log  ceph-volume.log
# ceph version
ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)

"log_to_file" and "log_file" are set the same for mgr and osd, but why there is 
osd log only,
but no mgr log?


Thanks!
Tony
________
From: Adam King mailto:adk...@redhat.com>>
Sent: March 21, 2022 08:26 AM
To: Tony Liu
Cc: ceph-users@ceph.io<mailto:ceph-users@ceph.io>; 
d...@ceph.io<mailto:d...@ceph.io>
Subject: Re: [ceph-users] logging with container

Hi Tony,

Afaik those container flags just set the defaults and the config options 
override them. Setting the necessary flags 
(https://docs.ceph.com/en/latest/cephadm/operations/#logging-to-files) seemed 
to work for me.

[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config set global log_to_file true
[ceph: root@vm-00 /]# ceph config set global mon_cluster_log_to_file true
[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph version
ceph version 16.2.7-601-g179a7bca (179a7bca8a84771b0dde09e26f7a2146a985df90) 
pacific (stable)
[ceph: root@vm-00 /]# exit
exit
[root@vm-00 ~]# ls /var/log/ceph/413f7ec8-a91e-11ec-9b02-52540092b5a3/
ceph.audit.log  ceph.cephadm.log  ceph.log  ceph-mgr.vm-00.ukcctb.log  
ceph-mon.vm-00.log  ceph-osd.0.log  ceph-osd.10.log  ceph-osd.2.log  
ceph-osd.4.log  ceph-osd.6.log  ceph-osd.8.log  ceph-volume.log



On Mon, Mar 21, 2022 at 1:06 AM Tony Liu 
mailto:tonyliu0...@hotmail.com><mailto:tonyliu0...@hotmail.com<mailto:tonyliu0...@hotmail.com>>>
 wrote:
Hi,

After reading through doc, it's still not very clear to me how logging works 
wi

[ceph-users] Re: logging with container

2022-03-24 Thread Tony Liu
Any comments on this?

Thanks!
Tony

From: Tony Liu 
Sent: March 21, 2022 10:01 PM
To: Adam King
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: [ceph-users] Re: logging with container

Hi Adam,

When I do "ceph tell mon.ceph-1 config set log_to_file true",
I see the log file is created. That confirms that those options in command line
can only be override by runtime config change.
Could you check mon and mgr logging on your setup?

Can we remove those options in command line and let logging to be controlled
by cluster configuration or configuration file?

Another issue is that, log keeps going to 
/var/lib/docker/containers//-json.log,
which keeps growing up and it's not under logrotate management. How can I stop
logging to container stdout/stderr? Setting "log_to_stderr" doesn't help.


Thanks!
Tony
________
From: Tony Liu 
Sent: March 21, 2022 09:41 PM
To: Adam King
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: [ceph-users] Re: logging with container

Hi Adam,

# ceph config get mgr log_to_file
true
# ceph config get mgr log_file
/var/log/ceph/$cluster-$name.log
# ceph config get osd log_to_file
true
# ceph config get osd log_file
/var/log/ceph/$cluster-$name.log
# ls /var/log/ceph/fa771070-a975-11ec-86c7-e4434be9cb2e/
ceph-osd.10.log  ceph-osd.13.log  ceph-osd.16.log  ceph-osd.19.log  
ceph-osd.1.log  ceph-osd.22.log  ceph-osd.4.log  ceph-osd.7.log  ceph-volume.log
# ceph version
ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)

"log_to_file" and "log_file" are set the same for mgr and osd, but why there is 
osd log only,
but no mgr log?


Thanks!
Tony

From: Adam King 
Sent: March 21, 2022 08:26 AM
To: Tony Liu
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: Re: [ceph-users] logging with container

Hi Tony,

Afaik those container flags just set the defaults and the config options 
override them. Setting the necessary flags 
(https://docs.ceph.com/en/latest/cephadm/operations/#logging-to-files) seemed 
to work for me.

[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config set global log_to_file true
[ceph: root@vm-00 /]# ceph config set global mon_cluster_log_to_file true
[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph version
ceph version 16.2.7-601-g179a7bca (179a7bca8a84771b0dde09e26f7a2146a985df90) 
pacific (stable)
[ceph: root@vm-00 /]# exit
exit
[root@vm-00 ~]# ls /var/log/ceph/413f7ec8-a91e-11ec-9b02-52540092b5a3/
ceph.audit.log  ceph.cephadm.log  ceph.log  ceph-mgr.vm-00.ukcctb.log  
ceph-mon.vm-00.log  ceph-osd.0.log  ceph-osd.10.log  ceph-osd.2.log  
ceph-osd.4.log  ceph-osd.6.log  ceph-osd.8.log  ceph-volume.log



On Mon, Mar 21, 2022 at 1:06 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi,

After reading through doc, it's still not very clear to me how logging works 
with container.
This is with Pacific v16.2 container.

In OSD container, I see this.
```
/usr/bin/ceph-osd -n osd.16 -f --setuser ceph --setgroup ceph 
--default-log-to-file=false --default-log-to-stderr=true 
--default-log-stderr-prefix=debug
```
When check ceph configuration.
```
# ceph config get osd.16 log_file
/var/log/ceph/$cluster-$name.log
# ceph config get osd.16 log_to_file
true
# ceph config show osd.16 log_to_file
false
```
Q1, what's the intention of those log settings in command line? It's high 
priority and overrides
configuration in file and mon. Is there any option not doing that when deploy 
the container?
Q2, since log_to_file is set to false by command line, why there is still 
loggings in log_file?

The same for mgr and mon.

What I want is to have everything in log file and minimize the stdout and 
stderr from container.
Because log file is managed by logrotate, it unlikely blow up disk space. But 
stdout and stderr
from container is stored in a single file, not managed by logrotate. It may 
grow up to huge file.
Also, it's easier to check log file by vi than "podman logs". And log file is 
also collected and
stored by ELK for central management.

Any comments how I can achieve what I want?
Runtime override may not be the best option, cause it's not persistent.


Thanks!
Tony

___
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___

[ceph-users] Re: logging with container

2022-03-21 Thread Tony Liu
Hi Adam,

When I do "ceph tell mon.ceph-1 config set log_to_file true",
I see the log file is created. That confirms that those options in command line
can only be override by runtime config change.
Could you check mon and mgr logging on your setup?

Can we remove those options in command line and let logging to be controlled
by cluster configuration or configuration file?

Another issue is that, log keeps going to 
/var/lib/docker/containers//-json.log,
which keeps growing up and it's not under logrotate management. How can I stop
logging to container stdout/stderr? Setting "log_to_stderr" doesn't help.


Thanks!
Tony
________
From: Tony Liu 
Sent: March 21, 2022 09:41 PM
To: Adam King
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: [ceph-users] Re: logging with container

Hi Adam,

# ceph config get mgr log_to_file
true
# ceph config get mgr log_file
/var/log/ceph/$cluster-$name.log
# ceph config get osd log_to_file
true
# ceph config get osd log_file
/var/log/ceph/$cluster-$name.log
# ls /var/log/ceph/fa771070-a975-11ec-86c7-e4434be9cb2e/
ceph-osd.10.log  ceph-osd.13.log  ceph-osd.16.log  ceph-osd.19.log  
ceph-osd.1.log  ceph-osd.22.log  ceph-osd.4.log  ceph-osd.7.log  ceph-volume.log
# ceph version
ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)

"log_to_file" and "log_file" are set the same for mgr and osd, but why there is 
osd log only,
but no mgr log?


Thanks!
Tony

From: Adam King 
Sent: March 21, 2022 08:26 AM
To: Tony Liu
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: Re: [ceph-users] logging with container

Hi Tony,

Afaik those container flags just set the defaults and the config options 
override them. Setting the necessary flags 
(https://docs.ceph.com/en/latest/cephadm/operations/#logging-to-files) seemed 
to work for me.

[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config set global log_to_file true
[ceph: root@vm-00 /]# ceph config set global mon_cluster_log_to_file true
[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph version
ceph version 16.2.7-601-g179a7bca (179a7bca8a84771b0dde09e26f7a2146a985df90) 
pacific (stable)
[ceph: root@vm-00 /]# exit
exit
[root@vm-00 ~]# ls /var/log/ceph/413f7ec8-a91e-11ec-9b02-52540092b5a3/
ceph.audit.log  ceph.cephadm.log  ceph.log  ceph-mgr.vm-00.ukcctb.log  
ceph-mon.vm-00.log  ceph-osd.0.log  ceph-osd.10.log  ceph-osd.2.log  
ceph-osd.4.log  ceph-osd.6.log  ceph-osd.8.log  ceph-volume.log



On Mon, Mar 21, 2022 at 1:06 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi,

After reading through doc, it's still not very clear to me how logging works 
with container.
This is with Pacific v16.2 container.

In OSD container, I see this.
```
/usr/bin/ceph-osd -n osd.16 -f --setuser ceph --setgroup ceph 
--default-log-to-file=false --default-log-to-stderr=true 
--default-log-stderr-prefix=debug
```
When check ceph configuration.
```
# ceph config get osd.16 log_file
/var/log/ceph/$cluster-$name.log
# ceph config get osd.16 log_to_file
true
# ceph config show osd.16 log_to_file
false
```
Q1, what's the intention of those log settings in command line? It's high 
priority and overrides
configuration in file and mon. Is there any option not doing that when deploy 
the container?
Q2, since log_to_file is set to false by command line, why there is still 
loggings in log_file?

The same for mgr and mon.

What I want is to have everything in log file and minimize the stdout and 
stderr from container.
Because log file is managed by logrotate, it unlikely blow up disk space. But 
stdout and stderr
from container is stored in a single file, not managed by logrotate. It may 
grow up to huge file.
Also, it's easier to check log file by vi than "podman logs". And log file is 
also collected and
stored by ELK for central management.

Any comments how I can achieve what I want?
Runtime override may not be the best option, cause it's not persistent.


Thanks!
Tony

___
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: logging with container

2022-03-21 Thread Tony Liu
Hi Adam,

# ceph config get mgr log_to_file
true
# ceph config get mgr log_file
/var/log/ceph/$cluster-$name.log
# ceph config get osd log_to_file
true
# ceph config get osd log_file
/var/log/ceph/$cluster-$name.log
# ls /var/log/ceph/fa771070-a975-11ec-86c7-e4434be9cb2e/
ceph-osd.10.log  ceph-osd.13.log  ceph-osd.16.log  ceph-osd.19.log  
ceph-osd.1.log  ceph-osd.22.log  ceph-osd.4.log  ceph-osd.7.log  ceph-volume.log
# ceph version
ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)

"log_to_file" and "log_file" are set the same for mgr and osd, but why there is 
osd log only,
but no mgr log?


Thanks!
Tony

From: Adam King 
Sent: March 21, 2022 08:26 AM
To: Tony Liu
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: Re: [ceph-users] logging with container

Hi Tony,

Afaik those container flags just set the defaults and the config options 
override them. Setting the necessary flags 
(https://docs.ceph.com/en/latest/cephadm/operations/#logging-to-files) seemed 
to work for me.

[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config set global log_to_file true
[ceph: root@vm-00 /]# ceph config set global mon_cluster_log_to_file true
[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph version
ceph version 16.2.7-601-g179a7bca (179a7bca8a84771b0dde09e26f7a2146a985df90) 
pacific (stable)
[ceph: root@vm-00 /]# exit
exit
[root@vm-00 ~]# ls /var/log/ceph/413f7ec8-a91e-11ec-9b02-52540092b5a3/
ceph.audit.log  ceph.cephadm.log  ceph.log  ceph-mgr.vm-00.ukcctb.log  
ceph-mon.vm-00.log  ceph-osd.0.log  ceph-osd.10.log  ceph-osd.2.log  
ceph-osd.4.log  ceph-osd.6.log  ceph-osd.8.log  ceph-volume.log



On Mon, Mar 21, 2022 at 1:06 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi,

After reading through doc, it's still not very clear to me how logging works 
with container.
This is with Pacific v16.2 container.

In OSD container, I see this.
```
/usr/bin/ceph-osd -n osd.16 -f --setuser ceph --setgroup ceph 
--default-log-to-file=false --default-log-to-stderr=true 
--default-log-stderr-prefix=debug
```
When check ceph configuration.
```
# ceph config get osd.16 log_file
/var/log/ceph/$cluster-$name.log
# ceph config get osd.16 log_to_file
true
# ceph config show osd.16 log_to_file
false
```
Q1, what's the intention of those log settings in command line? It's high 
priority and overrides
configuration in file and mon. Is there any option not doing that when deploy 
the container?
Q2, since log_to_file is set to false by command line, why there is still 
loggings in log_file?

The same for mgr and mon.

What I want is to have everything in log file and minimize the stdout and 
stderr from container.
Because log file is managed by logrotate, it unlikely blow up disk space. But 
stdout and stderr
from container is stored in a single file, not managed by logrotate. It may 
grow up to huge file.
Also, it's easier to check log file by vi than "podman logs". And log file is 
also collected and
stored by ELK for central management.

Any comments how I can achieve what I want?
Runtime override may not be the best option, cause it's not persistent.


Thanks!
Tony

___
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: bind monitoring service to specific network and port

2022-03-21 Thread Tony Liu
It's probably again related to podman. After switching back to Docker,
this works fine.

Thanks!
Tony

From: Tony Liu 
Sent: March 20, 2022 06:31 PM
To: ceph-users@ceph.io; d...@ceph.io
Subject: [ceph-users] bind monitoring service to specific network and port

Hi,

https://docs.ceph.com/en/pacific/cephadm/services/monitoring/#networks-and-ports
When I try that with Pacific v16.2 image, port works, network doesn't.
No matter which network specified in yaml file, orch apply always bind the 
service to *.
Is this known issue or something I am missing?
Could anyone point me to the coding for this?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: orch apply failed to use insecure private registry

2022-03-21 Thread Tony Liu
It's podman issue.
https://github.com/containers/podman/issues/11933
Switch back to Docker.

Thanks!
Tony

From: Eugen Block 
Sent: March 21, 2022 06:11 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: orch apply failed to use insecure private registry

Hi,

> Setting mgr/cephadm/registry_insecure to false doesn't help.

if you want to use an insecure registry you would need to set this
option to true, not false.

> I am using podman and /etc/containers/registries.conf is set with
> that insecure private registry.

Can you paste the whole content? It's been two years or so since I
tested a setup with an insecure registry, I believe the
registries.conf also requires a line with "insecure = true". I'm not
sure if this will be enough, though. Did you successfully login to the
registry from all nodes?

ceph cephadm registry-login my_url my_username my_password

Zitat von Tony Liu :

> Hi,
>
> I am using Pacific v16.2 container image. I put images on a insecure
> private registry.
> I am using podman and /etc/containers/registries.conf is set with
> that insecure private registry.
> "cephadm bootstrap" works fine to pull the image and setup the first node.
> When "ceph orch apply -i service.yaml" to deploy services on all
> nodes, "ceph log last cephadm"
> shows the failure to ping private registry with SSL.
> Setting mgr/cephadm/registry_insecure to false doesn't help.
> I have to manuall pull all images on all nodes, then "orch apply"
> continues and all services are deployed.
> Is this known issue or some settings I am missing?
> Could anyone point me to the cephadm code to pull container image?
>
>
> Thanks!
> Tony
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] logging with container

2022-03-20 Thread Tony Liu
Hi,

After reading through doc, it's still not very clear to me how logging works 
with container.
This is with Pacific v16.2 container.

In OSD container, I see this.
```
/usr/bin/ceph-osd -n osd.16 -f --setuser ceph --setgroup ceph 
--default-log-to-file=false --default-log-to-stderr=true 
--default-log-stderr-prefix=debug
```
When check ceph configuration.
```
# ceph config get osd.16 log_file
/var/log/ceph/$cluster-$name.log
# ceph config get osd.16 log_to_file
true
# ceph config show osd.16 log_to_file
false
```
Q1, what's the intention of those log settings in command line? It's high 
priority and overrides
configuration in file and mon. Is there any option not doing that when deploy 
the container?
Q2, since log_to_file is set to false by command line, why there is still 
loggings in log_file?

The same for mgr and mon.

What I want is to have everything in log file and minimize the stdout and 
stderr from container.
Because log file is managed by logrotate, it unlikely blow up disk space. But 
stdout and stderr
from container is stored in a single file, not managed by logrotate. It may 
grow up to huge file.
Also, it's easier to check log file by vi than "podman logs". And log file is 
also collected and
stored by ELK for central management.

Any comments how I can achieve what I want?
Runtime override may not be the best option, cause it's not persistent.


Thanks!
Tony

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] bind monitoring service to specific network and port

2022-03-20 Thread Tony Liu
Hi,

https://docs.ceph.com/en/pacific/cephadm/services/monitoring/#networks-and-ports
When I try that with Pacific v16.2 image, port works, network doesn't.
No matter which network specified in yaml file, orch apply always bind the 
service to *.
Is this known issue or something I am missing?
Could anyone point me to the coding for this?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] orch apply failed to use insecure private registry

2022-03-20 Thread Tony Liu
Hi,

I am using Pacific v16.2 container image. I put images on a insecure private 
registry.
I am using podman and /etc/containers/registries.conf is set with that insecure 
private registry.
"cephadm bootstrap" works fine to pull the image and setup the first node.
When "ceph orch apply -i service.yaml" to deploy services on all nodes, "ceph 
log last cephadm"
shows the failure to ping private registry with SSL.
Setting mgr/cephadm/registry_insecure to false doesn't help.
I have to manuall pull all images on all nodes, then "orch apply" continues and 
all services are deployed.
Is this known issue or some settings I am missing?
Could anyone point me to the cephadm code to pull container image?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [rgw][dashboard] dashboard can't access rgw behind proxy

2022-01-19 Thread Tony Liu
Hi,

I have 3 rgw services behind HAProxy. rgw-api-host and rgw-api-port are set 
properly
to the VIP and port. "curl http://:" works fine. But dashboard 
complains
that it can't find rgw service on that vip:port. If I set rgw-api-host directly 
to the node,
it also works fine. I ran tcpdump on the active mgr node, I don't see any 
traffic going
out to that VIP at all. Am I missing anything here? Does dashboard need to 
resolve
the rgw-api-host with some name?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Ceph-community] Why MON,MDS,MGR are on Public network?

2021-11-29 Thread Tony Liu
Is there any measurement about how much bandwidth will be taken by
private traffic vs. public/client traffic when they are on the same network?
I am currently having two 2x10G bondings for public and private, the intention
is to provide 2x10G bandwidth for clients. I do understand the overhead caused
by more networks, but I think it's more critical to guarantee client bandwidth,
specially when there is more private traffic during maintenance, rebalance, etc.

Thanks!
Tony

From: Anthony D'Atri 
Sent: November 29, 2021 02:14 PM
To: ceph-us...@ceph.com
Subject: [ceph-users] Re: [Ceph-community] Why MON,MDS,MGR are on Public 
network?



>
>> I don't trust the public network and afraid of if mons goes down due to this 
>> problem? So to be more secure and faster I need to understand the reason; 3- 
>> Why Mon,Mds,Mgr >should be
>>  on public network?
>

Remember that the clients need to reach the mons and any MDS the cluster has.  
It is not unusual for a separate replication/private/backend network to not 
have a default route or otherwise be unreachable from non-OSD nodes.

> The idea to separate OSD<->OSD traffic probably comes from the fact
> that replication means data gets multiplied over the network, so if a
> client writes 1G data to a pool with replication=3, then two more
> copies of that 1G needs to be sent, and if you do that on the "public"
> network, you might starve it with replication (or repair/backfill)
> traffic.

Indeed, one of the rationales is to prevent client/mon traffic and OSD 
replication traffic from DoSing each other.

> Many run with only one network, using as fast a network as you can
> afford, but if two separate networks at moderate speed is cheaper than
> one super fast, it might be worth considering, otherwise just scale
> the one single network to your needs.

Notably sometimes we see nodes with only two network ports.  One could run 
separate public/client and private/replication networks, without redundancy — 
or use bonding / EQR for redundancy but no dedicated replication network.

The two-network strategy dates from a time when 1 Gb/s networking was common 
and 10 Gb/s was cutting edge.  With today’s faster networks and Ceph’s multiple 
improvements in recovery/backfill, the equation and tradeoffs are different 
from where they were ten years ago.  Ceph is pretty good these days at 
detecting when an entire node is down, and with scrub randomization, reporters 
settings and a wise mon_osd_down_out_subtree_limit setting, thundering herds of 
backfill/recovery are much less of a problem than they used to be.

Switches, patch panels, crossconnects take up RUs and cost OpEx.  Sometimes the 
RUs saved by not having two networks means you can fit another node or two into 
each rack.

Having a replication network can result in certain flapping situations that can 
be tricky to troubleshoot, that’s what finally led me to embrace the single 
network architecture.  ymmv.

Also when you have five network connections to a given node, it’s super easy 
during maintenance to not get them plugged back in correctly, no matter how 
laboriously one labels the cables.  (dual public, dual private, BMC/netmgt).  
Admittedly this probably isn’t a gating factor, but it still happens ;)

— aad





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [EXTERNAL] Re: Why you might want packages not containers for Ceph deployments

2021-11-18 Thread Tony Liu
Instead of complaining, take some time to learn more about container would help.

Tony

From: Marc 
Sent: November 18, 2021 10:50 AM
To: Pickett, Neale T; Hans van den Bogert; ceph-users@ceph.io
Subject: [ceph-users] Re: [EXTERNAL] Re: Why you might want packages not 
containers for Ceph deployments

> We also use containers for ceph and love it. If for some reason we
> couldn't run ceph this way any longer, we would probably migrate
> everything to a different solution. We are absolutely committed to
> containerization.

I wonder if you are really using containers. Are you not just using ceph-adm? 
If you would be using containers you would have selected your OC already, and 
would be pissed about how the current containers are being developed and have 
to use a 2nd system.






___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph cluster Sync

2021-10-12 Thread Tony Liu
For PR-DR case, I am using RGW multi-site support to replicate backup image.

Tony

From: Manuel Holtgrewe 
Sent: October 12, 2021 11:40 AM
To: dhils...@performair.com
Cc: mico...@gmail.com; ceph-users
Subject: [ceph-users] Re: Ceph cluster Sync

To chime in here, there is

https://github.com/45Drives/cephgeorep

That allows cephfs replication pre pacific.

There is a mail thread somewhere on the list where a ceph developer warns
about semantics issues of recursive mtime even on pacific. However,
according to 45 drives they have never had an issue so YMMD.

HTH

 schrieb am Di., 12. Okt. 2021, 18:55:

> Michel;
>
> I am neither a Ceph evangelist, nor a Ceph expert, but here is my current
> understanding:
> Ceph clusters do not have in-built cross cluster synchronization.  That
> said, there are several things which might meet your needs.
>
> 1) If you're just planning your Ceph deployment, then the latest release
> (Pacific) introduced the concept of a stretch cluster, essentially a
> cluster which is stretched across datacenters (i.e. a relatively
> low-bandwidth, high-latency link)[1].
>
> 2) RADOSGW allows for uni-directional as well as bi-directional
> synchronization of the data that it handles.[2]
>
> 3) RBD provides mirroring functionality for the data it handles.[3]
>
> Thank you,
>
> Dominic L. Hilsbos, MBA
> Vice President - Information Technology
> Perform Air International Inc.
> dhils...@performair.com
> www.PerformAir.com
>
> [1] https://docs.ceph.com/en/latest/rados/operations/stretch-mode/
> [2] https://docs.ceph.com/en/latest/radosgw/sync-modules/
> [3] https://docs.ceph.com/en/latest/rbd/rbd-mirroring/
>
>
> -Original Message-
> From: Michel Niyoyita [mailto:mico...@gmail.com]
> Sent: Tuesday, October 12, 2021 8:35 AM
> To: ceph-users
> Subject: [ceph-users] Ceph cluster Sync
>
> Dear team
>
> I want to build two different cluster: one for primary site and the second
> for DR site. I would like to ask if these two cluster can
> communicate(synchronized) each other and data written to the PR site be
> synchronized to the DR site ,  if once we got trouble for the PR site the
> DR automatically takeover.
>
> Please help me for the solution or advise me how to proceed
>
> Best Regards
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] etcd support

2021-09-20 Thread Tony Liu
Hi,

I wonder if anyone could share some experiences in etcd support by Ceph.
My users build Kubernetes cluster in VMs on OpenStack with Ceph.
With HDD (DB/WAL on SSD) volume, etcd performance test fails sometimes
because of latency. With SSD (all SSD) volume, it works fine.
I wonder if there is anything I can improve with HDD volume, or it has to be
SSD volume to support etcd?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cannot create a container, mandatory "Storage Policy" dropdown field is empty

2021-09-13 Thread Tony Liu
Update /usr/lib/python3.6/site-packages/swiftclient/client.py and restart 
container horizon.
This is to fix the error message on dashboard when it tries to retrieve policy 
list.

-parsed = urlparse(urljoin(url, '/info'))
+parsed = urlparse(urljoin(url, '/swift/info'))


Tony

From: Michel Niyoyita 
Sent: September 13, 2021 01:08 AM
To: ceph-users
Subject: [ceph-users] Cannot create a container, mandatory "Storage Policy" 
dropdown field is empty

  Hello team ,

I am replacing swift by Ceph Radosgateway , and I am successful by creating
containers through openstack and ceph CLI side . but once trying to create
through the horizon dashboard I get errors:  *Error: *Unable to fetch the
policy details. ,  Unable to get the Swift service info   and Unable to get
the Swift container listing.

anyone faced the same issue can help
I  deployed Openstack wallaby using kolla-ansible running on ubuntu 20.04
and ceph pacific using ansible running on ubuntu 20.04

Michel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: debug RBD timeout issue

2021-09-08 Thread Tony Liu
Here it is.

[global]
fsid = 35d050c0-77c0-11eb-9242-2cea7ff9d07c
mon_host = [v2:10.250.50.80:3300/0,v1:10.250.50.80:6789/0] 
[v2:10.250.50.81:3300/0,v1:10.250.50.81:6789/0] 
[v2:10.250.50.82:3300/0,v1:10.250.50.82:6789/0]


Thanks!
Tony

From: Konstantin Shalygin 
Sent: September 8, 2021 08:29 AM
To: Tony Liu
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: Re: [ceph-users] debug RBD timeout issue

What is ceoh.conf for this rbd client?


k

Sent from my iPhone

> On 7 Sep 2021, at 19:54, Tony Liu  wrote:
>
>
> I have OpenStack Ussuri and Ceph Octopus. Sometimes, I see timeout when create
> or delete volumes. I can see RBD timeout from cinder-volume. Has anyone seen 
> such
> issue? I'd like to see what happens on Ceph. Which service should I look 
> into? Is it stuck
> with mon or any OSD? Any option to enable debugging to get more details?

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: debug RBD timeout issue

2021-09-08 Thread Tony Liu
That's what I am trying to figure out, "what exactly could cause a timeout".
User creates 10 VMs (boot on volume and an attached volume) by Terraform,
then destroy them. Repeat the same, it works fine most times, timeout happens
sometimes at different places, volume creation or volume deletion.
Since Terraform manages resources in parallel, 10 by default, not sure if it 
matters
how cinder-volume handles those requests. I doubt I can reproduce it with rbd
directly.
I will enable debug logging in cinder-volume to get more info. In the meantime,
I wonder how I can get more info from Ceph to understand such timeout better.


Thanks!
Tony

From: Eugen Block 
Sent: September 8, 2021 01:05 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: debug RBD timeout issue

Hi,

from an older cloud version I remember having to increase these settings:

[DEFAULT]
block_device_allocate_retries = 300
block_device_allocate_retries_interval = 10
block_device_creation_timeout = 300


The question is what exactly could cause a timeout. You write that you
only see these timeouts from time to time, then you should try to find
out what the difference is between successful and failing volumes. Is
it the size or anything else? Which glance stores are enabled? Can you
reproduce it, for example 'rbd create...' with the cinder user? Then
you could increase 'debug_rbd' and see if that reveals anything.


Zitat von Tony Liu :

> Hi,
>
> I have OpenStack Ussuri and Ceph Octopus. Sometimes, I see timeout
> when create
> or delete volumes. I can see RBD timeout from cinder-volume. Has
> anyone seen such
> issue? I'd like to see what happens on Ceph. Which service should I
> look into? Is it stuck
> with mon or any OSD? Any option to enable debugging to get more details?
>
> oslo_messaging.rpc.server [req-7802dea8-15f6-4177-b07c-e5241615b777
> d0dddad1fc7a4adf8ef5b185567e1842 b9adeeb6dbd54710a0b033ee49045b54 -
> default default] Exception during message handling: rbd.Timeout:
> [errno 110] error removing image
> oslo_messaging.rpc.server Traceback (most recent call last):
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py",
> line 165, in _process_incoming
> oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py",
> line 276, in dispatch
> oslo_messaging.rpc.server return self._do_dispatch(endpoint,
> method, ctxt, args)
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py",
> line 196, in _do_dispatch
> oslo_messaging.rpc.server result = func(ctxt, **new_args)
> oslo_messaging.rpc.server   File
> "",
> line 2, in delete_volume
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/cinder/coordination.py", line 151,
> in _synchronized
> oslo_messaging.rpc.server return f(*a, **k)
> oslo_messaging.rpc.server   File
> "",
> line 2, in delete_volume
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/cinder/objects/cleanable.py", line
> 212, in wrapper
> oslo_messaging.rpc.server result = f(*args, **kwargs)
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/cinder/volume/manager.py", line
> 917, in delete_volume
> oslo_messaging.rpc.server new_status)
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220,
> in __exit__
> oslo_messaging.rpc.server self.force_reraise()
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196,
> in force_reraise
> oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
> oslo_messaging.rpc.server raise value
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/cinder/volume/manager.py", line
> 899, in delete_volume
> oslo_messaging.rpc.server self.driver.delete_volume(volume)
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py",
> line 1160, in delete_volume
> oslo_messaging.rpc.server _try_remove_volume(client, volume_name)
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/cinder/utils.py", line 696, in
> _wrapper
> oslo_messaging.rpc.server return r.call(f, *args, **kwargs)
> oslo_messaging.rpc.server   File
> "/usr/lib/python3.6/site-packages/retrying.py", line 223, in call
> oslo_messaging.r

[ceph-users] debug RBD timeout issue

2021-09-07 Thread Tony Liu
Hi,

I have OpenStack Ussuri and Ceph Octopus. Sometimes, I see timeout when create
or delete volumes. I can see RBD timeout from cinder-volume. Has anyone seen 
such
issue? I'd like to see what happens on Ceph. Which service should I look into? 
Is it stuck
with mon or any OSD? Any option to enable debugging to get more details?

oslo_messaging.rpc.server [req-7802dea8-15f6-4177-b07c-e5241615b777 
d0dddad1fc7a4adf8ef5b185567e1842 b9adeeb6dbd54710a0b033ee49045b54 - default 
default] Exception during message handling: rbd.Timeout: [errno 110] error 
removing image
oslo_messaging.rpc.server Traceback (most recent call last):
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in 
_process_incoming
oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 276, 
in dispatch
oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, 
args)
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 196, 
in _do_dispatch
oslo_messaging.rpc.server result = func(ctxt, **new_args)
oslo_messaging.rpc.server   File 
"", line 2, in 
delete_volume
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/cinder/coordination.py", line 151, in 
_synchronized
oslo_messaging.rpc.server return f(*a, **k)
oslo_messaging.rpc.server   File 
"", line 2, in 
delete_volume
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/cinder/objects/cleanable.py", line 212, in 
wrapper
oslo_messaging.rpc.server result = f(*args, **kwargs)
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/cinder/volume/manager.py", line 917, in 
delete_volume
oslo_messaging.rpc.server new_status)
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
oslo_messaging.rpc.server self.force_reraise()
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/six.py", 
line 703, in reraise
oslo_messaging.rpc.server raise value
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/cinder/volume/manager.py", line 899, in 
delete_volume
oslo_messaging.rpc.server self.driver.delete_volume(volume)
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py", line 1160, in 
delete_volume
oslo_messaging.rpc.server _try_remove_volume(client, volume_name)
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/cinder/utils.py", line 696, in _wrapper
oslo_messaging.rpc.server return r.call(f, *args, **kwargs)
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/retrying.py", line 223, in call
oslo_messaging.rpc.server return attempt.get(self._wrap_exception)
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/retrying.py", line 261, in get
oslo_messaging.rpc.server six.reraise(self.value[0], self.value[1], 
self.value[2])
oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/six.py", 
line 703, in reraise
oslo_messaging.rpc.server raise value
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/retrying.py", line 217, in call
oslo_messaging.rpc.server attempt = Attempt(fn(*args, **kwargs), 
attempt_number, False)
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py", line 1155, in 
_try_remove_volume
oslo_messaging.rpc.server self.RBDProxy().remove(client.ioctx, volume_name)
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 190, in doit
oslo_messaging.rpc.server result = proxy_call(self._autowrap, f, *args, 
**kwargs)
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 148, in proxy_call
oslo_messaging.rpc.server rv = execute(f, *args, **kwargs)
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 129, in execute
oslo_messaging.rpc.server six.reraise(c, e, tb)
oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/six.py", 
line 703, in reraise
oslo_messaging.rpc.server raise value
oslo_messaging.rpc.server   File 
"/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 83, in tworker
oslo_messaging.rpc.server rv = meth(*args, **kwargs)
oslo_messaging.rpc.server   File "rbd.pyx", line 1283, in rbd.RBD.remove
oslo_messaging.rpc.server rbd.Timeout: [errno 110] error removing image


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd object mapping

2021-08-09 Thread Tony Liu
Thank you Konstantin!
Tony

From: Konstantin Shalygin 
Sent: August 9, 2021 01:20 AM
To: Tony Liu
Cc: ceph-users; d...@ceph.io
Subject: Re: [ceph-users] rbd object mapping


On 8 Aug 2021, at 20:10, Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:

That's what I thought. I am confused by this.

# ceph osd map vm fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk
osdmap e18381 pool 'vm' (4) object 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk' 
-> pg 4.c7a78d40 (4.0) -> up ([4,17,6], p4) acting ([4,17,6], p4)

It calls RBD image "object" and it shows the whole image maps to a single PG,
while the image is actually split into many objects each of which maps to a PG.
How am I supposed to understand the output of this command?

You can execute `ceph osd map vm nonexist` and you will see mapping for 
'nonexist' object. Future mapping...
To achieve mappings for each object of your image, you need to find all objects 
by rbd_header and iterate over this list.



k
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd object mapping

2021-08-08 Thread Tony Liu
>> There are two types of "object", RBD-image-object and 8MiB-block-object.
>> When create a RBD image, a RBD-image-object is created and 12800 
>> 8MiB-block-objects
>> are allocated. That whole RBD-image-object is mapped to a single PG, which 
>> is mapped
>> to 3 OSDs (replica 3). That means, all user data on that RBD image is stored 
>> in those
>> 3 OSDs. Is my understanding correct?
>
> RBD image is not a object, is a bunch of objects as block device abstraction.
> Nope, each object of image may be placed to pseudo random placement. For 
> example if you
> have 1 osds and 10GiB image with 4MiB objects your image may be placed to 
> 2560
> different PGs on 100-1000-2560 OSDs...

That's what I thought. I am confused by this.

# ceph osd map vm fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk
osdmap e18381 pool 'vm' (4) object 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk' 
-> pg 4.c7a78d40 (4.0) -> up ([4,17,6], p4) acting ([4,17,6], p4)

It calls RBD image "object" and it shows the whole image maps to a single PG,
while the image is actually split into many objects each of which maps to a PG.
How am I supposed to understand the output of this command?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd object mapping

2021-08-07 Thread Tony Liu
There are two types of "object", RBD-image-object and 8MiB-block-object.
When create a RBD image, a RBD-image-object is created and 12800 
8MiB-block-objects
are allocated. That whole RBD-image-object is mapped to a single PG, which is 
mapped
to 3 OSDs (replica 3). That means, all user data on that RBD image is stored in 
those
3 OSDs. Is my understanding correct?

I doubt it, because, for example, a Ceph cluster with bunch of 2TB drives, and 
user
won't be able to create RBD image bigger than 2TB. I don't believe that's true.
So, what am I missing here?

Thanks!
Tony

From: Konstantin Shalygin 
Sent: August 7, 2021 11:35 AM
To: Tony Liu
Cc: ceph-users; d...@ceph.io
Subject: Re: [ceph-users] rbd object mapping

Object map show where your object with any object name will be placed in 
defined pool with your crush map, and which of osd will serve this PG.
You can type anything in object name - and the the future placement or 
placement of existing object - this how algo works.

12800 means that your 100GiB image is a 12800 objects of 8 MiB of pool vm. All 
this objects prefixed with rbd header (seems block_name_prefix modern name of 
this)


Cheers,
k

On 7 Aug 2021, at 21:27, Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:

This shows one RBD image is treated as one object, and it's mapped to one PG.
"object" here means a RBD image.

# ceph osd map vm fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk
osdmap e18381 pool 'vm' (4) object 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk' 
-> pg 4.c7a78d40 (4.0) -> up ([4,17,6], p4) acting ([4,17,6], p4)

When show the info of this image, what's that "12800 objects" mean?
And what's that "order 23 (8 MiB objects)" mean?
What's "objects" here?

# rbd info vm/fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk
rbd image 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk':
   size 100 GiB in 12800 objects
   order 23 (8 MiB objects)
   snapshot_count: 0
   id: affa8fb94beb7e
   block_name_prefix: rbd_data.affa8fb94beb7e
   format: 2


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rbd object mapping

2021-08-07 Thread Tony Liu
Hi,

This shows one RBD image is treated as one object, and it's mapped to one PG.
"object" here means a RBD image.

# ceph osd map vm fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk
osdmap e18381 pool 'vm' (4) object 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk' 
-> pg 4.c7a78d40 (4.0) -> up ([4,17,6], p4) acting ([4,17,6], p4)

When show the info of this image, what's that "12800 objects" mean?
And what's that "order 23 (8 MiB objects)" mean?
What's "objects" here?

# rbd info vm/fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk 
rbd image 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk':
size 100 GiB in 12800 objects
order 23 (8 MiB objects)
snapshot_count: 0
id: affa8fb94beb7e
block_name_prefix: rbd_data.affa8fb94beb7e
format: 2


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [cinder-backup][ceph] replicate volume between sites

2021-07-31 Thread Tony Liu
Found a way to import volume [1]. Will validate it.
Keep looking into object store multi-site vs. RBD mirror for replicating volumes
between sites. Any comments is welcome.

[1] 
https://docs.openstack.org/python-cinderclient/latest/cli/details.html#cinder-manage

Thanks!
Tony

From: Tony Liu 
Sent: July 30, 2021 09:16 PM
To: openstack-discuss; ceph-users
Subject: [ceph-users] [cinder-backup][ceph] replicate volume between sites

Hi,

I have two sites with OpenStack Victoria deployed by Kolla and Ceph Octopus
deployed by cephadm. As what I know, either Swift (implemented by RADOSGW)
or RBD is supported to be the backend of cinder-backup. My intention is to use
one of those option to replicate Cinder volume from one site to another, based
on RADOSGW multi-site support or RBD mirroring. I wonder if anyone has done
this and could share some opinions, like pros, cons, which way is better for
which case?

One specific thing I am not clear is that, how to import RBD volume into Cinder
when it gets to another site. There used to be a way, but it's replicated.

Any comments is appreciated.

Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [cinder-backup][ceph] replicate volume between sites

2021-07-30 Thread Tony Liu
Hi,

I have two sites with OpenStack Victoria deployed by Kolla and Ceph Octopus
deployed by cephadm. As what I know, either Swift (implemented by RADOSGW)
or RBD is supported to be the backend of cinder-backup. My intention is to use
one of those option to replicate Cinder volume from one site to another, based
on RADOSGW multi-site support or RBD mirroring. I wonder if anyone has done
this and could share some opinions, like pros, cons, which way is better for
which case?

One specific thing I am not clear is that, how to import RBD volume into Cinder
when it gets to another site. There used to be a way, but it's replicated.

Any comments is appreciated.

Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs?

2021-03-28 Thread Tony Liu
Thank you Stefan and Josh!
Tony

From: Josh Baergen 
Sent: March 28, 2021 08:28 PM
To: Tony Liu
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: Do I need to update ceph.conf and restart each 
OSD after adding more MONs?

As was mentioned in this thread, all of the mon clients (OSDs included) learn 
about other mons through monmaps, which are distributed when mon membership and 
election changes. Thus, your OSDs should already know about the new mons.

mon_host indicates the list of mons that mon clients should try to contact at 
boot. Thus, it's important to have correct in the config but doesn't need to be 
updated after the process starts.

At least that's how I understand it; the config docs aren't terribly clear on 
this behaviour.

Josh


On Sat., Mar. 27, 2021, 2:07 p.m. Tony Liu, 
mailto:tonyliu0...@hotmail.com>> wrote:
Just realized that all config files (/var/lib/ceph///config)
on all nodes are already updated properly. It must be handled as part of adding
MONs. But "ceph config show" shows only single host.

mon_host   
[v2:10.250.50.80:3300/0,v1:10.250.50.80:6789/0<http://10.250.50.80:3300/0,v1:10.250.50.80:6789/0>]
  file

That means I still need to restart all services to apply the update, right?
Is this supposed to be part of adding MONs as well, or additional manual step?


Thanks!
Tony
________
From: Tony Liu mailto:tonyliu0...@hotmail.com>>
Sent: March 27, 2021 12:53 PM
To: Stefan Kooman; ceph-users@ceph.io<mailto:ceph-users@ceph.io>
Subject: [ceph-users] Re: Do I need to update ceph.conf and restart each OSD 
after adding more MONs?

# ceph config set osd.0 mon_host 
[v2:10.250.50.80:3300/0,v1:10.250.50.80:6789/0,v2:10.250.50.81:3300/0,v1:10.250.50.81:6789/0,v2:10.250.50.82:3300/0,v1:10.250.50.82:6789/0<http://10.250.50.80:3300/0,v1:10.250.50.80:6789/0,v2:10.250.50.81:3300/0,v1:10.250.50.81:6789/0,v2:10.250.50.82:3300/0,v1:10.250.50.82:6789/0>]
Error EINVAL: mon_host is special and cannot be stored by the mon

It seems that the only option is to update ceph.conf and restart service.


Tony
________
From: Tony Liu mailto:tonyliu0...@hotmail.com>>
Sent: March 27, 2021 12:20 PM
To: Stefan Kooman; ceph-users@ceph.io<mailto:ceph-users@ceph.io>
Subject: [ceph-users] Re: Do I need to update ceph.conf and restart each OSD 
after adding more MONs?

I expanded MON from 1 to 3 by updating orch service "ceph orch apply".
"mon_host" in all services (MON, MGR, OSDs) is not updated. It's still single
host from source "file".
What's the guidance here to update "mon_host" for all services? I am talking
about Ceph services, not client side.
Should I update ceph.conf for all services and restart all of them?
Or I can update it on-the-fly by "ceph config set"?
In the latter case, where the updated configuration is stored? Is it going to
be overridden by ceph.conf when restart service?


Thanks!
Tony

____
From: Stefan Kooman mailto:ste...@bit.nl>>
Sent: March 26, 2021 12:22 PM
To: Tony Liu; ceph-users@ceph.io<mailto:ceph-users@ceph.io>
Subject: Re: [ceph-users] Do I need to update ceph.conf and restart each OSD 
after adding more MONs?

On 3/26/21 6:06 PM, Tony Liu wrote:
> Hi,
>
> Do I need to update ceph.conf and restart each OSD after adding more MONs?

This should not be necessary, as the OSDs should learn about these
changes through monmaps. Updating the ceph.conf after the mons have been
updated is advised.

> This is with 15.2.8 deployed by cephadm.
>
> When adding MON, "mon_host" should be updated accordingly.
> Given [1], is that update "the monitor cluster’s centralized configuration
> database" or "runtime overrides set by an administrator"?

No need to put that in the centralized config database. I *think* they
mean ceph.conf file on the clients and hosts. At least, that's what you
would normally do (if not using DNS).

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>
___
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>
___
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: memory consumption by osd

2021-03-27 Thread Tony Liu
I don't see any problems yet. All OSDs are working fine.
Just that 1.8GB free memory concerns me.
I know 256GB memory for 10 OSDs (16TB HDD) is a lot, I am planning to
reduce it or increate osd_memory_target (if that's what you meant) to
boost performance. But before doing that, I'd like to understand what's
taking so much buff/cache and if there is any option to control it.


Thanks!
Tony

From: Anthony D'Atri 
Sent: March 27, 2021 07:27 PM
To: ceph-users
Subject: [ceph-users] Re: memory consumption by osd


Depending on your kernel version, MemFree can be misleading.  Attend to the 
value of MemAvailable instead.

Your OSDs all look to be well below the target, I wouldn’t think you have any 
problems.  In fact 256GB for just 10 OSDs is an embarassment of riches.  What 
type of drives are you using, and what’s the cluster used for?  If anything I 
might advise *raising* the target.

You might check tcmalloc usage

https://ceph-devel.vger.kernel.narkive.com/tYp0KkIT/ceph-daemon-memory-utilization-heap-release-drops-use-by-50

but I doubt this is an issue for you.

> What's taking that much buffer?
> # free -h
>  totalusedfree  shared  buff/cache   available
> Mem:  251Gi31Gi   1.8Gi   1.6Gi   217Gi   
> 215Gi
>
> # cat /proc/meminfo
> MemTotal:   263454780 kB
> MemFree: 2212484 kB
> MemAvailable:   226842848 kB
> Buffers:219061308 kB
> Cached:  2066532 kB
> SwapCached:  928 kB
> Active: 142272648 kB
> Inactive:   109641772 kB
> ..
>
>
> Thanks!
> Tony
> 
> From: Tony Liu 
> Sent: March 27, 2021 01:25 PM
> To: ceph-users
> Subject: [ceph-users] memory consumption by osd
>
> Hi,
>
> Here is a snippet from top on a node with 10 OSDs.
> ===
> MiB Mem : 257280.1 total,   2070.1 free,  31881.7 used, 223328.3 buff/cache
> MiB Swap: 128000.0 total, 126754.7 free,   1245.3 used. 221608.0 avail Mem
>
>PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ COMMAND
>  30492 167   20   0 4483384   2.9g  16696 S   6.0   1.2 707:05.25 ceph-osd
>  35396 167   20   0 952   2.8g  16468 S   5.0   1.1 815:58.52 ceph-osd
>  33488 167   20   0 4161872   2.8g  16580 S   4.7   1.1 496:07.94 ceph-osd
>  36371 167   20   0 4387792   3.0g  16748 S   4.3   1.2 762:37.64 ceph-osd
>  39185 167   20   0 5108244   3.1g  16576 S   4.0   1.2 998:06.73 ceph-osd
>  38729 167   20   0 4748292   2.8g  16580 S   3.3   1.1 895:03.67 ceph-osd
>  34439 167   20   0 4492312   2.8g  16796 S   2.0   1.1 921:55.50 ceph-osd
>  31473 167   20   0 4314500   2.9g  16684 S   1.3   1.2 680:48.09 ceph-osd
>  32495 167   20   0 4294196   2.8g  16552 S   1.0   1.1 545:14.53 ceph-osd
>  37230 167   20   0 4586020   2.7g  16620 S   1.0   1.1 844:12.23 ceph-osd
> ===
> Does it look OK with 2GB free?
> I can't tell how that 220GB is used for buffer/cache.
> Is that used by OSDs? Is it controlled by configuration or auto scaling based
> on physical memory? Any clarifications would be helpful.
>
>
> Thanks!
> Tony
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: memory consumption by osd

2021-03-27 Thread Tony Liu
Restarting OSD frees buff/cache memory.
What kind of data is there?
Is there any configuration to control this memory allocation?

Thanks!
Tony

From: Tony Liu 
Sent: March 27, 2021 06:10 PM
To: ceph-users
Subject: [ceph-users] Re: memory consumption by osd

To clarify, to avoid PG log taking too much memory, I already set
osd_max_pg_log_entries from default 1 to 1000.
I checked PG log size. They are all under 1100.
ceph pg dump -f json | jq '.pg_map.pg_stats[]' | grep ondisk_log_size

I also checked eash OSD. The total is only a few hundreds MB.
ceph daemon osd. dump_mempools

And osd_memory_target stays default 4GB.

What's taking that much buffer?
# free -h
  totalusedfree  shared  buff/cache   available
Mem:  251Gi31Gi   1.8Gi   1.6Gi   217Gi   215Gi

# cat /proc/meminfo
MemTotal:   263454780 kB
MemFree: 2212484 kB
MemAvailable:   226842848 kB
Buffers:219061308 kB
Cached:  2066532 kB
SwapCached:  928 kB
Active: 142272648 kB
Inactive:   109641772 kB
..


Thanks!
Tony

From: Tony Liu 
Sent: March 27, 2021 01:25 PM
To: ceph-users
Subject: [ceph-users] memory consumption by osd

Hi,

Here is a snippet from top on a node with 10 OSDs.
===
MiB Mem : 257280.1 total,   2070.1 free,  31881.7 used, 223328.3 buff/cache
MiB Swap: 128000.0 total, 126754.7 free,   1245.3 used. 221608.0 avail Mem

PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ COMMAND
  30492 167   20   0 4483384   2.9g  16696 S   6.0   1.2 707:05.25 ceph-osd
  35396 167   20   0 952   2.8g  16468 S   5.0   1.1 815:58.52 ceph-osd
  33488 167   20   0 4161872   2.8g  16580 S   4.7   1.1 496:07.94 ceph-osd
  36371 167   20   0 4387792   3.0g  16748 S   4.3   1.2 762:37.64 ceph-osd
  39185 167   20   0 5108244   3.1g  16576 S   4.0   1.2 998:06.73 ceph-osd
  38729 167   20   0 4748292   2.8g  16580 S   3.3   1.1 895:03.67 ceph-osd
  34439 167   20   0 4492312   2.8g  16796 S   2.0   1.1 921:55.50 ceph-osd
  31473 167   20   0 4314500   2.9g  16684 S   1.3   1.2 680:48.09 ceph-osd
  32495 167   20   0 4294196   2.8g  16552 S   1.0   1.1 545:14.53 ceph-osd
  37230 167   20   0 4586020   2.7g  16620 S   1.0   1.1 844:12.23 ceph-osd
===
Does it look OK with 2GB free?
I can't tell how that 220GB is used for buffer/cache.
Is that used by OSDs? Is it controlled by configuration or auto scaling based
on physical memory? Any clarifications would be helpful.


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: memory consumption by osd

2021-03-27 Thread Tony Liu
To clarify, to avoid PG log taking too much memory, I already set
osd_max_pg_log_entries from default 1 to 1000.
I checked PG log size. They are all under 1100.
ceph pg dump -f json | jq '.pg_map.pg_stats[]' | grep ondisk_log_size

I also checked eash OSD. The total is only a few hundreds MB.
ceph daemon osd. dump_mempools

And osd_memory_target stays default 4GB.

What's taking that much buffer?
# free -h
  totalusedfree  shared  buff/cache   available
Mem:  251Gi31Gi   1.8Gi   1.6Gi   217Gi   215Gi

# cat /proc/meminfo 
MemTotal:   263454780 kB
MemFree: 2212484 kB
MemAvailable:   226842848 kB
Buffers:219061308 kB
Cached:  2066532 kB
SwapCached:  928 kB
Active: 142272648 kB
Inactive:   109641772 kB
..


Thanks!
Tony

From: Tony Liu 
Sent: March 27, 2021 01:25 PM
To: ceph-users
Subject: [ceph-users] memory consumption by osd

Hi,

Here is a snippet from top on a node with 10 OSDs.
===
MiB Mem : 257280.1 total,   2070.1 free,  31881.7 used, 223328.3 buff/cache
MiB Swap: 128000.0 total, 126754.7 free,   1245.3 used. 221608.0 avail Mem

PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ COMMAND
  30492 167   20   0 4483384   2.9g  16696 S   6.0   1.2 707:05.25 ceph-osd
  35396 167   20   0 952   2.8g  16468 S   5.0   1.1 815:58.52 ceph-osd
  33488 167   20   0 4161872   2.8g  16580 S   4.7   1.1 496:07.94 ceph-osd
  36371 167   20   0 4387792   3.0g  16748 S   4.3   1.2 762:37.64 ceph-osd
  39185 167   20   0 5108244   3.1g  16576 S   4.0   1.2 998:06.73 ceph-osd
  38729 167   20   0 4748292   2.8g  16580 S   3.3   1.1 895:03.67 ceph-osd
  34439 167   20   0 4492312   2.8g  16796 S   2.0   1.1 921:55.50 ceph-osd
  31473 167   20   0 4314500   2.9g  16684 S   1.3   1.2 680:48.09 ceph-osd
  32495 167   20   0 4294196   2.8g  16552 S   1.0   1.1 545:14.53 ceph-osd
  37230 167   20   0 4586020   2.7g  16620 S   1.0   1.1 844:12.23 ceph-osd
===
Does it look OK with 2GB free?
I can't tell how that 220GB is used for buffer/cache.
Is that used by OSDs? Is it controlled by configuration or auto scaling based
on physical memory? Any clarifications would be helpful.


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] memory consumption by osd

2021-03-27 Thread Tony Liu
Hi,

Here is a snippet from top on a node with 10 OSDs.
===
MiB Mem : 257280.1 total,   2070.1 free,  31881.7 used, 223328.3 buff/cache
MiB Swap: 128000.0 total, 126754.7 free,   1245.3 used. 221608.0 avail Mem 

PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ COMMAND 
  30492 167   20   0 4483384   2.9g  16696 S   6.0   1.2 707:05.25 ceph-osd
  35396 167   20   0 952   2.8g  16468 S   5.0   1.1 815:58.52 ceph-osd
  33488 167   20   0 4161872   2.8g  16580 S   4.7   1.1 496:07.94 ceph-osd
  36371 167   20   0 4387792   3.0g  16748 S   4.3   1.2 762:37.64 ceph-osd
  39185 167   20   0 5108244   3.1g  16576 S   4.0   1.2 998:06.73 ceph-osd
  38729 167   20   0 4748292   2.8g  16580 S   3.3   1.1 895:03.67 ceph-osd
  34439 167   20   0 4492312   2.8g  16796 S   2.0   1.1 921:55.50 ceph-osd
  31473 167   20   0 4314500   2.9g  16684 S   1.3   1.2 680:48.09 ceph-osd
  32495 167   20   0 4294196   2.8g  16552 S   1.0   1.1 545:14.53 ceph-osd
  37230 167   20   0 4586020   2.7g  16620 S   1.0   1.1 844:12.23 ceph-osd
===
Does it look OK with 2GB free?
I can't tell how that 220GB is used for buffer/cache.
Is that used by OSDs? Is it controlled by configuration or auto scaling based
on physical memory? Any clarifications would be helpful.


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs?

2021-03-27 Thread Tony Liu
Just realized that all config files (/var/lib/ceph///config)
on all nodes are already updated properly. It must be handled as part of adding
MONs. But "ceph config show" shows only single host.

mon_host   [v2:10.250.50.80:3300/0,v1:10.250.50.80:6789/0]  
file

That means I still need to restart all services to apply the update, right?
Is this supposed to be part of adding MONs as well, or additional manual step?


Thanks!
Tony
____
From: Tony Liu 
Sent: March 27, 2021 12:53 PM
To: Stefan Kooman; ceph-users@ceph.io
Subject: [ceph-users] Re: Do I need to update ceph.conf and restart each OSD 
after adding more MONs?

# ceph config set osd.0 mon_host 
[v2:10.250.50.80:3300/0,v1:10.250.50.80:6789/0,v2:10.250.50.81:3300/0,v1:10.250.50.81:6789/0,v2:10.250.50.82:3300/0,v1:10.250.50.82:6789/0]
Error EINVAL: mon_host is special and cannot be stored by the mon

It seems that the only option is to update ceph.conf and restart service.


Tony
____
From: Tony Liu 
Sent: March 27, 2021 12:20 PM
To: Stefan Kooman; ceph-users@ceph.io
Subject: [ceph-users] Re: Do I need to update ceph.conf and restart each OSD 
after adding more MONs?

I expanded MON from 1 to 3 by updating orch service "ceph orch apply".
"mon_host" in all services (MON, MGR, OSDs) is not updated. It's still single
host from source "file".
What's the guidance here to update "mon_host" for all services? I am talking
about Ceph services, not client side.
Should I update ceph.conf for all services and restart all of them?
Or I can update it on-the-fly by "ceph config set"?
In the latter case, where the updated configuration is stored? Is it going to
be overridden by ceph.conf when restart service?


Thanks!
Tony


From: Stefan Kooman 
Sent: March 26, 2021 12:22 PM
To: Tony Liu; ceph-users@ceph.io
Subject: Re: [ceph-users] Do I need to update ceph.conf and restart each OSD 
after adding more MONs?

On 3/26/21 6:06 PM, Tony Liu wrote:
> Hi,
>
> Do I need to update ceph.conf and restart each OSD after adding more MONs?

This should not be necessary, as the OSDs should learn about these
changes through monmaps. Updating the ceph.conf after the mons have been
updated is advised.

> This is with 15.2.8 deployed by cephadm.
>
> When adding MON, "mon_host" should be updated accordingly.
> Given [1], is that update "the monitor cluster’s centralized configuration
> database" or "runtime overrides set by an administrator"?

No need to put that in the centralized config database. I *think* they
mean ceph.conf file on the clients and hosts. At least, that's what you
would normally do (if not using DNS).

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs?

2021-03-27 Thread Tony Liu
# ceph config set osd.0 mon_host 
[v2:10.250.50.80:3300/0,v1:10.250.50.80:6789/0,v2:10.250.50.81:3300/0,v1:10.250.50.81:6789/0,v2:10.250.50.82:3300/0,v1:10.250.50.82:6789/0]
Error EINVAL: mon_host is special and cannot be stored by the mon

It seems that the only option is to update ceph.conf and restart service.


Tony

From: Tony Liu 
Sent: March 27, 2021 12:20 PM
To: Stefan Kooman; ceph-users@ceph.io
Subject: [ceph-users] Re: Do I need to update ceph.conf and restart each OSD 
after adding more MONs?

I expanded MON from 1 to 3 by updating orch service "ceph orch apply".
"mon_host" in all services (MON, MGR, OSDs) is not updated. It's still single
host from source "file".
What's the guidance here to update "mon_host" for all services? I am talking
about Ceph services, not client side.
Should I update ceph.conf for all services and restart all of them?
Or I can update it on-the-fly by "ceph config set"?
In the latter case, where the updated configuration is stored? Is it going to
be overridden by ceph.conf when restart service?


Thanks!
Tony


From: Stefan Kooman 
Sent: March 26, 2021 12:22 PM
To: Tony Liu; ceph-users@ceph.io
Subject: Re: [ceph-users] Do I need to update ceph.conf and restart each OSD 
after adding more MONs?

On 3/26/21 6:06 PM, Tony Liu wrote:
> Hi,
>
> Do I need to update ceph.conf and restart each OSD after adding more MONs?

This should not be necessary, as the OSDs should learn about these
changes through monmaps. Updating the ceph.conf after the mons have been
updated is advised.

> This is with 15.2.8 deployed by cephadm.
>
> When adding MON, "mon_host" should be updated accordingly.
> Given [1], is that update "the monitor cluster’s centralized configuration
> database" or "runtime overrides set by an administrator"?

No need to put that in the centralized config database. I *think* they
mean ceph.conf file on the clients and hosts. At least, that's what you
would normally do (if not using DNS).

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs?

2021-03-27 Thread Tony Liu
I expanded MON from 1 to 3 by updating orch service "ceph orch apply".
"mon_host" in all services (MON, MGR, OSDs) is not updated. It's still single
host from source "file".
What's the guidance here to update "mon_host" for all services? I am talking
about Ceph services, not client side.
Should I update ceph.conf for all services and restart all of them?
Or I can update it on-the-fly by "ceph config set"?
In the latter case, where the updated configuration is stored? Is it going to
be overridden by ceph.conf when restart service?


Thanks!
Tony


From: Stefan Kooman 
Sent: March 26, 2021 12:22 PM
To: Tony Liu; ceph-users@ceph.io
Subject: Re: [ceph-users] Do I need to update ceph.conf and restart each OSD 
after adding more MONs?

On 3/26/21 6:06 PM, Tony Liu wrote:
> Hi,
>
> Do I need to update ceph.conf and restart each OSD after adding more MONs?

This should not be necessary, as the OSDs should learn about these
changes through monmaps. Updating the ceph.conf after the mons have been
updated is advised.

> This is with 15.2.8 deployed by cephadm.
>
> When adding MON, "mon_host" should be updated accordingly.
> Given [1], is that update "the monitor cluster’s centralized configuration
> database" or "runtime overrides set by an administrator"?

No need to put that in the centralized config database. I *think* they
mean ceph.conf file on the clients and hosts. At least, that's what you
would normally do (if not using DNS).

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Do I need to update ceph.conf and restart each OSD after adding more MONs?

2021-03-26 Thread Tony Liu
Hi,

Do I need to update ceph.conf and restart each OSD after adding more MONs?
This is with 15.2.8 deployed by cephadm.

When adding MON, "mon_host" should be updated accordingly.
Given [1], is that update "the monitor cluster’s centralized configuration
database" or "runtime overrides set by an administrator"?

[1] 
https://docs.ceph.com/en/latest/rados/configuration/ceph-conf/#config-sources

Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph octopus mysterious OSD crash

2021-03-19 Thread Tony Liu
Are you sure the OSD is with DB/WAL on SSD?

Tony

From: Philip Brown 
Sent: March 19, 2021 02:49 PM
To: Eugen Block
Cc: ceph-users
Subject: [ceph-users] Re: [BULK]  Re: Re: ceph octopus mysterious OSD crash

Wow.
My expectations have been adjusted. Thank you for detailing your experience, so 
I had motivation to try again.

Explicit steps I took:


1. went into "cephadm shell"  and did a vgremove on the HDD
2. ceph-volume zap /dev/(hdd)
3. lvremove (the matching old lv). This meant that the VG on the SSD had 25% 
space available.

  At this point, "ceph-volume inventory" shows the HDD as "available=True", but 
the shared SSD as false.

4. on my actual admin node, "ceph orch apply osd -i osd.deployspec.yml"

and after a few minutes...

it DID actually pick up the disk and make the OSD.

(I had prevously "ceph osd rm"'d the id. so it used the prior ID)



SO... there's still the concern about why the thing mysteriosly crashed in the 
first place :-/
(on TWO osd's!)

But at least I know how to rebuild a single disk.



- Original Message -
From: "Eugen Block" 
To: "Stefan Kooman" 
Cc: "ceph-users" , "Philip Brown" 
Sent: Friday, March 19, 2021 2:19:55 PM
Subject: [BULK]  Re: [ceph-users] Re: ceph octopus mysterious OSD crash

I am quite sure that this case is covered by cephadm already. A few
months ago I tested it after a major rework of ceph-volume. I don’t
have any links right now. But I had a lab environment with multiple
OSDs per node with rocksDB on SSD and after wiping both HDD and DB LV
cephadm automatically redeployed the OSD according to my drive group
file.


Zitat von Stefan Kooman :

> On 3/19/21 7:47 PM, Philip Brown wrote:
>
> I see.
>
>>
>> I dont think it works when 7/8 devices are already configured, and
>> the SSD is already mostly sliced.
>
> OK. If it is a test cluster you might just blow it all away. By
> doing this you are simulating a "SSD" failure taking down all HDDs
> with it. It sure isn't pretty. I would say the situation you ended
> up with is not a corner case by any means. I am afraid I would
> really need to set up a test cluster with cephadm to help you
> further at this point, besides the suggestion above.
>
> Gr. Stefan
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph orch daemon add , separate db

2021-03-19 Thread Tony Liu
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/EC45YMDJZD3T6TQINGM222H2H4RZABJ4/



From: Philip Brown 
Sent: March 19, 2021 08:59 AM
To: ceph-users
Subject: [ceph-users] ceph orch daemon add , separate db

I was having difficulty doing this myself, and I came across this semi-recent 
thread:

https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/T4R76XJN2NE442GQJ5P2KRJN6HXPMYKL/

" I've tried adding OSDs with ceph orch daemon add ... but it's pretty limited. 
...you can't [have] a separate db device. "



Has this been fixed yet?
Is it GOING to be fixed?



--
Philip Brown| Sr. Linux System Administrator | Medata, Inc.
5 Peters Canyon Rd Suite 250
Irvine CA 92606
Office 714.918.1310| Fax 714.918.1325
pbr...@medata.com| www.medata.com
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Networking Idea/Question

2021-03-16 Thread Tony Liu
"but you may see significant performance improvement with a
second "cluster" network in a large cluster."

"does not usually have a significant impact on overall performance."

The above two statements look conflict to me and cause confusing.

What's the purpose of "cluster" network, simply increasing total
bandwidth or for some isolations?

For example, 
1 network on 1 bonding with 2 x 40GB ports
vs.
2 networks on 2 bonding each with 2 x 20GB ports

They have the same total bandwidth 80GB, so they will support
the same performance, right?


Thanks!
Tony
> -Original Message-
> From: Andrew Walker-Brown 
> Sent: Tuesday, March 16, 2021 9:18 AM
> To: Tony Liu ; Stefan Kooman ;
> Dave Hall ; ceph-users 
> Subject: RE: [ceph-users] Re: Networking Idea/Question
> 
> https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/
> 
> 
> 
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986>  for
> Windows 10
> 
> 
> 
> From: Tony Liu <mailto:tonyliu0...@hotmail.com>
> Sent: 16 March 2021 16:16
> To: Stefan Kooman <mailto:ste...@bit.nl> ; Dave Hall
> <mailto:kdh...@binghamton.edu> ; ceph-users <mailto:ceph-users@ceph.io>
> Subject: [ceph-users] Re: Networking Idea/Question
> 
> 
> 
> > -Original Message-
> > From: Stefan Kooman 
> > Sent: Tuesday, March 16, 2021 4:10 AM
> > To: Dave Hall ; ceph-users 
> > Subject: [ceph-users] Re: Networking Idea/Question
> >
> > On 3/15/21 5:34 PM, Dave Hall wrote:
> > > Hello,
> > >
> > > If anybody out there has tried this or thought about it, I'd like to
> > > know...
> > >
> > > I've been thinking about ways to squeeze as much performance as
> > > possible from the NICs  on a Ceph OSD node.  The nodes in our
> cluster
> > > (6 x OSD, 3 x MGR/MON/MDS/RGW) currently have 2 x 10GB ports.
> > > Currently, one port is assigned to the front-side network, and one
> to
> > > the back-side network.  However, there are times when the traffic on
> > > one side or the other is more intense and might benefit from a bit
> > more bandwidth.
> >
> > What is (are) the reason(s) to choose a separate cluster and public
> > network?
> 
> That used to be the recommendation to separate client traffic and
> cluster traffic. I heard it's not true any more as the latest.
> It would be good if someone can point to the right link of such
> recommendation.
> 
> 
> Thanks!
> Tony
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Networking Idea/Question

2021-03-16 Thread Tony Liu
> -Original Message-
> From: Stefan Kooman 
> Sent: Tuesday, March 16, 2021 4:10 AM
> To: Dave Hall ; ceph-users 
> Subject: [ceph-users] Re: Networking Idea/Question
> 
> On 3/15/21 5:34 PM, Dave Hall wrote:
> > Hello,
> >
> > If anybody out there has tried this or thought about it, I'd like to
> > know...
> >
> > I've been thinking about ways to squeeze as much performance as
> > possible from the NICs  on a Ceph OSD node.  The nodes in our cluster
> > (6 x OSD, 3 x MGR/MON/MDS/RGW) currently have 2 x 10GB ports.
> > Currently, one port is assigned to the front-side network, and one to
> > the back-side network.  However, there are times when the traffic on
> > one side or the other is more intense and might benefit from a bit
> more bandwidth.
> 
> What is (are) the reason(s) to choose a separate cluster and public
> network?

That used to be the recommendation to separate client traffic and
cluster traffic. I heard it's not true any more as the latest.
It would be good if someone can point to the right link of such
recommendation.


Thanks!
Tony

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph orch and mixed SSD/rotating disks

2021-02-18 Thread Tony Liu
It may help if you could share how you added those OSDs.
This guide works for me.
https://docs.ceph.com/en/latest/cephadm/drivegroups/

Tony

From: Philip Brown 
Sent: February 17, 2021 09:30 PM
To: ceph-users
Subject: [ceph-users] ceph orch and mixed SSD/rotating disks

I'm coming back to trying mixed SSD+spinning disks after maybe a year.

It was my vague recollection, that if you told ceph "go auto configure all the 
disks", it would actually automatically carve up the SSDs into the appropriate 
number of LVM segments, and use them as WAL devices for each hdd based OSD on 
the system.

Was I wrong?
Because when I tried to bring up a brand new cluster (Octopus, cephadm 
bootstrapped), with multiple nodes and multiple disks per node...
it seemed to bring up the SSDS as just another set of OSDs.

it clearly recognized them as ssd. The output of "ceph orch device ls" showed 
them as ssd vs hdd for the others.
It just...didnt use them as I expected.

?

Maybe I was thinking of ceph ansible.
Is there not a nice way to do this with the new cephadm based "ceph orch"?
I would rather not have to go write json files or whatever by hand, when a 
computer should be perfectly capable of auto generating this stuff itself




--
Philip Brown| Sr. Linux System Administrator | Medata, Inc.
5 Peters Canyon Rd Suite 250
Irvine CA 92606
Office 714.918.1310| Fax 714.918.1325
pbr...@medata.com| www.medata.com
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] can't remove osd service by "ceph orch rm "

2021-02-15 Thread Tony Liu
Hi,

This is with v15.2 and v15.2.8.
Once an OSD service is applied, it can't be removed.
It always shows up from "ceph orch ls".
"ceph orch rm " only marks it "unmanaged",
but not actually removes it.
Is this the expected?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is replacing OSD whose data is on HDD and DB is on SSD supported?

2021-02-15 Thread Tony Liu
To update and close this thread, what I am looking for is not supported yet.
"ceph-volume lvm batch" requires clean device. It doesn't work to reuse
DB LV or create a new DB LV. Followed https://tracker.ceph.com/issues/46691
with "ceph-volume lvm prepare" to make this work.

Thanks!
Tony
________
From: Tony Liu 
Sent: February 14, 2021 02:01 PM
To: ceph-users@ceph.io; dev
Subject: [ceph-users] Is replacing OSD whose data is on HDD and DB is on SSD 
supported?

​Hi,

I've been trying with v15.2 and v15.2.8, no luck.
Wondering if this is actually supported or ever worked for anyone?

Here is what I've done.
1) Create a cluster with 1 controller (mon and mgr) and 3 OSD nodes,
   each of which is with 1 SSD for DB and 8 HDDs for data.
2) OSD service spec.
service_type: osd
service_id: osd-spec
placement:
 hosts:
 - ceph-osd-1
 - ceph-osd-2
 - ceph-osd-3
spec:
  block_db_size: 92341796864
  data_devices:
model: ST16000NM010G
  db_devices:
model: KPM5XRUG960G
3) Add OSD hosts and apply OSD service spec. 8 OSDs (data on HDD and
   DB on SSD) are created on each host properly.
4) Run "orch osd rm 1 --replace --force". OSD is marked "destroyed" and
   reweight is set to 0 in "osd tree". "pg dump" shows no PG on that OSD.
   "orch ps" shows no daemon running for that OSD.
5) Run "orch device zap  ". VG and LV for HDD are removed.
   LV for DB stays. "orch device ls" shows HDD device is available.
6) Cephadm finds OSD claims and applies OSD spec on the host.
   Here is the message.
   
   cephadm [INF] Found osd claims -> {'ceph-osd-1': ['1']}
   cephadm [INF] Found osd claims for drivegroup osd-spec -> {'ceph-osd-1': 
['1']}
   cephadm [INF] Applying osd-spec on host ceph-osd-1...
   cephadm [INF] Applying osd-spec on host ceph-osd-2...
   cephadm [INF] Applying osd-spec on host ceph-osd-3...
   cephadm [INF] ceph-osd-1: lvm batch --no-auto /dev/sdc /dev/sdd
 /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj
 --db-devices /dev/sdb --block-db-size 92341796864
 --osd-ids 1 --yes --no-systemd
   code: 0
   out: ['']
   err: ['/bin/docker:stderr --> passed data devices: 8 physical, 0 LVM',
   '/bin/docker:stderr --> relative data size: 1.0',
   '/bin/docker:stderr --> passed block_db devices: 1 physical, 0 LVM',
   '/bin/docker:stderr --> 1 fast devices were passed, but none are available']
   

Q1. Is DB LV on SSD supposed to be deleted or not, when replacing an OSD
whose data is on HDD and DB is on SSD?
Q2. If yes from Q1, is a new DB LV supposed to be created on SSD as long as
there is sufficient free space, when building the new OSD?
Q3. If no from Q1, since it's replacing, is the old DB LV going to be reused
for the new OSD?

Again, is this actually supposed to work? Am I missing anything or just trying
on some unsupported feature?


Thanks!
Tony

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: reinstalling node with orchestrator/cephadm

2021-02-15 Thread Tony Liu
Never mind, the OSD daemon shows up in "orch ps" after a while.

Thanks!
Tony
____
From: Tony Liu 
Sent: February 14, 2021 09:47 PM
To: Kenneth Waegeman; ceph-users
Subject: [ceph-users] Re: reinstalling node with orchestrator/cephadm

I followed https://tracker.ceph.com/issues/46691 to bring up the OSD.
"ceph osd tree" shows it's up. "ceph pg dump" shows PGs are remapped.
How can I make it to be aware by cephadm (showed up by "ceph orch ps")?
Because "ceph status" complains "1 stray daemons(s) not managed by cephadm".

Thanks!
Tony

From: Kenneth Waegeman 
Sent: February 12, 2021 05:14 AM
To: ceph-users
Subject: [ceph-users] Re: reinstalling node with orchestrator/cephadm


On 08/02/2021 16:52, Kenneth Waegeman wrote:
> Hi Eugen, all,
>
> Thanks for sharing your results! Since we have multiple clusters and
> clusters with +500 OSDs, this solution is not feasible for us.
>
> In the meantime I created an issue for this :
>
> https://tracker.ceph.com/issues/49159
Hi all,

For those who would have same/similar issues/questions, the ticket has
been updated.

it actually breaks down in two parts:
- ceph-volume documentation
(https://docs.ceph.com/en/latest/ceph-volume/lvm/activate/#activate
<https://docs.ceph.com/en/latest/ceph-volume/lvm/activate/#activate>)
notes that activate means:
'This activation process enables a systemd unit that persists the OSD ID
and its UUID (also called fsid in Ceph CLI tools), so that at boot time
it can understand what OSD is enabled and needs to be mounted.'
-> This is not true/does not work for use with cephadm, ceph-volume
can't make the osd directories/files like unit.run (yet) for osds that
should run with cephadm

- there is yet no way (documented) that existing OSD disks could be
discovered by cephadm/ceph orch on reinstalling a node like it used to
be with running ceph-volume activate --all. The workaround I see for now
is running

ceph-volume activate --all
for id in `ls -1 /var/lib/ceph/osd`; do echo cephadm adopt --style legacy 
--name ${id/ceph-/osd.}; done

This removes the ceph-volume units again and creates the cephadm ones :)

As pointed out by Sebastian Wagner: 'Please verify that the container
image used is consistent across the cluster after running the adoption
process.'

And thanks @Sebastian for making 'cephadm ceph-volume activate' a
feature request!


Kenneth

>
> We would need this especially to migrate/reinstall all our clusters to
> Rhel8 (without destroying/recreating all osd disks), so I really hope
> there is another solution :)
>
> Thanks again!
>
> Kenneth
>
> On 05/02/2021 16:11, Eugen Block wrote:
>> Hi Kenneth,
>>
>> I managed to succeed with this just now. It's a lab environment and
>> the OSDs are not encrypted but I was able to get the OSDs up again.
>> The ceph-volume commands also worked (just activation didn't) so I
>> had the required information about those OSDs.
>>
>> What I did was
>>
>> - collect the OSD data (fsid, keyring)
>> - create directories for osd daemons under
>> /var/lib/ceph//osd.
>> - note that the directory with the ceph uuid already existed since
>> the crash container had been created after bringing the node back
>> into the cluster
>> - creating the content for that OSD by copying the required files
>> from a different host and changed the contents of
>> - fsid
>> - keyring
>> - whoami
>> - unit.run
>> - unit.poststop
>>
>> - created the symlinks to the OSD devices:
>> - ln -s /dev/ceph-/osd-block- block
>> - ln -s /dev/ceph-/osd-block- block.db
>>
>> - changed ownership to ceph
>> - chown -R ceph.ceph /var/lib/ceph//osd./
>>
>> - started the systemd unit
>> - systemctl start ceph-@osd..service
>>
>> I repeated this for all OSDs on that host, now all OSDs are online
>> and the cluster is happy. I'm not sure what else is necessary in case
>> of encrypted OSDs, but maybe this procedure helps you.
>> I don't know if there's a smoother or even automated way, I don't
>> think there currently is. Maybe someone is working on it though.
>>
>> Regards,
>> Eugen
>>
>>
>> Zitat von Kenneth Waegeman :
>>
>>> Hi all,
>>>
>>> I'm running a 15.2.8 cluster using ceph orch with all daemons
>>> adopted to cephadm.
>>>
>>> I tried reinstall an OSD node. Is there a way to make ceph
>>> orch/cephadm activate the devices on this node again, ideally
>>> automatically?
>>>
>>> I tried running `cephadm ceph-volume -- lvm activate --all` but this
>>&g

[ceph-users] Re: share haproxy config for radosgw [EXT]

2021-02-14 Thread Tony Liu
You can have BGP-ECMP to multiple HAProxy instances to support
active-active mode, instead of using keepalived for active-backup mode,
if the traffic amount does required multiple HAProxy instances.

Tony

From: Graham Allan 
Sent: February 14, 2021 01:31 PM
To: Matthew Vernon
Cc: ceph-users
Subject: [ceph-users] Re: share haproxy config for radosgw [EXT]

On Tue, Feb 9, 2021 at 11:00 AM Matthew Vernon  wrote:

> On 07/02/2021 22:19, Marc wrote:
> >
> > I was wondering if someone could post a config for haproxy. Is there
> something specific to configure? Like binding clients to a specific backend
> server, client timeouts, security specific to rgw etc.
>
> Ours is templated out by ceph-ansible; to try and condense out just the
> interesting bits:
>
> (snipped the config...)
>
> The aim is to use all available CPU on the RGWs at peak load, but to
> also try and prevent one user overwhelming the service for everyone else
> - hence the dropping of idle connections and soft (and then hard) limits
> on per-IP connections.
>

Can I ask a followup question to this: how many haproxy instances do you
then run - one on each of your gateways, with keepalived to manage which is
active?

I ask because, since before I was involved with our ceph object store, it
has been load-balanced between multiple rgw servers directly using
bgp-ecmp. It doesn't sound like this is common practise in the ceph
community, and I'm wondering what the pros and cons are.

The bgp-ecmp load balancing has the flaw that it's not truly fault
tolerant, at least without additional checks to shut down the local quagga
instance if rgw isn't responding - it's only fault tolerant in the case of
an entire server going down, which meets our original goals of rolling
maintenance/updates, but not a radosgw process going unresponsive. In
addition I think we have always seen some background level of clients being
sent "connection reset by peer" errors, which I have never tracked down
within radosgw; I wonder if these might be masked by an haproxy frontend?

The converse is that all client gateway traffic must generally pass through
a single haproxy instance, while bgp-ecmp distributes the connections
across all nodes. Perhaps haproxy is lightweight and efficient enough that
this makes little difference to performance?

Graham
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Is replacing OSD whose data is on HDD and DB is on SSD supported?

2021-02-14 Thread Tony Liu
​Hi,

I've been trying with v15.2 and v15.2.8, no luck.
Wondering if this is actually supported or ever worked for anyone?

Here is what I've done.
1) Create a cluster with 1 controller (mon and mgr) and 3 OSD nodes,
   each of which is with 1 SSD for DB and 8 HDDs for data.
2) OSD service spec.
service_type: osd
service_id: osd-spec
placement:
 hosts:
 - ceph-osd-1
 - ceph-osd-2
 - ceph-osd-3
spec:
  block_db_size: 92341796864
  data_devices:
model: ST16000NM010G
  db_devices:
model: KPM5XRUG960G
3) Add OSD hosts and apply OSD service spec. 8 OSDs (data on HDD and
   DB on SSD) are created on each host properly.
4) Run "orch osd rm 1 --replace --force". OSD is marked "destroyed" and
   reweight is set to 0 in "osd tree". "pg dump" shows no PG on that OSD.
   "orch ps" shows no daemon running for that OSD.
5) Run "orch device zap  ". VG and LV for HDD are removed.
   LV for DB stays. "orch device ls" shows HDD device is available.
6) Cephadm finds OSD claims and applies OSD spec on the host.
   Here is the message.
   
   cephadm [INF] Found osd claims -> {'ceph-osd-1': ['1']}
   cephadm [INF] Found osd claims for drivegroup osd-spec -> {'ceph-osd-1': 
['1']}
   cephadm [INF] Applying osd-spec on host ceph-osd-1...
   cephadm [INF] Applying osd-spec on host ceph-osd-2...
   cephadm [INF] Applying osd-spec on host ceph-osd-3...
   cephadm [INF] ceph-osd-1: lvm batch --no-auto /dev/sdc /dev/sdd
 /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj
 --db-devices /dev/sdb --block-db-size 92341796864
 --osd-ids 1 --yes --no-systemd
   code: 0
   out: ['']
   err: ['/bin/docker:stderr --> passed data devices: 8 physical, 0 LVM',
   '/bin/docker:stderr --> relative data size: 1.0',
   '/bin/docker:stderr --> passed block_db devices: 1 physical, 0 LVM',
   '/bin/docker:stderr --> 1 fast devices were passed, but none are available']
   

Q1. Is DB LV on SSD supposed to be deleted or not, when replacing an OSD
whose data is on HDD and DB is on SSD?
Q2. If yes from Q1, is a new DB LV supposed to be created on SSD as long as
there is sufficient free space, when building the new OSD?
Q3. If no from Q1, since it's replacing, is the old DB LV going to be reused
for the new OSD?

Again, is this actually supposed to work? Am I missing anything or just trying
on some unsupported feature?


Thanks!
Tony

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: db_devices doesn't show up in exported osd service spec

2021-02-11 Thread Tony Liu
/dev/sdb is the SSD holding DB LVs for multiple HDDs.
What I expect is that, as long as there is sufficient space on db_devices 
specified
in service spec, a LV should be created.
Now, circling back to the original question, how does OSD replacement work?
I've been trying for a few weeks and hitting different issues, no luck.

Thanks!
Tony

From: Jens Hyllegaard (Soft Design A/S) 
Sent: February 10, 2021 11:54 PM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: db_devices doesn't show up in exported osd service 
spec

According to your "pvs" you still have a VG on your sdb device. As long as that 
is on there, it will not be available to ceph. I have had to do a lvremove, 
like this:
lvremove ceph-78c78efb-af86-427c-8be1-886fa1d54f8a 
osd-db-72784b7a-b5c0-46e6-8566-74758c297adc

Do a lvs command to see the right parameters.

Regards

Jens

-Original Message-----
From: Tony Liu 
Sent: 10. februar 2021 22:59
To: David Orman 
Cc: Jens Hyllegaard (Soft Design A/S) ; 
ceph-users@ceph.io
Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd 
service spec

Hi David,

===
# pvs
  PV VG  Fmt  Attr 
PSizePFree
  /dev/sda3  vg0 lvm2 a-- 
1.09t  0
  /dev/sdb   ceph-block-dbs-f8d28f1f-2dd3-47d0-9110-959e88405112 lvm2 a--  
<447.13g 127.75g
  /dev/sdc   ceph-block-8f85121e-98bf-4466-aaf3-d888bcc938f6 lvm2 a-- 
2.18t  0
  /dev/sde   ceph-block-0b47f685-a60b-42fb-b679-931ef763b3c8 lvm2 a-- 
2.18t  0
  /dev/sdf   ceph-block-c526140d-c75f-4b0d-8c63-fbb2a8abfaa2 lvm2 a-- 
2.18t  0
  /dev/sdg   ceph-block-52b422f7-900a-45ff-a809-69fadabe12fa lvm2 a-- 
2.18t  0
  /dev/sdh   ceph-block-da269f0d-ae11-4178-bf1e-6441b8800336 lvm2 a-- 
2.18t  0
===
After "orch osd rm", which doesn't clean up DB LV on OSD node, I manually clean 
it up by running "ceph-volume lvm zap --osd-id 12", which does the cleanup.
Is "orch device ls" supposed to show SSD device available if there is free 
space?
That could be another issue.

Thanks!
Tony

From: David Orman 
Sent: February 10, 2021 01:19 PM
To: Tony Liu
Cc: Jens Hyllegaard (Soft Design A/S); ceph-users@ceph.io
Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd 
service spec

It's displaying sdb (what I assume you want to be used as a DB device) as 
unavailable. What's "pvs" output look like on that "ceph-osd-1" host? Perhaps 
it is full. I see the other email you sent regarding replacement; I suspect the 
pre-existing LV from your previous OSD is not re-used. You may need to delete 
it then the service specification should re-create it along with the OSD. If I 
remember correctly, I stopped the automatic application of the service spec 
(ceph orch rm osd.servicespec) when I had to replace a failed OSD, removed the 
OSD, nuked the LV on the db device in question, put in the new drive, then 
re-enabled the service-spec (ceph orch apply osd -i) and the OSD + DB/WAL were 
created appropriately. I don't remember the exact sequence, and it may depend 
on the ceph version. I'm also unsure if the "orch osd rm  --replace 
[--force]" will allow preservation of the db/wal mapping, it might be worth 
looking at in the future.

On Wed, Feb 10, 2021 at 2:22 PM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi David,

Request info is below.

# ceph orch device ls ceph-osd-1
HOSTPATH  TYPE   SIZE  DEVICE_ID   MODEL
VENDOR   ROTATIONAL  AVAIL  REJECT REASONS
ceph-osd-1  /dev/sdd  hdd   2235G  SEAGATE_DL2400MM0159_WBM2VL2G   
DL2400MM0159 SEAGATE  1   True
ceph-osd-1  /dev/sda  hdd   1117G  SEAGATE_ST1200MM0099_WFK4NNDY   
ST1200MM0099 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sdb  ssd447G  ATA_MZ7KH480HAHQ0D3_S5CNNA0N305738  
MZ7KH480HAHQ0D3  ATA  0   False  LVM detected, locked
ceph-osd-1  /dev/sdc  hdd   2235G  SEAGATE_DL2400MM0159_WBM2WNSE   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sde  hdd   2235G  SEAGATE_DL2400MM0159_WBM2WP2S   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sdf  hdd   2235G  SEAGATE_DL2400MM0159_WBM2VK99   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sdg  hdd   2235G  SEAGATE_DL2400MM0159_WBM2VJBT   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sdh  hdd   2235G  SEAGATE_DL2400MM0159_WBM2VM

[ceph-users] Re: db_devices doesn't show up in exported osd service spec

2021-02-10 Thread Tony Liu
Hi David,

===
# pvs
  PV VG  Fmt  Attr 
PSizePFree
  /dev/sda3  vg0 lvm2 a-- 
1.09t  0
  /dev/sdb   ceph-block-dbs-f8d28f1f-2dd3-47d0-9110-959e88405112 lvm2 a--  
<447.13g 127.75g
  /dev/sdc   ceph-block-8f85121e-98bf-4466-aaf3-d888bcc938f6 lvm2 a-- 
2.18t  0
  /dev/sde   ceph-block-0b47f685-a60b-42fb-b679-931ef763b3c8 lvm2 a-- 
2.18t  0
  /dev/sdf   ceph-block-c526140d-c75f-4b0d-8c63-fbb2a8abfaa2 lvm2 a-- 
2.18t  0
  /dev/sdg   ceph-block-52b422f7-900a-45ff-a809-69fadabe12fa lvm2 a-- 
2.18t  0
  /dev/sdh   ceph-block-da269f0d-ae11-4178-bf1e-6441b8800336 lvm2 a-- 
2.18t  0
===
After "orch osd rm", which doesn't clean up DB LV on OSD node, I manually 
clean it up by running "ceph-volume lvm zap --osd-id 12", which does the 
cleanup.
Is "orch device ls" supposed to show SSD device available if there is free 
space?
That could be another issue.

Thanks!
Tony

From: David Orman 
Sent: February 10, 2021 01:19 PM
To: Tony Liu
Cc: Jens Hyllegaard (Soft Design A/S); ceph-users@ceph.io
Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd 
service spec

It's displaying sdb (what I assume you want to be used as a DB device) as 
unavailable. What's "pvs" output look like on that "ceph-osd-1" host? Perhaps 
it is full. I see the other email you sent regarding replacement; I suspect the 
pre-existing LV from your previous OSD is not re-used. You may need to delete 
it then the service specification should re-create it along with the OSD. If I 
remember correctly, I stopped the automatic application of the service spec 
(ceph orch rm osd.servicespec) when I had to replace a failed OSD, removed the 
OSD, nuked the LV on the db device in question, put in the new drive, then 
re-enabled the service-spec (ceph orch apply osd -i) and the OSD + DB/WAL were 
created appropriately. I don't remember the exact sequence, and it may depend 
on the ceph version. I'm also unsure if the "orch osd rm  --replace 
[--force]" will allow preservation of the db/wal mapping, it might be worth 
looking at in the future.

On Wed, Feb 10, 2021 at 2:22 PM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi David,

Request info is below.

# ceph orch device ls ceph-osd-1
HOSTPATH  TYPE   SIZE  DEVICE_ID   MODEL
VENDOR   ROTATIONAL  AVAIL  REJECT REASONS
ceph-osd-1  /dev/sdd  hdd   2235G  SEAGATE_DL2400MM0159_WBM2VL2G   
DL2400MM0159 SEAGATE  1   True
ceph-osd-1  /dev/sda  hdd   1117G  SEAGATE_ST1200MM0099_WFK4NNDY   
ST1200MM0099 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sdb  ssd447G  ATA_MZ7KH480HAHQ0D3_S5CNNA0N305738  
MZ7KH480HAHQ0D3  ATA  0   False  LVM detected, locked
ceph-osd-1  /dev/sdc  hdd   2235G  SEAGATE_DL2400MM0159_WBM2WNSE   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sde  hdd   2235G  SEAGATE_DL2400MM0159_WBM2WP2S   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sdf  hdd   2235G  SEAGATE_DL2400MM0159_WBM2VK99   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sdg  hdd   2235G  SEAGATE_DL2400MM0159_WBM2VJBT   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sdh  hdd   2235G  SEAGATE_DL2400MM0159_WBM2VMFK   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
# cat osd-spec.yaml
service_type: osd
service_id: osd-spec
placement:
 hosts:
 - ceph-osd-1
spec:
  objectstore: bluestore
  #block_db_size: 32212254720
  block_db_size: 64424509440
  data_devices:
#rotational: 1
paths:
- /dev/sdd
  db_devices:
#rotational: 0
size: ":1T"
#unmanaged: true

+-+--+--+--++-+
# ceph orch apply osd -i osd-spec.yaml --dry-run
+-+--++--++-+
|SERVICE  |NAME  |HOST|DATA  |DB  |WAL  |
+-+--++--++-+
|osd  |osd-spec  |ceph-osd-1  |/dev/sdd  |-   |-|
+-+--++--++-+

Thanks!
Tony
________
From: David Orman mailto:orma...@corenode.com>>
Sent: February 10, 2021 11:02 AM
To: Tony Liu
Cc: Jens Hyllegaard (Soft Design A/S); 
ceph-users@ceph.io<mailto:ceph-users@ceph.io>
Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd 

[ceph-users] Re: Device is not available after zap

2021-02-10 Thread Tony Liu
To update, the OSD had data on HDD and DB on SSD.
After "ceph orch osd rm 12 --replace --force" and wait
till rebalancing is done and daemon is stopped,
I ran "ceph orch device zap ceph-osd-2 /dev/sdd" to zap the device.
It cleared PV, VG and LV for data device, but not DB device.
DB device issue is being discussed in another thread.
Eventually, I restart the active mgr, then the device shows up available.
Not sure what was stuck in mgr.

Thanks!
Tony

From: Marc 
Sent: February 10, 2021 12:21 PM
To: Philip Brown; Matt Wilder
Cc: ceph-users
Subject: [ceph-users] Re: Device is not available after zap

I had something similar a while ago, can't remember how I solved it sorry, but 
it is not a lvm bug. Also posted it here. To bad this is still not fixed.

> -Original Message-
> Cc: ceph-users 
> Subject: [ceph-users] Re: Device is not available after zap
>
> ive always run it against the block dev
>
>
> - Original Message -
> From: "Matt Wilder" 
> To: "Philip Brown" 
> Cc: "ceph-users" 
> Sent: Wednesday, February 10, 2021 12:06:55 PM
> Subject: Re: [ceph-users] Re: Device is not available after zap
>
> Are you running zap on the lvm volume, or the underlying block device?
>
> If you are running it against the lvm volume, it sounds like you need to
> run it against the block device so it wipes the lvm volumes as well.
> (Disclaimer: I don't run Ceph in this configuration)
>
> On Wed, Feb 10, 2021 at 10:24 AM Philip Brown  wrote:
>
> > Sorry, not much to say other than a "me too".
> > i spent a week testing ceph configurations.. it should have only been
> 2
> > days. but a huge amount of my time was wasted because I needed to do a
> full
> > reboot on the hardware.
> >
> > on a related note: sometimes "zap" didnt fully clean things up. I had
> to
> > manually go in and clean up vgs. or pvs. or sometimes wipefs -a
> >
> > so, in theory, this could be a linux LVM bug.  but if I recall, i was
> > doing this with ceph octopus, and centos 7.9
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: db_devices doesn't show up in exported osd service spec

2021-02-10 Thread Tony Liu
Hi David,

Request info is below.

# ceph orch device ls ceph-osd-1
HOSTPATH  TYPE   SIZE  DEVICE_ID   MODEL
VENDOR   ROTATIONAL  AVAIL  REJECT REASONS
ceph-osd-1  /dev/sdd  hdd   2235G  SEAGATE_DL2400MM0159_WBM2VL2G   
DL2400MM0159 SEAGATE  1   True
ceph-osd-1  /dev/sda  hdd   1117G  SEAGATE_ST1200MM0099_WFK4NNDY   
ST1200MM0099 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sdb  ssd447G  ATA_MZ7KH480HAHQ0D3_S5CNNA0N305738  
MZ7KH480HAHQ0D3  ATA  0   False  LVM detected, locked
ceph-osd-1  /dev/sdc  hdd   2235G  SEAGATE_DL2400MM0159_WBM2WNSE   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sde  hdd   2235G  SEAGATE_DL2400MM0159_WBM2WP2S   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sdf  hdd   2235G  SEAGATE_DL2400MM0159_WBM2VK99   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sdg  hdd   2235G  SEAGATE_DL2400MM0159_WBM2VJBT   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
ceph-osd-1  /dev/sdh  hdd   2235G  SEAGATE_DL2400MM0159_WBM2VMFK   
DL2400MM0159 SEAGATE  1   False  LVM detected, Insufficient space 
(<5GB) on vgs, locked
# cat osd-spec.yaml 
service_type: osd
service_id: osd-spec
placement:
 hosts:
 - ceph-osd-1
spec:
  objectstore: bluestore
  #block_db_size: 32212254720
  block_db_size: 64424509440
  data_devices:
#rotational: 1
paths:
- /dev/sdd
  db_devices:
#rotational: 0
size: ":1T"
#unmanaged: true

+-+--+--+--++-+
# ceph orch apply osd -i osd-spec.yaml --dry-run
+-+--++--++-+
|SERVICE  |NAME  |HOST|DATA  |DB  |WAL  |
+-+--++--++-+
|osd  |osd-spec  |ceph-osd-1  |/dev/sdd  |-   |-|
+-+--++--++-+

Thanks!
Tony

From: David Orman 
Sent: February 10, 2021 11:02 AM
To: Tony Liu
Cc: Jens Hyllegaard (Soft Design A/S); ceph-users@ceph.io
Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd 
service spec

What's "ceph orch device ls" look like, and please show us your specification 
that you've used.

Jens was correct, his example is how we worked-around this problem, pending 
patch/new release.

On Wed, Feb 10, 2021 at 12:05 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
With db_devices.size, db_devices shows up from "orch ls --export",
but no DB device/lvm created for the OSD. Any clues?

Thanks!
Tony

From: Jens Hyllegaard (Soft Design A/S) 
mailto:jens.hyllega...@softdesign.dk>>
Sent: February 9, 2021 01:16 AM
To: ceph-users@ceph.io<mailto:ceph-users@ceph.io>
Subject: [ceph-users] Re: db_devices doesn't show up in exported osd service 
spec

Hi Tony.

I assume they used a size constraint instead of rotational. So if all your 
SSD's are 1TB or less , and all HDD's are more than that you could use:

spec:
  objectstore: bluestore
  data_devices:
rotational: true
  filter_logic: AND
  db_devices:
size: ':1TB'

It was usable in my test environment, and seems to work.

Regards

Jens


-Original Message-
From: Tony Liu mailto:tonyliu0...@hotmail.com>>
Sent: 9. februar 2021 02:09
To: David Orman mailto:orma...@corenode.com>>
Cc: ceph-users@ceph.io<mailto:ceph-users@ceph.io>
Subject: [ceph-users] Re: db_devices doesn't show up in exported osd service 
spec

Hi David,

Could you show me an example of OSD service spec YAML to workaround it by 
specifying size?

Thanks!
Tony

From: David Orman mailto:orma...@corenode.com>>
Sent: February 8, 2021 04:06 PM
To: Tony Liu
Cc: ceph-users@ceph.io<mailto:ceph-users@ceph.io>
Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd 
service spec

Adding ceph-users:

We ran into this same issue, and we used a size specification to workaround for 
now.

Bug and patch:

https://tracker.ceph.com/issues/49014
https://github.com/ceph/ceph/pull/39083

Backport to Octopus:

https://github.com/ceph/ceph/pull/39171

On Sat, Feb 6, 2021 at 7:05 PM Tony Liu 
mailto:tonyliu0...@hotmail.com><mailto:tonyliu0...@hotmail.com<mailto:tonyliu0...@hotmail.com>>>
 wrote:
Add dev to comment.

With 15.2.8, when apply OSD service spec, db_devices is gone.
Here is the service spec file.
==
service_type: osd
service_id: osd-spec
placement:
  hosts:
  - ceph-osd-1
spec:
  objectstore: bluestore
  data_devices:
rota

[ceph-users] Re: db_devices doesn't show up in exported osd service spec

2021-02-09 Thread Tony Liu
With db_devices.size, db_devices shows up from "orch ls --export",
but no DB device/lvm created for the OSD. Any clues?

Thanks!
Tony

From: Jens Hyllegaard (Soft Design A/S) 
Sent: February 9, 2021 01:16 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: db_devices doesn't show up in exported osd service 
spec

Hi Tony.

I assume they used a size constraint instead of rotational. So if all your 
SSD's are 1TB or less , and all HDD's are more than that you could use:

spec:
  objectstore: bluestore
  data_devices:
rotational: true
  filter_logic: AND
  db_devices:
size: ':1TB'

It was usable in my test environment, and seems to work.

Regards

Jens


-Original Message-----
From: Tony Liu 
Sent: 9. februar 2021 02:09
To: David Orman 
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: db_devices doesn't show up in exported osd service 
spec

Hi David,

Could you show me an example of OSD service spec YAML to workaround it by 
specifying size?

Thanks!
Tony

From: David Orman 
Sent: February 8, 2021 04:06 PM
To: Tony Liu
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd 
service spec

Adding ceph-users:

We ran into this same issue, and we used a size specification to workaround for 
now.

Bug and patch:

https://tracker.ceph.com/issues/49014
https://github.com/ceph/ceph/pull/39083

Backport to Octopus:

https://github.com/ceph/ceph/pull/39171

On Sat, Feb 6, 2021 at 7:05 PM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Add dev to comment.

With 15.2.8, when apply OSD service spec, db_devices is gone.
Here is the service spec file.
==
service_type: osd
service_id: osd-spec
placement:
  hosts:
  - ceph-osd-1
spec:
  objectstore: bluestore
  data_devices:
rotational: 1
  db_devices:
rotational: 0
==

Here is the logging from mon. The message with "Tony" is added by me in mgr to 
confirm. The audit from mon shows db_devices is gone.
Is there anything in mon to filter that out based on host info?
How can I trace it?
==
audit 2021-02-07T00:45:38.106171+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 
4020 : audit [DBG] from='client.24184218 -' entity='client.admin' 
cmd=[{"prefix": "orch apply osd", "target": ["mon-mgr", ""]}]: dispatch cephadm 
2021-02-07T00:45:38.108546+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 4021 : 
cephadm [INF] Marking host: ceph-osd-1 for OSDSpec preview refresh.
cephadm 2021-02-07T00:45:38.108798+ mgr.ceph-control-1.nxjnzz 
(mgr.24142551) 4022 : cephadm [INF] Saving service osd.osd-spec spec with 
placement ceph-osd-1 cephadm 2021-02-07T00:45:38.108893+ 
mgr.ceph-control-1.nxjnzz (mgr.24142551) 4023 : cephadm [INF] Tony: spec: 
placement=PlacementSpec(hosts=[HostPlacementSpec(hostname='ceph-osd-1',
 network='', name='')]), service_id='osd-spec', service_type='osd', 
data_devices=DeviceSelection(rotational=1, all=False), 
db_devices=DeviceSelection(rotational=0, all=False), osd_id_claims={}, 
unmanaged=False, filter_logic='AND', preview_only=False)> audit 
2021-02-07T00:45:38.109782+ mon.ceph-control-3 (mon.2) 25 : audit [INF] 
from='mgr.24142551 10.6.50.30:0/2838166251<http://10.6.50.30:0/2838166251>' 
entity='mgr.ceph-control-1.nxjnzz' cmd=[{"prefix":"config-key 
set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": 
\"2021-02-07T00:45:38.108810\", \"spec\": {\"plac
 ement\": {\"hosts\": [\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", 
\"service_name\": \"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": 
{\"data_devices\": {\"rotational\": 1}, \"filter_logic\": \"AND\", 
\"objectstore\": \"bluestore\"}}}"}]: dispatch audit 
2021-02-07T00:45:38.110133+ mon.ceph-control-1 (mon.0) 107 : audit [INF] 
from='mgr.24142551 ' entity='mgr.ceph-control-1.nxjnzz' 
cmd=[{"prefix":"config-key 
set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": 
\"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": 
[\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": 
\"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": 
{\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": 
\"bluestore\"}}}"}]: dispatch audit 2021-02-07T00:45:38.152756+ 
mon.ceph-control-1 (mo

[ceph-users] How is DB handled when remove/replace and add OSD?

2021-02-09 Thread Tony Liu
Hi,

I'd like to know how DB device is expected to be handled by "orch osd rm".
What I see is that, DB device on SSD is untouched when OSD on HDD is removed
or replaced. "orch device zap" removes PV, VG and LV of the device.
It doesn't touch the DB LV on SSD.

To remove an OSD permanently, do I need to manually clean up the DB LV on SSD?

To replace and OSD, is the old DB LV going to be reused for the new OSD,
or a new DB LV will be created?

I am asking this because, to replace an OSD, when the OSD is removed,
I manually removed DB LV on SSD. Now, I try to add new OSD, but --try-run
doesn't show DB device.
```
# cat osd-spec.yaml
service_type: osd
service_id: osd-spec
placement:
 hosts:
 - ceph-osd-1
spec:
  #objectstore: bluestore
  #block_db_size: 32212254720
  #block_db_size: 64424509440
  data_devices:
rotational: 1
  db_devices:
#rotational: 0
size: ":500GB"
#unmanaged: true

# ceph orch apply osd -i osd-spec.yaml --dry-run
+-+--++--++-+
|SERVICE  |NAME  |HOST|DATA  |DB  |WAL  |
+-+--++--++-+
|osd  |osd-spec  |ceph-osd-1  |/dev/sdd  |-   |-|
+-+--++--++-+
```
Any clues?

Thanks!
Tony

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: db_devices doesn't show up in exported osd service spec

2021-02-08 Thread Tony Liu
Hi David,

Could you show me an example of OSD service spec YAML
to workaround it by specifying size?

Thanks!
Tony

From: David Orman 
Sent: February 8, 2021 04:06 PM
To: Tony Liu
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: db_devices doesn't show up in exported osd 
service spec

Adding ceph-users:

We ran into this same issue, and we used a size specification to workaround for 
now.

Bug and patch:

https://tracker.ceph.com/issues/49014
https://github.com/ceph/ceph/pull/39083

Backport to Octopus:

https://github.com/ceph/ceph/pull/39171

On Sat, Feb 6, 2021 at 7:05 PM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Add dev to comment.

With 15.2.8, when apply OSD service spec, db_devices is gone.
Here is the service spec file.
==
service_type: osd
service_id: osd-spec
placement:
  hosts:
  - ceph-osd-1
spec:
  objectstore: bluestore
  data_devices:
rotational: 1
  db_devices:
rotational: 0
==

Here is the logging from mon. The message with "Tony" is added by me
in mgr to confirm. The audit from mon shows db_devices is gone.
Is there anything in mon to filter that out based on host info?
How can I trace it?
==
audit 2021-02-07T00:45:38.106171+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 
4020 : audit [DBG] from='client.24184218 -' entity='client.admin' 
cmd=[{"prefix": "orch apply osd", "target": ["mon-mgr", ""]}]: dispatch
cephadm 2021-02-07T00:45:38.108546+ mgr.ceph-control-1.nxjnzz 
(mgr.24142551) 4021 : cephadm [INF] Marking host: ceph-osd-1 for OSDSpec 
preview refresh.
cephadm 2021-02-07T00:45:38.108798+ mgr.ceph-control-1.nxjnzz 
(mgr.24142551) 4022 : cephadm [INF] Saving service osd.osd-spec spec with 
placement ceph-osd-1
cephadm 2021-02-07T00:45:38.108893+ mgr.ceph-control-1.nxjnzz 
(mgr.24142551) 4023 : cephadm [INF] Tony: spec: placement=PlacementSpec(hosts=[HostPlacementSpec(hostname='ceph-osd-1',
 network='', name='')]), service_id='osd-spec', service_type='osd', 
data_devices=DeviceSelection(rotational=1, all=False), 
db_devices=DeviceSelection(rotational=0, all=False), osd_id_claims={}, 
unmanaged=False, filter_logic='AND', preview_only=False)>
audit 2021-02-07T00:45:38.109782+ mon.ceph-control-3 (mon.2) 25 : audit 
[INF] from='mgr.24142551 
10.6.50.30:0/2838166251<http://10.6.50.30:0/2838166251>' 
entity='mgr.ceph-control-1.nxjnzz' cmd=[{"prefix":"config-key 
set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": 
\"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": 
[\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": 
\"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": 
{\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": 
\"bluestore\"}}}"}]: dispatch
audit 2021-02-07T00:45:38.110133+ mon.ceph-control-1 (mon.0) 107 : audit 
[INF] from='mgr.24142551 ' entity='mgr.ceph-control-1.nxjnzz' 
cmd=[{"prefix":"config-key 
set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": 
\"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": 
[\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": 
\"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": 
{\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": 
\"bluestore\"}}}"}]: dispatch
audit 2021-02-07T00:45:38.152756+ mon.ceph-control-1 (mon.0) 108 : audit 
[INF] from='mgr.24142551 ' entity='mgr.ceph-control-1.nxjnzz' 
cmd='[{"prefix":"config-key 
set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": 
\"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": 
[\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": 
\"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": 
{\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": 
\"bluestore\"}}}"}]': finished
==

Thanks!
Tony
> -Original Message-
> From: Jens Hyllegaard (Soft Design A/S) 
> mailto:jens.hyllega...@softdesign.dk>>
>

[ceph-users] Re: Device is not available after zap

2021-02-07 Thread Tony Liu
I built a new cluster from scratch, everything works fine.

Could anyone help to find out what is stuck here?
Another issue, devices don't show up after adding a host,
could be the same cause.
Any details about the workflow would be helpful too, like how
mon gets devices when a host is added, is it pushed by something
(mgr?) or pulled by mon?


Thanks!
Tony
> -Original Message-
> From: Tony Liu 
> Sent: Sunday, February 7, 2021 5:32 PM
> To: ceph-users 
> Subject: [ceph-users] Re: Device is not available after zap
> 
> I checked pvscan, vgscan, lvscan and "ceph-volume lvm list" on the OSD
> node, that zapped device doesn't show anywhere.
> Anything missing?
> 
> Thanks!
> Tony
> ____
> From: Tony Liu 
> Sent: February 7, 2021 05:27 PM
> To: ceph-users
> Subject: [ceph-users] Device is not available after zap
> 
> Hi,
> 
> With v15.2.8, after zap a device on OSD node, it's still not available.
> The reason is "locked, LVM detected". If I reboot the whole OSD node,
> then the device will be available. There must be something no being
> cleaned up. Any clues?
> 
> Thanks!
> Tony
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Device is not available after zap

2021-02-07 Thread Tony Liu
I checked pvscan, vgscan, lvscan and "ceph-volume lvm list" on the OSD node,
that zapped device doesn't show anywhere.
Anything missing?

Thanks!
Tony
____
From: Tony Liu 
Sent: February 7, 2021 05:27 PM
To: ceph-users
Subject: [ceph-users] Device is not available after zap

Hi,

With v15.2.8, after zap a device on OSD node, it's still not available.
The reason is "locked, LVM detected". If I reboot the whole OSD node,
then the device will be available. There must be something no being
cleaned up. Any clues?

Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Device is not available after zap

2021-02-07 Thread Tony Liu
Hi,

With v15.2.8, after zap a device on OSD node, it's still not available.
The reason is "locked, LVM detected". If I reboot the whole OSD node, 
then the device will be available. There must be something no being
cleaned up. Any clues?

Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: db_devices doesn't show up in exported osd service spec

2021-02-06 Thread Tony Liu
Add dev to comment.

With 15.2.8, when apply OSD service spec, db_devices is gone.
Here is the service spec file.
==
service_type: osd
service_id: osd-spec
placement:
  hosts:
  - ceph-osd-1
spec:
  objectstore: bluestore
  data_devices:
rotational: 1
  db_devices:
rotational: 0
==

Here is the logging from mon. The message with "Tony" is added by me
in mgr to confirm. The audit from mon shows db_devices is gone.
Is there anything in mon to filter that out based on host info?
How can I trace it?
==
audit 2021-02-07T00:45:38.106171+ mgr.ceph-control-1.nxjnzz (mgr.24142551) 
4020 : audit [DBG] from='client.24184218 -' entity='client.admin' 
cmd=[{"prefix": "orch apply osd", "target": ["mon-mgr", ""]}]: dispatch
cephadm 2021-02-07T00:45:38.108546+ mgr.ceph-control-1.nxjnzz 
(mgr.24142551) 4021 : cephadm [INF] Marking host: ceph-osd-1 for OSDSpec 
preview refresh.
cephadm 2021-02-07T00:45:38.108798+ mgr.ceph-control-1.nxjnzz 
(mgr.24142551) 4022 : cephadm [INF] Saving service osd.osd-spec spec with 
placement ceph-osd-1
cephadm 2021-02-07T00:45:38.108893+ mgr.ceph-control-1.nxjnzz 
(mgr.24142551) 4023 : cephadm [INF] Tony: spec: placement=PlacementSpec(hosts=[HostPlacementSpec(hostname='ceph-osd-1',
 network='', name='')]), service_id='osd-spec', service_type='osd', 
data_devices=DeviceSelection(rotational=1, all=False), 
db_devices=DeviceSelection(rotational=0, all=False), osd_id_claims={}, 
unmanaged=False, filter_logic='AND', preview_only=False)>
audit 2021-02-07T00:45:38.109782+ mon.ceph-control-3 (mon.2) 25 : audit 
[INF] from='mgr.24142551 10.6.50.30:0/2838166251' 
entity='mgr.ceph-control-1.nxjnzz' cmd=[{"prefix":"config-key 
set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": 
\"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": 
[\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": 
\"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": 
{\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": 
\"bluestore\"}}}"}]: dispatch
audit 2021-02-07T00:45:38.110133+ mon.ceph-control-1 (mon.0) 107 : audit 
[INF] from='mgr.24142551 ' entity='mgr.ceph-control-1.nxjnzz' 
cmd=[{"prefix":"config-key 
set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": 
\"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": 
[\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": 
\"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": 
{\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": 
\"bluestore\"}}}"}]: dispatch
audit 2021-02-07T00:45:38.152756+ mon.ceph-control-1 (mon.0) 108 : audit 
[INF] from='mgr.24142551 ' entity='mgr.ceph-control-1.nxjnzz' 
cmd='[{"prefix":"config-key 
set","key":"mgr/cephadm/spec.osd.osd-spec","val":"{\"created\": 
\"2021-02-07T00:45:38.108810\", \"spec\": {\"placement\": {\"hosts\": 
[\"ceph-osd-1\"]}, \"service_id\": \"osd-spec\", \"service_name\": 
\"osd.osd-spec\", \"service_type\": \"osd\", \"spec\": {\"data_devices\": 
{\"rotational\": 1}, \"filter_logic\": \"AND\", \"objectstore\": 
\"bluestore\"}}}"}]': finished
==

Thanks!
Tony
> -Original Message-
> From: Jens Hyllegaard (Soft Design A/S) 
> Sent: Thursday, February 4, 2021 6:31 AM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Re: db_devices doesn't show up in exported osd
> service spec
> 
> Hi.
> 
> I have the same situation. Running 15.2.8 I created a specification that
> looked just like it. With rotational in the data and non-rotational in
> the db.
> 
> First use applied fine. Afterwards it only uses the hdd, and not the ssd.
> Also, is there a way to remove an unused osd service.
> I manages to create osd.all-available-devices, when I tried to stop the
> autocreation of OSD's. Using ceph orch apply osd --all-available-devices
> --unmanaged=true
> 
> I created the original OSD using the web interface.
> 
> Regards
> 
> Jens
>

[ceph-users] Re: replace OSD failed

2021-02-04 Thread Tony Liu
Here is the issue.
https://tracker.ceph.com/issues/47758


Thanks!
Tony
> -Original Message-
> From: Tony Liu 
> Sent: Thursday, February 4, 2021 8:46 PM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Re: replace OSD failed
> 
> Here is the log from ceph-volume.
> ```
> [2021-02-05 04:03:17,000][ceph_volume.process][INFO  ] Running command:
> /usr/sbin/vgcreate --force --yes ceph-a3886f74-3de9-4e6e-a983-
> 8330eda0bd64 /dev/sdd
> [2021-02-05 04:03:17,134][ceph_volume.process][INFO  ] stdout Physical
> volume "/dev/sdd" successfully created.
> [2021-02-05 04:03:17,166][ceph_volume.process][INFO  ] stdout Volume
> group "ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64" successfully created
> [2021-02-05 04:03:17,189][ceph_volume.process][INFO  ] Running command:
> /usr/sbin/vgs --noheadings --readonly --units=b --nosuffix --
> separator=";" -S vg_name=ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 -o
> vg_name,pv_count,lv_count,vg_attr,vg_extent_count,vg_free_count,vg_exten
> t_size
> [2021-02-05 04:03:17,229][ceph_volume.process][INFO  ] stdout ceph-
> a3886f74-3de9-4e6e-a983-8330eda0bd64";"1";"0";"wz--n-
> ";"572317";"572317";"4194304
> [2021-02-05 04:03:17,229][ceph_volume.api.lvm][DEBUG ] size was passed:
> 2.18 TB -> 572318
> [2021-02-05 04:03:17,235][ceph_volume.process][INFO  ] Running command:
> /usr/sbin/lvcreate --yes -l 572318 -n osd-block-b05c3c90-b7d5-4f13-8a58-
> f72761c1971b ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64
> [2021-02-05 04:03:17,244][ceph_volume.process][INFO  ] stderr Volume
> group "ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64" has insufficient free
> space (572317 extents): 572318 required.
> ```
> size was passed: 2.18 TB -> 572318
> How is  this calculated?
> 
> 
> Thanks!
> Tony
> > -Original Message-
> > From: Tony Liu 
> > Sent: Thursday, February 4, 2021 8:34 PM
> > To: ceph-users@ceph.io
> > Subject: [ceph-users] replace OSD failed
> >
> > Hi,
> >
> > With 15.2.8, run "ceph orch rm osd 12 --replace --force", PGs on
> > osd.12 are remapped, osd.12 is removed from "ceph osd tree", the
> > daemon is removed from "ceph orch ps", the device is "available"
> > in "ceph orch device ls". Everything seems good at this point.
> >
> > Then dry-run service spec.
> > ```
> > # cat osd-spec.yaml
> > service_type: osd
> > service_id: osd-spec
> > placement:
> >   hosts:
> >   - ceph-osd-1
> > data_devices:
> >   rotational: 1
> > db_devices:
> >   rotational: 0
> >
> > # ceph orch apply osd -i osd-spec.yaml --dry-run
> > +-+--++--+--+-+
> > |SERVICE  |NAME  |HOST|DATA  |DB|WAL  |
> > +-+--++--+--+-+
> > |osd  |osd-spec  |ceph-osd-3  |/dev/sdd  |/dev/sdb  |-|
> > +-+--++--+--+-+
> > ```
> > It looks as expected.
> >
> > Then "ceph orch apply osd -i osd-spec.yaml".
> > Here is the log of cephadm.
> > ```
> > /bin/docker:stderr --> relative data size: 1.0 /bin/docker:stderr -->
> > passed block_db devices: 1 physical, 0 LVM /bin/docker:stderr Running
> > command: /usr/bin/ceph-authtool --gen-print-key /bin/docker:stderr
> > Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-
> > osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f
> > json /bin/docker:stderr Running command: /usr/bin/ceph --cluster ceph
> > --name client.bootstrap-osd --keyring
> > /var/lib/ceph/bootstrap-osd/ceph.keyring
> > -i - osd new b05c3c90-b7d5-4f13-8a58-f72761c1971b 12
> > /bin/docker:stderr Running command: /usr/sbin/vgcreate --force --yes
> > ceph-a3886f74-3de9-
> > 4e6e-a983-8330eda0bd64 /dev/sdd /bin/docker:stderr  stdout: Physical
> > volume "/dev/sdd" successfully created.
> > /bin/docker:stderr  stdout: Volume group
> > "ceph-a3886f74-3de9-4e6e-a983- 8330eda0bd64" successfully created
> /bin/docker:stderr Running command:
> > /usr/sbin/lvcreate --yes -l 572318 -n
> > osd-block-b05c3c90-b7d5-4f13-8a58-
> > f72761c1971b ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64
> > /bin/docker:stderr  stderr: Volume group
> > "ceph-a3886f74-3de9-4e6e-a983- 8330eda0bd64" has insufficient free
> > space (572317 extents): 572318 required.
> > /bin/docker:stderr --> Was unable to complete a new OSD, will rollback
> > changes ``` Q1, why VG na

[ceph-users] Re: replace OSD failed

2021-02-04 Thread Tony Liu
Here is the log from ceph-volume.
```
[2021-02-05 04:03:17,000][ceph_volume.process][INFO  ] Running command: 
/usr/sbin/vgcreate --force --yes ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 
/dev/sdd
[2021-02-05 04:03:17,134][ceph_volume.process][INFO  ] stdout Physical volume 
"/dev/sdd" successfully created.
[2021-02-05 04:03:17,166][ceph_volume.process][INFO  ] stdout Volume group 
"ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64" successfully created
[2021-02-05 04:03:17,189][ceph_volume.process][INFO  ] Running command: 
/usr/sbin/vgs --noheadings --readonly --units=b --nosuffix --separator=";" -S 
vg_name=ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 -o 
vg_name,pv_count,lv_count,vg_attr,vg_extent_count,vg_free_count,vg_extent_size
[2021-02-05 04:03:17,229][ceph_volume.process][INFO  ] stdout 
ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64";"1";"0";"wz--n-";"572317";"572317";"4194304
[2021-02-05 04:03:17,229][ceph_volume.api.lvm][DEBUG ] size was passed: 2.18 TB 
-> 572318
[2021-02-05 04:03:17,235][ceph_volume.process][INFO  ] Running command: 
/usr/sbin/lvcreate --yes -l 572318 -n 
osd-block-b05c3c90-b7d5-4f13-8a58-f72761c1971b 
ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64
[2021-02-05 04:03:17,244][ceph_volume.process][INFO  ] stderr Volume group 
"ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64" has insufficient free space (572317 
extents): 572318 required.
```
size was passed: 2.18 TB -> 572318
How is  this calculated?


Thanks!
Tony
> -Original Message-
> From: Tony Liu 
> Sent: Thursday, February 4, 2021 8:34 PM
> To: ceph-users@ceph.io
> Subject: [ceph-users] replace OSD failed
> 
> Hi,
> 
> With 15.2.8, run "ceph orch rm osd 12 --replace --force", PGs on osd.12
> are remapped, osd.12 is removed from "ceph osd tree", the daemon is
> removed from "ceph orch ps", the device is "available"
> in "ceph orch device ls". Everything seems good at this point.
> 
> Then dry-run service spec.
> ```
> # cat osd-spec.yaml
> service_type: osd
> service_id: osd-spec
> placement:
>   hosts:
>   - ceph-osd-1
> data_devices:
>   rotational: 1
> db_devices:
>   rotational: 0
> 
> # ceph orch apply osd -i osd-spec.yaml --dry-run
> +-+--++--+--+-+
> |SERVICE  |NAME  |HOST|DATA  |DB|WAL  |
> +-+--++--+--+-+
> |osd  |osd-spec  |ceph-osd-3  |/dev/sdd  |/dev/sdb  |-|
> +-+--++--+--+-+
> ```
> It looks as expected.
> 
> Then "ceph orch apply osd -i osd-spec.yaml".
> Here is the log of cephadm.
> ```
> /bin/docker:stderr --> relative data size: 1.0 /bin/docker:stderr -->
> passed block_db devices: 1 physical, 0 LVM /bin/docker:stderr Running
> command: /usr/bin/ceph-authtool --gen-print-key /bin/docker:stderr
> Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-
> osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
> /bin/docker:stderr Running command: /usr/bin/ceph --cluster ceph --name
> client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
> -i - osd new b05c3c90-b7d5-4f13-8a58-f72761c1971b 12 /bin/docker:stderr
> Running command: /usr/sbin/vgcreate --force --yes ceph-a3886f74-3de9-
> 4e6e-a983-8330eda0bd64 /dev/sdd /bin/docker:stderr  stdout: Physical
> volume "/dev/sdd" successfully created.
> /bin/docker:stderr  stdout: Volume group "ceph-a3886f74-3de9-4e6e-a983-
> 8330eda0bd64" successfully created /bin/docker:stderr Running command:
> /usr/sbin/lvcreate --yes -l 572318 -n osd-block-b05c3c90-b7d5-4f13-8a58-
> f72761c1971b ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64
> /bin/docker:stderr  stderr: Volume group "ceph-a3886f74-3de9-4e6e-a983-
> 8330eda0bd64" has insufficient free space (572317 extents): 572318
> required.
> /bin/docker:stderr --> Was unable to complete a new OSD, will rollback
> changes ``` Q1, why VG name (ceph-) is different from others (ceph-
> block-)?
> Q2, where is that 572318 from? Since all HDDs are the same model, VG
> "Total PE" of all HDDs is 572317.
> Has anyone seen similar issues? Anything I am missing?
> 
> 
> Thanks!
> Tony
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] replace OSD failed

2021-02-04 Thread Tony Liu
Hi,

With 15.2.8, run "ceph orch rm osd 12 --replace --force",
PGs on osd.12 are remapped, osd.12 is removed from "ceph osd tree",
the daemon is removed from "ceph orch ps", the device is "available"
in "ceph orch device ls". Everything seems good at this point.

Then dry-run service spec.
```
# cat osd-spec.yaml
service_type: osd
service_id: osd-spec
placement:
  hosts:
  - ceph-osd-1
data_devices:
  rotational: 1
db_devices:
  rotational: 0

# ceph orch apply osd -i osd-spec.yaml --dry-run
+-+--++--+--+-+
|SERVICE  |NAME  |HOST|DATA  |DB|WAL  |
+-+--++--+--+-+
|osd  |osd-spec  |ceph-osd-3  |/dev/sdd  |/dev/sdb  |-|
+-+--++--+--+-+
```
It looks as expected.

Then "ceph orch apply osd -i osd-spec.yaml".
Here is the log of cephadm.
```
/bin/docker:stderr --> relative data size: 1.0
/bin/docker:stderr --> passed block_db devices: 1 physical, 0 LVM
/bin/docker:stderr Running command: /usr/bin/ceph-authtool --gen-print-key
/bin/docker:stderr Running command: /usr/bin/ceph --cluster ceph --name 
client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd 
tree -f json
/bin/docker:stderr Running command: /usr/bin/ceph --cluster ceph --name 
client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - 
osd new b05c3c90-b7d5-4f13-8a58-f72761c1971b 12
/bin/docker:stderr Running command: /usr/sbin/vgcreate --force --yes 
ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64 /dev/sdd
/bin/docker:stderr  stdout: Physical volume "/dev/sdd" successfully created.
/bin/docker:stderr  stdout: Volume group 
"ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64" successfully created
/bin/docker:stderr Running command: /usr/sbin/lvcreate --yes -l 572318 -n 
osd-block-b05c3c90-b7d5-4f13-8a58-f72761c1971b 
ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64
/bin/docker:stderr  stderr: Volume group 
"ceph-a3886f74-3de9-4e6e-a983-8330eda0bd64" has insufficient free space (572317 
extents): 572318 required.
/bin/docker:stderr --> Was unable to complete a new OSD, will rollback changes
```
Q1, why VG name (ceph-) is different from others (ceph-block-)?
Q2, where is that 572318 from? Since all HDDs are the same model, VG
"Total PE" of all HDDs is 572317.
Has anyone seen similar issues? Anything I am missing?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


  1   2   >