[ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

2023-03-29 Thread Calhoun, Patrick
I think that the backported fix for this issue made it into ceph v16.2.11.

https://ceph.io/en/news/blog/2023/v16-2-11-pacific-released/


"ceph-volume: Pacific backports (pr#47413, Guillaume Abrioux, Zack Cerza, 
Arthur Outhenin-Chalandre)"

https://github.com/ceph/ceph/pull/47413/commits/4252cc44211f0ccebf388374744eaa26b32854d3

-Patrick

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

2022-12-09 Thread Adrien Georget

Hi,

We were also affected by this bug when we deployed a new Pacific cluster.
Any news about the release of this fix to Ceph Pacific? It looks done 
for Quincy version but not Pacific.


https://github.com/ceph/ceph/pull/47292

Regards,
Adrien

Le 05/10/2022 à 13:21, Anh Phan Tuan a écrit :

It seems the 17.2.4 release has fixed this.

ceph-volume: fix fast device alloc size on mulitple device (pr#47293,

Arthur Outhenin-Chalandre)


Bug #56031: batch compute a lower size than what it should be for blockdb
with multiple fast device - ceph-volume - Ceph
<https://tracker.ceph.com/issues/56031>

Regards,
Anh Phan

On Fri, Sep 16, 2022 at 2:34 AM Christophe BAILLON  wrote:


Hi

The problem is still present in version 17.2.3,
thanks for the trick to work around...

Regards

- Mail original -

De: "Anh Phan Tuan" 
À: "Calhoun, Patrick" 
Cc: "Arthur Outhenin-Chalandre" ,

"ceph-users" 

Envoyé: Jeudi 11 Août 2022 10:14:17
Objet: [ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD
Hi Patrick,

I am also facing this bug when deploying a new cluster at the time 16.2.7
release.

The bugs relative to the way ceph calculator db_size form give db disk.

Instead of : slot db size = size of db disk / num slot per disk.
Ceph calculated the value: slot db size = size of db disk (just one

disk) /

total number of slots needed (number of osd prepared in that time).

In your case, you have 2 db disks, It will make the db size only 50% of

the

corrected value.
In my case, I have 4 db disks per host, It makes the db size only 25% of
the corrected value.

This bug happens even when you deploy by batch command.
In that time, I finally used to work around by batch command but only
deploy all osd relative to one db disk a time, in this case ceph

calculated

the correct value.

Cheers,
Anh Phan



On Sat, Jul 30, 2022 at 12:31 AM Calhoun, Patrick 

wrote:

Thanks, Arthur,

I think you are right about that bug looking very similar to what I've
observed. I'll try to remember to update the list once the fix is merged
and released and I get a chance to test it.

I'm hoping somebody can comment on what are ceph's current best

practices

for sizing WAL/DB volumes, considering rocksdb levels and compaction.

-Patrick


From: Arthur Outhenin-Chalandre 
Sent: Friday, July 29, 2022 2:11 AM
To: ceph-users@ceph.io 
Subject: [ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

Hi Patrick,

On 7/28/22 16:22, Calhoun, Patrick wrote:

In a new OSD node with 24 hdd (16 TB each) and 2 ssd (1.44 TB each),

I'd

like to have "ceph orch" allocate WAL and DB on the ssd devices.

I use the following service spec:
spec:
   data_devices:
 rotational: 1
 size: '14T:'
   db_devices:
 rotational: 0
 size: '1T:'
   db_slots: 12

This results in each OSD having a 60GB volume for WAL/DB, which

equates

to 50% total usage in the VG on each ssd, and 50% free.

I honestly don't know what size to expect, but exactly 50% of capacity

makes me suspect this is due to a bug:

https://tracker.ceph.com/issues/54541
(In fact, I had run into this bug when specifying block_db_size rather

than db_slots)

Questions:
   Am I being bit by that bug?
   Is there a better approach, in general, to my situation?
   Are DB sizes still governed by the rocksdb tiering? (I thought that

this was mostly resolved by https://github.com/ceph/ceph/pull/29687 )

   If I provision a DB/WAL logical volume size to 61GB, is that

effectively a 30GB database, and 30GB of extra room for compaction?

I don't use cephadm, but it's maybe related to this regression:
https://tracker.ceph.com/issues/56031. At list the symptoms looks very
similar...

Cheers,

--
Arthur Outhenin-Chalandre
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

--
Christophe BAILLON
Mobile :: +336 16 400 522
Work :: https://eyona.com
Twitter :: https://twitter.com/ctof


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

2022-10-05 Thread Anh Phan Tuan
It seems the 17.2.4 release has fixed this.

ceph-volume: fix fast device alloc size on mulitple device (pr#47293,
> Arthur Outhenin-Chalandre)


Bug #56031: batch compute a lower size than what it should be for blockdb
with multiple fast device - ceph-volume - Ceph
<https://tracker.ceph.com/issues/56031>

Regards,
Anh Phan

On Fri, Sep 16, 2022 at 2:34 AM Christophe BAILLON  wrote:

> Hi
>
> The problem is still present in version 17.2.3,
> thanks for the trick to work around...
>
> Regards
>
> - Mail original -
> > De: "Anh Phan Tuan" 
> > À: "Calhoun, Patrick" 
> > Cc: "Arthur Outhenin-Chalandre" ,
> "ceph-users" 
> > Envoyé: Jeudi 11 Août 2022 10:14:17
> > Objet: [ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD
>
> > Hi Patrick,
> >
> > I am also facing this bug when deploying a new cluster at the time 16.2.7
> > release.
> >
> > The bugs relative to the way ceph calculator db_size form give db disk.
> >
> > Instead of : slot db size = size of db disk / num slot per disk.
> > Ceph calculated the value: slot db size = size of db disk (just one
> disk) /
> > total number of slots needed (number of osd prepared in that time).
> >
> > In your case, you have 2 db disks, It will make the db size only 50% of
> the
> > corrected value.
> > In my case, I have 4 db disks per host, It makes the db size only 25% of
> > the corrected value.
> >
> > This bug happens even when you deploy by batch command.
> > In that time, I finally used to work around by batch command but only
> > deploy all osd relative to one db disk a time, in this case ceph
> calculated
> > the correct value.
> >
> > Cheers,
> > Anh Phan
> >
> >
> >
> > On Sat, Jul 30, 2022 at 12:31 AM Calhoun, Patrick 
> wrote:
> >
> >> Thanks, Arthur,
> >>
> >> I think you are right about that bug looking very similar to what I've
> >> observed. I'll try to remember to update the list once the fix is merged
> >> and released and I get a chance to test it.
> >>
> >> I'm hoping somebody can comment on what are ceph's current best
> practices
> >> for sizing WAL/DB volumes, considering rocksdb levels and compaction.
> >>
> >> -Patrick
> >>
> >> 
> >> From: Arthur Outhenin-Chalandre 
> >> Sent: Friday, July 29, 2022 2:11 AM
> >> To: ceph-users@ceph.io 
> >> Subject: [ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD
> >>
> >> Hi Patrick,
> >>
> >> On 7/28/22 16:22, Calhoun, Patrick wrote:
> >> > In a new OSD node with 24 hdd (16 TB each) and 2 ssd (1.44 TB each),
> I'd
> >> like to have "ceph orch" allocate WAL and DB on the ssd devices.
> >> >
> >> > I use the following service spec:
> >> > spec:
> >> >   data_devices:
> >> > rotational: 1
> >> > size: '14T:'
> >> >   db_devices:
> >> > rotational: 0
> >> > size: '1T:'
> >> >   db_slots: 12
> >> >
> >> > This results in each OSD having a 60GB volume for WAL/DB, which
> equates
> >> to 50% total usage in the VG on each ssd, and 50% free.
> >> > I honestly don't know what size to expect, but exactly 50% of capacity
> >> makes me suspect this is due to a bug:
> >> > https://tracker.ceph.com/issues/54541
> >> > (In fact, I had run into this bug when specifying block_db_size rather
> >> than db_slots)
> >> >
> >> > Questions:
> >> >   Am I being bit by that bug?
> >> >   Is there a better approach, in general, to my situation?
> >> >   Are DB sizes still governed by the rocksdb tiering? (I thought that
> >> this was mostly resolved by https://github.com/ceph/ceph/pull/29687 )
> >> >   If I provision a DB/WAL logical volume size to 61GB, is that
> >> effectively a 30GB database, and 30GB of extra room for compaction?
> >>
> >> I don't use cephadm, but it's maybe related to this regression:
> >> https://tracker.ceph.com/issues/56031. At list the symptoms looks very
> >> similar...
> >>
> >> Cheers,
> >>
> >> --
> >> Arthur Outhenin-Chalandre
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
> --
> Christophe BAILLON
> Mobile :: +336 16 400 522
> Work :: https://eyona.com
> Twitter :: https://twitter.com/ctof
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

2022-09-15 Thread Christophe BAILLON
Hi

The problem is still present in version 17.2.3, 
thanks for the trick to work around...

Regards

- Mail original -
> De: "Anh Phan Tuan" 
> À: "Calhoun, Patrick" 
> Cc: "Arthur Outhenin-Chalandre" , 
> "ceph-users" 
> Envoyé: Jeudi 11 Août 2022 10:14:17
> Objet: [ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

> Hi Patrick,
> 
> I am also facing this bug when deploying a new cluster at the time 16.2.7
> release.
> 
> The bugs relative to the way ceph calculator db_size form give db disk.
> 
> Instead of : slot db size = size of db disk / num slot per disk.
> Ceph calculated the value: slot db size = size of db disk (just one disk) /
> total number of slots needed (number of osd prepared in that time).
> 
> In your case, you have 2 db disks, It will make the db size only 50% of the
> corrected value.
> In my case, I have 4 db disks per host, It makes the db size only 25% of
> the corrected value.
> 
> This bug happens even when you deploy by batch command.
> In that time, I finally used to work around by batch command but only
> deploy all osd relative to one db disk a time, in this case ceph calculated
> the correct value.
> 
> Cheers,
> Anh Phan
> 
> 
> 
> On Sat, Jul 30, 2022 at 12:31 AM Calhoun, Patrick  wrote:
> 
>> Thanks, Arthur,
>>
>> I think you are right about that bug looking very similar to what I've
>> observed. I'll try to remember to update the list once the fix is merged
>> and released and I get a chance to test it.
>>
>> I'm hoping somebody can comment on what are ceph's current best practices
>> for sizing WAL/DB volumes, considering rocksdb levels and compaction.
>>
>> -Patrick
>>
>> 
>> From: Arthur Outhenin-Chalandre 
>> Sent: Friday, July 29, 2022 2:11 AM
>> To: ceph-users@ceph.io 
>> Subject: [ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD
>>
>> Hi Patrick,
>>
>> On 7/28/22 16:22, Calhoun, Patrick wrote:
>> > In a new OSD node with 24 hdd (16 TB each) and 2 ssd (1.44 TB each), I'd
>> like to have "ceph orch" allocate WAL and DB on the ssd devices.
>> >
>> > I use the following service spec:
>> > spec:
>> >   data_devices:
>> > rotational: 1
>> > size: '14T:'
>> >   db_devices:
>> > rotational: 0
>> > size: '1T:'
>> >   db_slots: 12
>> >
>> > This results in each OSD having a 60GB volume for WAL/DB, which equates
>> to 50% total usage in the VG on each ssd, and 50% free.
>> > I honestly don't know what size to expect, but exactly 50% of capacity
>> makes me suspect this is due to a bug:
>> > https://tracker.ceph.com/issues/54541
>> > (In fact, I had run into this bug when specifying block_db_size rather
>> than db_slots)
>> >
>> > Questions:
>> >   Am I being bit by that bug?
>> >   Is there a better approach, in general, to my situation?
>> >   Are DB sizes still governed by the rocksdb tiering? (I thought that
>> this was mostly resolved by https://github.com/ceph/ceph/pull/29687 )
>> >   If I provision a DB/WAL logical volume size to 61GB, is that
>> effectively a 30GB database, and 30GB of extra room for compaction?
>>
>> I don't use cephadm, but it's maybe related to this regression:
>> https://tracker.ceph.com/issues/56031. At list the symptoms looks very
>> similar...
>>
>> Cheers,
>>
>> --
>> Arthur Outhenin-Chalandre
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

-- 
Christophe BAILLON
Mobile :: +336 16 400 522
Work :: https://eyona.com
Twitter :: https://twitter.com/ctof
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

2022-08-11 Thread Anh Phan Tuan
Hi Patrick,

I am also facing this bug when deploying a new cluster at the time 16.2.7
release.

The bugs relative to the way ceph calculator db_size form give db disk.

Instead of : slot db size = size of db disk / num slot per disk.
Ceph calculated the value: slot db size = size of db disk (just one disk) /
total number of slots needed (number of osd prepared in that time).

In your case, you have 2 db disks, It will make the db size only 50% of the
corrected value.
In my case, I have 4 db disks per host, It makes the db size only 25% of
the corrected value.

This bug happens even when you deploy by batch command.
In that time, I finally used to work around by batch command but only
deploy all osd relative to one db disk a time, in this case ceph calculated
the correct value.

Cheers,
Anh Phan



On Sat, Jul 30, 2022 at 12:31 AM Calhoun, Patrick  wrote:

> Thanks, Arthur,
>
> I think you are right about that bug looking very similar to what I've
> observed. I'll try to remember to update the list once the fix is merged
> and released and I get a chance to test it.
>
> I'm hoping somebody can comment on what are ceph's current best practices
> for sizing WAL/DB volumes, considering rocksdb levels and compaction.
>
> -Patrick
>
> 
> From: Arthur Outhenin-Chalandre 
> Sent: Friday, July 29, 2022 2:11 AM
> To: ceph-users@ceph.io 
> Subject: [ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD
>
> Hi Patrick,
>
> On 7/28/22 16:22, Calhoun, Patrick wrote:
> > In a new OSD node with 24 hdd (16 TB each) and 2 ssd (1.44 TB each), I'd
> like to have "ceph orch" allocate WAL and DB on the ssd devices.
> >
> > I use the following service spec:
> > spec:
> >   data_devices:
> > rotational: 1
> > size: '14T:'
> >   db_devices:
> > rotational: 0
> > size: '1T:'
> >   db_slots: 12
> >
> > This results in each OSD having a 60GB volume for WAL/DB, which equates
> to 50% total usage in the VG on each ssd, and 50% free.
> > I honestly don't know what size to expect, but exactly 50% of capacity
> makes me suspect this is due to a bug:
> > https://tracker.ceph.com/issues/54541
> > (In fact, I had run into this bug when specifying block_db_size rather
> than db_slots)
> >
> > Questions:
> >   Am I being bit by that bug?
> >   Is there a better approach, in general, to my situation?
> >   Are DB sizes still governed by the rocksdb tiering? (I thought that
> this was mostly resolved by https://github.com/ceph/ceph/pull/29687 )
> >   If I provision a DB/WAL logical volume size to 61GB, is that
> effectively a 30GB database, and 30GB of extra room for compaction?
>
> I don't use cephadm, but it's maybe related to this regression:
> https://tracker.ceph.com/issues/56031. At list the symptoms looks very
> similar...
>
> Cheers,
>
> --
> Arthur Outhenin-Chalandre
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

2022-07-29 Thread Calhoun, Patrick
Thanks, Arthur,

I think you are right about that bug looking very similar to what I've 
observed. I'll try to remember to update the list once the fix is merged and 
released and I get a chance to test it.

I'm hoping somebody can comment on what are ceph's current best practices for 
sizing WAL/DB volumes, considering rocksdb levels and compaction.

-Patrick


From: Arthur Outhenin-Chalandre 
Sent: Friday, July 29, 2022 2:11 AM
To: ceph-users@ceph.io 
Subject: [ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

Hi Patrick,

On 7/28/22 16:22, Calhoun, Patrick wrote:
> In a new OSD node with 24 hdd (16 TB each) and 2 ssd (1.44 TB each), I'd like 
> to have "ceph orch" allocate WAL and DB on the ssd devices.
>
> I use the following service spec:
> spec:
>   data_devices:
> rotational: 1
> size: '14T:'
>   db_devices:
> rotational: 0
> size: '1T:'
>   db_slots: 12
>
> This results in each OSD having a 60GB volume for WAL/DB, which equates to 
> 50% total usage in the VG on each ssd, and 50% free.
> I honestly don't know what size to expect, but exactly 50% of capacity makes 
> me suspect this is due to a bug:
> https://tracker.ceph.com/issues/54541
> (In fact, I had run into this bug when specifying block_db_size rather than 
> db_slots)
>
> Questions:
>   Am I being bit by that bug?
>   Is there a better approach, in general, to my situation?
>   Are DB sizes still governed by the rocksdb tiering? (I thought that this 
> was mostly resolved by https://github.com/ceph/ceph/pull/29687 )
>   If I provision a DB/WAL logical volume size to 61GB, is that effectively a 
> 30GB database, and 30GB of extra room for compaction?

I don't use cephadm, but it's maybe related to this regression:
https://tracker.ceph.com/issues/56031. At list the symptoms looks very
similar...

Cheers,

--
Arthur Outhenin-Chalandre
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm automatic sizing of WAL/DB on SSD

2022-07-29 Thread Arthur Outhenin-Chalandre
Hi Patrick,

On 7/28/22 16:22, Calhoun, Patrick wrote:
> In a new OSD node with 24 hdd (16 TB each) and 2 ssd (1.44 TB each), I'd like 
> to have "ceph orch" allocate WAL and DB on the ssd devices.
> 
> I use the following service spec:
> spec:
>   data_devices:
> rotational: 1
> size: '14T:'
>   db_devices:
> rotational: 0
> size: '1T:'
>   db_slots: 12
> 
> This results in each OSD having a 60GB volume for WAL/DB, which equates to 
> 50% total usage in the VG on each ssd, and 50% free.
> I honestly don't know what size to expect, but exactly 50% of capacity makes 
> me suspect this is due to a bug:
> https://tracker.ceph.com/issues/54541
> (In fact, I had run into this bug when specifying block_db_size rather than 
> db_slots)
> 
> Questions:
>   Am I being bit by that bug?
>   Is there a better approach, in general, to my situation?
>   Are DB sizes still governed by the rocksdb tiering? (I thought that this 
> was mostly resolved by https://github.com/ceph/ceph/pull/29687 )
>   If I provision a DB/WAL logical volume size to 61GB, is that effectively a 
> 30GB database, and 30GB of extra room for compaction?

I don't use cephadm, but it's maybe related to this regression:
https://tracker.ceph.com/issues/56031. At list the symptoms looks very
similar...

Cheers,

-- 
Arthur Outhenin-Chalandre
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io