[ceph-users] Re: Performance of volume size, not a block size

2024-04-17 Thread Mitsumasa KONDO
Hi Johansson-san,

Thank you very much for your detailed explanation. I read some documents in
Ceph community. So I generally understand. Thank you very much for all the
useful advice. The characteristics of distributed storage seem to be quite
complex, so I will investigate various things when I have time.

Regards,
--
Mitsumasa KONDO

2024年4月16日(火) 15:35 Janne Johansson :

> Den mån 15 apr. 2024 kl 13:09 skrev Mitsumasa KONDO <
> kondo.mitsum...@gmail.com>:
> > Hi Menguy-san,
> >
> > Thank you for your reply. Users who use large IO with tiny volumes are a
> > nuisance to cloud providers.
> >
> > I confirmed my ceph cluster with 40 SSDs. Each OSD on 1TB SSD has about
> 50
> > placement groups in my cluster. Therefore, each PG has approximately 20GB
> > of space.
> > If we create a small 8GB volume, I had a feeling it wouldn't be
> distributed
> > well, but it will be distributed well.
>
> RBD images get split into 2 or 4M pieces when stored in ceph, so an 8G
> RBD image will be split into 2048-or-4096 separate pieces that end up
> "randomly" on the PGs the pool is based on, which means that if you
> read or write the whole RBD image from start to end, you are going to
> spread the load to all OSDs.
>
> I think it works something like this, you ask librbd for an 8G image
> named "myimage", and underneath it makes myimage.0, myimage.1, 2,3,4
> and so on. The PG placement will depend on the object name, which of
> course differs for all the pieces, and hence they end up on different
> PGs, thereby spreading the load. If ceph did not do this, then you
> could never make an RBD image that was larger than "smallest free
> space on any of the pools OSDs" but also, it would mean that the RBD
> client would be talking to the same single OSD for everything, and
> that would not be a good way to use a clusters resources evenly.
>
> --
> May the most significant bit of your life be positive.
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Performance of volume size, not a block size

2024-04-15 Thread Janne Johansson
Den mån 15 apr. 2024 kl 13:09 skrev Mitsumasa KONDO :
> Hi Menguy-san,
>
> Thank you for your reply. Users who use large IO with tiny volumes are a
> nuisance to cloud providers.
>
> I confirmed my ceph cluster with 40 SSDs. Each OSD on 1TB SSD has about 50
> placement groups in my cluster. Therefore, each PG has approximately 20GB
> of space.
> If we create a small 8GB volume, I had a feeling it wouldn't be distributed
> well, but it will be distributed well.

RBD images get split into 2 or 4M pieces when stored in ceph, so an 8G
RBD image will be split into 2048-or-4096 separate pieces that end up
"randomly" on the PGs the pool is based on, which means that if you
read or write the whole RBD image from start to end, you are going to
spread the load to all OSDs.

I think it works something like this, you ask librbd for an 8G image
named "myimage", and underneath it makes myimage.0, myimage.1, 2,3,4
and so on. The PG placement will depend on the object name, which of
course differs for all the pieces, and hence they end up on different
PGs, thereby spreading the load. If ceph did not do this, then you
could never make an RBD image that was larger than "smallest free
space on any of the pools OSDs" but also, it would mean that the RBD
client would be talking to the same single OSD for everything, and
that would not be a good way to use a clusters resources evenly.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Performance of volume size, not a block size

2024-04-15 Thread Mitsumasa KONDO
Hi Anthony-san,

Thank you for your advice. I confirm my settings of my ceph cluster.
Autoscaler mode is on, so I had thought it's the best PGs. But the
autoscaler feature doesn't affect OSD's PGs. It's just for PG_NUM in
storage pools. Is that right?

Regards,
--
Mitsumasa KONDO


2024年4月15日(月) 22:58 Anthony D'Atri :

> If you're using SATA/SAS SSDs I would aim for 150-200 PGs per OSD as shown
> by `ceph osd df`.
> If NVMe, 200-300 unless you're starved for RAM.
>
>
> > On Apr 15, 2024, at 07:07, Mitsumasa KONDO 
> wrote:
> >
> > Hi Menguy-san,
> >
> > Thank you for your reply. Users who use large IO with tiny volumes are a
> > nuisance to cloud providers.
> >
> > I confirmed my ceph cluster with 40 SSDs. Each OSD on 1TB SSD has about
> 50
> > placement groups in my cluster. Therefore, each PG has approximately 20GB
> > of space.
> > If we create a small 8GB volume, I had a feeling it wouldn't be
> distributed
> > well, but it will be distributed well.
> >
> > Regards,
> > --
> > Mitsumasa KONDO
> >
> > 2024年4月15日(月) 15:29 Etienne Menguy :
> >
> >> Hi,
> >>
> >> Volume size doesn't affect performance, cloud providers apply a limit to
> >> ensure they can deliver expected performances to all their customers.
> >>
> >> Étienne
> >> --
> >> *From:* Mitsumasa KONDO 
> >> *Sent:* Monday, 15 April 2024 06:06
> >> *To:* ceph-users@ceph.io 
> >> *Subject:* [ceph-users] Performance of volume size, not a block size
> >>
> >> [Some people who received this message don't often get email from
> >> kondo.mitsum...@gmail.com. Learn why this is important at
> >> https://aka.ms/LearnAboutSenderIdentification ]
> >>
> >> Hi,
> >>
> >> In AWS EBS gp3, AWS says that small volume size cannot achieve best
> >> performance. I think it's a feature or tendency of general
> >> distributed storages including Ceph. Is that right in Ceph block
> storage? I
> >> read many docs on ceph community. I never heard of Ceph storage.
> >>
> >>
> >>
> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.aws.amazon.com%2Febs%2Flatest%2Fuserguide%2Fgeneral-purpose.html&data=05%7C02%7Cetienne.menguy%40ubisoft.com%7C3076825a4d2a4897074208dc5d017852%7Ce01bd386fa514210a2a429e5ab6f7ab1%7C0%7C0%7C638487508098942744%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=wOQKqG41uccTbyNHDIps62ojcTFBZYlyxxp3TzccsJI%3D&reserved=0
> >> 
> >>
> >> Regard,
> >> --
> >> Mitsumasa KONDO
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Performance of volume size, not a block size

2024-04-15 Thread Anthony D'Atri
If you're using SATA/SAS SSDs I would aim for 150-200 PGs per OSD as shown by 
`ceph osd df`.
If NVMe, 200-300 unless you're starved for RAM.


> On Apr 15, 2024, at 07:07, Mitsumasa KONDO  wrote:
> 
> Hi Menguy-san,
> 
> Thank you for your reply. Users who use large IO with tiny volumes are a
> nuisance to cloud providers.
> 
> I confirmed my ceph cluster with 40 SSDs. Each OSD on 1TB SSD has about 50
> placement groups in my cluster. Therefore, each PG has approximately 20GB
> of space.
> If we create a small 8GB volume, I had a feeling it wouldn't be distributed
> well, but it will be distributed well.
> 
> Regards,
> --
> Mitsumasa KONDO
> 
> 2024年4月15日(月) 15:29 Etienne Menguy :
> 
>> Hi,
>> 
>> Volume size doesn't affect performance, cloud providers apply a limit to
>> ensure they can deliver expected performances to all their customers.
>> 
>> Étienne
>> --
>> *From:* Mitsumasa KONDO 
>> *Sent:* Monday, 15 April 2024 06:06
>> *To:* ceph-users@ceph.io 
>> *Subject:* [ceph-users] Performance of volume size, not a block size
>> 
>> [Some people who received this message don't often get email from
>> kondo.mitsum...@gmail.com. Learn why this is important at
>> https://aka.ms/LearnAboutSenderIdentification ]
>> 
>> Hi,
>> 
>> In AWS EBS gp3, AWS says that small volume size cannot achieve best
>> performance. I think it's a feature or tendency of general
>> distributed storages including Ceph. Is that right in Ceph block storage? I
>> read many docs on ceph community. I never heard of Ceph storage.
>> 
>> 
>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.aws.amazon.com%2Febs%2Flatest%2Fuserguide%2Fgeneral-purpose.html&data=05%7C02%7Cetienne.menguy%40ubisoft.com%7C3076825a4d2a4897074208dc5d017852%7Ce01bd386fa514210a2a429e5ab6f7ab1%7C0%7C0%7C638487508098942744%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=wOQKqG41uccTbyNHDIps62ojcTFBZYlyxxp3TzccsJI%3D&reserved=0
>> 
>> 
>> Regard,
>> --
>> Mitsumasa KONDO
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Performance of volume size, not a block size

2024-04-15 Thread Mitsumasa KONDO
Hi Menguy-san,

Thank you for your reply. Users who use large IO with tiny volumes are a
nuisance to cloud providers.

I confirmed my ceph cluster with 40 SSDs. Each OSD on 1TB SSD has about 50
placement groups in my cluster. Therefore, each PG has approximately 20GB
of space.
If we create a small 8GB volume, I had a feeling it wouldn't be distributed
well, but it will be distributed well.

Regards,
--
Mitsumasa KONDO

2024年4月15日(月) 15:29 Etienne Menguy :

> Hi,
>
> Volume size doesn't affect performance, cloud providers apply a limit to
> ensure they can deliver expected performances to all their customers.
>
> Étienne
> --
> *From:* Mitsumasa KONDO 
> *Sent:* Monday, 15 April 2024 06:06
> *To:* ceph-users@ceph.io 
> *Subject:* [ceph-users] Performance of volume size, not a block size
>
> [Some people who received this message don't often get email from
> kondo.mitsum...@gmail.com. Learn why this is important at
> https://aka.ms/LearnAboutSenderIdentification ]
>
> Hi,
>
> In AWS EBS gp3, AWS says that small volume size cannot achieve best
> performance. I think it's a feature or tendency of general
> distributed storages including Ceph. Is that right in Ceph block storage? I
> read many docs on ceph community. I never heard of Ceph storage.
>
>
> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.aws.amazon.com%2Febs%2Flatest%2Fuserguide%2Fgeneral-purpose.html&data=05%7C02%7Cetienne.menguy%40ubisoft.com%7C3076825a4d2a4897074208dc5d017852%7Ce01bd386fa514210a2a429e5ab6f7ab1%7C0%7C0%7C638487508098942744%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=wOQKqG41uccTbyNHDIps62ojcTFBZYlyxxp3TzccsJI%3D&reserved=0
> 
>
> Regard,
> --
> Mitsumasa KONDO
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Performance of volume size, not a block size

2024-04-14 Thread Etienne Menguy
Hi,

Volume size doesn't affect performance, cloud providers apply a limit to ensure 
they can deliver expected performances to all their customers.

Étienne

From: Mitsumasa KONDO 
Sent: Monday, 15 April 2024 06:06
To: ceph-users@ceph.io 
Subject: [ceph-users] Performance of volume size, not a block size

[Some people who received this message don't often get email from 
kondo.mitsum...@gmail.com. Learn why this is important at 
https://aka.ms/LearnAboutSenderIdentification ]

Hi,

In AWS EBS gp3, AWS says that small volume size cannot achieve best
performance. I think it's a feature or tendency of general
distributed storages including Ceph. Is that right in Ceph block storage? I
read many docs on ceph community. I never heard of Ceph storage.

https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.aws.amazon.com%2Febs%2Flatest%2Fuserguide%2Fgeneral-purpose.html&data=05%7C02%7Cetienne.menguy%40ubisoft.com%7C3076825a4d2a4897074208dc5d017852%7Ce01bd386fa514210a2a429e5ab6f7ab1%7C0%7C0%7C638487508098942744%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=wOQKqG41uccTbyNHDIps62ojcTFBZYlyxxp3TzccsJI%3D&reserved=0

Regard,
--
Mitsumasa KONDO
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io