[ceph-users] Re: CompleteMultipartUpload takes a long time to finish

2024-02-06 Thread Ondřej Kukla
I will try to check this case.

From a first check I can see that the client has multiple (12) multipart 
uploads for the same key, started at the exactly same time.

But the objects are uploaded to only one “upload id” others are empty so the 
number of Parts is equal to the number of parts of the running upload.

For the part size, client is uploading 500MiB parts so for the 750GiB it’s 
~1500 parts.

Ondrej

> On 6. 2. 2024, at 7:53, Robin H. Johnson  wrote:
> 
> On Mon, Feb 05, 2024 at 07:51:34PM +0100, Ondřej Kukla wrote:
>> Hello,
>> 
>> For some time now I’m struggling with the time it takes to 
>> CompleteMultipartUpload on one of my rgw clusters.
>> 
>> I have a customer with ~8M objects in one bucket uploading quite a large 
>> files. From 100GB to like 800GB.
>> 
>> 
>> I’ve noticed when they are uploading ~200GB files that the requests started 
>> timeouting on a LB we have infront of the rgw.
>> 
>> When I’ve started going through the logs I’ve noticed that the 
>> CompleteMultipartUpload request took like 700s to finish. Which seemed 
>> ok-ish, but the number seem quite large.
>> 
>> However, when they started uploading 750GB files the time to complete the 
>> multipart upload ended around 2500s -> more than 40minutes which seems like 
>> a way to much.
>> 
>> 
>> Do you have a similar experience? Is there anything we can do to improve 
>> this? How much time does the CompleteMultipartUpload takes on your clusters?
>> 
>> The cluster is running on version 17.2.6.
> How many incomplete MPUs and MPU parts exist in that bucket?
> 
> Is it 8M objects based on ListObjects, or the number of objects reported
> by radosgw-admin bucket stats?
> 
> If there are LOTS of incomplete objects, that can cause extreme cases in
> the listing used by both ListObjects & CompleteMultipartUpload.
> 
> This is esp. likely if they the huge files used many tiny parts (some
> tooling defaults to 5MB parts).
> 
> Easy way to test this is ask them to do a single 200GB upload to a brand
> new bucket (with no objects).
> 
> If *that* case is fast, then it's something about the index entries in
> the existing bucket; likely a high proportion of incomplete parts.
> 
> CompleteMultipartUpload for 50GB => single-digit seconds.
> 
> -- 
> Robin Hugh Johnson
> Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer
> E-Mail   : robb...@gentoo.org 
> GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
> ___
> ceph-users mailing list -- ceph-users@ceph.io 
> To unsubscribe send an email to ceph-users-le...@ceph.io 
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CompleteMultipartUpload takes a long time to finish

2024-02-05 Thread Robin H. Johnson
On Mon, Feb 05, 2024 at 07:51:34PM +0100, Ondřej Kukla wrote:
> Hello,
> 
> For some time now I’m struggling with the time it takes to 
> CompleteMultipartUpload on one of my rgw clusters.
> 
> I have a customer with ~8M objects in one bucket uploading quite a large 
> files. From 100GB to like 800GB.
> 
> 
> I’ve noticed when they are uploading ~200GB files that the requests started 
> timeouting on a LB we have infront of the rgw.
> 
> When I’ve started going through the logs I’ve noticed that the 
> CompleteMultipartUpload request took like 700s to finish. Which seemed 
> ok-ish, but the number seem quite large.
> 
> However, when they started uploading 750GB files the time to complete the 
> multipart upload ended around 2500s -> more than 40minutes which seems like a 
> way to much.
> 
> 
> Do you have a similar experience? Is there anything we can do to improve 
> this? How much time does the CompleteMultipartUpload takes on your clusters?
> 
> The cluster is running on version 17.2.6.
How many incomplete MPUs and MPU parts exist in that bucket?

Is it 8M objects based on ListObjects, or the number of objects reported
by radosgw-admin bucket stats?

If there are LOTS of incomplete objects, that can cause extreme cases in
the listing used by both ListObjects & CompleteMultipartUpload.

This is esp. likely if they the huge files used many tiny parts (some
tooling defaults to 5MB parts).

Easy way to test this is ask them to do a single 200GB upload to a brand
new bucket (with no objects).

If *that* case is fast, then it's something about the index entries in
the existing bucket; likely a high proportion of incomplete parts.

CompleteMultipartUpload for 50GB => single-digit seconds.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


signature.asc
Description: PGP signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CompleteMultipartUpload takes a long time to finish

2024-02-05 Thread Ondřej Kukla
Hello Anthony,

The replicated index pool has about 20TiB of free space and we are using Intel 
P5510 NVMe Enterprise SSDs so I guess the HW shouldn’t be the issue.

Yes, I’m able to change the timeout on our LB, but I’m not sure if I want to 
set it to 40minutes+…

Ondrej

> On 5. 2. 2024, at 20:09, Anthony D'Atri  wrote:
> 
> Do you have sufficient capacity in the non-ec pool?  Is it on fast media?
> 
> You should be able to increase the timeout on your LB.
> 
>> On Feb 5, 2024, at 13:51, Ondřej Kukla  wrote:
>> 
>> Hello,
>> 
>> For some time now I’m struggling with the time it takes to 
>> CompleteMultipartUpload on one of my rgw clusters.
>> 
>> I have a customer with ~8M objects in one bucket uploading quite a large 
>> files. From 100GB to like 800GB.
>> 
>> 
>> I’ve noticed when they are uploading ~200GB files that the requests started 
>> timeouting on a LB we have infront of the rgw.
>> 
>> When I’ve started going through the logs I’ve noticed that the 
>> CompleteMultipartUpload request took like 700s to finish. Which seemed 
>> ok-ish, but the number seem quite large.
>> 
>> However, when they started uploading 750GB files the time to complete the 
>> multipart upload ended around 2500s -> more than 40minutes which seems like 
>> a way to much.
>> 
>> 
>> Do you have a similar experience? Is there anything we can do to improve 
>> this? How much time does the CompleteMultipartUpload takes on your clusters?
>> 
>> The cluster is running on version 17.2.6.
>> 
>> Regards,
>> 
>> Ondrej
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io