[ceph-users] Re: Creating nfs RGW export makes nfs-gnaesha server in crash loop

2023-01-12 Thread Matt Benjamin
Hi Ben,

The issue seems to be that you don't have a ceph keyring available to the
nfs-ganesha server.  The upstream doc talks about this.  The nfs-ganesha
runtime environment needs to be essentially identical to one (a pod, I
guess) that would run radosgw.

Matt

On Thu, Jan 12, 2023 at 7:27 AM Ruidong Gao  wrote:

> Hi,
>
> This is running Quincy 17.2.5 deployed by rook on k8s. RGW nfs export will
> crash Ganesha server pod. CephFS export works just fine. Here are steps of
> it:
> 1, create export:
> bash-4.4$ ceph nfs export create rgw --cluster-id nfs4rgw --pseudo-path
> /bucketexport --bucket testbk
> {
> "bind": "/bucketexport",
> "path": "testbk",
> "cluster": "nfs4rgw",
> "mode": "RW",
> "squash": "none"
> }
>
> 2, check pods status afterwards:
> rook-ceph-nfs-nfs1-a-679fdb795-82tcx  2/2 Running
>0  4h3m
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42  1/2 Error
>2  4h6m
>
> 3, check failing pod’s logs:
>
> 11/01/2023 08:11:53 : epoch 63be6f49 :
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
> nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
> 11/01/2023 08:11:54 : epoch 63be6f49 :
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
> nfs_start_grace :STATE :EVENT :grace reload client info completed from
> backend
> 11/01/2023 08:11:54 : epoch 63be6f49 :
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
> nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid
> count(0)
> 11/01/2023 08:11:57 : epoch 63be6f49 :
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
> nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
> 11/01/2023 08:11:57 : epoch 63be6f49 :
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
> export_defaults_commit :CONFIG :INFO :Export Defaults now
> (options=03303002/0008   , ,,   ,
>  , ,,, expire=   0)
> 2023-01-11T08:11:57.853+ 7f59dac7c200 -1 auth: unable to find a
> keyring on /var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or
> directory
> 2023-01-11T08:11:57.853+ 7f59dac7c200 -1 AuthRegistry(0x56476817a480)
> no keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling
> cephx
> 2023-01-11T08:11:57.855+ 7f59dac7c200 -1 auth: unable to find a
> keyring on /var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or
> directory
> 2023-01-11T08:11:57.855+ 7f59dac7c200 -1 AuthRegistry(0x7ffe4d092c90)
> no keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling
> cephx
> 2023-01-11T08:11:57.856+ 7f5987537700 -1 monclient(hunting):
> handle_auth_bad_method server allowed_methods [2] but i only support [1]
> 2023-01-11T08:11:57.856+ 7f5986535700 -1 monclient(hunting):
> handle_auth_bad_method server allowed_methods [2] but i only support [1]
> 2023-01-11T08:12:00.861+ 7f5986d36700 -1 monclient(hunting):
> handle_auth_bad_method server allowed_methods [2] but i only support [1]
> 2023-01-11T08:12:00.861+ 7f59dac7c200 -1 monclient: authenticate NOTE:
> no keyring found; disabled cephx authentication
> failed to fetch mon config (--no-mon-config to skip)
>
> 4, delete the export:
> ceph nfs export delete nfs4rgw /bucketexport
>
> Ganesha servers go back normal:
> rook-ceph-nfs-nfs1-a-679fdb795-82tcx  2/2 Running
>0  4h30m
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42  2/2 Running
>10 4h33m
>
> Any ideas to make it work?
>
> Thanks
> Ben
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW archive zone lifecycle

2023-02-08 Thread Matt Benjamin
Hi Ondřej,

Yes, we added an extension to allow writing lifecycle policy which will
only take effect in archive zone(s).  It's currently present on ceph/main,
and will be in Reef.

Matt

On Wed, Feb 8, 2023 at 2:10 AM Ondřej Kukla  wrote:

> Hi,
>
> I have two Ceph clusters in a multi-zone setup. The first one (master
> zone) would be accessible to users for their interaction using RGW.
> The second one is set to sync from the master zone with the tier type of
> the zone set as an archive (to version all files).
>
> My question here is. Is there an option to set a lifecycle for the version
> files saved on the archive zone? For example, keep only 5 versions per file
> or delete version files older than one year?
>
> Thanks a lot.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RadosGW - Performance Expectations

2023-02-10 Thread Matt Benjamin
Hi Shawn,

To get another S3 upload baseline, I'd recommend doing some upload testing
with s5cmd [1].

1. https://github.com/peak/s5cmd

Matt


On Fri, Feb 10, 2023 at 9:38 AM Shawn Weeks 
wrote:

> Good morning everyone, been running a small Ceph cluster with Proxmox for
> a while now and I’ve finally run across an issue I can’t find any
> information on. I have a 3 node cluster with 9 Samsung PM983 960GB NVME
> drives running on a dedicated 10gb network. RBD and CephFS performance have
> been great, most of the time I see over 500MBs writes and a rados benchmark
> shows 951 MB/s write and 1140 MB/s read bandwidth.
>
> The problem I’m seeing is after setting up RadosGW I can only upload to
> “S3” at around 25MBs with the official AWS CLI. Using s3cmd is slightly
> better at around 45MB/s. I’m going directly to the RadosGW instance with no
> load balancers in between and no ssl enabled. Just trying to figure out if
> this is normal. I’m not expecting it to be as fast as writing directly to a
> RBD but I was kinda hoping for more than this.
>
> So what should I expect in performance from the RadosGW?
>
> Here are some rados bench results and my ceph report
>
> https://gist.github.com/shawnweeks/f6ef028284b5cdb10d80b8dc0654eec5
>
> https://gist.github.com/shawnweeks/7cfe94c08adbc24f2a3d8077688df438
>
> Thanks
> Shawn
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: s3 compatible interface

2023-03-20 Thread Matt Benjamin
Hi Chris,

This looks useful.  Note for this thread:  this *looks like* it's using the
zipper dbstore backend?  Yes, that's coming in Reef.  We think of dbstore
as mostly the zipper reference driver, but it can be useful as a standalone
setup, potentially.

But there's now a prototype of a posix file filter that can be stacked on
dbstore (or rados, I guess)--not yet merged, and iiuc post-Reef.  That's
the project Daniel was describing.  The posix/gpfs filter is aiming for
being thin and fast and horizontally scalable.

The s3gw project that Clyso and folks were writing about is distinct from
both of these.  I *think* it's truthful to say that s3gw is its own
thing--a hybrid backing store with objects in files, but also metadata
atomicity from an embedded db--plus interesting orchestration.

Matt

On Mon, Mar 20, 2023 at 3:45 PM Chris MacNaughton <
chris.macnaugh...@canonical.com> wrote:

> On 3/20/23 12:02, Frank Schilder wrote:
>
> Hi Marc,
>
> I'm also interested in an S3 service that uses a file system as a back-end. I 
> looked at the documentation of https://github.com/aquarist-labs/s3gw and have 
> to say that it doesn't make much sense to me. I don't see this kind of 
> gateway anywhere there. What I see is a build of a rados gateway that can be 
> pointed at a ceph cluster. That's not a gateway to an FS.
>
> Did I misunderstand your actual request or can you point me to the part of 
> the documentation where it says how to spin up an S3 interface using a file 
> system for user data?
>
> The only thing I found is 
> https://s3gw-docs.readthedocs.io/en/latest/helm-charts/#local-storage, but it 
> sounds to me that this is not where the user data will be going.
>
> Thanks for any hints and best regards,
>
>
> for testing you can try: https://github.com/aquarist-labs/s3gw
>
> Yes indeed, that looks like it can be used with a simple fs backend.
>
> Hey,
>
> (Re-sending this email from a mailing-list subscribed email)
>
> I was playing around with RadosGW's file backend (coming in Reef, zipper)
> a few months back and ended up making this docker container that just works
> to setup things:
> https://github.com/ChrisMacNaughton/ceph-rgw-docker; published (still,
> maybe for a while?) at https://hub.docker.com/r/iceyec/ceph-rgw-zipper
>
> Chris
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Bucket sync policy

2023-04-24 Thread Matt Benjamin
I'm unclear whether all of this currently works on upstream quincy
(apologies if all such backports have been done)?  You might retest against
reef or your ceph/main branch.

Matt

On Mon, Apr 24, 2023 at 2:52 PM Yixin Jin  wrote:

>  Actually, "bucket sync run" somehow made it worse since now the
> destination zone shows "bucket is caught up with source" from "bucket sync
> status" even though it clearly missed an object.
>
> On Monday, April 24, 2023 at 02:37:46 p.m. EDT, Yixin Jin <
> yji...@yahoo.ca> wrote:
>
>   An update:
> After creating and enabling the bucket sync policy, I ran "bucket sync
> markers" and saw that each shard had the status of "init". The run "bucket
> sync run" in the end marked the status to be "incremental-sync", which
> seems to go through full-sync stage. However, the lone object in the source
> zone wasn't synced over to the destination zone.
> I actually used gdb to walk through radosgw-admin to run "bucket sync
> run". It seems not to do anything for full-sync and it printed a log saying
> "finished iterating over all available prefixes:...", which actually broke
> off the do-while loop after the call to
> prefix_handler.revalidate_marker(&list_marker). This call returned false
> because it couldn't find rules from the sync pipe. I haven't drilled deeper
> to see why it didn't get rules, whatever it means. Nevertheless, the
> workaround with "bucket sync run" doesn't seem to work, at least not with
> Quincy.
>
> Regards,Yixin
>
> On Monday, April 24, 2023 at 12:37:24 p.m. EDT, Soumya Koduri <
> skod...@redhat.com> wrote:
>
>  On 4/24/23 21:52, Yixin Jin wrote:
> > Hello ceph gurus,
> >
> > We are trying bucket-specific sync policy feature with Quincy release
> and we encounter something strange. Our test setup is very simple. I use
> mstart.sh to spin up 3 clusters, configure them with a single realm, a
> single zonegroup and 3 zones – z0, z1, z2, with z0 being the master. I
> created a zonegroup-level sync policy with “allowed”, a symmetrical flow
> among all 3 zones and a pipe allowing all zones to all zones. I created a
> single bucket “test-bucket” at z0 and uploaded a single object to it. By
> now, there should be no sync since the policy is “allowed” only and I can
> see the single file only exist in z0 and “bucket sync status” shows the
> sync is actually disabled. Finally, I created a bucket-specific sync policy
> being “enabled” and a pipe between z0 and z1 only. I expected that sync
> should be kicked off between z0 and z1 and I did see from “sync info” that
> there are sources/dests being z0/z1. “bucket sync status” also shows the
> source zone and source bucket. At z0, it shows everything is caught up but
> at z1 it shows one shard is behind, which is expected since that only
> object exists in z0 but not in z1.
> >
> >
> >
> > Now, here comes the strange part. Although z1 shows there is one shard
> behind, it doesn’t seem to make any progress on syncing it. It doesn’t seem
> to do any full sync at all since “bucket sync status” shows “full sync:
> 0/11 shards”. There hasn’t been any full sync since otherwise, z1 should
> have that only object. It is stuck in this condition forever until I make
> another upload on the same object. I suspect the update of the object
> triggers a new data log, which triggers the sync. Why wasn’t there a full
> sync and how can one force a full sync?
>
> yes this is known_issue yet to be addressed with bucket level sync
> policy ( - https://tracker.ceph.com/issues/57489 ). The interim
> workaround to sync existing objects  is to either
>
> * create new objects (or)
>
> * execute "bucket sync run"
>
> after creating/enabling the bucket policy.
>
> Please note that this issue is specific to only bucket policy but
> doesn't exist for sync-policy set at zonegroup level.
>
>
> Thanks,
>
> Soumya
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [multisite] Resetting an empty bucket

2023-05-01 Thread Matt Benjamin
Hi Yixin,

This sounds interesting.  I kind of suspect that this feature requires some
more conceptual design support.  Like, at a high level, how a bucket's
"zone residency" might be defined and specified, and what policies might
govern changing it, not to mention, how you direct things (commands? ops?).

This might be something to bring to an RGW upstream call (the "refactoring"
meeting), Wednesdays at 11:30 EST.

Matt

On Mon, May 1, 2023 at 5:27 PM Yixin Jin  wrote:

> Hi folks,
>
> Armed with bucket-specific sync policy feature, I found that we could move
> objects of a bucket between zones. It is migration via sync followed by
> object removal at the source. This allows us to better utilize available
> capacities in different clusters/zones. However, to achieve this, we need a
> way to reset an empty bucket so that it can serve as a destination for a
> migration after it serves as a source before. ceph/rgw currently doesn't
> seem to be able to do that. So I create a feature request for it
> https://tracker.ceph.com/issues/59593
>
> My own prototype shows that this feature is fairly simply to implement and
> works well for bucket migration.
>
> Cheers,
> Yixin
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
>

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Bucket-level redirect_zone

2023-05-16 Thread Matt Benjamin
Hi Yixin,

I support experimentation for sure, but if we want to consider a feature
for inclusion, we need design proposal(s) and review, of course.  If you're
interested in feedback on your current ideas, you could consider coming to
the "refactoring" meeting on a Wednesday.  I think these ideas would be
interesting to discuss, maybe it would reduce time to develop.

Matt

On Tue, May 16, 2023 at 5:20 PM Yixin Jin  wrote:

> Hi folks,
>
> I created a feature request ticket to call for bucket-level redirect_zone (
> https://tracker.ceph.com/issues/61199), which is basically an extension
> from zone-level redirect_zone. I found it helpful in realizing CopyObject
> with (x-copy-source) in multisite environments where bucket content don't
> exist in all zones. This feature is similar to what Matt Benjamin suggested
> about the concept of "bucket residence".
>
> In my own prototyping affect, redirecting feature at bucket level is
> fairly straightforward. Making use of it with CopyObject is trickier
> because I haven't found a good way to get the source object policy when
> RGWCopyObj::verify_permission() is called. Anyway, I will continue to
> explore this idea and hope to get more support of it.
>
> Thanks,
> Yixin
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
>

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rgw: strong consistency for (bucket) policy settings?

2023-09-11 Thread Matt Benjamin
Yes, it's also strongly consistent.  It's also last writer wins, though, so
two clients somehow permitted to contend for updating policy could
overwrite each other's changes, just as with objects.

Matt

On Mon, Sep 11, 2023 at 2:21 PM Matthias Ferdinand 
wrote:

> Hi,
>
> while I don't currently use rgw, I still am curious about consistency
> guarantees.
>
> Usually, S3 has strong read-after-write consistency guarantees (for
> requests that do not overlap). According to
> https://docs.ceph.com/en/latest/dev/radosgw/bucket_index/
> in Ceph this is also true for per-object ACLs.
>
> Is there also a strong consistency guarantee for (bucket) policies? The
> documentation at
> https://docs.ceph.com/en/latest/radosgw/bucketpolicy/
> apparently does not say anything about this.
>
> How would multiple rgw instances synchronize a policy change? Is this
> effective immediate with strong consistency or is there some propagation
> delay (hopefully on with some upper bound)?
>
>
> Best regards
> Matthias
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: S3 user with more than 1000 buckets

2023-10-03 Thread Matt Benjamin
Hi Thomas,

If I'm not mistaken, the RGW will paginate ListBuckets essentially like
ListObjectsv1 if the S3 client provides the appropriate "marker" parameter
values.  COS does this too, I noticed.  I'm not sure which S3 clients can
be relied on to do this, though.

Matt

On Tue, Oct 3, 2023 at 9:06 AM Thomas Bennett  wrote:

> Hi Jonas,
>
> Thanks :) that solved my issue.
>
> It would seem to me that this is heading towards something that the clients
> s3 should paginate, but I couldn't find any documentation on how to
> paginate bucket listings. All the information points to paginating object
> listing - which makes sense.
>
> Just for competition of this thread:
>
> The rgw parameters are found at: Quincy radosgw config ref
> <https://docs.ceph.com/en/quincy/radosgw/config-ref/>
>
> I ran the following command to update the parameter for all running rgw
> daemons:
> ceph config set client.rgw rgw_list_buckets_max_chunk 1
>
> And then confirmed the running daemons were configured:
> ceph daemon /var/run/ceph/ceph-client.rgw.xxx.xxx.asok config show | grep
> rgw_list_buckets_max_chunk
> "rgw_list_buckets_max_chunk": "1",
>
> Kind regards,
> Tom
>
> On Tue, 3 Oct 2023 at 13:30, Jonas Nemeiksis  wrote:
>
> > Hi,
> >
> > You should increase these default settings:
> >
> > rgw_list_buckets_max_chunk // for buckets
> > rgw_max_listing_results // for objects
> >
> > On Tue, Oct 3, 2023 at 12:59 PM Thomas Bennett  wrote:
> >
> >> Hi,
> >>
> >> I'm running a Ceph 17.2.5 Rados Gateway and I have a user with more than
> >> 1000 buckets.
> >>
> >> When the client tries to list all their buckets using s3cmd, rclone and
> >> python boto3, they all three only ever return the first 1000 bucket
> names.
> >> I can confirm the buckets are all there (and more than 1000) by checking
> >> with the radosgw-admin command.
> >>
> >> Have I missed a pagination limit for listing user buckets in the rados
> >> gateway?
> >>
> >> Thanks,
> >> Tom
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> >
> >
> > --
> > Jonas
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: About delete old bucket lifecycle

2023-12-06 Thread Matt Benjamin
Hi,

I think running "lc reshard fix" will fix this.

Matt


On Wed, Dec 6, 2023 at 5:48 AM VÔ VI  wrote:

> Hi community,
>
> I am have multiple bucket was delete but lifecycle of bucket still exist,
> how i can delete it with radosgw-admin, because user can't access to bucket
> for delete lifecycle. User for this bucket does not exist.
>
> root@ceph:~# radosgw-admin lc list
> [
> {
> "bucket": ":r30203:f3fec4b6-a248-4f3f-be75-b8055e61233a.33081.8",
> "started": "Wed, 06 Dec 2023 10:43:55 GMT",
> "status": "COMPLETE"
> },
> {
> "bucket": ":r30304:f3fec4b6-a248-4f3f-be75-b8055e61233a.33081.13",
> "started": "Wed, 06 Dec 2023 10:43:54 GMT",
> "status": "COMPLETE"
> },
> {
> "bucket":
> ":ec3204cam04:f3fec4b6-a248-4f3f-be75-b8055e61233a.31736.1",
> "started": "Wed, 06 Dec 2023 10:44:30 GMT",
> "status": "COMPLETE"
> },
> {
> "bucket": ":r30105:f3fec4b6-a248-4f3f-be75-b8055e61233a.33081.5",
> "started": "Wed, 06 Dec 2023 10:44:40 GMT",
> "status": "COMPLETE"
> },
> {
> "bucket": ":r30303:f3fec4b6-a248-4f3f-be75-b8055e61233a.33081.14",
> "started": "Wed, 06 Dec 2023 10:44:40 GMT",
> "status": "COMPLETE"
> },
> {
> "bucket":
> ":ec3201cam02:f3fec4b6-a248-4f3f-be75-b8055e61233a.56439.2",
> "started": "Wed, 06 Dec 2023 10:43:56 GMT",
> "status": "COMPLETE"
> },
> Thanks to the community.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW crashes when rgw_enable_ops_log is enabled

2024-01-25 Thread Matt Benjamin
Hi Marc,

The ops log code is designed to discard data if the socket is
flow-controlled, iirc.  Maybe we just need to handle the signal.

Of course, you should have something consuming data on the socket, but it's
still a problem if radosgw exits unexpectedly.

Matt

On Thu, Jan 25, 2024 at 10:08 AM Marc Singer  wrote:

> Hi Ceph Users
>
> I am encountering a problem with the RGW Admin Ops Socket.
>
> I am setting up the socket as follows:
>
> rgw_enable_ops_log = true
> rgw_ops_log_socket_path = /tmp/ops/rgw-ops.socket
> rgw_ops_log_data_backlog = 16Mi
>
> Seems like the socket fills up over time and it doesn't seem to get
> flushed, at some point the process runs out of file space.
>
> Do I need to configure something or send something for the socket to flush?
>
> See the log here:
>
> 0> 2024-01-25T13:10:13.908+ 7f247b00eb00 -1 *** Caught signal (File
> size limit exceeded) **
>   in thread 7f247b00eb00 thread_name:ops_log_file
>
>   ceph version 18.2.0 (5dd24139a1eada541a3bc16b6941c5dde975e26d) reef
> (stable)
>   NOTE: a copy of the executable, or `objdump -rdS ` is
> needed to interpret this.
>
> --- logging levels ---
> 0/ 5 none
> 0/ 1 lockdep
> 0/ 1 context
> 1/ 1 crush
> 1/ 5 mds
> 1/ 5 mds_balancer
> 1/ 5 mds_locker
> 1/ 5 mds_log
> 1/ 5 mds_log_expire
> 1/ 5 mds_migrator
> 0/ 1 buffer
> 0/ 1 timer
> 0/ 1 filer
> 0/ 1 striper
> 0/ 1 objecter
> 0/ 5 rados
> 0/ 5 rbd
> 0/ 5 rbd_mirror
> 0/ 5 rbd_replay
> 0/ 5 rbd_pwl
> 0/ 5 journaler
> 0/ 5 objectcacher
> 0/ 5 immutable_obj_cache
> 0/ 5 client
> 1/ 5 osd
> 0/ 5 optracker
> 0/ 5 objclass
> 1/ 3 filestore
> 1/ 3 journal
> 0/ 0 ms
> 1/ 5 mon
> 0/10 monc
> 1/ 5 paxos
> 0/ 5 tp
> 1/ 5 auth
> 1/ 5 crypto
> 1/ 1 finisher
> 1/ 1 reserver
> 1/ 5 heartbeatmap
> 1/ 5 perfcounter
> 1/ 5 rgw
> 1/ 5 rgw_sync
> 1/ 5 rgw_datacache
> 1/ 5 rgw_access
> 1/ 5 rgw_dbstore
> 1/ 5 rgw_flight
> 1/ 5 javaclient
> 1/ 5 asok
> 1/ 1 throttle
> 0/ 0 refs
> 1/ 5 compressor
> 1/ 5 bluestore
> 1/ 5 bluefs
> 1/ 3 bdev
> 1/ 5 kstore
> 4/ 5 rocksdb
> 4/ 5 leveldb
> 1/ 5 fuse
> 2/ 5 mgr
> 1/ 5 mgrc
> 1/ 5 dpdk
> 1/ 5 eventtrace
> 1/ 5 prioritycache
> 0/ 5 test
> 0/ 5 cephfs_mirror
> 0/ 5 cephsqlite
> 0/ 5 seastore
> 0/ 5 seastore_onode
> 0/ 5 seastore_odata
> 0/ 5 seastore_omap
> 0/ 5 seastore_tm
> 0/ 5 seastore_t
> 0/ 5 seastore_cleaner
> 0/ 5 seastore_epm
> 0/ 5 seastore_lba
> 0/ 5 seastore_fixedkv_tree
> 0/ 5 seastore_cache
> 0/ 5 seastore_journal
> 0/ 5 seastore_device
> 0/ 5 seastore_backref
> 0/ 5 alienstore
> 1/ 5 mclock
> 0/ 5 cyanstore
> 1/ 5 ceph_exporter
> 1/ 5 memstore
>-2/-2 (syslog threshold)
>99/99 (stderr threshold)
> --- pthread ID / name mapping for recent threads ---
>7f2472a89b00 / safe_timer
>7f2472cadb00 / radosgw
>...
>log_file
>
> /var/lib/ceph/crash/2024-01-25T13:10:13.909546Z_01ee6e6a-e946-4006-9d32-e17ef2f9df74/log
> --- end dump of recent events ---
> reraise_fatal: default handler for signal 25 didn't terminate the process?
>
> Thank you for your help.
>
> Marc
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW crashes when rgw_enable_ops_log is enabled

2024-01-26 Thread Matt Benjamin
Hi Marc,

1. if you can, yes, create a tracker issue on tracker.ceph.com?
2. you might be able to get more throughput with (some number) of
additional threads;  the first thing I would try is prioritization (nice)

regards,

Matt


On Fri, Jan 26, 2024 at 6:08 AM Marc Singer  wrote:

> Hi Matt
>
> Thanks for your answer.
>
> Should I open a bug report then?
>
> How would I be able to read more from it? Have multiple threads access
> it and read from it simultaneously?
>
> Marc
>
> On 1/25/24 20:25, Matt Benjamin wrote:
> > Hi Marc,
> >
> > No, the only thing you need to do with the Unix socket is to keep
> > reading from it.  So it probably is getting backlogged.  And while you
> > could arrange things to make that less likely, you likely can't make
> > it impossible, so there's a bug here.
> >
> > Matt
> >
> > On Thu, Jan 25, 2024 at 10:52 AM Marc Singer 
> wrote:
> >
> > Hi
> >
> > I am using a unix socket client to connect with it and read the data
> > from it.
> > Do I need to do anything like signal the socket that this data has
> > been
> > read? Or am I not reading fast enough and data is backing up?
> >
> > What I am also noticing that at some point (probably after something
> > with the ops socket happens), the log level seems to increase for
> > some
> > reason? I did not find anything in the logs yet why this would be
> > the case.
> >
> > *Normal:*
> >
> > 2024-01-25T15:47:58.444+ 7fe98a5c0b00  1 == starting new
> > request
> > req=0x7fe98712c720 =
> > 2024-01-25T15:47:58.548+ 7fe98b700b00  1 == req done
> > req=0x7fe98712c720 op status=0 http_status=200
> > latency=0.104001537s ==
> > 2024-01-25T15:47:58.548+ 7fe98b700b00  1 beast: 0x7fe98712c720:
> > redacted - redacted [25/Jan/2024:15:47:58.444 +] "PUT
> > /redacted/redacted/chunks/27/27242/27242514_10_4194304 HTTP/1.1" 200
> > 4194304 - "redacted" - latency=0.104001537s
> >
> > *Close before crashing:
> > *
> >
> >-509> 2024-01-25T14:54:31.588+ 7f5186648b00  1 == starting
> > new request req=0x7f517ffca720 =
> >-508> 2024-01-25T14:54:31.588+ 7f5186648b00  2 req
> > 2568229052387020224 0.0s initializing for trans_id =
> > tx023a42eb7515dcdc0-0065b27627-823feaa-central
> >-507> 2024-01-25T14:54:31.588+ 7f5186648b00  2 req
> > 2568229052387020224 0.0s getting op 1
> >-506> 2024-01-25T14:54:31.588+ 7f5186648b00  2 req
> > 2568229052387020224 0.0s s3:put_obj verifying requester
> >-505> 2024-01-25T14:54:31.588+ 7f5186648b00  2 req
> > 2568229052387020224 0.0s s3:put_obj normalizing buckets
> > and tenants
> >-504> 2024-01-25T14:54:31.588+ 7f5186648b00  2 req
> > 2568229052387020224 0.0s s3:put_obj init permissions
> >-503> 2024-01-25T14:54:31.588+ 7f5186648b00  2 req
> > 2568229052387020224 0.0s s3:put_obj recalculating target
> >-502> 2024-01-25T14:54:31.588+ 7f5186648b00  2 req
> > 2568229052387020224 0.0s s3:put_obj reading permissions
> >-501> 2024-01-25T14:54:31.588+ 7f5186648b00  2 req
> > 2568229052387020224 0.0s s3:put_obj init op
> >-500> 2024-01-25T14:54:31.588+ 7f5186648b00  2 req
> > 2568229052387020224 0.0s s3:put_obj verifying op mask
> >-499> 2024-01-25T14:54:31.588+ 7f5186648b00  2 req
> > 2568229052387020224 0.0s s3:put_obj verifying op permissions
> >-498> 2024-01-25T14:54:31.588+ 7f5186648b00  5 req
> > 2568229052387020224 0.0s s3:put_obj Searching permissions for
> > identity=rgw::auth::SysReqApplier ->
> > rgw::auth::LocalApplier(acct_user=redacted, acct_name=redacted,
> > subuser=, perm_mask=15, is_admin=0) mask=50
> >-497> 2024-01-25T14:54:31.588+ 7f5186648b00  5 req
> > 2568229052387020224 0.0s s3:put_obj Searching permissions for
> > uid=redacted
> >-496> 2024-01-25T14:54:31.588+ 7f5186648b00  5 req
> > 2568229052387020224 0.0s s3:put_obj Found permission: 15
> >-495> 2024-01-25T14:54:31.588+ 7f5186648b00  5 req
> > 2568229052387020224 0.0s s3:put_obj Searching permissions for
> > group=1 mask=50
> >-494> 2024-01-25T14

[ceph-users] Re: Can big data use Ceph?

2020-12-22 Thread Matt Benjamin
Ceph RGW is frequently used as a backing store for Hadoop and Spark
(S3A connector).

Matt

On Tue, Dec 22, 2020 at 5:29 AM fantastic2085  wrote:
>
> Can big data use Ceph?For example, can Hive Hbase Spark use Ceph?
> https://github.com/ceph/cephfs-hadoop is no longer maintain?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Metadata for LibRADOS

2021-03-02 Thread Matt Benjamin
Hi Cary,

As you've said, these are well-developed features of RGW, I think that
would be the way to go, in the Ceph ecosystem.

Matt

On Tue, Mar 2, 2021 at 3:41 PM Cary FitzHugh  wrote:
>
> Hello -
>
> We're trying to use native libRADOS and the only challenge we're running
> into is searching metadata.
>
> Using the rgw metadata sync seems to require all data to be pushed through
> the rgw, which is not something we're interested in setting up at the
> moment.
>
> Are there hooks or features of libRADOS which could be leveraged to enable
> syncing of metadata to an external system (elastic-search / postgres / etc)?
>
> Is there a way to listen to a stream of updates to a pool in real-time,
> with some guarantees I wouldn't miss things?
>
> Are there any features like this in libRADOS?
>
> Thank you
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Metadata for LibRADOS

2021-03-02 Thread Matt Benjamin
Right.  The elastic search integration--or something custom you could
base on s3 bucket notifications--would both be working with events
generated in RGW.

Matt

On Tue, Mar 2, 2021 at 3:55 PM Cary FitzHugh  wrote:
>
> Understood.
>
> With the RGW architecture comes more load balancing concerns, more moving 
> parts, more tedious (to me) ACLs, less features (append and some other things 
> not supported in S3).  Was hoping for a solution which didn't require us to 
> be hamstrung and only read / write to a pool via the gateway.
>
> If the RGW Metadata search was able to "source" it's data from the OSDs and 
> sync that way, then I'd be up for setting up a skeleton implementation,  but 
> it sounds like RGW Metadata is only going to record things which are flowing 
> through the gateway.  (Is that correct?)
>
>
>
>
> On Tue, Mar 2, 2021 at 3:46 PM Matt Benjamin  wrote:
>>
>> Hi Cary,
>>
>> As you've said, these are well-developed features of RGW, I think that
>> would be the way to go, in the Ceph ecosystem.
>>
>> Matt
>>
>> On Tue, Mar 2, 2021 at 3:41 PM Cary FitzHugh  wrote:
>> >
>> > Hello -
>> >
>> > We're trying to use native libRADOS and the only challenge we're running
>> > into is searching metadata.
>> >
>> > Using the rgw metadata sync seems to require all data to be pushed through
>> > the rgw, which is not something we're interested in setting up at the
>> > moment.
>> >
>> > Are there hooks or features of libRADOS which could be leveraged to enable
>> > syncing of metadata to an external system (elastic-search / postgres / 
>> > etc)?
>> >
>> > Is there a way to listen to a stream of updates to a pool in real-time,
>> > with some guarantees I wouldn't miss things?
>> >
>> > Are there any features like this in libRADOS?
>> >
>> > Thank you
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>> >
>>
>>
>> --
>>
>> Matt Benjamin
>> Red Hat, Inc.
>> 315 West Huron Street, Suite 140A
>> Ann Arbor, Michigan 48103
>>
>> http://www.redhat.com/en/technologies/storage
>>
>> tel.  734-821-5101
>> fax.  734-769-8938
>> cel.  734-216-5309
>>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: x-amz-request-id logging with beast + rgw (ceph 15.2.10/containerized)?

2021-04-01 Thread Matt Benjamin
Hi Folks,

A Red Hat SA (Mustafa Aydin) suggested, some while back, a concise
formula for relaying ops-log to syslog, basically a script executing

socat unix-connect:/var/run/ceph/opslog,reuseaddr UNIX-CLIENT:/dev/log &

I haven't experimented with it.

Matt

On Thu, Apr 1, 2021 at 12:22 PM Yuval Lifshitz  wrote:
>
> Hi David,
> Don't have any good idea for "octopus" (other than ops log), but you can do
> that (and more) in "pacific" using lua scripting on the RGW:
> https://docs.ceph.com/en/pacific/radosgw/lua-scripting/
>
> Yuval
>
> On Thu, Apr 1, 2021 at 7:11 PM David Orman  wrote:
>
> > Hi,
> >
> > Is there any way to log the x-amz-request-id along with the request in
> > the rgw logs? We're using beast and don't see an option in the
> > configuration documentation to add headers to the request lines. We
> > use centralized logging and would like to be able to search all layers
> > of the request path (edge, lbs, ceph, etc) with a x-amz-request-id.
> >
> > Right now, all we see is this:
> >
> > debug 2021-04-01T15:55:31.105+ 7f54e599b700  1 beast:
> > 0x7f5604c806b0: x.x.x.x - - [2021-04-01T15:55:31.105455+] "PUT
> > /path/object HTTP/1.1" 200 556 - "aws-sdk-go/1.36.15 (go1.15.3; linux;
> > amd64)" -
> >
> > We've also tried this:
> >
> > ceph config set global rgw_enable_ops_log true
> > ceph config set global rgw_ops_log_socket_path /tmp/testlog
> >
> > After doing this, inside the rgw container, we can socat -
> > UNIX-CONNECT:/tmp/testlog and see the log entries being recorded that
> > we want, but there has to be a better way to do this, where the logs
> > are emitted like the request logs above by beast, so that we can
> > handle it using journald. If there's an alternative that would
> > accomplish the same thing, we're very open to suggestions.
> >
> > Thank you,
> > David
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW objects has same marker and bucket id in different buckets.

2021-04-21 Thread Matt Benjamin
Hi Morphin,

Yes, this is by design.  When an RGW object has tail chunks and is
copied so as to duplicate an entire tail chunk, RGW causes the
coincident chunk(s) to be shared.  Tail chunks are refcounted to avoid
leaks.

Matt

On Wed, Apr 21, 2021 at 4:21 PM by morphin  wrote:
>
> Hello.
>
> I have a rgw s3 user and the user have 2 bucket.
> I tried to copy objects from old.bucket to new.bucket with rclone. (in
> the rgw client server)
> After I checked the object with "radosgw-admin --bucket=new.bucket
> object stat $i" and I saw old.bucket id and marker id also old bucket
> name in the object stats.
>
> Is rgw doing this for deduplication or is it a bug?
> If it's not a bug then If I delete the old bucket what will happen to
> these objects???
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW objects has same marker and bucket id in different buckets.

2021-04-22 Thread Matt Benjamin
Hi Morphin,

On Thu, Apr 22, 2021 at 3:40 AM by morphin  wrote:
>
> Thanks for the answer.
>
> I have 2 question:
> 1- If I use different user and a bucket what will happen?  Is this
> design only for same user or user independent?

It's user independent.

> 2- If I delete the Source bucket with radosgw-admin or via S3 delete,
> what will happen to these objects?
>

The refcount on each shared object will be decreased by 1.  If there
are no more references, the objects will be garbage collected.

>
>
> Also I have more questions if you have time :)

I don't have the expertise in multisite replication to debug this.  I
don't think the issue is related to RGW's copy-sharing.

Matt

>
> Some objects are pending state due to zone-sync error. I've removed
> the master zone and set secondary to master.
> I still see the pending objects in the bucket. (below)
>
> radosgw-admin --id radosgw.srv1 object stat --bucket=descript
> --object=2020/01/17/1b819bd9-5036-4ca4-98f7-b0308e1e3017
> {
> "name": "2020/01/17/1b819bd9-5036-4ca4-98f7-b0308e1e3017",
> "size": 0,
> "tag": "",
> "attrs": {
> "user.rgw.manifest": "",
> "user.rgw.olh.idtag": "ivlde1avu2l3lli6i349h62c0d79ao4u",
> "user.rgw.olh.info": "\u0001\u0001�",
> "user.rgw.olh.pending.607d4d5be0hh3lpzjd7vzt2j":
> "\u0001\u0001\u0008",
> "user.rgw.olh.pending.607d4d5c9uhlh9sf93j8lf7l":
> "\u0001\u0001\u0008",
> "user.rgw.olh.pending.607d4d5cpip1i8z8rytcnkqf":
> "\u0001\u0001\u0008",
> "user.rgw.olh.ver": "3"
> }
> }
>
>
> I overwrite these objects with rclone from old zone bucket to new
> created bucket on the same user at master zone.
>
> After a while I noticed that I'm getting a warning for these objects
> in rgw client log and the overwritten objects switching back to the
> corrupted objects.
>
> 2021-04-22 10:27:55.445 7f2d85fd4700  0 WARNING: couldn't find acl
> header for object, generating default
> 2021-04-22 10:27:55.445 7f2d85fd4700  1 == req done
> req=0x55a441452710 op status=0 http_status=200 latency=0.022s
> ==
> 2021-04-22 10:27:55.445 7f2d85fd4700  1 beast: 0x55a441452710:
> 10.151.101.15 - - [2021-04-22 10:27:55.0.44549s] "GET
> /descript/2020/01/17/1b819bd9-5036-4ca4-98f7-b0308e1e3017 HTTP/1.1"
> 200 0 - "aws-sdk-java/1.11.638 Linux/3.10.0-1160.11.1.el7.x86_64
> Java_HotSpot(TM)_64-Bit_Server_VM/25.281-b09 java/1.8.0_281
> groovy/2.5.6 vendor/Oracle_Corporation" -
>
> Am I doing something wrong?
> Also "sync error trim" does not work. How can I clean these errors and
> these pending objects?
>
> ceph version 14.2.16
>
>
> Have a great day.
> Regards.
>
>
> Matt Benjamin , 22 Nis 2021 Per, 06:08 tarihinde
> şunu yazdı:
> >
> > Hi Morphin,
> >
> > Yes, this is by design.  When an RGW object has tail chunks and is
> > copied so as to duplicate an entire tail chunk, RGW causes the
> > coincident chunk(s) to be shared.  Tail chunks are refcounted to avoid
> > leaks.
> >
> > Matt
> >
> > On Wed, Apr 21, 2021 at 4:21 PM by morphin  wrote:
> > >
> > > Hello.
> > >
> > > I have a rgw s3 user and the user have 2 bucket.
> > > I tried to copy objects from old.bucket to new.bucket with rclone. (in
> > > the rgw client server)
> > > After I checked the object with "radosgw-admin --bucket=new.bucket
> > > object stat $i" and I saw old.bucket id and marker id also old bucket
> > > name in the object stats.
> > >
> > > Is rgw doing this for deduplication or is it a bug?
> > > If it's not a bug then If I delete the old bucket what will happen to
> > > these objects???
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >
> >
> >
> > --
> >
> > Matt Benjamin
> > Red Hat, Inc.
> > 315 West Huron Street, Suite 140A
> > Ann Arbor, Michigan 48103
> >
> > http://www.redhat.com/en/technologies/storage
> >
> > tel.  734-821-5101
> > fax.  734-769-8938
> > cel.  734-216-5309
> >
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: global multipart lc policy in radosgw

2021-05-02 Thread Matt Benjamin
Hi Boris,

The only configuration available is per bucket.  Folks requested a
global setting before, but we've so far tried to stick with the AWS
model.

Matt

On Sun, May 2, 2021 at 6:38 AM Boris Behrens  wrote:
>
> Hi,
> I have a lot of multipart uploads that look like they never finished. Some
> of them date back to 2019.
>
> Is there a way to clean them up when they didn't finish in 28 days?
>
> I know I can implement a LC policy per bucket, but how do I implement it
> cluster wide?
>
> Cheers
>  Boris
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: x-amz-request-id logging with beast + rgw (ceph 15.2.10/containerized)?

2021-05-07 Thread Matt Benjamin
Hi David,

I think the solution is most likely the ops log.  It is called for
every op, and has the transaction id.

Matt

On Fri, May 7, 2021 at 4:58 PM David Orman  wrote:
>
> Hi Yuval,
>
> We've managed to get an upgrade done with the 16.2.3 release in a
> testing cluster, and we've been able to implement some of the logging
> I need via this mechanism, but the logs are emitted only when
> debug_rgw is set to 20. I don't need to log any of that level of data
> (we used centralized logging and the sheer volume of this output is
> staggering); I'm just trying to get the full request log, to include
> the transactionID, so I can match it up with the logging we do on our
> load balancer solution. Is there another mechanism to emit these logs
> at normal log levels? RGWDebugLog() doesn't appear to be what I'm
> actually looking for. My intent is to emit JSON logs using this
> mechanism, in the end, with all of the required fields for requests.
> The current "beast: " log lines don't contain the information we need,
> such as txid, which is what we're attempting to solve for - but can't
> afford to have full debug logging enabled in production clusters.
>
> Thanks!
> David
>
> On Thu, Apr 1, 2021 at 11:21 AM Yuval Lifshitz  wrote:
> >
> > Hi David,
> > Don't have any good idea for "octopus" (other than ops log), but you can do 
> > that (and more) in "pacific" using lua scripting on the RGW:
> > https://docs.ceph.com/en/pacific/radosgw/lua-scripting/
> >
> > Yuval
> >
> > On Thu, Apr 1, 2021 at 7:11 PM David Orman  wrote:
> >>
> >> Hi,
> >>
> >> Is there any way to log the x-amz-request-id along with the request in
> >> the rgw logs? We're using beast and don't see an option in the
> >> configuration documentation to add headers to the request lines. We
> >> use centralized logging and would like to be able to search all layers
> >> of the request path (edge, lbs, ceph, etc) with a x-amz-request-id.
> >>
> >> Right now, all we see is this:
> >>
> >> debug 2021-04-01T15:55:31.105+ 7f54e599b700  1 beast:
> >> 0x7f5604c806b0: x.x.x.x - - [2021-04-01T15:55:31.105455+] "PUT
> >> /path/object HTTP/1.1" 200 556 - "aws-sdk-go/1.36.15 (go1.15.3; linux;
> >> amd64)" -
> >>
> >> We've also tried this:
> >>
> >> ceph config set global rgw_enable_ops_log true
> >> ceph config set global rgw_ops_log_socket_path /tmp/testlog
> >>
> >> After doing this, inside the rgw container, we can socat -
> >> UNIX-CONNECT:/tmp/testlog and see the log entries being recorded that
> >> we want, but there has to be a better way to do this, where the logs
> >> are emitted like the request logs above by beast, so that we can
> >> handle it using journald. If there's an alternative that would
> >> accomplish the same thing, we're very open to suggestions.
> >>
> >> Thank you,
> >> David
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: x-amz-request-id logging with beast + rgw (ceph 15.2.10/containerized)?

2021-05-07 Thread Matt Benjamin
That hasn't been glued up, but likely will be.  Something similar for
k8s has been discussed.  You could certainly use lua to do something
custom.  I suspect lua to customize the canned ops-log will become an
option, too.

Matt

On Fri, May 7, 2021 at 5:51 PM David Orman  wrote:
>
> Has anyone figured out an elegant way to emit this from inside cephadm
> managed/containerized ceph, so it can be handled via the host's
> journald and processed/shipped? We had gone down that path before, but
> decided to hold off on the suggestion that the LUA-based scripting
> might be a better option.
>
> David
>
> On Fri, May 7, 2021 at 4:21 PM Matt Benjamin  wrote:
> >
> > Hi David,
> >
> > I think the solution is most likely the ops log.  It is called for
> > every op, and has the transaction id.
> >
> > Matt
> >
> > On Fri, May 7, 2021 at 4:58 PM David Orman  wrote:
> > >
> > > Hi Yuval,
> > >
> > > We've managed to get an upgrade done with the 16.2.3 release in a
> > > testing cluster, and we've been able to implement some of the logging
> > > I need via this mechanism, but the logs are emitted only when
> > > debug_rgw is set to 20. I don't need to log any of that level of data
> > > (we used centralized logging and the sheer volume of this output is
> > > staggering); I'm just trying to get the full request log, to include
> > > the transactionID, so I can match it up with the logging we do on our
> > > load balancer solution. Is there another mechanism to emit these logs
> > > at normal log levels? RGWDebugLog() doesn't appear to be what I'm
> > > actually looking for. My intent is to emit JSON logs using this
> > > mechanism, in the end, with all of the required fields for requests.
> > > The current "beast: " log lines don't contain the information we need,
> > > such as txid, which is what we're attempting to solve for - but can't
> > > afford to have full debug logging enabled in production clusters.
> > >
> > > Thanks!
> > > David
> > >
> > > On Thu, Apr 1, 2021 at 11:21 AM Yuval Lifshitz  
> > > wrote:
> > > >
> > > > Hi David,
> > > > Don't have any good idea for "octopus" (other than ops log), but you 
> > > > can do that (and more) in "pacific" using lua scripting on the RGW:
> > > > https://docs.ceph.com/en/pacific/radosgw/lua-scripting/
> > > >
> > > > Yuval
> > > >
> > > > On Thu, Apr 1, 2021 at 7:11 PM David Orman  wrote:
> > > >>
> > > >> Hi,
> > > >>
> > > >> Is there any way to log the x-amz-request-id along with the request in
> > > >> the rgw logs? We're using beast and don't see an option in the
> > > >> configuration documentation to add headers to the request lines. We
> > > >> use centralized logging and would like to be able to search all layers
> > > >> of the request path (edge, lbs, ceph, etc) with a x-amz-request-id.
> > > >>
> > > >> Right now, all we see is this:
> > > >>
> > > >> debug 2021-04-01T15:55:31.105+ 7f54e599b700  1 beast:
> > > >> 0x7f5604c806b0: x.x.x.x - - [2021-04-01T15:55:31.105455+] "PUT
> > > >> /path/object HTTP/1.1" 200 556 - "aws-sdk-go/1.36.15 (go1.15.3; linux;
> > > >> amd64)" -
> > > >>
> > > >> We've also tried this:
> > > >>
> > > >> ceph config set global rgw_enable_ops_log true
> > > >> ceph config set global rgw_ops_log_socket_path /tmp/testlog
> > > >>
> > > >> After doing this, inside the rgw container, we can socat -
> > > >> UNIX-CONNECT:/tmp/testlog and see the log entries being recorded that
> > > >> we want, but there has to be a better way to do this, where the logs
> > > >> are emitted like the request logs above by beast, so that we can
> > > >> handle it using journald. If there's an alternative that would
> > > >> accomplish the same thing, we're very open to suggestions.
> > > >>
> > > >> Thank you,
> > > >> David
> > > >> ___
> > > >> ceph-users mailing list -- ceph-users@ceph.io
> > > >> To unsubscribe send an email to ceph-users-le...@ceph.io
> > > >>
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >
> > >
> >
> >
> > --
> >
> > Matt Benjamin
> > Red Hat, Inc.
> > 315 West Huron Street, Suite 140A
> > Ann Arbor, Michigan 48103
> >
> > http://www.redhat.com/en/technologies/storage
> >
> > tel.  734-821-5101
> > fax.  734-769-8938
> > cel.  734-216-5309
> >
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to organize data in S3

2021-05-24 Thread Matt Benjamin
The one side effect you might see if you actually create a million or
more bucklets for a single user, is a large OMAP warning, as the
mapping of buckets to owners is not sharded.

Matt

On Mon, May 24, 2021 at 1:51 AM Michal Strnad  wrote:
>
> Thank you. So we can create millions of buckets associated to only one
> S3 account without any limitation or side effect? Does anyone use it
> this way?
>
> Many thanks in advance.
> Michal
>
>
> On 5/23/21 9:42 PM, Janne Johansson wrote:
> > Many buckets.
> >
> > Den sön 23 maj 2021 kl 20:53 skrev Michal Strnad :
> >>
> >> Hi all,
> >>
> >> We need to store millions of files using S3 protocol in Ceph (version
> >> Nautilus), but have projects where isn't appropriate or possible to
> >> create a lot of S3 accounts. Is it better to have multiple S3 buckets or
> >> one bucket with sub folders?
> >>
> >> For example AWS service from Amazon allows you to create up to 100
> >> buckets in each of your AWS cloud accounts. You can request more
> >> buckets, up to a maximum quota of 1,000, by submitting a service limit
> >> increase. There is no limit on the number of objects you can store in a
> >> bucket, but in the Ceph we run into a problem with listing and
> >> resharding with a millions of files in one bucket.
> >>
> >> Thank you
> >>
> >> Michal
> >>
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> >
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs vs rbd vs rgw

2021-05-25 Thread Matt Benjamin
Hi Jorge,

I think it depends on your workload.

On Tue, May 25, 2021 at 7:43 PM Jorge Garcia  wrote:
>
> This may be too broad of a topic, or opening a can of worms, but we are
> running a CEPH environment and I was wondering if there's any guidance
> about this question:
>
> Given that some group would like to store 50-100 TBs of data on CEPH and
> use it from a linux environment, are there any advantages or
> disadvantages in terms of performance/ease of use/learning curve to
> using cephfs vs using a block device thru rbd vs using object storage
> thru rgw? Here are my general thoughts:
>
> cephfs - Until recently, you were not allowed to have multiple
> filesystems. Not sure about performance.
>

I/O performance can be /very/ good.  Metadata performance has can
vary.  If you need shared POSIX access ("native" or NFS or SMB), you
need cephfs.

> rbd - Can only be mounted on one system at a time, but I guess that
> filesystem could then be served using NFS.

Yes, but it's single attach.

>
> rgw - A different usage model from regular linux file/directory
> structure. Are there advantages to forcing people to use this interface?

There are advantages.  S3 has become a preferred interface for some
applications, especially analytics (e.g., Hadoop, Spark, PrestoSql)).

>
> I'm tempted to set up 3 separate areas and try them and compare the
> results, but I'm wondering if somebody has done some similar experiment
> in the past.

Not sure, good question.

Matt

>
> Thanks for any help you can provide!
>
> Jorge
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW/S3 losing multipart upload objects

2022-03-17 Thread Matt Benjamin
Thanks, Soumya.

It's also possible that what's reproducing is the known (space) leak
during re-upload of multipart parts, described here:
https://tracker.ceph.com/issues/44660.
A fix for this is being worked on, it's taking a while.

Matt

On Thu, Mar 17, 2022 at 10:31 AM Soumya Koduri  wrote:
>
> On 3/17/22 17:16, Ulrich Klein wrote:
> > Hi,
> >
> > My second attempt to get help with a problem I'm trying to solve for about 
> > 6 month now.
> >
> > I have a Ceph 16.2.6 test cluster, used almost exclusively for providing 
> > RGW/S3 service. similar to a production cluster.
> >
> > The problem I have is this:
> > A client uploads (via S3) a bunch of large files into a bucket via 
> > multiparts
> > The upload(s) get interrupted and retried
> > In the end from a client's perspective all the files are visible and 
> > everything looks fine.
> > But on the cluster there are many more objects in the buckets
> > Even after cleaning out the incomplete multipart uploads there are too many 
> > objects
> > Even after deleting all the visible objects from the bucket there are still 
> > objects in the bucket
> > I have so far found no way to get rid of those left-over objects.
> > It's screwing up space accounting and I'm afraid I'll eventually have a 
> > cluster full of those lost objects.
> > The only way to clean up seems to be to copy te contents of a bucket to a 
> > new bucket and delete the screwed-up bucket. But on a production system 
> > that's not always a real option.
> >
> > I've found a variety of older threads that describe a similar problem. None 
> > of them decribing a solution :(
> >
> >
> >
> > I can pretty easily reproduce the problem with this sequence:
> >
> > On a client system create a directory with ~30 200MB files. (On a faster 
> > system I'd probably need bigger or more files)
> > tstfiles/tst01 - tst29
> >
> > run
> > $ rclone mkdir tester:/test-bucket # creates a bucket on the test system 
> > with user tester
> > Run
> > $ rclone sync -v tstfiles tester:/test-bucket/tstfiles
> > a couple of times (6-8), interrupting each one via CNTRL-C
> > Eventually let one finish.
> >
> > Now I can use s3cmd to see all the files:
> > $ s3cmd ls -lr s3://test-bucket/tstfiles
> > 2022-03-16 17:11   200M  ecb28853bd18eeae185b0b12bd47333c-40  STANDARD 
> > s3://test-bucket/tstfiles/tst01
> > ...
> > 2022-03-16 17:13   200M  ecb28853bd18eeae185b0b12bd47333c-40  STANDARD 
> > s3://test-bucket/tstfiles/tst29
> >
> > ... and to list incomplete uploads:
> > $ s3cmd multipart s3://test-bucket
> > s3://test-bucket/
> > Initiated PathId
> > 2022-03-16T17:11:19.074Z  s3://test-bucket/tstfiles/tst05 
> > 2~1nElF0c3uq5FnZ9cKlsnGlXKATvjr0g
> > ...
> > 2022-03-16T17:12:41.583Z  s3://test-bucket/tstfiles/tst28 
> > 2~exVQUILhVSmFqWxCuAflRa4Tfq4nUQa
> >
> > I can abort the uploads with
> > $  s3cmd abortmp s3://test-bucket/tstfiles/tst05 
> > 2~1nElF0c3uq5FnZ9cKlsnGlXKATvjr0g
> > ...
>
>
>
> On the latest master, I see that these objects are deleted immediately
> post abortmp. I believe this issue may have beenn fixed as part of [1],
> backported to v16.2.7 [2]. Maybe you could try upgrading your cluster
> and recheck.
>
>
> Thanks,
>
> Soumya
>
>
> [1] https://tracker.ceph.com/issues/53222
>
> [2] https://tracker.ceph.com/issues/53291
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: globally disableradosgw lifecycle processing

2022-04-19 Thread Matt Benjamin
Hi Christopher,

Yes, you will need to restart the rgw instance(s).

Matt

On Tue, Apr 19, 2022 at 3:13 PM Christopher Durham  wrote:
>
>
> Hello,
> I am using radosgw with lifecycle processing on multiple buckets. I may have 
> need to globallydisable lifecycle processing and do some investigation.
> Can I do that by setting rgw_lc_max_worker to 0 on my radosgw server?
> I'd rather not push rules to for every bucket with Status: Disabled, or 
> delete them all.
>
> I am using pacific 16.2.7 on Rocky Linux
>
> Thanks
> -Chris
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW Bucket Retrieval Notifications

2022-08-02 Thread Matt Benjamin
The RGW ops-log would be one way of capturing that sort of information.

Matt

On Tue, Aug 2, 2022 at 2:36 PM Kevin Seales 
wrote:

>
> I do not see a notification event for when objects are retrieved from a S3
> bucket.  Are there any other options available to grab this information
> easily?  We are looking to use this information for internal audit
> reporting/dashboards.  I see it can be parsed out of the log files, but I'm
> hoping there may be an easier way.
>
> Cheers
> Kevin
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Downside of many rgw bucket shards?

2022-08-29 Thread Matt Benjamin
We choose prime number shard counts, yes.
Indexless buckets do increase insert-delete performance, but by definition,
though, an indexless bucket cannot be listed.

Matt

On Mon, Aug 29, 2022 at 1:46 PM Anthony D'Atri 
wrote:

> Do I recall that the number of shards is ideally odd, or even prime?
> Performance might be increased by indexless buckets if the application can
> handle
>
> > On Aug 29, 2022, at 10:06 AM, J. Eric Ivancich 
> wrote:
> >
> > Generally it’s a good thing. There’s less contention for bucket index
> updates when, for example, lots of writes are happening together. Dynamic
> resharding will take things up to 1999 shards on its own with the default
> config.
> >
> > Given that we use hashing of objet names to determine which shard they
> go to, the most complicated operation is bucket listing, which has to
> retrieve entries from each shard, order them, and return them to the
> client. And it has to do this in batches of about 1000 at a time.
> >
> > It looks like you’re expecting on the order of 10,000,000 objects in
> these buckets, so I imagine you’re not going to be listing them with any
> regularity.
> >
> > Eric
> > (he/him)
> >
> >> On Aug 29, 2022, at 12:06 PM, Boris Behrens  wrote:
> >>
> >> Hi there,
> >>
> >> I have some buckets that would require >100 shards and I would like to
> ask
> >> if there are any downsides to have these many shards on a bucket?
> >>
> >> Cheers
> >> Boris
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Versioning of objects in the archive zone

2022-10-04 Thread Matt Benjamin
Hi,

Please review https://github.com/ceph/ceph/pull/46928

thanks,

Matt

On Tue, Oct 4, 2022 at 10:37 AM Beren beren  wrote:

> Hi,
> Is it possible to manage the number of versions of objects in the archive
> zone ? (https://docs.ceph.com/en/latest/radosgw/archive-sync-module/)
>
> If I can't manage the number of versions, then sooner or later the versions
> will kill the entire cluster:(
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
>

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: why rgw generates large quantities orphan objects?

2022-10-14 Thread Matt Benjamin
; 00 0  28749   28 MiB  0
> 0 B 0 B  0 B
> cephfs-metadata  932 MiB 14772   0  44316
> 00 01569690  3.8 GiB1258651  3.4
> GiB 0 B  0 B
> cephfs-replicated-pool   738 GiB300962   0 902886
> 00 0 794612  470 GiB 770689  245
> GiB 0 B  0 B
> deeproute-replica-hdd-pool  1016 GiB104276   0 312828
> 00 0   18176216  298 GiB  441783780  6.7
> TiB 0 B  0 B
> deeproute-replica-ssd-pool30 GiB  3691   0  11073
> 00 02466079  2.1 GiB8416232  221
> GiB 0 B  0 B
> device_health_metrics 50 MiB   108   0324
> 00 0   1836  1.8 MiB   1944   18
> MiB 0 B  0 B
> os-test.rgw.buckets.data 5.6 TiB  39844453   0  239066718
> 00 0  552896177  3.0 TiB  999441015   60
> TiB 0 B  0 B
> os-test.rgw.buckets.index1.8 GiB33   0 99
> 00 0  153600295  154 GiB  110916573   62
> GiB 0 B  0 B
> os-test.rgw.buckets.non-ec   2.1 MiB45   0135
> 00 0 574240  349 MiB 153725  139
> MiB 0 B  0 B
> os-test.rgw.control  0 B 8   0 24
> 00 0  0  0 B  0
> 0 B 0 B  0 B
> os-test.rgw.log  3.7 MiB   346   0   1038
> 00 0   83877803   80 GiB6306730  7.6
> GiB 0 B  0 B
> os-test.rgw.meta 220 KiB23   0 69
> 00 0 640854  506 MiB 108229   53
> MiB 0 B  0 B
>
> total_objects40268737
> total_used   7.8 TiB
> total_avail  1.1 PiB
> total_space  1.1 PiB
> ```
> ceph verison:
> ```
> [root@node01 /]# ceph versions
> {
> "mon": {
> "ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
> pacific (stable)": 3
> },
> "mgr": {
> "ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
> pacific (stable)": 2
> },
> "osd": {
> "ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
> pacific (stable)": 108
>     },
>     "mds": {
> "ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
> pacific (stable)": 2
> },
> "rgw": {
> "ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
> pacific (stable)": 9
> },
> "overall": {
> "ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17)
> pacific (stable)": 124
> }
> }
> ```
>
> Thanks,
> Best regards
> Liang Zheng
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rgw with unix socket

2022-10-17 Thread Matt Benjamin
Hi Rok,

It was I think planned years ago to drop the fcgi front-end.  I'm unsure if
Pacific does have it.

Matt


On Mon, Oct 17, 2022 at 11:31 AM Rok Jaklič  wrote:

> Hi,
>
> I try to configure ceph with rgw and unix socket (based on
> https://docs.ceph.com/en/pacific/man/8/radosgw/?highlight=radosgw). I have
> in ceph.conf something like this:
> [client.radosgw.ctplmon3]
> host = ctplmon3
> rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock
> log file = /var/log/ceph/client.radosgw.ctplmon3.log
> rgw print continue = false
>
> When I start radosgw with:
> radosgw -c /etc/ceph/ceph.conf --setuser ceph --setgroup ceph -n
> client.radosgw.ctplmon3
>
> I get in logs:
> 2022-10-17T17:11:22.925+0200 7f75c72545c0  0 ceph version 16.2.10
> (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable), process
> radosgw, pid 4077748
> 2022-10-17T17:11:22.925+0200 7f75c72545c0  0 framework: beast
> 2022-10-17T17:11:22.925+0200 7f75c72545c0  0 framework conf key: port, val:
> 7480
> 2022-10-17T17:11:22.925+0200 7f75c72545c0  1 radosgw_Main not setting numa
> affinity
> 2022-10-17T17:11:24.863+0200 7f75c72545c0  0 framework: beast
> 2022-10-17T17:11:24.863+0200 7f75c72545c0  0 framework conf key:
> ssl_certificate, val: config://rgw/cert/$realm/$zone.crt
> 2022-10-17T17:11:24.863+0200 7f75c72545c0  0 framework conf key:
> ssl_private_key, val: config://rgw/cert/$realm/$zone.key
> 2022-10-17T17:11:24.863+0200 7f75c72545c0  0 starting handler: beast
> 2022-10-17T17:11:24.867+0200 7f75c72545c0  0 set uid:gid to 167:167
> (ceph:ceph)
> 2022-10-17T17:11:24.904+0200 7f75c72545c0  1 mgrc service_daemon_register
> rgw.11621227 metadata {arch=x86_64,ceph_release=pacific,ceph_version=ceph
> version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific
> (stable),ceph_version_short=16.2.10,cpu=Intel(R) Xeon(R) Silver 4114 CPU @
> 2.20GHz,distro=almalinux,distro_description=AlmaLinux 8.6 (Sky
> Tiger),distro_version=8.6,frontend_config#0=beast
> port=7480,frontend_type#0=beast,hostname=ctplmon3.arnes.si
> ,id=radosgw.ctplmon3,kernel_description=#1
> SMP Tue Aug 2 13:42:59 EDT
>
> 2022,kernel_version=4.18.0-372.19.1.el8_6.x86_64,mem_swap_kb=8388604,mem_total_kb=65325948,num_handles=1,os=Linux,pid=4077748,realm_id=,realm_name=,zone_id=c2c70444-7a41-4acd-a0d0-9f87d324ec72,zone_name=default,zonegroup_id=b1e0d55c-f7cb-4e73-b1cb-6cffa1fd6578,zonegroup_name=default}
> 2022-10-17T17:20:37.712+0200 7f75ae9a2700 -1 received  signal: Interrupt,
> si_code : 128, si_value (int): 0, si_value (ptr): 0, si_errno: 0, si_pid :
> 0, si_uid : 0, si_addr0, si_status0
>
> ... where it seems it started rgw on port and ip.
>
> Looking at:
> https://github.com/ceph/ceph/blob/quincy/src/rgw/rgw_asio_frontend.cc
>
> I do not see any reference to handling rgw on unix sockets. Is this even
> implemented?
>
> Kind regards,
> Rok
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: s3gw v0.7.0 released

2022-10-20 Thread Matt Benjamin
The ability to run as a stand-alone service without a RADOS service comes
from the Zipper API work, which is part of upstream Ceph RGW, obviously.
It should relatively soon be possible to load new Zipper store drivers
(backends) at runtime, so there won't be a need to maintain a fork of Ceph
RGW.

regards,

Matt

On Thu, Oct 20, 2022 at 1:34 PM Joao Eduardo Luis  wrote:

> # s3gw v0.7.0
>
> The s3gw team is announcing the release of s3gw v0.7.0. This release
> contains fixes to known bugs and new features. This includes an early
> version of an object explorer via the web-based UI. See the CHANGELOG
> below for more information.
>
> This project is still under early-stage development and is not
> recommended for production systems and upgrades are not guaranteed to
> succeed from one version to another. Additionally, although we strive
> for API parity with RADOSGW, features may still be missing.
>
> Do not hesitate to provide constructive feedback.
>
> ## CHANGELOG
>
> Exciting changes include:
>
> - Bucket management features for non-admin users (create/update/delete
> buckets) on the UI.
> - Different improvements on the UI.
> - Several bug fixes.
> - Improved charts.
>
> Full changelog can be found at
> https://github.com/aquarist-labs/s3gw/releases/tag/v0.7.0
>
> ## OBTAINING s3gw
>
> Container images can be found on GitHub’s container registry:
>
>  ghcr.io/aquarist-labs/s3gw:v0.7.0
>  ghcr.io/aquarist-labs/s3gw-ui:v0.7.0
>
> Additionally, a helm chart [1] is available at ArtifactHUB:
>
>  https://artifacthub.io/packages/helm/s3gw/s3gw
>
> For additional information, see the documentation:
>
>  https://s3gw-docs.readthedocs.io/en/latest/
>
> ## WHAT IS s3gw
>
> s3gw is an S3-compatible service that focuses on deployment within a
> Kubernetes environment backed by any PVC, including Longhorn [2]. Since
> its inception, the primary focus has been on Cloud Native deployments.
> However, s3gw can be deployed in a myriad of scenarios (including a
> standalone container), provided it has some form of storage attached.
>
> s3gw is based on Ceph’s RADOSGW but runs as a stand-alone service
> without the RADOS cluster and relies on a storage backend still under
> heavy development by the storage team at SUSE. Additionally, the s3gw
> team is developing a web-based UI for management and an object explorer.
>
> More information can be found at https://aquarist-labs.io/s3gw/ or
> https://github.com/aquarist-labs/s3gw/ .
>
>-Joao and the s3gw team
>
> [1] https://github.com/aquarist-labs/s3gw-charts
> [2] https://longhorn.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: s3gw v0.7.0 released

2022-10-20 Thread Matt Benjamin
And to clarify, too, this Aquarium work is the first attempt by folks to
build a file backed storage setup, it's great to see innovation around this.

Matt

On Thu, Oct 20, 2022 at 1:50 PM Joao Eduardo Luis  wrote:

> On 2022-10-20 17:46, Matt Benjamin wrote:
> > The ability to run as a stand-alone service without a RADOS service
> > comes
> > from the Zipper API work, which is part of upstream Ceph RGW,
> > obviously.
> > It should relatively soon be possible to load new Zipper store drivers
> > (backends) at runtime, so there won't be a need to maintain a fork of
> > Ceph
> > RGW.
>
> Indeed it does. None of this would be possible without Zipper and the
> SAL abstraction work. :)
>
>-Joao
>
> >
> > regards,
> >
> > Matt
> >
> > On Thu, Oct 20, 2022 at 1:34 PM Joao Eduardo Luis 
> > wrote:
> >
> >> # s3gw v0.7.0
> >>
> >> The s3gw team is announcing the release of s3gw v0.7.0. This release
> >> contains fixes to known bugs and new features. This includes an early
> >> version of an object explorer via the web-based UI. See the CHANGELOG
> >> below for more information.
> >>
> >> This project is still under early-stage development and is not
> >> recommended for production systems and upgrades are not guaranteed to
> >> succeed from one version to another. Additionally, although we strive
> >> for API parity with RADOSGW, features may still be missing.
> >>
> >> Do not hesitate to provide constructive feedback.
> >>
> >> ## CHANGELOG
> >>
> >> Exciting changes include:
> >>
> >> - Bucket management features for non-admin users (create/update/delete
> >> buckets) on the UI.
> >> - Different improvements on the UI.
> >> - Several bug fixes.
> >> - Improved charts.
> >>
> >> Full changelog can be found at
> >> https://github.com/aquarist-labs/s3gw/releases/tag/v0.7.0
> >>
> >> ## OBTAINING s3gw
> >>
> >> Container images can be found on GitHub’s container registry:
> >>
> >>  ghcr.io/aquarist-labs/s3gw:v0.7.0
> >>  ghcr.io/aquarist-labs/s3gw-ui:v0.7.0
> >>
> >> Additionally, a helm chart [1] is available at ArtifactHUB:
> >>
> >>  https://artifacthub.io/packages/helm/s3gw/s3gw
> >>
> >> For additional information, see the documentation:
> >>
> >>  https://s3gw-docs.readthedocs.io/en/latest/
> >>
> >> ## WHAT IS s3gw
> >>
> >> s3gw is an S3-compatible service that focuses on deployment within a
> >> Kubernetes environment backed by any PVC, including Longhorn [2].
> >> Since
> >> its inception, the primary focus has been on Cloud Native deployments.
> >> However, s3gw can be deployed in a myriad of scenarios (including a
> >> standalone container), provided it has some form of storage attached.
> >>
> >> s3gw is based on Ceph’s RADOSGW but runs as a stand-alone service
> >> without the RADOS cluster and relies on a storage backend still under
> >> heavy development by the storage team at SUSE. Additionally, the s3gw
> >> team is developing a web-based UI for management and an object
> >> explorer.
> >>
> >> More information can be found at https://aquarist-labs.io/s3gw/ or
> >> https://github.com/aquarist-labs/s3gw/ .
> >>
> >>-Joao and the s3gw team
> >>
> >> [1] https://github.com/aquarist-labs/s3gw-charts
> >> [2] https://longhorn.io
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> >
> >
> > --
> >
> > Matt Benjamin
> > Red Hat, Inc.
> > 315 West Huron Street, Suite 140A
> > Ann Arbor, Michigan 48103
> >
> > http://www.redhat.com/en/technologies/storage
> >
> > tel.  734-821-5101
> > fax.  734-769-8938
> > cel.  734-216-5309
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: S3 key prefixes and performance impact on Ceph?

2020-05-22 Thread Matt Benjamin
Hi,

The current behavior is effectively that of a flat namespace.  As the
number of objects in a bucket becomes large, RGW partitions the index,
and a hash of the key name is used to place it.  Reads on the
partitions are done in parallel (unless unordered listing is
requested, an RGW extension).

Matt

On Fri, May 22, 2020 at 8:39 AM  wrote:
>
> I've just set up a Ceph cluster and I'm accessing it via object gateway with 
> S3 API.
>
> One thing I don't see documented anywhere is - how does Ceph performance 
> scale with S3 key prefixes?
>
> In AWS S3, performance scales linearly with key prefix (see: 
> https://docs.aws.amazon.com/AmazonS3/latest/dev/optimizing-performance.html). 
> I see the keys as a nested hash table or nodes of a prefix tree, where each 
> prefix is stored in closer proximity at a hardware level - you want to spread 
> reads evenly over prefixes to avoid parallel I/O being concentrated on the 
> same hot spots.
>
> So for example if my access pattern regularly involves scanning data through 
> multiple dates for a single city, this key structure will be more effective: 
> `mmdd/city/data.csv`. Whereas if my access pattern involves scanning 
> through different cities on a single date, `city/mmdd/data.csv` would be 
> more effective.
>
> How about Ceph? Does naming convention of the key prefixes have an effect on 
> Ceph's object gateway performance or does it treat the full object "paths" as 
> a completely flat namespace?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW Garbage Collector

2020-05-24 Thread Matt Benjamin
Hi Manuel,

rgw_gc_obj_min_wait -- yes, this is how you control how long rgw waits
before removing the stripes of deleted objects

the following are more gc performance and proportion of available iops:
rgw_gc_processor_max_time -- controls how long gc runs once scheduled;
 a large value might be 3600
rgw_gc_processor_period -- sets the gc cycle;  smaller is more frequent

If you want to make gc more aggressive when it is running, set the
following (can be increased), which more than doubles the :

rgw_gc_max_concurrent_io = 20
rgw_gc_max_trim_chunk = 32

If you want to increase gc fraction of total rgw i/o, increase these
(mostly, concurrent_io).

regards,

Matt

On Sun, May 24, 2020 at 4:02 PM EDH - Manuel Rios
 wrote:
>
> Hi,
>
> Im looking for any experience optimizing garbage collector with the next 
> configs:
>
> global  advanced rgw_gc_obj_min_wait
> global  advanced rgw_gc_processor_max_time
> global  advanced rgw_gc_processor_period
>
> By default gc expire objects within 2 hours, we're looking to define expire 
> in 10 minutes as our S3 cluster got heavy uploads and deletes.
>
> Are those params usable? For us doesn't have sense store delete objects 2 
> hours in a gc.
>
> Regards
> Manuel
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW multi-object delete failing with 403 denied

2020-07-11 Thread Matt Benjamin
> 2020-07-11T17:55:54.038+0100 7f45adad7700  5 Searching permissions for
> group=2 mask=50
> 2020-07-11T17:55:54.038+0100 7f45adad7700  5 Permissions for group not found
> 2020-07-11T17:55:54.038+0100 7f45adad7700  5 req 15 0.00402s
> s3:multi_object_delete -- Getting permissions done for
> identity=rgw::auth::SysReqApplier ->
> rgw::auth::LocalApplier(acct_user=j, acct_name=J, subuser=,
> perm_mask=15, is_admin=0), owner=c, perm=0
> 2020-07-11T17:55:54.038+0100 7f45adad7700 10 req 15 0.00402s
> s3:multi_object_delete  identity=rgw::auth::SysReqApplier ->
> rgw::auth::LocalApplier(acct_user=j, acct_name=J, subuser=,
> perm_mask=15, is_admin=0) requested perm (type)=2, policy perm=0,
> user_perm_mask=2, acl perm=0
> 2020-07-11T17:55:54.038+0100 7f45adad7700  1 op->ERRORHANDLER:
> err_no=-13 new_err_no=-13
> 2020-07-11T17:55:54.038+0100 7f45adad7700  2 req 15 0.00402s
> s3:multi_object_delete op status=0
> 2020-07-11T17:55:54.038+0100 7f45adad7700  2 req 15 0.00402s
> s3:multi_object_delete http status=403
> 2020-07-11T17:55:54.038+0100 7f45adad7700  1 == req done
> req=0x7f45adaced50 op status=0 http_status=403 latency=0.00402s ==
> 2020-07-11T17:55:54.038+0100 7f45adad7700 20 process_request() returned -13
> 2020-07-11T17:55:54.038+0100 7f45adad7700  1 civetweb: 0x5628b9424000:
> 192.168.80.135 - - [11/Jul/2020:17:55:54 +0100] "POST /mybucket/?delete
> HTTP/1.1" 403 464 - aws-sdk-java/1.11.820 Linux/5.7.7-200.fc32.x86_64
> OpenJDK_64-Bit_Server_VM/14.0.1+7 java/14.0.1 vendor/Red_Hat,_Inc.
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW versioned objects lost after Octopus 15.2.3 -> 15.2.4 upgrade

2020-08-05 Thread Matt Benjamin
Hi Chris,

There is new lifecycle processing logic backported to Octopus, it
looks like, in 15.2.3.  I'm looking at the non-current calculation to
see if it could incorrectly rely on a stale value (from an eralier
entry).

thanks,

Matt

On Wed, Aug 5, 2020 at 8:52 AM Chris Palmer  wrote:
>
> This is starting to look like a regression error in Octopus 15.2.4.
>
> After cleaning things up by deleting all old versions, and deleting and
> recreating the bucket lifecycle policy (see below), I then let it run.
> Each day a new version got created, dating back to 17 July (correct).
> Until this morning when we went over the 18-day non-current expiration:
> this morning all but the latest version of each object disappeared -
> which is what happened the midnight following after our 15.2.3->15.2.4
> upgrade.
>
> So instead of a gently rolling 18-days of versions (on 15.2.3), we now
> build up to 18 days after which all non-current versions get deleted (on
> 15.2.4).
>
> Anyone come across versioning problems on 15.2.4?
>
> Thanks, Chris
>
> On 17/07/2020 09:11, Chris Palmer wrote:
> > This got worse this morning. An RGW daemon crashed at midnight with a
> > segfault, and the backtrace hints that it was processing the
> > expiration rule:
> >
> > "backtrace": [
> > "(()+0x12730) [0x7f97b8c4e730]",
> > "(()+0x15878a) [0x7f97b862378a]",
> > "(std::__cxx11::basic_string,
> > std::allocator >::compare(std::__cxx11::basic_string > std::char_traits, std::allocator > const&) const+0x23)
> > [0x7f97c25d3e43]",
> > "(LCOpAction_DMExpiration::check(lc_op_ctx&,
> > std::chrono::time_point > std::chrono::duration >
> > >*)+0x87) [0x7f97c283d127]",
> > "(LCOpRule::process(rgw_bucket_dir_entry&, DoutPrefixProvider
> > const*)+0x1b8) [0x7f97c281cbc8]",
> > "(()+0x5b836d) [0x7f97c281d36d]",
> > "(WorkQ::entry()+0x247) [0x7f97c28302d7]",
> > "(()+0x7fa3) [0x7f97b8c43fa3]",
> > "(clone()+0x3f) [0x7f97b85c44cf]"
> >
> > One object version got removed when it should not have.
> >
> > In an attempt to clean things up I have manually deleted all
> > non-current versions, and removed and recreated the (same) lifecycle
> > policy. I will also create a new test bucket with a similar policy and
> > test that in parallel. We will see what happens tomorrow
> >
> > Thanks, Chris
> >
> >
> > On 16/07/2020 08:22, Chris Palmer wrote:
> >> I have an RGW bucket (backups) that is versioned. A nightly job
> >> creates a new version of a few objects. There is a lifecycle policy
> >> (see below) that keeps 18 days of versions. This has been working
> >> perfectly and has not been changed. Until I upgraded Octopus...
> >>
> >> The nightly job creates separate log files, including a listing of
> >> the object versions. From these I can see that:
> >>
> >> 13/7  02:14   versions from 13/7 01:13 back to 24/6 01:17 (correct)
> >>
> >> 14/7  02:14   versions from 14/7 01:13 back to 25/6 01:14 (correct)
> >>
> >> 14/7  10:00   upgrade Octopus 15.2.3 -> 15.2.4
> >>
> >> 15/7  02:14   versions from 15/7 01:13 back to 25/6 01:14 (would have
> >> expected 25/6 to have expired)
> >>
> >> 16/7  02:14   versions from 16/7 01:13 back to 15/7 01:13 (now all
> >> pre-upgrade versions have wrongly disappeared)
> >>
> >> It's not a big deal for me as they are only backups, providing it
> >> continues to work correctly from now on. However it may affect some
> >> other people  much more.
> >>
> >> Any ideas on the root cause? And if it is likely to be stable again now?
> >>
> >> Thanks, Chris
> >>
> >> {
> >> "Rules": [
> >> {
> >> "Expiration": {
> >> "ExpiredObjectDeleteMarker": true
> >> },
> >> "ID": "Expiration & incomplete uploads",
> >> "Prefix": "",
> >>     "Status": "Enabled",
> >> "NoncurrentVersionExpiration": {
> >> "NoncurrentDays": 18
> >> },
> >> "AbortIncompleteMultipartUpload": {
> >> "DaysAfterInitiation": 1
> >> }
> >> }
> >> ]
> >> }
> >>
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW versioned objects lost after Octopus 15.2.3 -> 15.2.4 upgrade

2020-08-05 Thread Matt Benjamin
Hi Chris,

I've confirmed that the issue you're experiencing are addressed in
lifecycle commits that were required but missed during the backport to
Octopus.  I'll work with the backport team to address this quickly.

Thanks for providing the detailed reproducer information, it was very
helpful in identifying the issue.

Matt

On Wed, Aug 5, 2020 at 9:23 AM Matt Benjamin  wrote:
>
> Hi Chris,
>
> There is new lifecycle processing logic backported to Octopus, it
> looks like, in 15.2.3.  I'm looking at the non-current calculation to
> see if it could incorrectly rely on a stale value (from an eralier
> entry).
>
> thanks,
>
> Matt
>
> On Wed, Aug 5, 2020 at 8:52 AM Chris Palmer  
> wrote:
> >
> > This is starting to look like a regression error in Octopus 15.2.4.
> >
> > After cleaning things up by deleting all old versions, and deleting and
> > recreating the bucket lifecycle policy (see below), I then let it run.
> > Each day a new version got created, dating back to 17 July (correct).
> > Until this morning when we went over the 18-day non-current expiration:
> > this morning all but the latest version of each object disappeared -
> > which is what happened the midnight following after our 15.2.3->15.2.4
> > upgrade.
> >
> > So instead of a gently rolling 18-days of versions (on 15.2.3), we now
> > build up to 18 days after which all non-current versions get deleted (on
> > 15.2.4).
> >
> > Anyone come across versioning problems on 15.2.4?
> >
> > Thanks, Chris
> >
> > On 17/07/2020 09:11, Chris Palmer wrote:
> > > This got worse this morning. An RGW daemon crashed at midnight with a
> > > segfault, and the backtrace hints that it was processing the
> > > expiration rule:
> > >
> > > "backtrace": [
> > > "(()+0x12730) [0x7f97b8c4e730]",
> > > "(()+0x15878a) [0x7f97b862378a]",
> > > "(std::__cxx11::basic_string,
> > > std::allocator >::compare(std::__cxx11::basic_string > > std::char_traits, std::allocator > const&) const+0x23)
> > > [0x7f97c25d3e43]",
> > > "(LCOpAction_DMExpiration::check(lc_op_ctx&,
> > > std::chrono::time_point > > std::chrono::duration >
> > > >*)+0x87) [0x7f97c283d127]",
> > > "(LCOpRule::process(rgw_bucket_dir_entry&, DoutPrefixProvider
> > > const*)+0x1b8) [0x7f97c281cbc8]",
> > > "(()+0x5b836d) [0x7f97c281d36d]",
> > > "(WorkQ::entry()+0x247) [0x7f97c28302d7]",
> > > "(()+0x7fa3) [0x7f97b8c43fa3]",
> > > "(clone()+0x3f) [0x7f97b85c44cf]"
> > >
> > > One object version got removed when it should not have.
> > >
> > > In an attempt to clean things up I have manually deleted all
> > > non-current versions, and removed and recreated the (same) lifecycle
> > > policy. I will also create a new test bucket with a similar policy and
> > > test that in parallel. We will see what happens tomorrow
> > >
> > > Thanks, Chris
> > >
> > >
> > > On 16/07/2020 08:22, Chris Palmer wrote:
> > >> I have an RGW bucket (backups) that is versioned. A nightly job
> > >> creates a new version of a few objects. There is a lifecycle policy
> > >> (see below) that keeps 18 days of versions. This has been working
> > >> perfectly and has not been changed. Until I upgraded Octopus...
> > >>
> > >> The nightly job creates separate log files, including a listing of
> > >> the object versions. From these I can see that:
> > >>
> > >> 13/7  02:14   versions from 13/7 01:13 back to 24/6 01:17 (correct)
> > >>
> > >> 14/7  02:14   versions from 14/7 01:13 back to 25/6 01:14 (correct)
> > >>
> > >> 14/7  10:00   upgrade Octopus 15.2.3 -> 15.2.4
> > >>
> > >> 15/7  02:14   versions from 15/7 01:13 back to 25/6 01:14 (would have
> > >> expected 25/6 to have expired)
> > >>
> > >> 16/7  02:14   versions from 16/7 01:13 back to 15/7 01:13 (now all
> > >> pre-upgrade versions have wrongly disappeared)
> > >>
> > >> It's not a big deal for me as they are only backups, providing it
> > >> continues to work correctly from now on. However it may affect some
> > >> other people  much more.
> > >>
> > >> Any ideas on the root cause? And if it is likely to 

[ceph-users] Re: RGW versioned objects lost after Octopus 15.2.3 -> 15.2.4 upgrade

2020-08-05 Thread Matt Benjamin
The lifecycle changes in question do not change the semantics nor any
api of lifecycle.  The behavior change was a regression.

regards,

Matt

On Wed, Aug 5, 2020 at 12:12 PM Daniel Poelzleithner  wrote:
>
> On 2020-08-05 15:23, Matt Benjamin wrote:
>
> > There is new lifecycle processing logic backported to Octopus, it
> > looks like, in 15.2.3.  I'm looking at the non-current calculation to
> > see if it could incorrectly rely on a stale value (from an eralier
> > entry).
>
> So, you don't care about samever ?
>
> Replacing a lifecyle processing logic in a patch version is, sorry to
> say this, a nogo. At least, don't make the impression to use semantic
> versioning then.
>
> kind regards
>  poelzi
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW unable to delete a bucket

2020-08-06 Thread Matt Benjamin
Hi Folks,

I don't know of a downstream issue that looks like this, and we've
upstreamed every fix for bucket listing and cleanup we have.  We are
pursuing a space leak believed to arise in "radosgw-admin bucket rm
--purge-objects" but not a non-terminating listing.

The only upstream release not planned to get a backport of orphans
list tools is Luminous.  I thought backport to Octopus was already
done by the backport team?

regards,

Matt

On Thu, Aug 6, 2020 at 2:40 PM EDH - Manuel Rios
 wrote:
>
> You'r not the only one affected by this issue
>
> As far as i know several huge companies hitted this bug too, but private 
> patches or tools are not public released.
>
> This is caused for the a resharding process during upload in previous 
> versions.
>
> Workarround for us.:
>
> - Delete objects of the bucket at rados level.
> - Delete the index file of the bucket.
>
> Pray to god to not happen again.
>
> Still pending backporting to Nautilus of the new experimental tool to find 
> orphans in RGW
>
> Maybe @Matt Benjamin can give us and ETA for get ready that tool backported...
>
> Regards
>
>
>
> -Mensaje original-
> De: Andrei Mikhailovsky 
> Enviado el: jueves, 6 de agosto de 2020 13:55
> Para: ceph-users 
> Asunto: [ceph-users] Re: RGW unable to delete a bucket
>
> BUMP...
>
>
> - Original Message -
> > From: "Andrei Mikhailovsky" 
> > To: "ceph-users" 
> > Sent: Tuesday, 4 August, 2020 17:16:28
> > Subject: [ceph-users] RGW unable to delete a bucket
>
> > Hi
> >
> > I am trying to delete a bucket using the following command:
> >
> > # radosgw-admin bucket rm --bucket= --purge-objects
> >
> > However, in console I get the following messages. About 100+ of those
> > messages per second.
> >
> > 2020-08-04T17:11:06.411+0100 7fe64cacf080 1
> > RGWRados::Bucket::List::list_objects_ordered INFO ordered bucket
> > listing requires read #1
> >
> >
> > The command has been running for about 35 days days and it still
> > hasn't finished. The size of the bucket is under 1TB for sure. Probably 
> > around 500GB.
> >
> > I have recently removed about a dozen of old buckets without any
> > issues. It's this particular bucket that is being very stubborn.
> >
> > Anything I can do to remove it, including it's objects and any orphans
> > it might have?
> >
> >
> > Thanks
> >
> > Andrei
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to 
> ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Multipart uploads with partsizes larger than 16MiB failing on Nautilus

2020-09-08 Thread Matt Benjamin
thanks, Shubjero

Would you consider creating a ceph tracker issue for this?

regards,

Matt

On Tue, Sep 8, 2020 at 4:13 PM shubjero  wrote:
>
> I had been looking into this issue all day and during testing found
> that a specific configuration option we had been setting for years was
> the culprit. Not setting this value and letting it fall back to the
> default seems to have fixed our issue with multipart uploads.
>
> If you are curious, the configuration option is rgw_obj_stripe_size
> which was being set to 67108864 bytes (64MiB). The default is 4194304
> bytes (4MiB). This is a documented option
> (https://docs.ceph.com/docs/nautilus/radosgw/config-ref/) and from my
> testing it seems like using anything but the default (only tried
> larger values) breaks multipart uploads.
>
> On Tue, Sep 8, 2020 at 12:12 PM shubjero  wrote:
> >
> > Hey all,
> >
> > I'm creating a new post for this issue as we've narrowed the problem
> > down to a partsize limitation on multipart upload. We have discovered
> > that in our production Nautilus (14.2.11) cluster and our lab Nautilus
> > (14.2.10) cluster that multipart uploads with a configured part size
> > of greater than 16777216 bytes (16MiB) will return a status 500 /
> > internal server error from radosgw.
> >
> > So far I have increased the following rgw settings/values that looked
> > suspect, without any success/improvement with partsizes.
> > Such as:
> > "rgw_get_obj_window_size": "16777216",
> > "rgw_put_obj_min_window_size": "16777216",
> >
> > I am trying to determine if this is because of a conservative default
> > setting somewhere that I don't know about or if this is perhaps a bug?
> >
> > I would appreciate it if someone on Nautilus with rgw could also test
> > / provide feedback. It's very easy to reproduce and configuring your
> > partsize with aws2cli requires you to put the following in your aws
> > 'config'
> > s3 =
> >   multipart_chunksize = 32MB
> >
> > rgw server logs during a failed multipart upload (32MB chunk/partsize):
> > 2020-09-08 15:59:36.054 7f2d32fa6700  1 == starting new request
> > req=0x55953dc36930 =
> > 2020-09-08 15:59:36.082 7f2d32fa6700 -1 res_query() failed
> > 2020-09-08 15:59:36.138 7f2d32fa6700  1 == req done
> > req=0x55953dc36930 op status=0 http_status=200 latency=0.0839988s
> > ==
> > 2020-09-08 16:00:07.285 7f2d3dfbc700  1 == starting new request
> > req=0x55953dc36930 =
> > 2020-09-08 16:00:07.285 7f2d3dfbc700 -1 res_query() failed
> > 2020-09-08 16:00:07.353 7f2d00741700  1 == starting new request
> > req=0x55954dd5e930 =
> > 2020-09-08 16:00:07.357 7f2d00741700 -1 res_query() failed
> > 2020-09-08 16:00:07.413 7f2cc56cb700  1 == starting new request
> > req=0x55953dc02930 =
> > 2020-09-08 16:00:07.417 7f2cc56cb700 -1 res_query() failed
> > 2020-09-08 16:00:07.473 7f2cb26a5700  1 == starting new request
> > req=0x5595426f6930 =
> > 2020-09-08 16:00:07.473 7f2cb26a5700 -1 res_query() failed
> > 2020-09-08 16:00:09.465 7f2d3dfbc700  0 WARNING: set_req_state_err
> > err_no=35 resorting to 500
> > 2020-09-08 16:00:09.465 7f2d3dfbc700  1 == req done
> > req=0x55953dc36930 op status=-35 http_status=500 latency=2.17997s
> > ==
> > 2020-09-08 16:00:09.549 7f2d00741700  0 WARNING: set_req_state_err
> > err_no=35 resorting to 500
> > 2020-09-08 16:00:09.549 7f2d00741700  1 == req done
> > req=0x55954dd5e930 op status=-35 http_status=500 latency=2.19597s
> > ==
> > 2020-09-08 16:00:09.605 7f2cc56cb700  0 WARNING: set_req_state_err
> > err_no=35 resorting to 500
> > 2020-09-08 16:00:09.609 7f2cc56cb700  1 == req done
> > req=0x55953dc02930 op status=-35 http_status=500 latency=2.19597s
> > ==
> > 2020-09-08 16:00:09.641 7f2cb26a5700  0 WARNING: set_req_state_err
> > err_no=35 resorting to 500
> > 2020-09-08 16:00:09.641 7f2cb26a5700  1 == req done
> > req=0x5595426f6930 op status=-35 http_status=500 latency=2.16797s
> > ==
> >
> > awscli client side output during a failed multipart upload:
> > root@jump:~# aws --no-verify-ssl --endpoint-url
> > http://lab-object.cancercollaboratory.org:7480 s3 cp 4GBfile
> > s3://troubleshooting
> > upload failed: ./4GBfile to s3://troubleshooting/4GBfile An error
> > occurred (UnknownError) when calling the UploadPart operation (reached
> > max retries: 2): Unknown
> >
> > Thanks,
> >
> > Jared Baker
> > Cloud Architect for the Cancer Genome Collaboratory
> > Ontario Institute for Cancer Research
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rgw index shard much larger than others

2020-10-01 Thread Matt Benjamin
Hi Dan,

Possibly you're reproducing https://tracker.ceph.com/issues/46456.

That explains how the underlying issue worked, I don't remember how a
bucked exhibiting this is repaired.

Eric?

Matt


On Thu, Oct 1, 2020 at 8:41 AM Dan van der Ster  wrote:
>
> Dear friends,
>
> Running 14.2.11, we have one particularly large bucket with a very
> strange distribution of objects among the shards. The bucket has 512
> shards, and most shards have ~75k entries, but shard 0 has 1.75M
> entries:
>
> # rados -p default.rgw.buckets.index listomapkeys
> .dir.61c59385-085d-4caa-9070-63a3868dccb6.272652427.1.0 | wc -l
> 1752085
>
> # rados -p default.rgw.buckets.index listomapkeys
> .dir.61c59385-085d-4caa-9070-63a3868dccb6.272652427.1.1 | wc -l
> 78388
>
> # rados -p default.rgw.buckets.index listomapkeys
> .dir.61c59385-085d-4caa-9070-63a3868dccb6.272652427.1.2 | wc -l
> 78764
>
> We had resharded this bucket (manually) from 32 up to 512 shards just
> before upgrading from 12.2.12 to 14.2.11 a couple weeks ago.
>
> Any idea why shard .0 is getting such an imbalance of entries?
> Should we manually reshard this bucket again?
>
> Thanks!
>
> Dan
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Multisite replication speed

2020-10-09 Thread Matt Benjamin
Hi Nicolas,

This is expected behavior currently, but a sync fairness mechanism
that will permit load-sharing across gateways during replication is
being worked on.

regards,

Matt

On Fri, Oct 9, 2020 at 6:30 AM Nicolas Moal  wrote:
>
> Hello Paul,
>
> Thank you very much for pointing us at BBR ! We will definitely run some 
> tests before and with the change applied to see if it's increasing a bit our 
> transfer speed.
>
> One additional question if you don't mind. As of today, our zonegroup 
> configuration consist of two zones, a master zone with a HAProxy VIP, and a 
> slave zone with another HAProxy VIP. Those VIPs are acting as endpoints for 
> users for accessing the cluster but are also used for the replication traffic 
> between the master and the slave zone, so it's one frontend IP with the 
> RadosGWs configured as backends in the HAProxy configuration for both.
>
> Let's say we configure multiples VIPs as endpoints in the zonegroup 
> configuration, will Ceph load-balance the traffic between those endpoints for 
> the replication traffic or take advantages of the multiples endpoints to 
> multithread the replication traffic between those endpoints, thus increasing 
> the overall replication speed ?
>
> Our initial tests show that no matter how many endpoints you specify in the 
> configuration, it will only use one source IP and destination IP at a time 
> and fuel the replication traffic throught this one, and only this one.
>
> Is it the expected behavior or we are missing something ?
>
> Thanks again !
>
> Cheers,
>
> Nicolas
>
> Téléchargez Outlook pour iOS<https://aka.ms/o0ukef>
> 
> De : Paul Mezzanini 
> Envoyé : Thursday, October 8, 2020 6:51:36 PM
> À : Nicolas Moal ; ceph-users 
> Objet : Re: Multisite replication speed
>
> With a long distance link I would definitely look into switching to BBR for 
> your congestion control as your first step.
>
> Well, your _first_ step is to do an iperf and establish a baseline
>
> A quick search and this link seems to explain it not-too-bad
> https://www.cyberciti.biz/cloud-computing/increase-your-linux-server-internet-speed-with-tcp-bbr-congestion-control/
>
> We have used it before with great success for long distance, high throughput 
> transfers.
>
> -paul
> --
> Paul Mezzanini
> Sr Systems Administrator / Engineer, Research Computing
> Information & Technology Services
> Finance & Administration
> Rochester Institute of Technology
> o:(585) 475-3245 | pfm...@rit.edu
>
> CONFIDENTIALITY NOTE: The information transmitted, including attachments, is
> intended only for the person(s) or entity to which it is addressed and may
> contain confidential and/or privileged material. Any review, retransmission,
> dissemination or other use of, or taking of any action in reliance upon this
> information by persons or entities other than the intended recipient is
> prohibited. If you received this in error, please contact the sender and
> destroy any copies of this information.
> 
>
> 
> From: Nicolas Moal 
> Sent: Thursday, October 8, 2020 10:36 AM
> To: ceph-users
> Subject: [ceph-users] Multisite replication speed
>
> Hello everybody,
>
> We have two Ceph object clusters replicating over a very long-distance WAN 
> link. Our version of Ceph is 14.2.10.
> Currently, replication speed seems to be capped around 70 MiB/s even if 
> there's a 10Gb WAN link between the two clusters.
> The clusters themselves don't seem to suffer from any performance issue.
>
> The replication traffic leverages HAProxy VIPs, which means there's a single 
> endpoint (the HAProxy VIP) in the multisite replication configuration.
>
> So, my questions are:
> - Is it possible to improve replication speed by adding more endpoints in the 
> multisite replication configuration? The issue we are facing is that the 
> secondary cluster is way behind the master cluster because of the relatively 
> slow speed.
> - Is there anything else I can do to optimize replication speed ?
>
> Thanks for your comments !
>
> Nicolas
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW blocking on large objects

2019-10-17 Thread Matt Benjamin
My impression is that running a second gateway (assuming 1 at present)
on the same host would be preferable to running one with very high
thread count, also that 1024 is a good maximum value for thread count.

Matt

On Thu, Oct 17, 2019 at 4:01 PM Robert LeBlanc  wrote:
>
> On Thu, Oct 17, 2019 at 11:46 AM Casey Bodley  wrote:
> >
> >
> > On 10/17/19 12:59 PM, Robert LeBlanc wrote:
> > > On Thu, Oct 17, 2019 at 9:22 AM Casey Bodley  wrote:
> > >
> > >> With respect to this issue, civetweb and beast should behave the same.
> > >> Both frontends have a large thread pool, and their calls to
> > >> process_request() run synchronously (including blocking on rados
> > >> requests) on a frontend thread. So once there are more concurrent client
> > >> connections than there are frontend threads, new connections will block
> > >> until there's a thread available to service them.
> > > Okay, this really helps me understand what's going on here. Is there
> > > plans to remove the synchronous calls and make them async or improve
> > > this flow a bit?
> >
> > Absolutely yes, this work has been in progress for a long time now, and
> > octopus does get a lot of concurrency here. Eventually, all of
> > process_request() will be async-enabled, and we'll be able to run beast
> > with a much smaller thread pool.
>
> This is great news. Anything we can do to help in this effort as it is
> very important for us?
>
> > > Currently I'm seeing 1024 max concurrent ops and 512 thread pool. Does
> > > this mean that on an equally distributed requests that one op could be
> > > processing on the backend RADOS with another queued behind it waiting?
> > > Is this done in round robin fashion so for 99% small io, a very long
> > > RADOS request can get many IO blocked behind it because it is being
> > > round-robin dispatched to the thread pool? (I assume the latter is
> > > what I'm seeing).
> > >
> > > rgw_max_concurrent_requests1024
> > > rgw_thread_pool_size   512
> > >
> > > If I match the two, do you think it would help prevent small IO from
> > > being blocked by larger IO?
> > rgw_max_concurrent_requests was added in support of the beast/async
> > work, precisely because (post-Nautilus) the number of beast threads will
> > no longer limit the number of concurrent requests. This variable is what
> > throttles incoming requests to prevent radosgw's resource consumption
> > from ballooning under heavy workload. And unlike the existing model
> > where a request remains in the queue until a thread is ready to service
> > it, any requests that exceed rgw_max_concurrent_requests will be
> > rejected with '503 SlowDown' in s3 or '498 Rate Limited' in swift.
> >
> > With respect to prioritization, there isn't any by default but we do
> > have a prototype request scheduler that uses dmclock to prioritize
> > requests based on some hard-coded request classes. It's not especially
> > useful in its current form, but we do have plans to further elaborate
> > the classes and eventually pass the information down to osds for
> > integrated QOS.
> >
> > As of nautilus, though, the thread pool size is the only effective knob
> > you have.
>
> Do you see any problems with running 2k-4k threads if we have the RAM to do 
> so?
>
> 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW blocking on large objects

2019-10-17 Thread Matt Benjamin
Thanks very much, Robert.

Matt

On Thu, Oct 17, 2019 at 5:24 PM Robert LeBlanc  wrote:
>
> On Thu, Oct 17, 2019 at 2:03 PM Casey Bodley  wrote:
> > > This is great news. Anything we can do to help in this effort as it is
> > > very important for us?
> >
> > We would love help here. While most of the groundwork is done, so the
> > remaining work is mostly mechanical.
> >
> > To summarize the strategy, the beast frontend spawns a coroutine for
> > each client connection, and that coroutine is represented by a
> > boost::asio::yield_context. We wrap this in an 'optional_yield' struct
> > that gets passed to process_request(). The civetweb frontend always
> > passes an empty object (ie null_yield) so that everything runs
> > synchronously. When making calls into librados, we have a
> > rgw_rados_operate() function that supports this optional_yield argument.
> > If it gets a null_yield, it calls the blocking version of
> > librados::IoCtx::operate(). Otherwise it calls a special
> > librados::async_operate() function which suspends the coroutine until
> > completion instead of blocking the thread.
> >
> > So most of the remaining work is in plumbing this optional_yield
> > variable through all of the code paths under process_request() that call
> > into librados. The rgw_rados_operate() helpers will log a "WARNING:
> > blocking librados call" whenever they block inside of a beast frontend
> > thread, so we can go through the rgw log to identify all of the places
> > that still need a yield context. By iterating on this process, we can
> > eventually remove all of the blocking calls, then set up regression
> > testing to verify that no rgw logs contain that warning.
> >
> > Here's an example pr from Ali that adds the optional_yield to requests
> > for bucket instance info: https://github.com/ceph/ceph/pull/27898. It
> > extends the get_bucket_info() call to take optional_yield, and passes
> > one in where available, using null_yield to mark the synchronous cases
> > where one isn't available.
>
> I'll work to get familiar with the code base and see if I can submit
> some PRs to help out. Things are a bit crazy, but this is very
> important to us too.
>
> 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io



-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [object gateway] setting storage class does not move object to correct backing pool?

2019-12-10 Thread Matt Benjamin
lt;<<<<<<<
> root@node1:~# rados -p tier1-ssd ls
> ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_darthvader.png
> ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_2019-10-15-090436_1254x522_scrubbed.png
> ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_kanariepiet.jpg
>
> root@node1:~# rados -p tier2-hdd ls
> ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1__shadow_.FEruUOZaVJXJcOG-e2tO1xcInNzoEvN_0
>
> $ s3cmd info s3://bucket/kanariepiet.jpg
> [snip]
> Last mod:  Tue, 10 Dec 2019 08:09:58 GMT
> Storage:   STANDARD
> [snip]
>
> $ s3cmd info s3://bucket/darthvader.png
> [snip]
> Last mod:  Wed, 04 Dec 2019 10:35:14 GMT
> Storage:   SPINNING_RUST
> [snip]
>
> $ s3cmd info s3://bucket/2019-10-15-090436_1254x522_scrubbed.png
> [snip]
> Last mod:  Tue, 10 Dec 2019 10:33:24 GMT
> Storage:   STANDARD
> [snip]
> ==
>
> Any thoughts on what might occur here?
>
> Best regards,
> Gerdriaan Mulder
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Bucket rename with

2020-02-14 Thread Matt Benjamin
The world lived for a long time without it, but it's certainly useful.
If Abhishek and/or the backport team would like this for Nautilus, I
will help retarget our downstream backport (it's bigger than 22
commits with the dependencies, I believe, n.b.).

Matt

On Fri, Feb 14, 2020 at 5:02 PM EDH - Manuel Rios
 wrote:
>
> Honestly, not having a function to rename bucket from admin rgw-admin is like 
> not having a function to copy or move. It is something basic, since if not 
> the workarround, it is to create a new bucket and move all the files with the 
> consequent loss of time and cost of computation. In addition to the 
> interruption.
>
> Im sure that im not the only one administrator of rgw that need rename some 
> buckets, because by default system let Users for example use CAPITIAL letter 
> and that's not compliance.
>
> KR,
> Manuel
>
> -Mensaje original-
> De: J. Eric Ivancich 
> Enviado el: viernes, 14 de febrero de 2020 20:47
> Para: EDH - Manuel Rios 
> CC: ceph-users@ceph.io
> Asunto: Re: [ceph-users] Bucket rename with
>
> On 2/4/20 12:29 PM, EDH - Manuel Rios wrote:
> > Hi
> >
> > Some Customer asked us for a normal easy problem, they want rename a bucket.
> >
> > Checking the Nautilus documentation looks by now its not possible, but I 
> > checked master documentation and a CLI should be accomplish this apparently.
> >
> > $ radosgw-admin bucket link --bucket=foo --bucket-new-name=bar --uid=johnny
> >
> > Will be backported to Nautilus? Or its still just for developer/master 
> > users?
> >
> > https://docs.ceph.com/docs/master/man/8/radosgw-admin/
> Given both that it's a feature and the sheer size of the PR -- 22 commits and 
> 32 files altered -- my guess is that it will not be backported to Nautilus. 
> However I'll invite the principals to weigh in.
>
> Best,
>
> Eric
>
> --
> J. Eric Ivancich
> he/him/his
> Red Hat Storage
> Ann Arbor, Michigan, USA
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw garbage collection seems stuck and mannual gc process didn't work

2020-04-11 Thread Matt Benjamin
An issue presenting exactly like this was fixed in spring of last year, for
certain on nautilus and higher.

Matt

On Sat, Apr 11, 2020, 12:04 PM <346415...@qq.com> wrote:

> Ceph Version : Mimic 13.2.4
>
> The cluster has been running steadily for more than a year, recently I
> found cluster usage grows faster than usual .And we figured out the problem
> is garbage collection.
> 'radosgw-admin gc list '  has millions of objects to gc. the earliest tag
> time is 2019-09 , but 99% of them are from 2020-03 to now
>
> `ceph df`
> GLOBAL:
> SIZEAVAIL   RAW USED %RAW USED
> 1.7 PiB 1.1 PiB  602 TiB 35.22
> POOLS:
> NAME   ID USED%USED MAX AVAIL
>OBJECTS
> .rgw.root  10 1.2 KiB 0   421 TiB
>4
> default.rgw.control11 0 B 0   421 TiB
>8
> default.rgw.data.root  12 0 B 0   421 TiB
>0
> default.rgw.gc 13 0 B 0   421 TiB
>0
> default.rgw.log14 4.8 GiB 0   421 TiB
> 6414
> default.rgw.intent-log 15 0 B 0   421 TiB
>0
> default.rgw.meta   16 110 KiB 0   421 TiB
>  463
> default.rgw.usage  17 0 B 0   421 TiB
>0
> default.rgw.users.keys 18 0 B 0   421 TiB
>0
> default.rgw.users.email19 0 B 0   421 TiB
>0
> default.rgw.users.swift20 0 B 0   421 TiB
>0
> default.rgw.users.uid  21 0 B 0   421 TiB
>0
> default.rgw.buckets.extra  22 0 B 0   421 TiB
>0
> default.rgw.buckets.index  23 0 B 0   421 TiB
>   118720
> default.rgw.buckets.data   24 263 TiB 38.41   421 TiB
>138902771
> default.rgw.buckets.non-ec 25 0 B 0   421 TiB
>16678
>
> however we counted each bucket usage by  ' radosgw-admin bucket stats '
> ,it should cost only 160TiB , about 80TiB are in GC list
>
> former gc config setting before we find gc problem:
> rgw_gc_max_objs = 32
> rgw_gc_obj_min_wait = 3600
> rgw_gc_processor_period = 3600
> rgw_gc_processor_max_time = 3600
>
> yesterday we adjust our setting and restart rgw:
> rgw_gc_max_objs = 1024
> rgw_gc_obj_min_wait = 300
> rgw_gc_processor_period = 600
> rgw_gc_processor_max_time = 600
> rgw_gc_max_concurrent_io = 40
> rgw_gc_max_trim_chunk = 1024
>
> today  we use  ' rados  -p default.rgw.log listomapkeys gc.$i --cluster
> ceph -N gc | wc -l '   (i from 0 to 1023)
> well , only gc.0 to gc.511 has data
>
> here are some result sorted
>  -time 14:43 result:
>……
>36 gc_202004111443/gc.502.tag
>38 gc_202004111443/gc.501.tag
>40 gc_202004111443/gc.136.tag
>46 gc_202004111443/gc.511.tag
>   212 gc_202004111443/gc.9.tag
>   218 gc_202004111443/gc.24.tag
> 21976 gc_202004111443/gc.13.tag
> 42956 gc_202004111443/gc.26.tag
> 71772 gc_202004111443/gc.25.tag
> 85766 gc_202004111443/gc.6.tag
>104504 gc_202004111443/gc.7.tag
>105444 gc_202004111443/gc.10.tag
>106114 gc_202004111443/gc.3.tag
>126860 gc_202004111443/gc.31.tag
>127352 gc_202004111443/gc.23.tag
>147942 gc_202004111443/gc.27.tag
>148046 gc_202004111443/gc.15.tag
>167116 gc_202004111443/gc.28.tag
>167932 gc_202004111443/gc.21.tag
>187986 gc_202004111443/gc.5.tag
>188312 gc_202004111443/gc.22.tag
>209084 gc_202004111443/gc.30.tag
>209152 gc_202004111443/gc.18.tag
>209702 gc_202004111443/gc.19.tag
>231100 gc_202004111443/gc.8.tag
>249622 gc_202004111443/gc.14.tag
>251092 gc_202004111443/gc.2.tag
>251366 gc_202004111443/gc.12.tag
>251802 gc_202004111443/gc.0.tag
>252158 gc_202004111443/gc.11.tag
>272114 gc_202004111443/gc.1.tag
>291518 gc_202004111443/gc.20.tag
>293646 gc_202004111443/gc.16.tag
>312998 gc_202004111443/gc.17.tag
>352984 gc_202004111443/gc.29.tag
>488232 gc_202004111443/gc.4.tag
>   5935806 total
>
>
>  -time 16:53 result:
> ……
>28 gc_202004111653/gc.324.tag
>28 gc_202004111653/gc.414.tag
>30 gc_202004111653/gc.350.tag
>30 gc_202004111653/gc.456.tag
>   204 gc_202004111653/gc.9.tag
>   208 gc_202004111653/gc.24.tag
> 21986 gc_202004111653/gc.13.tag
> 42964 gc_202004111653/gc.26.tag
> 71780 gc_202004111653/gc.25.tag
> 85778 gc_202004111653/gc.6.tag
>104512 gc_202004111653/gc.7.tag
>105452 gc_202004111653/gc.10.tag
>106122 gc_202004111653/gc.3.tag
>126866 gc_202004111653/gc.31.tag
>127372 gc_202004111653/gc.23.tag
>147944 gc_202004111653/gc.27.tag
>14

[ceph-users] Re: radosgw garbage collection seems stuck and mannual gc process didn't work

2020-04-14 Thread Matt Benjamin
Hi Peter,

You won't need to do anything--the gc process will clear the stall and
begin clearing its backlog immediately after the upgrade.

Matt

On Sat, Apr 11, 2020 at 10:42 PM Peter Parker <346415...@qq.com> wrote:
>
> thanks a lot
> i'm not sure if the PR is  https://github.com/ceph/ceph/pull/26601  ?
> and that has been backport to mimic   https://github.com/ceph/ceph/pull/27796
>
> it seems the cluster needs to be upgraded  to  13.2.6 or higher
> after upgrade , what else should I do ? like manually execute  gc process  to 
> clean up those objects or just let it runs automatically?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Bucket sync across available DCs

2020-04-28 Thread Matt Benjamin
Hi Szabo,

Per-bucket sync with improved AWS compatibility was added in Octopus.

regards,

Matt

On Mon, Apr 27, 2020 at 11:18 PM Szabo, Istvan (Agoda)
 wrote:
>
> Hi,
>
> is there a way to synchronize a specific bucket by Ceph across the available 
> datacenters?
> I've just found multi site setup but that one sync the complete cluster, 
> which is equal to failover solution.
> For me just 1 bucket.
>
> Thank you
>
> 
> This message is confidential and is for the sole use of the intended 
> recipient(s). It may also be privileged or otherwise protected by copyright 
> or other legal rules. If you have received it by mistake please let us know 
> by reply email and delete it from your system. It is prohibited to copy this 
> message or disclose its content to anyone. Any confidentiality or privilege 
> is not waived or lost by any mistaken delivery or unauthorized disclosure of 
> the message. All messages sent to and from Agoda may be monitored to ensure 
> compliance with company policies, to protect the company's interests and to 
> remove potential malware. Electronic messages may be intercepted, amended, 
> lost or deleted, or contain viruses.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW STS Support in Nautilus ?

2020-05-12 Thread Matt Benjamin
yay!  thanks Wyllys, Pritha

Matt

On Tue, May 12, 2020 at 11:38 AM Wyllys Ingersoll
 wrote:
>
>
> Thanks for the hint, I fixed my keycloak configuration for that application 
> client so the token only includes a single audience value and now it works 
> fine.
>
> thanks!!
>
>
> On Tue, May 12, 2020 at 11:11 AM Wyllys Ingersoll 
>  wrote:
>>
>> The "aud" field in the introspection result is a list, not a single string.
>>
>> On Tue, May 12, 2020 at 11:02 AM Pritha Srivastava  
>> wrote:
>>>
>>> app_id must match with the 'aud' field in the token introspection result 
>>> (In the example the value of 'aud' is 'customer-portal')
>>>
>>> Thanks,
>>> Pritha
>>>
>>> On Tue, May 12, 2020 at 8:16 PM Wyllys Ingersoll 
>>>  wrote:
>>>>
>>>>
>>>> Running Nautilus 14.2.9 and trying to follow the STS example given here: 
>>>> https://docs.ceph.com/docs/master/radosgw/STS/ to setup a policy for 
>>>> AssumeRoleWithWebIdentity using KeyCloak (8.0.1) as the OIDC provider. I 
>>>> am able to see in the rgw debug logs that the token being passed from the 
>>>> client is passing the introspection check, but it always ends up failing 
>>>> the final authorization to access the requested bucket resource and is 
>>>> rejected with a 403 status "AccessDenied".
>>>>
>>>> I configured my policy as described in the 2nd example on the STS page 
>>>> above. I suspect the problem is with the "StringEquals" condition 
>>>> statement in the AssumeRolePolicy document (I could be wrong though).
>>>>
>>>> The example shows using the keycloak URI followed by ":app_id" matching 
>>>> with the name of the keycloak client application ("customer-portal" in the 
>>>> example).  My keycloak setup does not have any such field in the 
>>>> introspection result and I can't seem to figure out how to make this all 
>>>> work.
>>>>
>>>> I cranked up the logging to 20/20 and still did not see any hints as to 
>>>> what part of the policy is causing the access to be denied.
>>>>
>>>> Any suggestions?
>>>>
>>>> -Wyllys Ingersoll
>>>>
>>>> ___
>>>> Dev mailing list -- d...@ceph.io
>>>> To unsubscribe send an email to dev-le...@ceph.io
>
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io



-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io