Re: [ceph-users] radosgw sync falling behind regularly

2019-03-11 Thread Trey Palmer
HI Casey,

We're still trying to figure this sync problem out, if you could possibly
tell us anything further we would be deeply grateful!

Our errors are coming from 'data sync'.   In `sync status` we pretty
constantly show one shard behind, but a different one each time we run it.

Here's a paste -- these commands were run in rapid succession.

root@sv3-ceph-rgw1:~# radosgw-admin sync status
  realm b3e2afe7-2254-494a-9a34-ce50358779fd (savagebucket)
  zonegroup de6af748-1a2f-44a1-9d44-30799cf1313e (us)
   zone 331d3f1e-1b72-4c56-bb5a-d1d0fcf6d0b8 (sv3-prod)
  metadata sync syncing
full sync: 0/64 shards
incremental sync: 64/64 shards
metadata is caught up with master
  data sync source: 107d29a0-b732-4bf1-a26e-1f64f820e839 (dc11-prod)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source
source: 1e27bf9c-3a2f-4845-85b6-33a24bbe1c04 (sv5-corp)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source
root@sv3-ceph-rgw1:~# radosgw-admin sync status
  realm b3e2afe7-2254-494a-9a34-ce50358779fd (savagebucket)
  zonegroup de6af748-1a2f-44a1-9d44-30799cf1313e (us)
   zone 331d3f1e-1b72-4c56-bb5a-d1d0fcf6d0b8 (sv3-prod)
  metadata sync syncing
full sync: 0/64 shards
incremental sync: 64/64 shards
metadata is caught up with master
  data sync source: 107d29a0-b732-4bf1-a26e-1f64f820e839 (dc11-prod)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is behind on 1 shards
behind shards: [30]
oldest incremental change not applied: 2019-01-19
22:53:23.0.16109s
source: 1e27bf9c-3a2f-4845-85b6-33a24bbe1c04 (sv5-corp)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source
root@sv3-ceph-rgw1:~#


Below I'm pasting a small section of log.  Thanks so much for looking!

Trey Palmer


root@sv3-ceph-rgw1:/var/log/ceph# tail -f ceph-rgw-sv3-ceph-rgw1.log | grep
-i error
2019-03-08 11:43:07.208572 7fa080cc7700  0 data sync: ERROR: failed to read
remote data log info: ret=-2
2019-03-08 11:43:07.211348 7fa080cc7700  0 meta sync: ERROR:
RGWBackoffControlCR called coroutine returned -2
2019-03-08 11:43:07.267117 7fa080cc7700  0 data sync: ERROR: failed to read
remote data log info: ret=-2
2019-03-08 11:43:07.269631 7fa080cc7700  0 meta sync: ERROR:
RGWBackoffControlCR called coroutine returned -2
2019-03-08 11:43:07.895192 7fa080cc7700  0 data sync: ERROR: init sync on
dmv/dmv:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18467.134 failed, retcode=-2
2019-03-08 11:43:08.046685 7fa080cc7700  0 data sync: ERROR: init sync on
dmv/dmv:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18467.134 failed, retcode=-2
2019-03-08 11:43:08.171277 7fa0870eb700  0 ERROR: failed to get bucket
instance info for
.bucket.meta.phowe_superset:phowe_superset:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.233
2019-03-08 11:43:08.171748 7fa0850e7700  0 ERROR: failed to get bucket
instance info for
.bucket.meta.gdfp_dev:gdfp_dev:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.158
2019-03-08 11:43:08.175867 7fa08a0f1700  0 meta sync: ERROR: can't remove
key:
bucket.instance:phowe_superset/phowe_superset:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.233
ret=-2
2019-03-08 11:43:08.176755 7fa0820e1700  0 data sync: ERROR: init sync on
whoiswho/whoiswho:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18467.293 failed,
retcode=-2
2019-03-08 11:43:08.176872 7fa0820e1700  0 data sync: ERROR: init sync on
dmv/dmv:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18467.134 failed, retcode=-2
2019-03-08 11:43:08.176885 7fa093103700  0 ERROR: failed to get bucket
instance info for
.bucket.meta.phowe_superset:phowe_superset:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.233
2019-03-08 11:43:08.176925 7fa0820e1700  0 data sync: ERROR: failed to
retrieve bucket info for
bucket=phowe_superset/phowe_superset:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.233
2019-03-08 11:43:08.177916 7fa0910ff700  0 meta sync: ERROR: can't remove
key:
bucket.instance:gdfp_dev/gdfp_dev:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.158
ret=-2
2019-03-08 11:43:08.178815 7fa08b0f3700  0 ERROR: failed to get bucket
instance info for
.bucket.meta.gdfp_dev:gdfp_dev:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.158
2019-03-08 11:43:08.178847 7fa0820e1700  0 data sync: ERROR: failed to
retrieve bucket info for
bucket=gdfp_dev/gdfp_dev:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.158
2019-03-08 11:43:

Re: [ceph-users] radosgw sync falling behind regularly

2019-03-06 Thread Trey Palmer
It appears we eventually got 'data sync init' working.

At least, it's worked on 5 of the 6 sync directions in our 3-node cluster.
 The sixth has not run without an error returned, although 'sync status'
does say "preparing for full sync".

Thanks,

Trey

On Wed, Mar 6, 2019 at 1:22 PM Trey Palmer  wrote:

> Casey,
>
> This was the result of trying 'data sync init':
>
> root@c2-rgw1:~# radosgw-admin data sync init
> ERROR: source zone not specified
> root@c2-rgw1:~# radosgw-admin data sync init --source-zone= uuid>
> WARNING: cannot find source zone id for name=
> ERROR: sync.init_sync_status() returned ret=-2
> root@c2-rgw1:~# radosgw-admin data sync init --source-zone=c1-zone
> ERROR: sync.init() returned ret=-5
> 2019-03-06 10:14:59.815735 7fecb214fe40  0 data sync: ERROR: failed to
> fetch datalog info
> root@c2-rgw1:~#
>
> Do you have any further advice or info?
>
> Thanks again,
>
> Trey
>
>
> On Wed, Mar 6, 2019 at 11:47 AM Casey Bodley  wrote:
>
>> Hi Trey,
>>
>> I think it's more likely that these stale metadata entries are from
>> deleted buckets, rather than accidental bucket reshards. When a bucket
>> is deleted in a multisite configuration, we don't delete its bucket
>> instance because other zones may still need to sync the object deletes -
>> and they can't make progress on sync if the bucket metadata disappears.
>> These leftover bucket instances look the same to the 'reshard
>> stale-instances' commands, but I'd be cautious about using that to
>> remove them in multisite, as it may cause more sync errors and
>> potentially leak storage if they still contain objects.
>>
>> Regarding 'datalog trim', that alone isn't safe because it could trim
>> entries that hadn't been applied on other zones yet, causing them to
>> miss some updates. What you can do is run 'data sync init' on each zone,
>> and restart gateways. This will restart with a data full sync (which
>> will scan all buckets for changes), and skip past any datalog entries
>> from before the full sync. I was concerned that the bug in error
>> handling (ie "ERROR: init sync on...") would also affect full sync, but
>> that doesn't appear to be the case - so I do think that's worth trying.
>>
>> On 3/5/19 6:24 PM, Trey Palmer wrote:
>> > Casey,
>> >
>> > Thanks very much for the reply!
>> >
>> > We definitely have lots of errors on sync-disabled buckets and the
>> > workaround for that is obvious (most of them are empty anyway).
>> >
>> > Our second form of error is stale buckets.  We had dynamic resharding
>> > enabled but have now disabled it (having discovered it was on by
>> > default, and not supported in multisite).
>> >
>> > We removed several hundred stale buckets via 'radosgw-admin sharding
>> > stale-instances rm', but they are still giving us sync errors.
>> >
>> > I have found that these buckets do have entries in 'radosgw-admin
>> > datalog list', and my guess is this could be fixed by doing a
>> > 'radosgw-admin datalog trim' for each entry on the master zone.
>> >
>> > Does that sound right?  :-)
>> >
>> > Thanks again for the detailed explanation,
>> >
>> > Trey Palmer
>> >
>> > On Tue, Mar 5, 2019 at 5:55 PM Casey Bodley > > <mailto:cbod...@redhat.com>> wrote:
>> >
>> > Hi Christian,
>> >
>> > I think you've correctly intuited that the issues are related to
>> > the use
>> > of 'bucket sync disable'. There was a bug fix for that feature in
>> > http://tracker.ceph.com/issues/26895, and I recently found that a
>> > block
>> > of code was missing from its luminous backport. That missing code is
>> > what handled those "ERROR: init sync on  failed,
>> > retcode=-2" errors.
>> >
>> > I included a fix for that in a later backport
>> > (https://github.com/ceph/ceph/pull/26549), which I'm still working
>> to
>> > get through qa. I'm afraid I can't really recommend a workaround
>> > for the
>> > issue in the meantime.
>> >
>> > Looking forward though, we do plan to support something like s3's
>> > cross
>> > region replication so you can enable replication on a specific
>> bucket
>> > without having to enable

Re: [ceph-users] radosgw sync falling behind regularly

2019-03-06 Thread Trey Palmer
Casey,

This was the result of trying 'data sync init':

root@c2-rgw1:~# radosgw-admin data sync init
ERROR: source zone not specified
root@c2-rgw1:~# radosgw-admin data sync init --source-zone=
WARNING: cannot find source zone id for name=
ERROR: sync.init_sync_status() returned ret=-2
root@c2-rgw1:~# radosgw-admin data sync init --source-zone=c1-zone
ERROR: sync.init() returned ret=-5
2019-03-06 10:14:59.815735 7fecb214fe40  0 data sync: ERROR: failed to
fetch datalog info
root@c2-rgw1:~#

Do you have any further advice or info?

Thanks again,

Trey


On Wed, Mar 6, 2019 at 11:47 AM Casey Bodley  wrote:

> Hi Trey,
>
> I think it's more likely that these stale metadata entries are from
> deleted buckets, rather than accidental bucket reshards. When a bucket
> is deleted in a multisite configuration, we don't delete its bucket
> instance because other zones may still need to sync the object deletes -
> and they can't make progress on sync if the bucket metadata disappears.
> These leftover bucket instances look the same to the 'reshard
> stale-instances' commands, but I'd be cautious about using that to
> remove them in multisite, as it may cause more sync errors and
> potentially leak storage if they still contain objects.
>
> Regarding 'datalog trim', that alone isn't safe because it could trim
> entries that hadn't been applied on other zones yet, causing them to
> miss some updates. What you can do is run 'data sync init' on each zone,
> and restart gateways. This will restart with a data full sync (which
> will scan all buckets for changes), and skip past any datalog entries
> from before the full sync. I was concerned that the bug in error
> handling (ie "ERROR: init sync on...") would also affect full sync, but
> that doesn't appear to be the case - so I do think that's worth trying.
>
> On 3/5/19 6:24 PM, Trey Palmer wrote:
> > Casey,
> >
> > Thanks very much for the reply!
> >
> > We definitely have lots of errors on sync-disabled buckets and the
> > workaround for that is obvious (most of them are empty anyway).
> >
> > Our second form of error is stale buckets.  We had dynamic resharding
> > enabled but have now disabled it (having discovered it was on by
> > default, and not supported in multisite).
> >
> > We removed several hundred stale buckets via 'radosgw-admin sharding
> > stale-instances rm', but they are still giving us sync errors.
> >
> > I have found that these buckets do have entries in 'radosgw-admin
> > datalog list', and my guess is this could be fixed by doing a
> > 'radosgw-admin datalog trim' for each entry on the master zone.
> >
> > Does that sound right?  :-)
> >
> > Thanks again for the detailed explanation,
> >
> > Trey Palmer
> >
> > On Tue, Mar 5, 2019 at 5:55 PM Casey Bodley  > <mailto:cbod...@redhat.com>> wrote:
> >
> > Hi Christian,
> >
> > I think you've correctly intuited that the issues are related to
> > the use
> > of 'bucket sync disable'. There was a bug fix for that feature in
> > http://tracker.ceph.com/issues/26895, and I recently found that a
> > block
> > of code was missing from its luminous backport. That missing code is
> > what handled those "ERROR: init sync on  failed,
> > retcode=-2" errors.
> >
> > I included a fix for that in a later backport
> > (https://github.com/ceph/ceph/pull/26549), which I'm still working
> to
> > get through qa. I'm afraid I can't really recommend a workaround
> > for the
> > issue in the meantime.
> >
> > Looking forward though, we do plan to support something like s3's
> > cross
> > region replication so you can enable replication on a specific bucket
> > without having to enable it globally.
> >
> > Casey
> >
> >
> > On 3/5/19 2:32 PM, Christian Rice wrote:
> > >
> > > Much appreciated.  We’ll continue to poke around and certainly will
> > > disable the dynamic resharding.
> > >
> > > We started with 12.2.8 in production.  We definitely did not
> > have it
> > > enabled in ceph.conf
> > >
> > > *From: *Matthew H  > <mailto:matthew.he...@hotmail.com>>
> > > *Date: *Tuesday, March 5, 2019 at 11:22 AM
> > > *To: *Christian Rice  > <mailto:cr...@pandora.com>>, ceph-users
> > > mailto:ceph-users@lists.cep

Re: [ceph-users] radosgw sync falling behind regularly

2019-03-06 Thread Trey Palmer
Casey,

You are spot on that almost all of these are deleted buckets.   At some
point in the last few months we deleted and replaced buckets with
underscores in their names,  and those are responsible for most of these
errors.

Thanks very much for the reply and explanation.  We’ll give ‘data sync
init’ a try.

— Trey


On Wed, Mar 6, 2019 at 11:47 AM Casey Bodley  wrote:

> Hi Trey,
>
> I think it's more likely that these stale metadata entries are from
> deleted buckets, rather than accidental bucket reshards. When a bucket
> is deleted in a multisite configuration, we don't delete its bucket
> instance because other zones may still need to sync the object deletes -
> and they can't make progress on sync if the bucket metadata disappears.
> These leftover bucket instances look the same to the 'reshard
> stale-instances' commands, but I'd be cautious about using that to
> remove them in multisite, as it may cause more sync errors and
> potentially leak storage if they still contain objects.
>
> Regarding 'datalog trim', that alone isn't safe because it could trim
> entries that hadn't been applied on other zones yet, causing them to
> miss some updates. What you can do is run 'data sync init' on each zone,
> and restart gateways. This will restart with a data full sync (which
> will scan all buckets for changes), and skip past any datalog entries
> from before the full sync. I was concerned that the bug in error
> handling (ie "ERROR: init sync on...") would also affect full sync, but
> that doesn't appear to be the case - so I do think that's worth trying.
>
> On 3/5/19 6:24 PM, Trey Palmer wrote:
> > Casey,
> >
> > Thanks very much for the reply!
> >
> > We definitely have lots of errors on sync-disabled buckets and the
> > workaround for that is obvious (most of them are empty anyway).
> >
> > Our second form of error is stale buckets.  We had dynamic resharding
> > enabled but have now disabled it (having discovered it was on by
> > default, and not supported in multisite).
> >
> > We removed several hundred stale buckets via 'radosgw-admin sharding
> > stale-instances rm', but they are still giving us sync errors.
> >
> > I have found that these buckets do have entries in 'radosgw-admin
> > datalog list', and my guess is this could be fixed by doing a
> > 'radosgw-admin datalog trim' for each entry on the master zone.
> >
> > Does that sound right?  :-)
> >
> > Thanks again for the detailed explanation,
> >
> > Trey Palmer
> >
> > On Tue, Mar 5, 2019 at 5:55 PM Casey Bodley  > <mailto:cbod...@redhat.com>> wrote:
> >
> > Hi Christian,
> >
> > I think you've correctly intuited that the issues are related to
> > the use
> > of 'bucket sync disable'. There was a bug fix for that feature in
> > http://tracker.ceph.com/issues/26895, and I recently found that a
> > block
> > of code was missing from its luminous backport. That missing code is
> > what handled those "ERROR: init sync on  failed,
> > retcode=-2" errors.
> >
> > I included a fix for that in a later backport
> > (https://github.com/ceph/ceph/pull/26549), which I'm still working
> to
> > get through qa. I'm afraid I can't really recommend a workaround
> > for the
> > issue in the meantime.
> >
> > Looking forward though, we do plan to support something like s3's
> > cross
> > region replication so you can enable replication on a specific bucket
> > without having to enable it globally.
> >
> > Casey
> >
> >
> > On 3/5/19 2:32 PM, Christian Rice wrote:
> > >
> > > Much appreciated.  We’ll continue to poke around and certainly will
> > > disable the dynamic resharding.
> > >
> > > We started with 12.2.8 in production.  We definitely did not
> > have it
> > > enabled in ceph.conf
> > >
> > > *From: *Matthew H  > <mailto:matthew.he...@hotmail.com>>
> > > *Date: *Tuesday, March 5, 2019 at 11:22 AM
> > > *To: *Christian Rice  > <mailto:cr...@pandora.com>>, ceph-users
> > > mailto:ceph-users@lists.ceph.com>>
> > > *Cc: *Trey Palmer  > <mailto:nerdmagic...@gmail.com>>
> > > *Subject: *Re: radosgw sync falling behind regularly
> > >
> > > Hi Christian,
> > 

Re: [ceph-users] radosgw sync falling behind regularly

2019-03-05 Thread Trey Palmer
Casey,

Thanks very much for the reply!

We definitely have lots of errors on sync-disabled buckets and the
workaround for that is obvious (most of them are empty anyway).

Our second form of error is stale buckets.  We had dynamic resharding
enabled but have now disabled it (having discovered it was on by default,
and not supported in multisite).

We removed several hundred stale buckets via 'radosgw-admin sharding
stale-instances rm', but they are still giving us sync errors.

I have found that these buckets do have entries in 'radosgw-admin datalog
list', and my guess is this could be fixed by doing a 'radosgw-admin
datalog trim' for each entry on the master zone.

Does that sound right?  :-)

Thanks again for the detailed explanation,

Trey Palmer

On Tue, Mar 5, 2019 at 5:55 PM Casey Bodley  wrote:

> Hi Christian,
>
> I think you've correctly intuited that the issues are related to the use
> of 'bucket sync disable'. There was a bug fix for that feature in
> http://tracker.ceph.com/issues/26895, and I recently found that a block
> of code was missing from its luminous backport. That missing code is
> what handled those "ERROR: init sync on  failed,
> retcode=-2" errors.
>
> I included a fix for that in a later backport
> (https://github.com/ceph/ceph/pull/26549), which I'm still working to
> get through qa. I'm afraid I can't really recommend a workaround for the
> issue in the meantime.
>
> Looking forward though, we do plan to support something like s3's cross
> region replication so you can enable replication on a specific bucket
> without having to enable it globally.
>
> Casey
>
>
> On 3/5/19 2:32 PM, Christian Rice wrote:
> >
> > Much appreciated.  We’ll continue to poke around and certainly will
> > disable the dynamic resharding.
> >
> > We started with 12.2.8 in production.  We definitely did not have it
> > enabled in ceph.conf
> >
> > *From: *Matthew H 
> > *Date: *Tuesday, March 5, 2019 at 11:22 AM
> > *To: *Christian Rice , ceph-users
> > 
> > *Cc: *Trey Palmer 
> > *Subject: *Re: radosgw sync falling behind regularly
> >
> > Hi Christian,
> >
> > To be on the safe side and future proof yourself will want to go ahead
> > and set the following in your ceph.conf file, and then issue a restart
> > to your RGW instances.
> >
> > rgw_dynamic_resharding = false
> >
> > There are a number of issues with dynamic resharding, multisite rgw
> > problems being just one of them. However I thought it was disabled
> > automatically when multisite rgw is used (but I will have to double
> > check the code on that). What version of Ceph did you initially
> > install the cluster with? Prior to v12.2.2 this feature was enabled by
> > default for all rgw use cases.
> >
> > Thanks,
> >
> > 
> >
> > *From:*Christian Rice 
> > *Sent:* Tuesday, March 5, 2019 2:07 PM
> > *To:* Matthew H; ceph-users
> > *Subject:* Re: radosgw sync falling behind regularly
> >
> > Matthew, first of all, let me say we very much appreciate your help!
> >
> > So I don’t think we turned dynamic resharding on, nor did we manually
> > reshard buckets. Seems like it defaults to on for luminous but the
> > mimic docs say it’s not supported in multisite.  So do we need to
> > disable it manually via tell and ceph.conf?
> >
> > Also, after running the command you suggested, all the stale instances
> > are gone…these from my examples were in output:
> >
> > "bucket_instance":
> > "sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18467.303",
> >
> > "bucket_instance":
> > "sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299",
> >
> > "bucket_instance":
> > "sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.301",
> >
> > Though we still get lots of log messages like so in rgw:
> >
> > 2019-03-05 11:01:09.526120 7f64120ae700  0 ERROR: failed to get bucket
> > instance info for
> >
> .bucket.meta.sysad_task:sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299
> >
> > 2019-03-05 11:01:09.528664 7f63e5016700  1 civetweb: 0x55976f1c2000:
> > 172.17.136.17 - - [05/Mar/2019:10:54:06 -0800] "GET
> >
> /admin/metadata/bucket.instance/sysad_task/sysad-task:1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299?key=sysad_task%2Fsysad-task%3A1e27bf9c-3a2f-4845-85b6-33a24bbe1c04.18330.299&rgwx-zonegroup=d

Re: [ceph-users] radosgw sync falling behind regularly

2019-03-05 Thread Trey Palmer
Hi Matthew,

I work with Christian.  Thanks so much for looking at this.

We have a huge stale-instances list from that command.

Our periods are all the same, I redirected them to a file on each node and
checksummed them.  Here's the period:

{
"id": "3d0d40ef-90de-40ea-8c44-caa20ea8dc53",
"epoch": 16,
"predecessor_uuid": "926c74c7-c1a7-46b1-9f25-eb5c392a7fbb",
"sync_status": [],
"period_map": {
"id": "3d0d40ef-90de-40ea-8c44-caa20ea8dc53",
"zonegroups": [
{
"id": "de6af748-1a2f-44a1-9d44-30799cf1313e",
"name": "us",
"api_name": "us",
"is_master": "true",
"endpoints": [
"http://sv5-ceph-rgw1.savagebeast.com:8080";
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "1e27bf9c-3a2f-4845-85b6-33a24bbe1c04",
"zones": [
{
"id": "107d29a0-b732-4bf1-a26e-1f64f820e839",
"name": "dc11-prod",
"endpoints": [
"http://dc11-ceph-rgw1:8080";
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": []
},
{
"id": "1e27bf9c-3a2f-4845-85b6-33a24bbe1c04",
"name": "sv5-corp",
"endpoints": [
"http://sv5-ceph-rgw1.savagebeast.com:8080";
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": []
},
{
"id": "331d3f1e-1b72-4c56-bb5a-d1d0fcf6d0b8",
"name": "sv3-prod",
"endpoints": [
"http://sv3-ceph-rgw1:8080";
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": []
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": []
}
],
"default_placement": "default-placement",
"realm_id": "b3e2afe7-2254-494a-9a34-ce50358779fd"
}
],
"short_zone_ids": [
{
"key": "107d29a0-b732-4bf1-a26e-1f64f820e839",
"val": 1720993486
},
{
"key": "1e27bf9c-3a2f-4845-85b6-33a24bbe1c04",
"val": 2301637458
},
{
"key": "331d3f1e-1b72-4c56-bb5a-d1d0fcf6d0b8",
"val": 1449486239
}
]
},
"master_zonegroup": "de6af748-1a2f-44a1-9d44-30799cf1313e",
"master_zone": "1e27bf9c-3a2f-4845-85b6-33a24bbe1c04",
"period_config": {
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
}
},
"realm_id": "b3e2afe7-2254-494a-9a34-ce50358779fd",
"realm_name": "savagebucket",
"realm_epoch": 2
}




On Tue, Mar 5, 2019 at 7:31 AM Matthew H  wrote:

> Hi Christian,
>
> You haven't resharded any of your buckets have you?  You can run the
> command below in v12.2.11 to list stale bucket instances.
>
> radosgw-admin reshard stale-instances list
>
> Can you also send the output from the following command on each rgw?
>
> radosgw-admin period get
>
>
>
> --
> *From:* Christian Rice 
> *Sent:* Tuesday, March 5, 2019 1:46 AM
> *To:* Matthew H; ceph-users
> *Subject:* Re: radosgw sync falling behind regularly
>
>
> sure thing.
>
>
>
> sv5-ceph-rgw1
>
> zonegroup get
>
> {
>
> "id": "de6af748-1a2f-44a1-9d44-30799cf1313e",
>
> "name": "us",
>
> "api_name": "us",
>
> "is_master": "true",
>
> "endpoints": [
>
> "http://sv5-ceph-r

Re: [ceph-users] Question about RadosGW subusers

2017-04-13 Thread Trey Palmer
Anton,

It turns out that Adam Emerson is trying to get bucket policies and roles
merged in time for Luminous:

https://github.com/ceph/ceph/pull/14307

Given this, I think we will only be using subusers temporarily as a method
to track which human or service did what in which bucket.  This seems to us
much easier than trying to deal with ACL's without any concept of groups,
roles, or policies, in buckets that can often have millions of objects.

Here is the general idea:


1.  Each bucket has a user ("master user"), but we don't use or issue that
set of keys at all.

radosgw-admin user create --uid=mybucket --display-name="My Bucket"

You can of course have multiple buckets per user but so far for us it has
been simple to have one user per bucket, with the username the same as the
bucket name.   If a human needs access to more than one bucket, we will
create multiple subusers for them.   That's not convenient, but it's
temporary.

So what we're doing is effectively making the user into the group, with the
subusers being the users, and each user only capable of being in one group.
  Very suboptimal, but better than the total chaos that would result from
giving everyone the same set of keys for a given bucket.


2.  For each human user or service/machine user of that bucket, we create
subusers.You can do this via:

## full-control ops user
radosgw-admin subuser create --uid=mybucket --subuser=mybucket:alice
--access=full --gen-access-key --gen-secret --key-type=s3

## write-only server user
radosgw-admin subuser create --uid=mybucket --subuser=mybucket:daemon
--access=write --gen-access-key --gen-secret-key --key-type=s3

If you then do a "radosgw-admin metadata get user:mybucket", the JSON
output contains the subusers and their keys.


3.  Raise the RGW log level in ceph.conf to make an "access key id" line
available for each request, which you can then map to a subuser if/when you
need to track who did what after the fact.  In ceph.conf:

debug_rgw = 10/10

This will cause the logs to be VERY verbose, an order of magnitude and some
change more verbose than default.   We plan to discard most of the logs
while feeding them into ElasticSearch.

We might not need this much log verbosity once we have policies and are
using unique users rather than subusers.

Nevertheless, I hope we can eventually reduce the log level of the "access
key id" line, as we have a pretty mainstream use case and I'm certain that
tracking S3 request users will be required for many organizations for
accounting and forensic purposes just as it is for us.

-- Trey

On Thu, Apr 13, 2017 at 1:29 PM,  wrote:

> Hey Trey.
>
> Sounds great, we were discussing the same kind of requirements and
> couldn't agree on/find something "useful"... so THANK YOU for sharing!!!
>
> It would be great if you could provide some more details or an example how
> you configure the "bucket user" and sub-users and all that stuff.
> Even more interesting for me, how do the "different ppl or services"
> access that buckets/objects afterwards?! I mean via which tools (s3cmd,
> boto, cyberduck, mix of some, ...) and are there any ACLs set/in use as
> well?!
>
> (sorry if this all sounds somehow dumb but I'm a just a novice ;) )
>
> best
>  Anton
>
>
> Gesendet: Dienstag, 11. April 2017 um 00:17 Uhr
> Von: "Trey Palmer" 
> An: ceph-us...@ceph.com
> Betreff: [ceph-users] Question about RadosGW subusers
>
> Probably a question for @yehuda :
>
>
> We have fairly strict user accountability requirements.  The best way we
> have found to meet them with S3 object storage on Ceph is by using RadosGW
> subusers.
>
> If we set up one user per bucket, then set up subusers to provide separate
> individual S3 keys and access rights for different people or services using
> that bucket, then we can track who did what via access key in the RadosGW
> logs (at debug_rgw = 10/10).
>
> Of course, this is not a documented use case for subusers.  I'm wondering
> if Yehuda or anyone else could estimate our risk of future incompatibility
> if we implement user/key management around subusers in this manner?
>
> Thanks,
>
> Trey___ ceph-users mailing
> list ceph-users@lists.ceph.com http://lists.ceph.com/
> listinfo.cgi/ceph-users-ceph.com
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Question about RadosGW subusers

2017-04-10 Thread Trey Palmer
Probably a question for @yehuda :

We have fairly strict user accountability requirements.  The best way we
have found to meet them with S3 object storage on Ceph is by using RadosGW
subusers.

If we set up one user per bucket, then set up subusers to provide separate
individual S3 keys and access rights for different people or services using
that bucket, then we can track who did what via access key in the RadosGW
logs (at debug_rgw = 10/10).

Of course, this is not a documented use case for subusers.  I'm wondering
if Yehuda or anyone else could estimate our risk of future incompatibility
if we implement user/key management around subusers in this manner?

Thanks,

Trey
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how-to undo a "multisite" config

2017-04-07 Thread Trey Palmer
Hi Anton,

I'm not sure exactly what you're trying to do.  If you want to delete
everything and start over, then just remove the zones, zonegroups and
realms on both sides, and remove their pools.

If you have a master zone you want to keep, but you want to remove the
non-master zone that is mirroring it multisite, then do this:

1.  Stop the offending radosgw service(s) on the non-master cluster.
2.  Delete the zone on the non-master cluster, "radosgw-admin zone delete
--rgw-zone=myzone-b".
3.  "radosgw-admin zonegroup get" > tempfile
4.  Edit tempfile and remove the non-empty zone from the JSON.
5.  "radosgw-admin zonegroup set" < tempfile
6.  Update the period:   radosgw-admin period update --commit

Then I believe you can just delete the RGW pools on the non-master side and
start over if you need to.

I found the multisite documentation accurate in Jewel if you're just doing
a single zonegroup/realm.   I concede that it's complex and confusing,
though, and the man page and other documentation lag far behind.   I have
found it useful to consult the options parsing at the top of the
radosgw-admin source code a few times.

If you are stuck at something specific, shoot me an email.

-- Trey

On Mon, Apr 3, 2017 at 3:44 PM,  wrote:

> Hi Cephers.
>
> Quick question couldn't find a "how-to" or "docu"... not even sure if
> someone else ever had to do it...
>
> What would be the steps to make a (failed) multisite config change,
> exactly following
> - http://docs.ceph.com/docs/master/radosgw/multisite/
> undone again?
>
> And as I'm on that topic now, any other WORKING documentation how to setup
> "multisite"?
>
> Thanks & regards
>  Anton
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Latency between datacenters

2017-02-08 Thread Trey Palmer
The multisite configuration available starting in Jewel sound more
appropriate for your situation.

But then you need two separate clusters, each large enough to contain all
of your objects.

   -- Trey

On Tue, Feb 7, 2017 at 12:17 PM, Daniel Picolli Biazus 
wrote:

> Hi Guys,
>
> I have been planning to deploy a Ceph Cluster with the following hardware:
>
> *OSDs:*
>
> 4 Servers Xeon D 1520 / 32 GB RAM / 5 x 6TB SAS 2 (6 OSD daemon per server)
>
> Monitor/Rados Gateways
>
> 5 Servers Xeon D 1520 32 GB RAM / 2 x 1TB SAS 2 (5 MON daemon/ 4 rados
> daemon)
>
> Usage: Object Storage only
>
> However I need to deploy 2 OSD and 3 MON Servers in Miami datacenter
> and another 2 OSD and 2 MON Servers in Montreal Datacenter. The latency
> between these datacenters is 50 milliseconds.
>Considering this scenario, should I use Federated Gateways or should I
> use a single Cluster ?
>
> Thanks in advance
> Daniel
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Monitoring

2017-01-17 Thread Trey Palmer
Just going into production now with a large-ish multisite radosgw setup on
10.2.   We are starting off by alerting on anything that isn't HEALTH_OK,
just to see how things go.   If we get HEALTH_WARN but no mons or OSD's are
down then it will be a low-level alert.   We will massage scripts to pick
up on different conditions.

We're using graphite via collectd for visualization.

-- Trey


On Fri, Jan 13, 2017 at 3:15 PM, Chris Jones  wrote:

> General question/survey:
>
> Those that have larger clusters, how are you doing alerting/monitoring?
> Meaning, do you trigger off of 'HEALTH_WARN', etc? Not really talking about
> collectd related but more on initial alerts of an issue or potential issue?
> What threshold do you use basically? Just trying to get a pulse of what
> others are doing.
>
> Thanks in advance.
>
> --
> Best Regards,
> Chris Jones
> ​Bloomberg​
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Memory leak in radosgw

2016-10-24 Thread Trey Palmer
Updating to libcurl 7.44 fixed the memory leak issue.   Thanks for the tip,
Ben.

FWIW this was a massive memory leak, it rendered the system untenable in my
testing.   RGW multisite will flat not work with the current CentOS/RHEL7
libcurl.

Seems like there are a lot of different problems caused by libcurl
bugs/incompatibilities.

   -- Trey

On Fri, Oct 21, 2016 at 11:04 AM, Trey Palmer  wrote:

> Hi Ben,
>
> I previously hit this bug:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1327142
>
> So I updated from libcurl 7.29.0-25 to the new update package libcurl
> 7.29.0-32 on RHEL 7, which fixed the deadlock problem.
>
> I had not seen the issue you linked.   It doesn't seem directly related,
> since my problem is a memory leak and not CPU.   Clearly, though, older
> libcurl versions remain problematic for multiple reasons, so I'll give a
> newer one a try.
>
> Thanks for the input!
>
>-- Trey
>
>
>
> On Fri, Oct 21, 2016 at 3:21 AM, Ben Morrice  wrote:
>
>> What version of libcurl are you using?
>>
>> I was hitting this bug with RHEL7/libcurl 7.29 which could also be your
>> catalyst.
>>
>> http://tracker.ceph.com/issues/15915
>>
>> Kind regards,
>>
>> Ben Morrice
>>
>> __
>> Ben Morrice | e: ben.morr...@epfl.ch | t: +41-21-693-9670
>> EPFL ENT CBS BBP
>> Biotech Campus
>> Chemin des Mines 9
>> 1202 Geneva
>> Switzerland
>>
>> On 20/10/16 21:41, Trey Palmer wrote:
>>
>> I've been trying to test radosgw multisite and have a pretty bad memory
>> leak.It appears to be associated only with multisite sync.
>>
>> Multisite works well for a small numbers of objects.However, it all
>> fell over when I wrote in 8M 64K objects to two buckets overnight for
>> testing (via cosbench).
>>
>> The leak appears to happen on the multisite transfer source -- that is, the
>> node where the objects were written originally.   The radosgw process
>> eventually dies, I'm sure via the OOM killer, and systemd restarts it.
>> Then repeat, though multisite sync pretty much stops at that point.
>>
>> I have tried 10.2.2, 10.2.3 and a combination of the two.   I'm running on
>> CentOS 7.2, using civetweb with SSL.   I saw that the memory profiler only
>> works on mon, osd and mds processes.
>>
>> Anyone else seen anything like this?
>>
>>-- Trey
>>
>>
>>
>>
>> ___
>> ceph-users mailing 
>> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Memory leak in radosgw

2016-10-21 Thread Trey Palmer
Hi Ben,

I previously hit this bug:

https://bugzilla.redhat.com/show_bug.cgi?id=1327142

So I updated from libcurl 7.29.0-25 to the new update package libcurl
7.29.0-32 on RHEL 7, which fixed the deadlock problem.

I had not seen the issue you linked.   It doesn't seem directly related,
since my problem is a memory leak and not CPU.   Clearly, though, older
libcurl versions remain problematic for multiple reasons, so I'll give a
newer one a try.

Thanks for the input!

   -- Trey



On Fri, Oct 21, 2016 at 3:21 AM, Ben Morrice  wrote:

> What version of libcurl are you using?
>
> I was hitting this bug with RHEL7/libcurl 7.29 which could also be your
> catalyst.
>
> http://tracker.ceph.com/issues/15915
>
> Kind regards,
>
> Ben Morrice
>
> __
> Ben Morrice | e: ben.morr...@epfl.ch | t: +41-21-693-9670
> EPFL ENT CBS BBP
> Biotech Campus
> Chemin des Mines 9
> 1202 Geneva
> Switzerland
>
> On 20/10/16 21:41, Trey Palmer wrote:
>
> I've been trying to test radosgw multisite and have a pretty bad memory
> leak.It appears to be associated only with multisite sync.
>
> Multisite works well for a small numbers of objects.However, it all
> fell over when I wrote in 8M 64K objects to two buckets overnight for
> testing (via cosbench).
>
> The leak appears to happen on the multisite transfer source -- that is, the
> node where the objects were written originally.   The radosgw process
> eventually dies, I'm sure via the OOM killer, and systemd restarts it.
> Then repeat, though multisite sync pretty much stops at that point.
>
> I have tried 10.2.2, 10.2.3 and a combination of the two.   I'm running on
> CentOS 7.2, using civetweb with SSL.   I saw that the memory profiler only
> works on mon, osd and mds processes.
>
> Anyone else seen anything like this?
>
>-- Trey
>
>
>
>
> ___
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Memory leak in radosgw

2016-10-20 Thread Trey Palmer
I've been trying to test radosgw multisite and have a pretty bad memory
leak.It appears to be associated only with multisite sync.

Multisite works well for a small numbers of objects.However, it all
fell over when I wrote in 8M 64K objects to two buckets overnight for
testing (via cosbench).

The leak appears to happen on the multisite transfer source -- that is, the
node where the objects were written originally.   The radosgw process
eventually dies, I'm sure via the OOM killer, and systemd restarts it.
Then repeat, though multisite sync pretty much stops at that point.

I have tried 10.2.2, 10.2.3 and a combination of the two.   I'm running on
CentOS 7.2, using civetweb with SSL.   I saw that the memory profiler only
works on mon, osd and mds processes.

Anyone else seen anything like this?

   -- Trey
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com