Re: [ceph-users] radosgw sync falling behind regularly

2019-03-11 Thread Trey Palmer
. Thanks so much for looking! Trey Palmer root@sv3-ceph-rgw1:/var/log/ceph# tail -f ceph-rgw-sv3-ceph-rgw1.log | grep -i error 2019-03-08 11:43:07.208572 7fa080cc7700 0 data sync: ERROR: failed to read remote data log info: ret=-2 2019-03-08 11:43:07.211348 7fa080cc7700 0 meta sync: ERROR

Re: [ceph-users] radosgw sync falling behind regularly

2019-03-06 Thread Trey Palmer
It appears we eventually got 'data sync init' working. At least, it's worked on 5 of the 6 sync directions in our 3-node cluster. The sixth has not run without an error returned, although 'sync status' does say "preparing for full sync". Thanks, Trey On Wed, Mar 6, 2019 at 1:22 PM T

Re: [ceph-users] radosgw sync falling behind regularly

2019-03-06 Thread Trey Palmer
ffect full sync, but > that doesn't appear to be the case - so I do think that's worth trying. > > On 3/5/19 6:24 PM, Trey Palmer wrote: > > Casey, > > > > Thanks very much for the reply! > > > > We definitely have lots of errors on sync-disabled buckets an

Re: [ceph-users] radosgw sync falling behind regularly

2019-03-06 Thread Trey Palmer
or > handling (ie "ERROR: init sync on...") would also affect full sync, but > that doesn't appear to be the case - so I do think that's worth trying. > > On 3/5/19 6:24 PM, Trey Palmer wrote: > > Casey, > > > > Thanks very much for the reply! > > > > W

Re: [ceph-users] radosgw sync falling behind regularly

2019-03-05 Thread Trey Palmer
be fixed by doing a 'radosgw-admin datalog trim' for each entry on the master zone. Does that sound right? :-) Thanks again for the detailed explanation, Trey Palmer On Tue, Mar 5, 2019 at 5:55 PM Casey Bodley wrote: > Hi Christian, > > I think you've correctly intuited that the issues ar

Re: [ceph-users] radosgw sync falling behind regularly

2019-03-05 Thread Trey Palmer
Hi Matthew, I work with Christian. Thanks so much for looking at this. We have a huge stale-instances list from that command. Our periods are all the same, I redirected them to a file on each node and checksummed them. Here's the period: { "id": "3d0d40ef-90de-40ea-8c44-caa20ea8dc53",

Re: [ceph-users] Question about RadosGW subusers

2017-04-13 Thread Trey Palmer
as > well?! > > (sorry if this all sounds somehow dumb but I'm a just a novice ;) ) > > best > Anton > > > Gesendet: Dienstag, 11. April 2017 um 00:17 Uhr > Von: "Trey Palmer" <t...@mailchimp.com> > An: ceph-us...@ceph.com > Betreff: [ceph-users] Q

[ceph-users] Question about RadosGW subusers

2017-04-10 Thread Trey Palmer
Probably a question for @yehuda : We have fairly strict user accountability requirements. The best way we have found to meet them with S3 object storage on Ceph is by using RadosGW subusers. If we set up one user per bucket, then set up subusers to provide separate individual S3 keys and access

Re: [ceph-users] how-to undo a "multisite" config

2017-04-07 Thread Trey Palmer
Hi Anton, I'm not sure exactly what you're trying to do. If you want to delete everything and start over, then just remove the zones, zonegroups and realms on both sides, and remove their pools. If you have a master zone you want to keep, but you want to remove the non-master zone that is

Re: [ceph-users] Latency between datacenters

2017-02-08 Thread Trey Palmer
The multisite configuration available starting in Jewel sound more appropriate for your situation. But then you need two separate clusters, each large enough to contain all of your objects. -- Trey On Tue, Feb 7, 2017 at 12:17 PM, Daniel Picolli Biazus wrote: > Hi Guys,

Re: [ceph-users] Ceph Monitoring

2017-01-17 Thread Trey Palmer
Just going into production now with a large-ish multisite radosgw setup on 10.2. We are starting off by alerting on anything that isn't HEALTH_OK, just to see how things go. If we get HEALTH_WARN but no mons or OSD's are down then it will be a low-level alert. We will massage scripts to pick

Re: [ceph-users] Memory leak in radosgw

2016-10-24 Thread Trey Palmer
by libcurl bugs/incompatibilities. -- Trey On Fri, Oct 21, 2016 at 11:04 AM, Trey Palmer <t...@mailchimp.com> wrote: > Hi Ben, > > I previously hit this bug: > > https://bugzilla.redhat.com/show_bug.cgi?id=1327142 > > So I updated from libcurl 7.29.0-25 to the

Re: [ceph-users] Memory leak in radosgw

2016-10-21 Thread Trey Palmer
rice | e: ben.morr...@epfl.ch | t: +41-21-693-9670 > EPFL ENT CBS BBP > Biotech Campus > Chemin des Mines 9 > 1202 Geneva > Switzerland > > On 20/10/16 21:41, Trey Palmer wrote: > > I've been trying to test radosgw multisite and have a pretty bad memory > leak.It

[ceph-users] Memory leak in radosgw

2016-10-20 Thread Trey Palmer
I've been trying to test radosgw multisite and have a pretty bad memory leak.It appears to be associated only with multisite sync. Multisite works well for a small numbers of objects.However, it all fell over when I wrote in 8M 64K objects to two buckets overnight for testing (via