Re: [ceph-users] tuning ceph mds cache settings

2019-01-29 Thread Jonathan Woytek
Thanks.. I'll give this a shot and we'll see what happens! jonathan On Tue, Jan 29, 2019 at 8:47 AM Yan, Zheng wrote: > On Tue, Jan 29, 2019 at 9:05 PM Jonathan Woytek > wrote: > > > > On Tue, Jan 29, 2019 at 7:12 AM Yan, Zheng wrote: > >> > >&g

Re: [ceph-users] tuning ceph mds cache settings

2019-01-29 Thread Jonathan Woytek
h changing contents, of course) and will be pretty easy to distribute amongst the mds with pinning, but then there are a handful of other directories that exist, some of which are created or destroyed at different times. jonathan -- Jonathan Woytek http://www.dryrose.com KB3HOZ PGP: 462C 5F

Re: [ceph-users] tuning ceph mds cache settings

2019-01-09 Thread Jonathan Woytek
On Wed, Jan 9, 2019 at 4:34 PM Patrick Donnelly wrote: > Hello Jonathan, > > On Wed, Jan 9, 2019 at 5:37 AM Jonathan Woytek wrote: > > While working on examining performance under load at scale, I see a > marked performance improvement whenever I would restart certain mds >

[ceph-users] tuning ceph mds cache settings

2019-01-09 Thread Jonathan Woytek
re a few cache options, too, that I want to understand. I know this is long, but hopefully it makes sense and someone can give me a few pointers. If you need additional information to comment, please feel free to ask! -- Jonathan Woytek http://www.dryrose.com KB3HOZ PGP: 462C 5F50 144D 6B09 3B6

Re: [ceph-users] (no subject)

2019-01-09 Thread Jonathan Woytek
, > Mosi > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Jonathan Woytek http://www.dryrose.com KB3HOZ PGP: 462C 5F50 144D 6B09 3B65 FCE8 C1DC DEC4 E8B6 AABC

Re: [ceph-users] MDS stuck in 'rejoin' after network fragmentation caused OSD flapping

2018-08-19 Thread Jonathan Woytek
On Sun, Aug 19, 2018 at 9:29 AM David Turner wrote: > I second that you do not have nearly enough RAM in these servers and I > don't you have at least 72 CPU cores either which means you again don't > have the minimum recommendation for the amount of OSDs you have, let alone > everything else. I

Re: [ceph-users] MDS stuck in 'rejoin' after network fragmentation caused OSD flapping

2018-08-16 Thread Jonathan Woytek
ndle, and never released any of it until it was killed. jonathan -- Jonathan Woytek http://www.dryrose.com KB3HOZ PGP: 462C 5F50 144D 6B09 3B65 FCE8 C1DC DEC4 E8B6 AABC ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS stuck in 'rejoin' after network fragmentation caused OSD flapping

2018-08-16 Thread Jonathan Woytek
, and restarted all four mds daemons. The cluster is back to healthy again. I've got more stuff to write up on our end for recovery procedures now, and that's a good thing! Thanks again! jonathan On Wed, Aug 15, 2018 at 11:12 PM, Jonathan Woytek wrote: > > On Wed, Aug 15, 2018

Re: [ceph-users] MDS stuck in 'rejoin' after network fragmentation caused OSD flapping

2018-08-15 Thread Jonathan Woytek
On Wed, Aug 15, 2018 at 11:02 PM Yan, Zheng wrote: > On Thu, Aug 16, 2018 at 10:55 AM Jonathan Woytek > wrote: > > > > ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic > (stable) > > > > > > Try deleting mds0_openfiles.0 (mds1_openf

Re: [ceph-users] MDS stuck in 'rejoin' after network fragmentation caused OSD flapping

2018-08-15 Thread Jonathan Woytek
ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable) On Wed, Aug 15, 2018 at 10:51 PM, Yan, Zheng wrote: > On Thu, Aug 16, 2018 at 10:50 AM Jonathan Woytek wrote: >> >> Actually, I missed it--I do see the wipe start, wipe done in the log. >> Howev

Re: [ceph-users] MDS stuck in 'rejoin' after network fragmentation caused OSD flapping

2018-08-15 Thread Jonathan Woytek
Actually, I missed it--I do see the wipe start, wipe done in the log. However, it is still doing verify_diri_backtrace, as described previously. jonathan On Wed, Aug 15, 2018 at 10:42 PM, Jonathan Woytek wrote: > On Wed, Aug 15, 2018 at 9:40 PM, Yan, Zheng wrote: >> How many client re

Re: [ceph-users] MDS stuck in 'rejoin' after network fragmentation caused OSD flapping

2018-08-15 Thread Jonathan Woytek
not seeing any reference to 'wipe', so I'm not sure if it is being honored. Am I putting that in the right place? jonathan -- Jonathan Woytek http://www.dryrose.com KB3HOZ PGP: 462C 5F50 144D 6B09 3B65 FCE8 C1DC DEC4 E8B6 AABC ___ c

[ceph-users] MDS stuck in 'rejoin' after network fragmentation caused OSD flapping

2018-08-15 Thread Jonathan Woytek
s when we detect it missing. Is there anything I can do to make this more efficient or help to get the process completed so MDS goes online? jonathan -- Jonathan Woytek http://www.dryrose.com KB3HOZ PGP: 462C 5F50 144D 6B09 3B65 FCE8 C1DC DEC4 E8B6 AABC