On Sep 18, 2013, at 11:50 PM, Gregory Farnum <g...@inktank.com> wrote:
> On Wed, Sep 18, 2013 at 6:33 AM, Dan Van Der Ster > <daniel.vanders...@cern.ch> wrote: >> Hi, >> We just finished debugging a problem with RBD-backed Glance image creation >> failures, and thought our workaround would be useful for others. Basically, >> we found that during an image upload, librbd on the glance api server was >> consuming many many processes, eventually hitting the 1024 nproc limit of >> non-root users in RHEL. The failure occurred when uploading to pools with >> 2048 PGs, but didn't fail when uploading to pools with 512 PGs (we're >> guessing that librbd is opening one thread per accessed-PG, and not closing >> those threads until the whole processes completes.) >> >> If you hit this same problem (and you run RHEL like us), you'll need to >> modify at least /etc/security/limits.d/90-nproc.conf (adding your non-root >> user that should be allowed > 1024 procs), and then also possibly run ulimit >> -u in the init script of your client process. Ubuntu should have some >> similar limits. > > Did your pools with 2048 PGs have a significantly larger number of > OSDs in them? Or are both pools on a pool with a lot of OSDs relative > to the PG counts? 1056 OSDs at the moment. Uploading a 14GB image we observed up to ~1500 threads. We set the glance client to allow 4096 processes for now. > The PG count shouldn't matter for this directly, but RBD (and other > clients) will create a couple messenger threads for each OSD it talks > to, and while they'll eventually shut down on idle it doesn't > proactively close them. I'd expect this to be a problem around 500 > OSDs. A couple, is that the upper limit? Should we be safe with ulimit -u 2*nOSDs +1 ?? Cheers, Dan > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com