I'm curious how does the num_threads option to civetweb relate to the 'rgw
thread pool size'?  Should i make them equal?

ie:

rgw frontends = civetweb enable_keep_alive=yes port=80 num_threads=125
error_log_file=/var/log/ceph/civetweb.error.log
access_log_file=/var/log/ceph/civetweb.access.log


-Ben

On Thu, Feb 9, 2017 at 12:30 PM, Wido den Hollander <w...@42on.com> wrote:

>
> > Op 9 februari 2017 om 19:34 schreef Mark Nelson <mnel...@redhat.com>:
> >
> >
> > I'm not really an RGW expert, but I'd suggest increasing the
> > "rgw_thread_pool_size" option to something much higher than the default
> > 100 threads if you haven't already.  RGW requires at least 1 thread per
> > client connection, so with many concurrent connections some of them
> > might end up timing out.  You can scale the number of threads and even
> > the number of RGW instances on a single server, but at some point you'll
> > run out of threads at the OS level.  Probably before that actually
> > happens though, you'll want to think about multiple RGW gateway nodes
> > behind a load balancer.  Afaik that's how the big sites do it.
> >
>
> In addition, have you tried to use more RADOS handles?
>
> rgw_num_rados_handles = 8
>
> That with more RGW threads as Mark mentioned.
>
> Wido
>
> > I believe some folks are considering trying to migrate rgw to a
> > threadpool/event processing model but it sounds like it would be quite a
> > bit of work.
> >
> > Mark
> >
> > On 02/09/2017 12:25 PM, Benjeman Meekhof wrote:
> > > Hi all,
> > >
> > > We're doing some stress testing with clients hitting our rados gw
> > > nodes with simultaneous connections.  When the number of client
> > > connections exceeds about 5400 we start seeing 403 forbidden errors
> > > and log messages like the following:
> > >
> > > 2017-02-09 08:53:16.915536 7f8c667bc700 0 NOTICE: request time skew
> > > too big now=2017-02-09 08:53:16.000000 req_time=2017-02-09
> > > 08:37:18.000000
> > >
> > > This is version 10.2.5 using embedded civetweb.  There's just one
> > > instance per node, and they all start generating 403 errors and the
> > > above log messages when enough clients start hitting them.  The
> > > hardware is not being taxed at all, negligible load and network
> > > throughput.   OSD don't show any appreciable increase in CPU load or
> > > io wait on journal/data devices.  Unless I'm missing something it
> > > looks like the RGW is just not scaling to fill out the hardware it is
> > > on.
> > >
> > > Does anyone have advice on scaling RGW to fully utilize a host?
> > >
> > > thanks,
> > > Ben
> > > _______________________________________________
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to