Re: Listen performance

Fredrik Hallenberg Mon, 13 Jun 2022 08:39:51 -0700

Thanks for the detailed response. I have made some changes in my message
queue handling and it does perform better.
I am still not sure that increasing the listen backlog is necessary, but I
think it would be good to have the option to do it without patching the c++
container implementation.
Have I missed something that makes it possible to do this? Could I provide
my own container that overrides some method to only change the listen setup?




On Tue, Jun 7, 2022 at 9:59 AM Robbie Gemmell <robbie.gemm...@gmail.com>
wrote:

> Theres a fair bit of detail there which wasnt present originally.
>
> So it sounds like you are indeed only using the single thread, by
> scheduling a regular (how often is regular?) periodic task on it,
> since that wasnt mentioned it was unclear that was the case (usually
> means it isnt the case). That periodic task then effectively schedules
> return tasks for (individual message?) responses in the batch of
> queued requests by passing tasks using the connection work queue.
>
> That actually seems unnecessary if you are already running everything
> on the connection/container thread anyway, the connection work queue
> is typically used to pass work for a connections [container] thread to
> do from a different thread, rather than passing to itself. Its not
> clear you really gain in this case with the work queue if its the same
> lone thread doing all the work, mainly its just doing the same work at
> different times (i.e later) than it might have otherwise done it, plus
> there is then additional scheduling overhead and work needed to
> service the additional tasks than there might have been without. Plus
> if there is a batched backlog of incoming requests to then
> periodically respond to, that could increase the amount of time the
> thread needs to spend on doing that in a single sequence and then
> processing the response tasks, perhaps making it less available to
> accept new things while it does so for that now-grouped work, at which
> point the acceptor backlog might come more into play. I would have a
> play with either changing/removing use of the work queue for responses
> if you are really using a single thread, or changing the way the
> handling is done so it is another thread actually handling things and
> only the send is being passed back (maybe even a batch of sends in one
> task). Or perhaps using multiple container threads so connections
> arent all on the same one, but again dropping the work queue usage and
> just processing the response inline.
>
> All that said, if your 'processing X is quick' is true (how quick is
> quick?) then I still wouldnt really expect to see any delays of
> anything like the length you are talking about regardless, unless
> something else odd was going on, such as the prior talk of
> reconnection. Although if that was previously occurring and you have
> significantly raised the backlog, it should then not really be in play
> anymore, which should be very simple to identify from the timings of
> what is happening. I'd suggest instrumenting your code to get a more
> precise handle on where the time is going..you should be able to tell
> exactly how long it takes for every connection to open from the
> clients perspective, for a message to arrive at the server, for it to
> be acked, for the request to be processed, for the request to arrive
> at consumer etc. Related to that, also note you can run client and/or
> server ends with protocol tracing enabled (PN_TRACE_FRM=1) to
> visualise what traffic is/isnt happening. If you have clients seeing 1
> minute delays, that might be something fairly visible just looking as
> it runs. E.g run a bunch of clients without it as normal, and also
> manually run some with tracing on and observe.
>
> Perhaps you can narrow down an issue in your codes handling, or
> perhaps you can establish a case you think there is actually an issue
> in Proton, providing a minimal reproduce that can show it.
>
> On Mon, 6 Jun 2022 at 20:07, Fredrik Hallenberg <megahal...@gmail.com>
> wrote:
> >
> > Maybe my wording was not correct, responses to clients are handled fine
> > when connection is achieved. The issue is only about the time before the
> > connection is made and the initial client message shows up in the server
> > handler. When this happens I will push the message to a queue and return
> > immediately. The queue is handled by a fiber running at regular
> intervals,
> > this is done by using the qpid scheduler. Each message will get its own
> > fiber which will use the qpid work queue to send a reply when processing
> is
> > done. This processing should happen quickly. I am pretty sure this system
> > is safe, I have done a lot of testing on it. If you think it will cause
> > delays in qpid I will try to improve it using threads etc. I have tried
> > running the message queue consumer on a separate thread but as I
> mentioned
> > I did not see any obvious improvements so I opted to go for a single
> thread
> > solution.
> >
> > On Mon, Jun 6, 2022 at 11:29 AM Robbie Gemmell <robbie.gemm...@gmail.com
> >
> > wrote:
> >
> > > Personally from the original mail I think its as likely issue lies in
> > > just how the messages are being handled and responses generated. If
> > > adding threads is not helping any, it would only reinforce that view
> > > for me.
> > >
> > > Note is made that a single thread is being used, and that messages are
> > > only queued by the thread and "handled elsewhere" "quickly", but
> > > "responses" take a long time. What type of queuing is being done? Can
> > > the queue block (which would stop the container doing _anything_ )?
> > > How are the messages actually then being handled and responses
> > > generated exactly? Unless that process is using the appropriate
> > > mechanisms for passing work back to the connection (/container if
> > > single) thread, it would both be both unsafe and may very well result
> > > in delays, because no IO would actually happen until something else
> > > entirely caused that connection to process again later (e.g heartbeat
> > > checking).
> > >
> > > On Fri, 3 Jun 2022 at 18:04, Cliff Jansen <cliffjan...@gmail.com>
> wrote:
> > > >
> > > > Adding threads should allow connection setup (socket creation,
> accept,
> > > and
> > > > initial malloc of data structures) to run in parallel with connection
> > > > processing (socket read/write, TLS overhead, AMQP encode/decode, your
> > > > application on_message callback).
> > > >
> > > > The epoll proactor scales better with additional threads than the
> libuv
> > > > implementation.  If you are seeing no benefit with extra threads,
> trying
> > > > the libuv  proactor is a worthwhile idea.
> > > >
> > > > On Fri, Jun 3, 2022 at 2:38 AM Fredrik Hallenberg <
> megahal...@gmail.com>
> > > > wrote:
> > > >
> > > > > Yes, the fd limit is already raised a lot. Increasing backlog has
> > > improved
> > > > > performance and more file descriptors are in use but still I feel
> > > > > connection times are too long. Is there anything else to tune in
> the
> > > > > proactor? Should I try with the libuv proactor instead of epoll?
> > > > > I have tried with multiple threads in the past but did not notice
> any
> > > > > difference, but perhaps it is worth trying again with the current
> > > backlog
> > > > > setting?
> > > > >
> > > > >
> > > > > On Thu, Jun 2, 2022 at 5:11 PM Cliff Jansen <cliffjan...@gmail.com
> >
> > > wrote:
> > > > >
> > > > > > Please try raising your fd limit too. Perhaps doubling it or
> more.
> > > > > >
> > > > > > I would also try running your proton::container with more
> threads,
> > > say 4
> > > > > > and then 16, and see if that makes a difference.  It shouldn’t if
> > > your
> > > > > > processing within Proton is as minimal as you describe.
>  However, if
> > > > > there
> > > > > > is lengthy lock contention as you pass work out and then back in
> to
> > > > > Proton,
> > > > > > that may introduce delays.
> > > > > >
> > > > > > On Thu, Jun 2, 2022 at 7:43 AM Fredrik Hallenberg <
> > > megahal...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I have done some experiments raising the backlog value, and it
> is
> > > > > > > possibly a bit better, I have to test it more. Even if it
> works I
> > > would
> > > > > > of
> > > > > > > course like to avoid having to rely on a patched qpid. Also,
> maybe
> > > some
> > > > > > > internal queues or similar should be modified to handle this?
> > > > > > >
> > > > > > > I have not seen transport errors in the clients, but this may
> be
> > > > > because
> > > > > > > reconnection is enabled. I am unsure on what the reconnection
> > > feature
> > > > > > > actually does, I never seen an on_connection_open where
> > > > > > > connection.reconnection() returns true.
> > > > > > > Perhaps it is only useful when a connection is established and
> then
> > > > > lost?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Jun 2, 2022 at 1:44 PM Ted Ross <tr...@redhat.com>
> wrote:
> > > > > > >
> > > > > > > > On Thu, Jun 2, 2022 at 9:06 AM Fredrik Hallenberg <
> > > > > > megahal...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi, my application tends to get a lot of short lived
> incoming
> > > > > > > > connections.
> > > > > > > > > Messages are very short sync messages that usually can be
> > > responded
> > > > > > > with
> > > > > > > > > very little processing on the server side. It works fine
> but I
> > > feel
> > > > > > > > > that the performance is a bit lacking when many connections
> > > happen
> > > > > at
> > > > > > > the
> > > > > > > > > same time and would like advice on how to improve it. I am
> > > using
> > > > > qpid
> > > > > > > > > proton c++ 0.37 with epoll proactor.
> > > > > > > > > My current design uses a single thread for the listener
> but it
> > > will
> > > > > > > > > immediately push incoming messages in on_message to a queue
> > > that is
> > > > > > > > handled
> > > > > > > > > elsewhere. I can see that clients have to wait for a long
> time
> > > (up
> > > > > > to a
> > > > > > > > > minute) until they get a response, but I don't believe
> there
> > > is an
> > > > > > > issue
> > > > > > > > on
> > > > > > > > > my end as I as will quickly deal with any client messages
> as
> > > soon
> > > > > as
> > > > > > > they
> > > > > > > > > show up. Rather the issues seems to be that messages are
> not
> > > pushed
> > > > > > > into
> > > > > > > > > the queue quickly enough.
> > > > > > > > > I have noticed that the pn_proactor_listen is hardcoded to
> use
> > > a
> > > > > > > backlog
> > > > > > > > of
> > > > > > > > > 16 in the default container implementation, this seems low,
> > > but I
> > > > > am
> > > > > > > not
> > > > > > > > > sure if it is correct to change it.
> > > > > > > > > Any advice apppreciated. My goal is that a client should
> never
> > > need
> > > > > > to
> > > > > > > > wait
> > > > > > > > > more than a few seconds for a response even under
> reasonably
> > > high
> > > > > > load,
> > > > > > > > > maybe a few hundred connections per seconds.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I would try increasing the backlog.  16 seems low to me as
> > > well.  Do
> > > > > > you
> > > > > > > > know if any of your clients are re-trying the connection
> setup
> > > > > because
> > > > > > > they
> > > > > > > > overran the server's backlog?
> > > > > > > >
> > > > > > > > -Ted
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> > > For additional commands, e-mail: users-h...@qpid.apache.org
> > >
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> For additional commands, e-mail: users-h...@qpid.apache.org
>
>

Re: Listen performance

Reply via email to