Hi Dan,

In svc_vc_recv(), we handle the case of incomplete receive by rearming the
FD and returning ( if xd->sx_fbtbc is not zero). In the case of EAGAIN also
shouldn't we be doing the same? epoll is ONESHOT; so new receives won't
give new events until epoll_ctl() is called, right?

I tried adding the rearming code in EAGAIN cases and was able run the test
without receive hang.

diff --git a/src/svc_vc.c b/src/svc_vc.c
index f5377df..496444a 100644
--- a/src/svc_vc.c
+++ b/src/svc_vc.c
@@ -680,6 +680,12 @@ svc_vc_recv(SVCXPRT *xprt)
                        code = errno;

                        if (code == EAGAIN || code == EWOULDBLOCK) {
+                               if (unlikely(svc_rqst_rearm_events(xprt))) {
+                                       __warnx(TIRPC_DEBUG_FLAG_ERROR,
+                                               "%s: %p fd %d
svc_rqst_rearm_events failed (will set dead)",
+                                               __func__, xprt,
xprt->xp_fd);
+                                       SVC_DESTROY(xprt);
+                               }
                                __warnx(TIRPC_DEBUG_FLAG_WARN,
                                        "%s: %p fd %d recv errno %d (try
again)",
                                        "svc_vc_wait", xprt, xprt->xp_fd,
code);
@@ -731,8 +737,14 @@ svc_vc_recv(SVCXPRT *xprt)
                code = errno;

                if (code == EAGAIN || code == EWOULDBLOCK) {
+                       if (unlikely(svc_rqst_rearm_events(xprt))) {
+                               __warnx(TIRPC_DEBUG_FLAG_ERROR,
+                                       "%s: %p fd %d svc_rqst_rearm_events
failed (will set dead)",
+                                       __func__, xprt, xprt->xp_fd);
+                               SVC_DESTROY(xprt);
+                       }
                        __warnx(TIRPC_DEBUG_FLAG_SVC_VC,
-                               "%s: %p fd %d recv errno %d (try again)",
+                               "%s: %p fd %d recv errno %d (try again) 2",
                                __func__, xprt, xprt->xp_fd, code);
                        return SVC_STAT(xprt);



On Fri, Jan 26, 2018 at 6:24 AM, Matt Benjamin <mbenj...@redhat.com> wrote:

> Yes, I wasn't claiming there is anything missing.  Before 2.6, there
> was a rearm method being called.
>
> Matt
>
> On Fri, Jan 26, 2018 at 9:20 AM, Daniel Gryniewicz <d...@redhat.com>
> wrote:
> > I don't think you re-arm a FD in epoll.  You arm it once, and it fires
> until
> > you disarm it, as far as I know.  You just call epoll_wait() to get new
> > events.
> >
> > The thread model is a bit odd;  When the epoll fires, all the events are
> > found, and a thread is submitted for each one except one.  That one is
> > handled in the local thread (since it's expected that most epoll triggers
> > will have one event on them, thus using the current hot thread).  In
> > addition, a new thread is submitted to go back and wait for events, so
> > there's no delay handling new events.  So EAGAIN is handled by just
> > indicating this thread is done, and returning it to the thread pool.
> When
> > the socket is ready again, it will trigger a new event on the thread
> waiting
> > on the epoll.
> >
> > Bill, please correct me if I'm wrong.
> >
> > Daniel
> >
> >
> > On 01/25/2018 09:13 PM, Matt Benjamin wrote:
> >>
> >> Hmm.  We used to handle that ;)
> >>
> >> Matt
> >>
> >> On Thu, Jan 25, 2018 at 9:11 PM, Pradeep <pradeeptho...@gmail.com>
> wrote:
> >>>
> >>> If recv() returns EAGAIN, then svc_vc_recv() returns without rearming
> the
> >>> epoll_fd. How does it get back to svc_vc_recv() again?
> >>>
> >>> On Wed, Jan 24, 2018 at 9:26 PM, Pradeep <pradeeptho...@gmail.com>
> wrote:
> >>>>
> >>>>
> >>>> Hello,
> >>>>
> >>>> I seem to be hitting a corner case where ganesha (2.6-rc2) does not
> >>>> respond to a RENEW request from 4.0 client. Enabled the debug logs and
> >>>> noticed that NFS layer has not seen the RENEW request (I can see it in
> >>>> tcpdump).
> >>>>
> >>>> I collected netstat output periodically and found that there is a time
> >>>> window of ~60 sec where the receive buffer size remains the same. This
> >>>> means
> >>>> the RPC layer somehow missed a 'recv' call. Now if I enable debug on
> >>>> TIRPC,
> >>>> I can't reproduce the issue. Any pointers to potential races where I
> >>>> could
> >>>> enable selective prints would be helpful.
> >>>>
> >>>> svc_rqst_epoll_event() resets SVC_XPRT_FLAG_ADDED. Is it possible for
> >>>> another thread to svc_rqst_rearm_events()? In that case if
> >>>> svc_rqst_epoll_event() could reset the flag set by
> svc_rqst_rearm_events
> >>>> and
> >>>> complete the current receive before the other thread could call
> >>>> epoll_ctl(),
> >>>> right?
> >>>>
> >>>> Thanks,
> >>>> Pradeep
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> ------------------------------------------------------------
> ------------------
> >>> Check out the vibrant tech community on one of the world's most
> >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>> _______________________________________________
> >>> Nfs-ganesha-devel mailing list
> >>> Nfs-ganesha-devel@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
> >>>
> >>
> >>
> >>
> >
> >
> > ------------------------------------------------------------
> ------------------
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > _______________________________________________
> > Nfs-ganesha-devel mailing list
> > Nfs-ganesha-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
>
>
> --
>
> Matt Benjamin
> Red Hat, Inc.
> 315 West Huron Street, Suite 140A
> Ann Arbor, Michigan 48103
>
> http://www.redhat.com/en/technologies/storage
>
> tel.  734-821-5101
> fax.  734-769-8938
> cel.  734-216-5309
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Reply via email to