I think I've finally got a handle on this flake, and a possible solution [1]. One thing that's still bothering me though is that the "CANCELLED: Multiplexer hanging up" errors seem to be unavoidable.
They occur when the GrpcDataService is closed [2] and it closes all of it's multiplexers, which just send an error to their outbound observers [3]. It seems to me that there should be a more graceful way to shut everything down, but I'm not seeing it. Am I missing something? grpc-java suggests using GrpcCleanupRule to gracefully shut-down in-process servers and clients [4], should we be utilizing that somehow? Brian [1] https://github.com/apache/beam/pull/7794 [2] https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/data/GrpcDataService.java#L117 [3] https://github.com/apache/beam/tree/master/sdks/java/fn-execution/src/main/java/org/apache/beam/sdk/fn/data/BeamFnDataGrpcMultiplexer.java#L112 [4] https://github.com/grpc/grpc-java/blob/master/examples/README.md#unit-test-examples On Thu, Feb 7, 2019 at 11:49 AM Brian Hulette <bhule...@google.com> wrote: > This was already reported in BEAM-6512 [1], which Scott gave me as a > starter bug. I haven't been able to reproduce locally, so I'm trying to see > if I can get it to fail on Jenkins again with some additional logging [2]. > > Definitely interested in other's thoughts on this, I only vaguely > understand what's going on. So far the only headway I've made is noticing > that the "CANCELLED: Multiplexer hanging up" error seems to always occur > exactly three times in failing tests. Successful runs may have one or two > of these messages but never three. > > [1] https://issues.apache.org/jira/browse/BEAM-6512 > [2] https://github.com/apache/beam/pull/7767 > > On Tue, Feb 5, 2019 at 9:50 AM Alex Amato <ajam...@google.com> wrote: > >> >> org.apache.beam.runners.fnexecution.data.GrpcDataServiceTest.testMessageReceivedBySingleClientWhenThereAreMultipleClients >> >> I keep seeing this test failing in my PRs >> >> https://builds.apache.org/job/beam_PreCommit_Java_Commit/4018/ >> >> >> https://builds.apache.org/job/beam_PreCommit_Java_Commit/4018/testReport/junit/org.apache.beam.runners.fnexecution.data/GrpcDataServiceTest/testMessageReceivedBySingleClientWhenThereAreMultipleClients/ >> >> >> I've seen this one come and go for a few weeks or so. I am unsure exactly >> when it first occured. >> >