Ah... Did not yet debug. But wouldn't [1] mean setting system.out to 'null' on first call to @setup ? As there was no previous call to DataflowWorkerLoggingInitializer.initialize?
https://github.com/apache/beam/blame/master/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/logging/DataflowWorkerLoggingInitializerTest.java#L81 On Fri, Apr 5, 2019 at 10:12 PM Lukasz Cwik <[email protected]> wrote: > We replace System.out/err to capture user logs and forward the logs for > the Dataflow worker[1]. It could be that this test[2] is not resetting it > afterwards which leaves it at null and then some future code causes it to > fail. > > 1: > https://github.com/apache/beam/blob/e69d69d72dc5b9c3d6069c0b71825c3c2b0b4e61/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/logging/DataflowWorkerLoggingInitializer.java#L132 > 2: > https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/logging/DataflowWorkerLoggingInitializerTest.java > > On Fri, Apr 5, 2019 at 1:42 AM Michael Luckey <[email protected]> wrote: > >> FWIW, the TimerRecieverTest is also failing consistently on my macOS. >> Running on my ubuntu VM, they pass. >> >> Now the stacktrace indicates an NullPinterException thrown out of the >> finally block [1] >> >> As this is really bad and of course would hide the cause, I added some >> >> diff --git >> a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java >> b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java >> >> index 708b669112..8c21928da1 100644 >> >> --- >> a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java >> >> +++ >> b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java >> >> @@ -169,7 +169,12 @@ public class FnHarness { >> >> LOG.info("Entering instruction processing loop"); >> >> control.processInstructionRequests(options.as >> (GcsOptions.class).getExecutorService()); >> >> } finally { >> >> - System.out.println("Shutting SDK harness down."); >> >> + try { >> >> + System.out.println("Shutting SDK harness down."); >> >> + } catch (NullPointerException npe) { >> >> + LOG.warn("NPE sys.out=" + System.out, npe); >> >> + } >> >> } >> >> } >> >> } >> >> No my test shows outputs >> >> Apr 05, 2019 9:29:59 AM org.apache.beam.fn.harness.FnHarness main >> WARNING: NPE sys.out=null >> java.lang.NullPointerException >> at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:173) >> at >> org.apache.beam.runners.dataflow.worker.fn.control.TimerReceiverTest.lambda$setUp$0(TimerReceiverTest.java:123) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at java.lang.Thread.run(Thread.java:748) >> >> >> >> and pass (sic!) >> >> Something weird is going on here.... >> >> Now replacing that 'System.out' with 'LOG.info' seems also to be working. >> At least I could not reproduce the failure trying several times. I am lost >> here, as there is probably a good reason to use sys out here. >> >> Btw. After the first failure with NullPointerExceptions. successive runs >> seem to fail for different reasons. Getting timeout in test setup. Unsure, >> might indicate some grpc port/server startup issue because previous run did >> not do proper cleanup. >> >> best, >> >> michel >> >> [1] >> https://github.com/apache/beam/blob/master/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java#L172 >> >> On Thu, Apr 4, 2019 at 10:42 PM Lukasz Cwik <[email protected]> wrote: >> >>> I looked at the failures you were experiencing and the error message >>> doesn't provide enough information to figure out why it is failing. >>> >>> On Wed, Apr 3, 2019 at 9:23 PM Csaba Kassai <[email protected]> wrote: >>> >>>> Oh, I just missed it then :) >>>> Thank you Lukasz for connecting us. >>>> >>>> Yeah, the two TimerReceiverTest tests fail reliably for me. >>>> >>>> >>>> >>>> >>>> >>>> On Tue, 2 Apr 2019 at 23:53, Lukasz Cwik <[email protected]> wrote: >>>> >>>>> +Ahmed >>>>> >>>>> I have added you as a contributor. >>>>> >>>>> It seems as though Ahmed had just picked up BEAM-3489 yesterday. Reach >>>>> out to Ahmed if you would like to help them out with the task. >>>>> >>>>> Was TimerReceiverTest failing reliably when performing a parallel >>>>> build or is it flaky? >>>>> >>>>> I have asked Chamikara to take a look for PR 8180. >>>>> >>>>> >>>>> On Tue, Apr 2, 2019 at 8:33 AM Csaba Kassai <[email protected]> wrote: >>>>> >>>>>> Hi All! >>>>>> >>>>>> I am Csabi, I would be happy to contribute to Beam. >>>>>> Could you grant me contributor role and assign issue BEAM-3489 >>>>>> <https://issues.apache.org/jira/browse/BEAM-3489> to me? My user >>>>>> name is "csabakassai". >>>>>> >>>>>> After I checked out the code and tried to do a gradle check I found >>>>>> these issues: >>>>>> >>>>>> 1. *jUnit tests fails:* the TimerReceiverTest fails in the >>>>>> ":beam-runners-google-cloud-dataflow-java-fn-api-worker:test" and the >>>>>> ":beam-runners-google-cloud-dataflow-java-legacy-worker:test" tasks. >>>>>> When I >>>>>> execute tests independently everything is fine, so I disabled the >>>>>> parallel >>>>>> build and this solves the problem. I have not investigated further, >>>>>> do you >>>>>> have any more insights on this issue? I have attached the test >>>>>> reports. >>>>>> 2. *python test fail*: there is a python test which fails if the >>>>>> current offset of your timezone differs from the offset in 1970. In >>>>>> my case >>>>>> the Singapore is now GMT+8 and it was GMT+7:30 in 1970. I created a >>>>>> ticket >>>>>> for this issue where I I describe the problem in details: >>>>>> https://jira.apache.org/jira/browse/BEAM-6947. Could you assign >>>>>> the ticket to me? Also I created a PR with a possible fix: >>>>>> https://github.com/apache/beam/pull/8180. Could you suggest me a >>>>>> reviewer? >>>>>> >>>>>> >>>>>> Thank you, >>>>>> Csabi >>>>>> >>>>>> >>>>>> >>>>>>
