FWIW, the TimerRecieverTest is also failing consistently on my macOS.
Running on my ubuntu VM, they pass.

Now the stacktrace indicates an NullPinterException thrown out of the
finally block [1]

As this is really bad and of course would hide the cause, I added some

diff --git
a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java
b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java

index 708b669112..8c21928da1 100644

---
a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java

+++
b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java

@@ -169,7 +169,12 @@ public class FnHarness {

       LOG.info("Entering instruction processing loop");

       control.processInstructionRequests(options.as
(GcsOptions.class).getExecutorService());

     } finally {

-      System.out.println("Shutting SDK harness down.");

+      try {

+        System.out.println("Shutting SDK harness down.");

+      } catch (NullPointerException npe) {

+        LOG.warn("NPE sys.out=" + System.out, npe);

+      }

     }

   }

 }

No my test shows outputs

Apr 05, 2019 9:29:59 AM org.apache.beam.fn.harness.FnHarness main
WARNING: NPE  sys.out=null
java.lang.NullPointerException
        at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:173)
        at 
org.apache.beam.runners.dataflow.worker.fn.control.TimerReceiverTest.lambda$setUp$0(TimerReceiverTest.java:123)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)



and pass (sic!)

Something weird is going on here....

Now replacing that 'System.out' with 'LOG.info' seems also to be working.
At least I could not reproduce the failure trying several times. I am lost
here, as there is probably a good reason to use sys out here.

Btw. After the first failure with NullPointerExceptions. successive runs
seem to fail for different reasons. Getting timeout in test setup. Unsure,
might indicate some grpc port/server startup issue because previous run did
not do proper cleanup.

best,

michel

[1]
https://github.com/apache/beam/blob/master/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java#L172

On Thu, Apr 4, 2019 at 10:42 PM Lukasz Cwik <[email protected]> wrote:

> I looked at the failures you were experiencing and the error message
> doesn't provide enough information to figure out why it is failing.
>
> On Wed, Apr 3, 2019 at 9:23 PM Csaba Kassai <[email protected]> wrote:
>
>> Oh, I just missed it then :)
>> Thank you Lukasz for connecting us.
>>
>> Yeah, the two TimerReceiverTest tests fail reliably for me.
>>
>>
>>
>>
>>
>> On Tue, 2 Apr 2019 at 23:53, Lukasz Cwik <[email protected]> wrote:
>>
>>> +Ahmed
>>>
>>> I have added you as a contributor.
>>>
>>> It seems as though Ahmed had just picked up BEAM-3489 yesterday. Reach
>>> out to Ahmed if you would like to help them out with the task.
>>>
>>> Was TimerReceiverTest failing reliably when performing a parallel build
>>> or is it flaky?
>>>
>>> I have asked Chamikara to take a look for PR 8180.
>>>
>>>
>>> On Tue, Apr 2, 2019 at 8:33 AM Csaba Kassai <[email protected]> wrote:
>>>
>>>> Hi All!
>>>>
>>>> I am Csabi, I would be happy to contribute to Beam.
>>>> Could you grant me contributor role and assign issue BEAM-3489
>>>> <https://issues.apache.org/jira/browse/BEAM-3489>  to me? My user name
>>>> is "csabakassai".
>>>>
>>>> After I checked out the code and tried to do a gradle check I found
>>>> these issues:
>>>>
>>>>    1. *jUnit tests fails:* the TimerReceiverTest fails in the
>>>>    ":beam-runners-google-cloud-dataflow-java-fn-api-worker:test" and the
>>>>    ":beam-runners-google-cloud-dataflow-java-legacy-worker:test" tasks. 
>>>> When I
>>>>    execute tests independently everything is fine, so I disabled the 
>>>> parallel
>>>>    build and this solves the problem. I have not investigated further, do 
>>>> you
>>>>    have any more insights on this issue? I have attached the test reports.
>>>>    2. *python test fail*: there is a python test which fails if the
>>>>    current offset of your timezone differs from the offset in 1970. In my 
>>>> case
>>>>    the Singapore is now GMT+8 and it was GMT+7:30 in 1970. I created a 
>>>> ticket
>>>>    for this issue where I I describe the problem in details:
>>>>    https://jira.apache.org/jira/browse/BEAM-6947. Could you assign the
>>>>    ticket to me? Also I created a PR with a possible fix:
>>>>    https://github.com/apache/beam/pull/8180. Could you suggest me a
>>>>    reviewer?
>>>>
>>>>
>>>> Thank you,
>>>> Csabi
>>>>
>>>>
>>>>
>>>>

Reply via email to