[GitHub] flink issue #5718: [FLINK-8073][kafka-tests] Disable timeout in tests

2018-03-21 Thread pnowojski
Github user pnowojski commented on the issue:

https://github.com/apache/flink/pull/5718
  
Thanks!


---


[GitHub] flink issue #5718: [FLINK-8073][kafka-tests] Disable timeout in tests

2018-03-21 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5718
  
merging.


---


[GitHub] flink issue #5718: [FLINK-8073][kafka-tests] Disable timeout in tests

2018-03-21 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/5718
  
Having a custom timeout like @zentol suggests could be helpful in the long 
run.

For now, though, I personally agree that the timeouts are counter 
productive (I actually never understood what they are supposed to solve). I 
prefer a deadlocking test to stay around, so I can jstack to the process, 
attach a debugger, pull a heap dump, whatever. The only environment where we 
cannot do that is the CI server, which has a timeout and thread dump already...


---


[GitHub] flink issue #5718: [FLINK-8073][kafka-tests] Disable timeout in tests

2018-03-20 Thread pnowojski
Github user pnowojski commented on the issue:

https://github.com/apache/flink/pull/5718
  
In that case can we merge it as it is and if you want @zentol, we can start 
a discussion about future direction of this?


---


[GitHub] flink issue #5718: [FLINK-8073][kafka-tests] Disable timeout in tests

2018-03-19 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5718
  
I don't want to make a fuss over a single test, so for the purposes of this 
test it's fine to just remote the timeout.

In general however i think it would be pretty neat if we would add such a 
`Timeout` to the `TestLogger` class. I see the problem of debugging tests that 
have timeouts, but removing timeouts altogether is the wrong conclusion imo. 
Instead, we could implement the `Timeout` such that it doesn't fail the test if 
a certain profile/property is set.

With this we keep the benefits of test timeouts (CI service independence, 
less special behavior locally vs CI, fixed upper bound for test times which is 
particularly useful for new tests, "guarantee" that the test terminates) while 
still allowing debugging. In fact we may end up improving the debugging 
situation by consolidating how timeouts are implemented instead of each test 
rolling their own solution that you can't disable.

The travis watchdog _cannot_ be removed as it covers the entire maven 
process that from time to time locks up outside of tests. That said, the fact 
that we have to rely on the travis watchdog _to ensure that tests terminate_ is 
a bad sign. Not to mention that it already forced us to introduce workarounds 
to make tests "compatible", like the kafka tests and others that print stuff 
for the sole purposes of not triggering it.



---


[GitHub] flink issue #5718: [FLINK-8073][kafka-tests] Disable timeout in tests

2018-03-19 Thread tillrohrmann
Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/5718
  
I agree with @pnowojski. The watchdog and Travis itself should time out the 
test for us if things take too long.


---


[GitHub] flink issue #5718: [FLINK-8073][kafka-tests] Disable timeout in tests

2018-03-19 Thread pnowojski
Github user pnowojski commented on the issue:

https://github.com/apache/flink/pull/5718
  
I think that timeouts in IDE are only making things worse:
1. you can not debug, since longer debugging session will fail 
2. in case of deadlocks, locally you can always manually dump/view stack 
traces but you also can do full thread/memory dump and/or attach a debugger.

On travis you can not debug it in any way, thus automated actions make 
sense (and they preserve some travis resources for other builds). 

Furthermore, such rule would have to be implemented in all of our tests, 
not only here. Also IMO if you want to replace/remove `travis_mvn_watchdog` 
with such rule, this is out of scope of this ticket and it should be discussed 
separately (I would be against it), so that it doesn't hinder this ticket.

When I originally created those tests (with `@Test(timeout = )`) I did 
it by copying without thinking from a test, that was printing something to logs 
once every checkpoint, thus it was interrupting travis watchdog and one could 
argue that it required this special case timeout. This commit makes it more 
consistent with rest of the tests.


---


[GitHub] flink issue #5718: [FLINK-8073][kafka-tests] Disable timeout in tests

2018-03-19 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5718
  
This change appears to be focus mostly around Travis which i dislike a bit. 
I'd rather add a custom `Timeout` that prints the thread-dump so we still 
get nice behavior when running things locally/in the IDE.

```
@Rule
public final Timeout timeout = new MyTimeOut(10, TimeUnit.MILLISECONDS);

public static class MyTimeOut extends Timeout {

public MyTimeOut(long timeout, TimeUnit timeUnit) {
super(timeout, timeUnit);
}

@Override
public Statement apply(Statement base, Description description) 
{
Statement fail = super.apply(base, description);
return new Statement() {
@Override
public void evaluate() throws Throwable {

System.out.println(crunchifyGenerateThreadDump());
fail.evaluate();
}
};
}
}

public static String crunchifyGenerateThreadDump() {
final StringBuilder dump = new StringBuilder();
final ThreadMXBean threadMXBean = 
ManagementFactory.getThreadMXBean();
final ThreadInfo[] threadInfos = 
threadMXBean.getThreadInfo(threadMXBean.getAllThreadIds(), 100);
for (ThreadInfo threadInfo : threadInfos) {
dump.append('"');
dump.append(threadInfo.getThreadName());
dump.append("\" ");
final Thread.State state = threadInfo.getThreadState();
dump.append("\n   java.lang.Thread.State: ");
dump.append(state);
final StackTraceElement[] stackTraceElements = 
threadInfo.getStackTrace();
for (final StackTraceElement stackTraceElement : 
stackTraceElements) {
dump.append("\nat ");
dump.append(stackTraceElement);
}
dump.append("\n\n");
}
return dump.toString();
}
```


---