[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-12-13 Thread Aljoscha Krettek (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995788#comment-16995788
 ] 

Aljoscha Krettek commented on FLINK-14834:
--

That comment was not correct, it was not fixed.

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Aljoscha Krettek
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
> Attachments: run-with-3-slots.txt, run-with-4-slots.txt
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-12-11 Thread Aljoscha Krettek (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993574#comment-16993574
 ] 

Aljoscha Krettek commented on FLINK-14834:
--

The issue seems to be fixed on master. I'm currently bisecting to identify the 
"culprit".

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Aljoscha Krettek
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
> Attachments: run-with-3-slots.txt, run-with-4-slots.txt
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-12-11 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993369#comment-16993369
 ] 

Chesnay Schepler commented on FLINK-14834:
--

Since [~trohrmann] is working on FLINK-15013 I'll assign him to this issue; 
just so it's marked as not requiring an assignee.

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
> Attachments: run-with-3-slots.txt, run-with-4-slots.txt
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-12-09 Thread Aljoscha Krettek (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991707#comment-16991707
 ] 

Aljoscha Krettek commented on FLINK-14834:
--

This issue can be clossed when FLINK-15013 is resolved.

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
> Attachments: run-with-3-slots.txt, run-with-4-slots.txt
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-12-05 Thread Aljoscha Krettek (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988650#comment-16988650
 ] 

Aljoscha Krettek commented on FLINK-14834:
--

FLINK-15013 is the reason why {{Running Kerberized YARN on Docker test (custom 
fs plugin)}} is failing, but I think {{Running Kerberized YARN on Docker test 
(default input)}} might work now.

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Aljoscha Krettek
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
> Attachments: run-with-3-slots.txt, run-with-4-slots.txt
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-12-04 Thread Yu Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988525#comment-16988525
 ] 

Yu Li commented on FLINK-14834:
---

Correct me if I'm wrong, but from the comments in FLINK-15007 we could 
re-enable the "Kerberized YARN on Docker" tests now (FLINK-15013 is still open 
and necessary, but seems won't cause the test fail)? Thanks.

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Aljoscha Krettek
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
> Attachments: run-with-3-slots.txt, run-with-4-slots.txt
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-12-02 Thread Aljoscha Krettek (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986065#comment-16986065
 ] 

Aljoscha Krettek commented on FLINK-14834:
--

See the linked issues for analysis on which commit caused the issue.

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Aljoscha Krettek
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
> Attachments: run-with-3-slots.txt, run-with-4-slots.txt
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-12-01 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985793#comment-16985793
 ] 

Yang Wang commented on FLINK-14834:
---

I run the job on a real yarn cluster. It always need 3 slots and only 3 
taskmanager are started.

It is very curious sometimes we need 4 slots. Maybe someone is familiar with 
scheduler could help.

[~zhuzh] What do you think?

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Aljoscha Krettek
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
> Attachments: run-with-3-slots.txt, run-with-4-slots.txt
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-11-29 Thread Gary Yao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984925#comment-16984925
 ] 

Gary Yao commented on FLINK-14834:
--

I also temporarily disabled the "Kerberized YARN on Docker test (default 
input)".

master: 85905f80e9711967711c2992612dccdd2cc211ac

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Aljoscha Krettek
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
> Attachments: run-with-3-slots.txt, run-with-4-slots.txt
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-11-26 Thread Aljoscha Krettek (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982696#comment-16982696
 ] 

Aljoscha Krettek commented on FLINK-14834:
--

I think I found the culprit for the failure of the {{dummfs}} tests. It's this 
change: 
https://github.com/apache/flink/commit/749965348170e4608ff2a23c9617f67b8c341df5.
 This changes the job to have two sources instead of one which, under normal 
circumstances, requires too many slots to run and therefore the job will fail.

For the failure of the {{normal}} tests, I see in the logs that there is only 
two instances of
{code}
2019-11-20 21:32:48,107 INFO  org.apache.flink.yarn.YarnResourceManager 
- Requesting new TaskExecutor container with resources 
. Number pending requests 1.
{code}
while the job requires three {{TaskExecutors}} to run. For a successful run of 
this test you will see three {{Requesting new TaskExecutor}} lines. I don't 
know why the {{JobMaster}} does not request more {{TaskExecutors}} but this 
could be a bug.

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Aljoscha Krettek
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-11-24 Thread Yu Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981267#comment-16981267
 ] 

Yu Li commented on FLINK-14834:
---

Another instance: https://api.travis-ci.org/v3/job/616253236/log.txt
With similar cause:
{noformat}
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution 
failed.
at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
at 
org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$getJobExecutionResult$0(ClusterClientJobClientAdapter.java:81)
... 19 more
Caused by: 
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
Could not allocate all requires slots within timeout of 12 ms. Slots 
required: 7, slots allocated: 3, previous allocation IDs: [], execution status: 
completed: Attempt #0 (Source: Collection Source (1/1)) @ 
{noformat}

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Aljoscha Krettek
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-11-22 Thread Aljoscha Krettek (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979952#comment-16979952
 ] 

Aljoscha Krettek commented on FLINK-14834:
--

The reason is also similar:
{code}
org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: 
517ed66541602083e10e5594216e9bfe)
at 
org.apache.flink.client.ClientUtils.submitJobAndWaitForResult(ClientUtils.java:144)
at 
org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:64)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1531)
at 
org.apache.flink.streaming.examples.wordcount.WordCount.main(WordCount.java:96)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:333)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:217)
at 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:183)
at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:747)
at 
org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:282)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:219)
at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1012)
at 
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1085)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840)
at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1085)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution 
failed.
at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
at 
org.apache.flink.client.ClientUtils.submitJobAndWaitForResult(ClientUtils.java:142)
... 20 more
Caused by: 
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
Could not allocate all requires slots within timeout of 12 ms. Slots 
required: 7, slots allocated: 3, previous allocation IDs: [], execution status: 
completed: Attempt #0 (Source: Collection Source (1/1)) @ 
org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot@5bbe73e6 - 
[SCHEDULED], completed: Attempt #0 (Flat Map (1/3)) @ 
{code}

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Aljoscha Krettek
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-11-22 Thread Yu Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979951#comment-16979951
 ] 

Yu Li commented on FLINK-14834:
---

Another instance for 'Running Kerberized YARN on Docker test (default input)': 
https://api.travis-ci.org/v3/job/615032422/log.txt

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Aljoscha Krettek
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14834) Kerberized YARN on Docker test fails on Travis

2019-11-21 Thread Gary Yao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979280#comment-16979280
 ] 

Gary Yao commented on FLINK-14834:
--

'Running Kerberized YARN on Docker test (default input)' also started failing
https://api.travis-ci.org/v3/job/614505046/log.txt

> Kerberized YARN on Docker test fails on Travis
> --
>
> Key: FLINK-14834
> URL: https://issues.apache.org/jira/browse/FLINK-14834
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Tests
>Affects Versions: 1.10.0
>Reporter: Gary Yao
>Assignee: Aljoscha Krettek
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.10.0
>
>
> https://api.travis-ci.org/v3/job/612782888/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)