[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402622#comment-17402622 ] Gabor Somogyi commented on FLINK-20461: --- Thanks for checking it too, added my findings to the PR. I think we're on track :) > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0, 1.14.0 >Reporter: Huang Xingbo >Assignee: Till Rohrmann >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402587#comment-17402587 ] Till Rohrmann commented on FLINK-20461: --- I think the problem could be that we are looking for the Flink dist jar after the job has terminated. This also means that we are looking for this file while Yarn will clean up the directory of the submitted Yarn application. Hence, I think we are looking at a classic race condition. I'll try to verify this suspicion. > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0, 1.14.0 >Reporter: Huang Xingbo >Assignee: Till Rohrmann >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401530#comment-17401530 ] Yangze Guo commented on FLINK-20461: [~gaborgsomogyi] Thanks a lot for offering your help! Wish you good luck to reproduce this problem :) > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0, 1.14.0 >Reporter: Huang Xingbo >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401524#comment-17401524 ] Gabor Somogyi commented on FLINK-20461: --- [~karmagyz] Just finishing a relatively big task and intended to have a look. As I see this is a rare bug so hopefully I can reproduce it. > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0, 1.14.0 >Reporter: Huang Xingbo >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400230#comment-17400230 ] Yangze Guo commented on FLINK-20461: [~gaborgsomogyi] Would you like to take a look at this issue at your convenience? > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0, 1.14.0 >Reporter: Huang Xingbo >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391867#comment-17391867 ] Xintong Song commented on FLINK-20461: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21330=logs=8fd975ef-f478-511d-4997-6f15fe8a1fd3=494f6362-8ffa-5ff8-9158-c7f00e541279=31906 > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0, 1.14.0 >Reporter: Huang Xingbo >Priority: Major > Labels: test-stability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17374462#comment-17374462 ] Xintong Song commented on FLINK-20461: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19856=logs=f450c1a5-64b1-5955-e215-49cb1ad5ec88=ea63c80c-957f-50d1-8f67-3671c14686b9=28127 > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0, 1.14.0 >Reporter: Huang Xingbo >Priority: Major > Labels: test-stability > Fix For: 1.14.0 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17373148#comment-17373148 ] Xintong Song commented on FLINK-20461: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19797=logs=8fd975ef-f478-511d-4997-6f15fe8a1fd3=ac0fa443-5d45-5a6b-3597-0310ecc1d2ab=31002 > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0, 1.14.0 >Reporter: Huang Xingbo >Priority: Major > Labels: test-stability > Fix For: 1.14.0 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17369190#comment-17369190 ] Xintong Song commented on FLINK-20461: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19500=logs=a5ef94ef-68c2-57fd-3794-dc108ed1c495=9c1ddabe-d186-5a2c-5fcc-f3cafb3ec699=28263 > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0 >Reporter: Huang Xingbo >Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned, testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362768#comment-17362768 ] Dawid Wysakowicz commented on FLINK-20461: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18946=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf=28714 > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0 >Reporter: Huang Xingbo >Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned, testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17335858#comment-17335858 ] Flink Jira Bot commented on FLINK-20461: This issue was marked "stale-assigned" 7 ago and has not received an update. I have automatically removed the current assignee from the issue so others in the community may pick it up. If you are still working on this ticket, please ask a committer to reassign you and provide an update about your current status. > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: stale-assigned, testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17327087#comment-17327087 ] Guowei Ma commented on FLINK-20461: --- another cases https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16998=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf=28148 > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: stale-assigned, testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17322901#comment-17322901 ] Flink Jira Bot commented on FLINK-20461: This issue is assigned but has not received an update in 7 days so it has been labeled "stale-assigned". If you are still working on the issue, please give an update and remove the label. If you are no longer working on the issue, please unassign so someone else may work on it. In 7 days the issue will be automatically unassigned. > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0, 1.13.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: stale-assigned, testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307532#comment-17307532 ] Guowei Ma commented on FLINK-20461: --- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15302=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf=28982 {code:java} [ERROR] testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) Time elapsed: 26.797 s <<< ERROR! java.io.FileNotFoundException: File does not exist: hdfs://localhost:40564/user/agent07_azpcontainer/.flink/application_1616528383248_0001/flink-dist_2.11-1.13-SNAPSHOT.jar at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) at org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:165) at org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:128) at org.apache.flink.yarn.YARNFileReplicationITCase.lambda$testPerJobModeWithDefaultFileReplication$1(YARNFileReplicationITCase.java:78) at org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:286) at org.apache.flink.yarn.YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication(YARNFileReplicationITCase.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) {code} > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at >
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17285035#comment-17285035 ] Zhenqiu Huang commented on FLINK-20461: --- [~trohrmann][~dwysakowicz] I am not able to reproduce locally yet. For changing azure-pipeline, I haven't got the chance to work on it now. > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.11.3, 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17284592#comment-17284592 ] Dawid Wysakowicz commented on FLINK-20461: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=13325=logs=245e1f2e-ba5b-5570-d689-25ae21e5302f=02d88c1a-f1b3-5a8c-4b4a-cf43c70f99e1 > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282954#comment-17282954 ] Till Rohrmann commented on FLINK-20461: --- [~hpeter] did you have luck reproducing the problem? > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282888#comment-17282888 ] Dawid Wysakowicz commented on FLINK-20461: -- New instance: https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=13222=logs=f450c1a5-64b1-5955-e215-49cb1ad5ec88=ea63c80c-957f-50d1-8f67-3671c14686b9 > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17255156#comment-17255156 ] Zhenqiu Huang commented on FLINK-20461: --- [~xintongsong] [~hxbks2ks] Tried run the test class 100 times in IntelliJ, none of them failed. I will try to change the azure-pipelines to run/repeat this single test. > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17254761#comment-17254761 ] Xintong Song commented on FLINK-20461: -- [~hpeter], Is this related to the Azure environment? If not, one thing you can try is to loop this locally in IDE. IntelliJ has a feature to repeat a test until failure. [https://httpain.com/blog/debugging-flaky-tests-in-intellij-idea/#:~:text=Retry%20test%20until%20failure=Let's%20edit%20a%20Run%20Configuration,Launch%20the%20test%20again]. If this is related to the Azure environment, the only thing I can think of is to modify the azure test scripts to only run/repeat this single test case. Unfortunately, I don't find any docs instructing how to do that, you may need to look into `azure-pipelines.yml` and `tools/azure-pipelines/`. > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17254757#comment-17254757 ] Huang Xingbo commented on FLINK-20461: -- [~hpeter] The frequency of this test case is not high, you can trigger it by multiple push in Private Azure Pipeline. Regarding the method of Debugging in Azure Pipeline, I only know the way of printing logs. > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17254750#comment-17254750 ] Zhenqiu Huang commented on FLINK-20461: --- [~hxbks2ks] [~xintongsong] Looks like the issue doesn't happen in each build. Do you have any suggestions to debug in the azure environment? > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17254743#comment-17254743 ] Huang Xingbo commented on FLINK-20461: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=11304=logs=5cae8624-c7eb-5c51-92d3-4d2dacedd221=420bd9ec-164e-562e-8947-0dacde3cec91 > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17254397#comment-17254397 ] Zhenqiu Huang commented on FLINK-20461: --- [~rmetzger] The log in Azure is exactly the same as what is reported in the ticket. Is there a way for me to access more environment info in azure container? > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248893#comment-17248893 ] Robert Metzger commented on FLINK-20461: This is probably a rare test failure. Have you checked the logs uploaded to Azure to understand why it failed on CI? > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
[ https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244951#comment-17244951 ] Zhenqiu Huang commented on FLINK-20461: --- [~hxbks2ks] I checked in master. Both test cases of YARNFileReplicationITCase have passed. How can I reprocess the error? > YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication > -- > > Key: FLINK-20461 > URL: https://issues.apache.org/jira/browse/FLINK-20461 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.12.0 >Reporter: Huang Xingbo >Assignee: Zhenqiu Huang >Priority: Major > Labels: testability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf] > {code:java} > [ERROR] > testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase) > Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does > not exist: > hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148) > at > org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)