[jira] [Commented] (FLINK-15308) Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1
[ https://issues.apache.org/jira/browse/FLINK-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021053#comment-17021053 ] Gary Yao commented on FLINK-15308: -- If affectsVersion is 1.10.0 and fixVersion is 1.10.0, I think we can remove the release note since there is no behavioral change compared to 1.9. > Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1 > -- > > Key: FLINK-15308 > URL: https://issues.apache.org/jira/browse/FLINK-15308 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.10.0 > Environment: $ git log > commit 4b54da2c67692b1c9d43e1184c00899b0151b3ae > Author: bowen.li > Date: Tue Dec 17 17:37:03 2019 -0800 >Reporter: Feng Jiajie >Assignee: Yingjie Cao >Priority: Blocker > Labels: pull-request-available > Fix For: 1.10.0 > > Attachments: image-2019-12-19-10-55-30-644.png > > Time Spent: 40m > Remaining Estimate: 0h > > Job worked well with default flink-conf.yaml with > pipelined-shuffle.compression: > {code:java} > taskmanager.numberOfTaskSlots: 1 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > But when I set taskmanager.numberOfTaskSlots to 4 or 6: > {code:java} > taskmanager.numberOfTaskSlots: 6 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > job failed: > {code:java} > $ bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 12288m > ~/flink-example-1.0-SNAPSHOT.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/data/sa_cluster/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,907 INFO org.apache.flink.yarn.YarnClusterDescriptor > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-18 15:04:41,084 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Cluster specification: > ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=12288, > numberTaskManagers=1, slotsPerTaskManager=6} > 2019-12-18 15:04:42,344 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Submitting application master application_1576573857638_0026 > 2019-12-18 15:04:42,370 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1576573857638_0026 > 2019-12-18 15:04:42,371 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Waiting for the cluster to be allocated > 2019-12-18 15:04:42,372 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Deploying cluster, current state ACCEPTED > 2019-12-18 15:04:45,388 INFO org.apache.flink.yarn.YarnClusterDescriptor > - YARN application has been deployed successfully. > 2019-12-18 15:04:45,390 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found Web Interface debugboxcreate431x3.sa:36162 of > application 'application_1576573857638_0026'. > Job has been submitted with JobID 9140c70769f4271cc22ea8becaa26272 > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: org.apache.flink.client.program.ProgramInvocationException: > Job failed (JobID: 9140c70769f4271cc22ea8becaa26272) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335) > at > org.apache.flink.client.program.PackagedProgram.invokeI
[jira] [Commented] (FLINK-15308) Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1
[ https://issues.apache.org/jira/browse/FLINK-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000982#comment-17000982 ] Yingjie Cao commented on FLINK-15308: - Fix via 8525c378b91c16245d2e0456d423ed39f5c9b330 on master. Fix via b87fc76ace24c69423037e68220091cb2965ac3e on release-1.10. > Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1 > -- > > Key: FLINK-15308 > URL: https://issues.apache.org/jira/browse/FLINK-15308 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.10.0 > Environment: $ git log > commit 4b54da2c67692b1c9d43e1184c00899b0151b3ae > Author: bowen.li > Date: Tue Dec 17 17:37:03 2019 -0800 >Reporter: Feng Jiajie >Assignee: Yingjie Cao >Priority: Blocker > Labels: pull-request-available > Fix For: 1.10.0 > > Attachments: image-2019-12-19-10-55-30-644.png > > Time Spent: 40m > Remaining Estimate: 0h > > Job worked well with default flink-conf.yaml with > pipelined-shuffle.compression: > {code:java} > taskmanager.numberOfTaskSlots: 1 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > But when I set taskmanager.numberOfTaskSlots to 4 or 6: > {code:java} > taskmanager.numberOfTaskSlots: 6 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > job failed: > {code:java} > $ bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 12288m > ~/flink-example-1.0-SNAPSHOT.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/data/sa_cluster/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,907 INFO org.apache.flink.yarn.YarnClusterDescriptor > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-18 15:04:41,084 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Cluster specification: > ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=12288, > numberTaskManagers=1, slotsPerTaskManager=6} > 2019-12-18 15:04:42,344 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Submitting application master application_1576573857638_0026 > 2019-12-18 15:04:42,370 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1576573857638_0026 > 2019-12-18 15:04:42,371 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Waiting for the cluster to be allocated > 2019-12-18 15:04:42,372 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Deploying cluster, current state ACCEPTED > 2019-12-18 15:04:45,388 INFO org.apache.flink.yarn.YarnClusterDescriptor > - YARN application has been deployed successfully. > 2019-12-18 15:04:45,390 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found Web Interface debugboxcreate431x3.sa:36162 of > application 'application_1576573857638_0026'. > Job has been submitted with JobID 9140c70769f4271cc22ea8becaa26272 > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: org.apache.flink.client.program.ProgramInvocationException: > Job failed (JobID: 9140c70769f4271cc22ea8becaa26272) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveMode
[jira] [Commented] (FLINK-15308) Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1
[ https://issues.apache.org/jira/browse/FLINK-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999892#comment-16999892 ] Feng Jiajie commented on FLINK-15308: - Really looking forward to it. > Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1 > -- > > Key: FLINK-15308 > URL: https://issues.apache.org/jira/browse/FLINK-15308 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.10.0 > Environment: $ git log > commit 4b54da2c67692b1c9d43e1184c00899b0151b3ae > Author: bowen.li > Date: Tue Dec 17 17:37:03 2019 -0800 >Reporter: Feng Jiajie >Assignee: Yingjie Cao >Priority: Blocker > Fix For: 1.10.0 > > Attachments: image-2019-12-19-10-55-30-644.png > > > Job worked well with default flink-conf.yaml with > pipelined-shuffle.compression: > {code:java} > taskmanager.numberOfTaskSlots: 1 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > But when I set taskmanager.numberOfTaskSlots to 4 or 6: > {code:java} > taskmanager.numberOfTaskSlots: 6 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > job failed: > {code:java} > $ bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 12288m > ~/flink-example-1.0-SNAPSHOT.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/data/sa_cluster/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,907 INFO org.apache.flink.yarn.YarnClusterDescriptor > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-18 15:04:41,084 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Cluster specification: > ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=12288, > numberTaskManagers=1, slotsPerTaskManager=6} > 2019-12-18 15:04:42,344 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Submitting application master application_1576573857638_0026 > 2019-12-18 15:04:42,370 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1576573857638_0026 > 2019-12-18 15:04:42,371 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Waiting for the cluster to be allocated > 2019-12-18 15:04:42,372 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Deploying cluster, current state ACCEPTED > 2019-12-18 15:04:45,388 INFO org.apache.flink.yarn.YarnClusterDescriptor > - YARN application has been deployed successfully. > 2019-12-18 15:04:45,390 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found Web Interface debugboxcreate431x3.sa:36162 of > application 'application_1576573857638_0026'. > Job has been submitted with JobID 9140c70769f4271cc22ea8becaa26272 > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: org.apache.flink.client.program.ProgramInvocationException: > Job failed (JobID: 9140c70769f4271cc22ea8becaa26272) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205) > at > org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(C
[jira] [Commented] (FLINK-15308) Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1
[ https://issues.apache.org/jira/browse/FLINK-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999868#comment-16999868 ] Yingjie Cao commented on FLINK-15308: - The problem is cause by race of multi netty threads. The simplest way of fix the problem may be make the BufferCompressor/BufferDecompressor util thread safe, however it can complicate the network stack. After an offline discussion, we finally decide to disable data compression for pipeline mode in version release-1.10 and we may add the feature back if there a better solution in the future. > Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1 > -- > > Key: FLINK-15308 > URL: https://issues.apache.org/jira/browse/FLINK-15308 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.10.0 > Environment: $ git log > commit 4b54da2c67692b1c9d43e1184c00899b0151b3ae > Author: bowen.li > Date: Tue Dec 17 17:37:03 2019 -0800 >Reporter: Feng Jiajie >Assignee: Yingjie Cao >Priority: Blocker > Fix For: 1.10.0 > > Attachments: image-2019-12-19-10-55-30-644.png > > > Job worked well with default flink-conf.yaml with > pipelined-shuffle.compression: > {code:java} > taskmanager.numberOfTaskSlots: 1 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > But when I set taskmanager.numberOfTaskSlots to 4 or 6: > {code:java} > taskmanager.numberOfTaskSlots: 6 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > job failed: > {code:java} > $ bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 12288m > ~/flink-example-1.0-SNAPSHOT.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/data/sa_cluster/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,907 INFO org.apache.flink.yarn.YarnClusterDescriptor > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-18 15:04:41,084 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Cluster specification: > ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=12288, > numberTaskManagers=1, slotsPerTaskManager=6} > 2019-12-18 15:04:42,344 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Submitting application master application_1576573857638_0026 > 2019-12-18 15:04:42,370 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1576573857638_0026 > 2019-12-18 15:04:42,371 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Waiting for the cluster to be allocated > 2019-12-18 15:04:42,372 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Deploying cluster, current state ACCEPTED > 2019-12-18 15:04:45,388 INFO org.apache.flink.yarn.YarnClusterDescriptor > - YARN application has been deployed successfully. > 2019-12-18 15:04:45,390 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found Web Interface debugboxcreate431x3.sa:36162 of > application 'application_1576573857638_0026'. > Job has been submitted with JobID 9140c70769f4271cc22ea8becaa26272 > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: org.apache.flink.client.program.ProgramInvocationException: > Job failed (JobID: 9140c70769f4271cc22ea8becaa26272) > at
[jira] [Commented] (FLINK-15308) Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1
[ https://issues.apache.org/jira/browse/FLINK-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999708#comment-16999708 ] Yingjie Cao commented on FLINK-15308: - [~fengjiajie] I also reproduced it. > Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1 > -- > > Key: FLINK-15308 > URL: https://issues.apache.org/jira/browse/FLINK-15308 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.10.0 > Environment: $ git log > commit 4b54da2c67692b1c9d43e1184c00899b0151b3ae > Author: bowen.li > Date: Tue Dec 17 17:37:03 2019 -0800 >Reporter: Feng Jiajie >Assignee: Yingjie Cao >Priority: Blocker > Attachments: image-2019-12-19-10-55-30-644.png > > > Job worked well with default flink-conf.yaml with > pipelined-shuffle.compression: > {code:java} > taskmanager.numberOfTaskSlots: 1 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > But when I set taskmanager.numberOfTaskSlots to 4 or 6: > {code:java} > taskmanager.numberOfTaskSlots: 6 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > job failed: > {code:java} > $ bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 12288m > ~/flink-example-1.0-SNAPSHOT.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/data/sa_cluster/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,907 INFO org.apache.flink.yarn.YarnClusterDescriptor > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-18 15:04:41,084 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Cluster specification: > ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=12288, > numberTaskManagers=1, slotsPerTaskManager=6} > 2019-12-18 15:04:42,344 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Submitting application master application_1576573857638_0026 > 2019-12-18 15:04:42,370 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1576573857638_0026 > 2019-12-18 15:04:42,371 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Waiting for the cluster to be allocated > 2019-12-18 15:04:42,372 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Deploying cluster, current state ACCEPTED > 2019-12-18 15:04:45,388 INFO org.apache.flink.yarn.YarnClusterDescriptor > - YARN application has been deployed successfully. > 2019-12-18 15:04:45,390 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found Web Interface debugboxcreate431x3.sa:36162 of > application 'application_1576573857638_0026'. > Job has been submitted with JobID 9140c70769f4271cc22ea8becaa26272 > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: org.apache.flink.client.program.ProgramInvocationException: > Job failed (JobID: 9140c70769f4271cc22ea8becaa26272) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205) > at > org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664) >
[jira] [Commented] (FLINK-15308) Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1
[ https://issues.apache.org/jira/browse/FLINK-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999706#comment-16999706 ] Feng Jiajie commented on FLINK-15308: - Hi [~kevin.cyj] , I can reproduce the problem every time. YARN cluster: 3 node ( 8 core 32GB ) {code:java} $ cat flink-conf.yaml | grep -v '^#' | grep -v '^$' jobmanager.rpc.address: localhost jobmanager.rpc.port: 6123 jobmanager.heap.size: 1024m taskmanager.memory.total-process.size: 1024m taskmanager.numberOfTaskSlots: 6 parallelism.default: 1 taskmanager.network.pipelined-shuffle.compression.enabled: true jobmanager.execution.failover-strategy: region {code} > Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1 > -- > > Key: FLINK-15308 > URL: https://issues.apache.org/jira/browse/FLINK-15308 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.10.0 > Environment: $ git log > commit 4b54da2c67692b1c9d43e1184c00899b0151b3ae > Author: bowen.li > Date: Tue Dec 17 17:37:03 2019 -0800 >Reporter: Feng Jiajie >Assignee: Yingjie Cao >Priority: Blocker > Attachments: image-2019-12-19-10-55-30-644.png > > > Job worked well with default flink-conf.yaml with > pipelined-shuffle.compression: > {code:java} > taskmanager.numberOfTaskSlots: 1 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > But when I set taskmanager.numberOfTaskSlots to 4 or 6: > {code:java} > taskmanager.numberOfTaskSlots: 6 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > job failed: > {code:java} > $ bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 12288m > ~/flink-example-1.0-SNAPSHOT.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/data/sa_cluster/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,907 INFO org.apache.flink.yarn.YarnClusterDescriptor > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-18 15:04:41,084 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Cluster specification: > ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=12288, > numberTaskManagers=1, slotsPerTaskManager=6} > 2019-12-18 15:04:42,344 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Submitting application master application_1576573857638_0026 > 2019-12-18 15:04:42,370 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1576573857638_0026 > 2019-12-18 15:04:42,371 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Waiting for the cluster to be allocated > 2019-12-18 15:04:42,372 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Deploying cluster, current state ACCEPTED > 2019-12-18 15:04:45,388 INFO org.apache.flink.yarn.YarnClusterDescriptor > - YARN application has been deployed successfully. > 2019-12-18 15:04:45,390 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found Web Interface debugboxcreate431x3.sa:36162 of > application 'application_1576573857638_0026'. > Job has been submitted with JobID 9140c70769f4271cc22ea8becaa26272 > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: org.apache.flink.client.program.ProgramInvocationException: > Job failed (JobID: 9140c7076
[jira] [Commented] (FLINK-15308) Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1
[ https://issues.apache.org/jira/browse/FLINK-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999694#comment-16999694 ] Yingjie Cao commented on FLINK-15308: - [~fengjiajie] I can not reproduce the problem in my test environment. Is there any other settings? !image-2019-12-19-10-55-30-644.png! > Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1 > -- > > Key: FLINK-15308 > URL: https://issues.apache.org/jira/browse/FLINK-15308 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.10.0 > Environment: $ git log > commit 4b54da2c67692b1c9d43e1184c00899b0151b3ae > Author: bowen.li > Date: Tue Dec 17 17:37:03 2019 -0800 >Reporter: Feng Jiajie >Assignee: Yingjie Cao >Priority: Blocker > Attachments: image-2019-12-19-10-55-30-644.png > > > Job worked well with default flink-conf.yaml with > pipelined-shuffle.compression: > {code:java} > taskmanager.numberOfTaskSlots: 1 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > But when I set taskmanager.numberOfTaskSlots to 4 or 6: > {code:java} > taskmanager.numberOfTaskSlots: 6 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > job failed: > {code:java} > $ bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 12288m > ~/flink-example-1.0-SNAPSHOT.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/data/sa_cluster/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,907 INFO org.apache.flink.yarn.YarnClusterDescriptor > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-18 15:04:41,084 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Cluster specification: > ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=12288, > numberTaskManagers=1, slotsPerTaskManager=6} > 2019-12-18 15:04:42,344 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Submitting application master application_1576573857638_0026 > 2019-12-18 15:04:42,370 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1576573857638_0026 > 2019-12-18 15:04:42,371 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Waiting for the cluster to be allocated > 2019-12-18 15:04:42,372 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Deploying cluster, current state ACCEPTED > 2019-12-18 15:04:45,388 INFO org.apache.flink.yarn.YarnClusterDescriptor > - YARN application has been deployed successfully. > 2019-12-18 15:04:45,390 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found Web Interface debugboxcreate431x3.sa:36162 of > application 'application_1576573857638_0026'. > Job has been submitted with JobID 9140c70769f4271cc22ea8becaa26272 > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: org.apache.flink.client.program.ProgramInvocationException: > Job failed (JobID: 9140c70769f4271cc22ea8becaa26272) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205) > at > org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:
[jira] [Commented] (FLINK-15308) Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1
[ https://issues.apache.org/jira/browse/FLINK-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16998934#comment-16998934 ] Yingjie Cao commented on FLINK-15308: - [~fengjiajie] Thanks for reporting the issue and sharing the code. I'll try to reproduce the problem. > Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1 > -- > > Key: FLINK-15308 > URL: https://issues.apache.org/jira/browse/FLINK-15308 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.10.0 > Environment: $ git log > commit 4b54da2c67692b1c9d43e1184c00899b0151b3ae > Author: bowen.li > Date: Tue Dec 17 17:37:03 2019 -0800 >Reporter: Feng Jiajie >Assignee: Yingjie Cao >Priority: Blocker > > Job worked well with default flink-conf.yaml with > pipelined-shuffle.compression: > {code:java} > taskmanager.numberOfTaskSlots: 1 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > But when I set taskmanager.numberOfTaskSlots to 4 or 6: > {code:java} > taskmanager.numberOfTaskSlots: 6 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > job failed: > {code:java} > $ bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 12288m > ~/flink-example-1.0-SNAPSHOT.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/data/sa_cluster/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,907 INFO org.apache.flink.yarn.YarnClusterDescriptor > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-18 15:04:41,084 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Cluster specification: > ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=12288, > numberTaskManagers=1, slotsPerTaskManager=6} > 2019-12-18 15:04:42,344 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Submitting application master application_1576573857638_0026 > 2019-12-18 15:04:42,370 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1576573857638_0026 > 2019-12-18 15:04:42,371 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Waiting for the cluster to be allocated > 2019-12-18 15:04:42,372 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Deploying cluster, current state ACCEPTED > 2019-12-18 15:04:45,388 INFO org.apache.flink.yarn.YarnClusterDescriptor > - YARN application has been deployed successfully. > 2019-12-18 15:04:45,390 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found Web Interface debugboxcreate431x3.sa:36162 of > application 'application_1576573857638_0026'. > Job has been submitted with JobID 9140c70769f4271cc22ea8becaa26272 > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: org.apache.flink.client.program.ProgramInvocationException: > Job failed (JobID: 9140c70769f4271cc22ea8becaa26272) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205) > at > org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:66
[jira] [Commented] (FLINK-15308) Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1
[ https://issues.apache.org/jira/browse/FLINK-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16998931#comment-16998931 ] Feng Jiajie commented on FLINK-15308: - [https://github.com/fengjiajie/my-flink-test/tree/master/src/main] run cmd: bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 8192m ~/laputa-flink-example-1.0-SNAPSHOT.jar > Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1 > -- > > Key: FLINK-15308 > URL: https://issues.apache.org/jira/browse/FLINK-15308 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.10.0 > Environment: $ git log > commit 4b54da2c67692b1c9d43e1184c00899b0151b3ae > Author: bowen.li > Date: Tue Dec 17 17:37:03 2019 -0800 >Reporter: Feng Jiajie >Assignee: Yingjie Cao >Priority: Blocker > > Job worked well with default flink-conf.yaml with > pipelined-shuffle.compression: > {code:java} > taskmanager.numberOfTaskSlots: 1 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > But when I set taskmanager.numberOfTaskSlots to 4 or 6: > {code:java} > taskmanager.numberOfTaskSlots: 6 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > job failed: > {code:java} > $ bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 12288m > ~/flink-example-1.0-SNAPSHOT.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/data/sa_cluster/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,907 INFO org.apache.flink.yarn.YarnClusterDescriptor > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-18 15:04:41,084 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Cluster specification: > ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=12288, > numberTaskManagers=1, slotsPerTaskManager=6} > 2019-12-18 15:04:42,344 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Submitting application master application_1576573857638_0026 > 2019-12-18 15:04:42,370 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1576573857638_0026 > 2019-12-18 15:04:42,371 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Waiting for the cluster to be allocated > 2019-12-18 15:04:42,372 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Deploying cluster, current state ACCEPTED > 2019-12-18 15:04:45,388 INFO org.apache.flink.yarn.YarnClusterDescriptor > - YARN application has been deployed successfully. > 2019-12-18 15:04:45,390 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found Web Interface debugboxcreate431x3.sa:36162 of > application 'application_1576573857638_0026'. > Job has been submitted with JobID 9140c70769f4271cc22ea8becaa26272 > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: org.apache.flink.client.program.ProgramInvocationException: > Job failed (JobID: 9140c70769f4271cc22ea8becaa26272) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205) > at > org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138) > at >
[jira] [Commented] (FLINK-15308) Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1
[ https://issues.apache.org/jira/browse/FLINK-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16998892#comment-16998892 ] Yingjie Cao commented on FLINK-15308: - [~fengjiajie] Could you share your code? I would like to see if I can reproduce the problem locally. > Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1 > -- > > Key: FLINK-15308 > URL: https://issues.apache.org/jira/browse/FLINK-15308 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.10.0 > Environment: $ git log > commit 4b54da2c67692b1c9d43e1184c00899b0151b3ae > Author: bowen.li > Date: Tue Dec 17 17:37:03 2019 -0800 >Reporter: Feng Jiajie >Priority: Blocker > > Job worked well with default flink-conf.yaml with > pipelined-shuffle.compression: > {code:java} > taskmanager.numberOfTaskSlots: 1 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > But when I set taskmanager.numberOfTaskSlots to 4 or 6: > {code:java} > taskmanager.numberOfTaskSlots: 6 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > job failed: > {code:java} > $ bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 12288m > ~/flink-example-1.0-SNAPSHOT.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/data/sa_cluster/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,907 INFO org.apache.flink.yarn.YarnClusterDescriptor > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-18 15:04:41,084 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Cluster specification: > ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=12288, > numberTaskManagers=1, slotsPerTaskManager=6} > 2019-12-18 15:04:42,344 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Submitting application master application_1576573857638_0026 > 2019-12-18 15:04:42,370 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1576573857638_0026 > 2019-12-18 15:04:42,371 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Waiting for the cluster to be allocated > 2019-12-18 15:04:42,372 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Deploying cluster, current state ACCEPTED > 2019-12-18 15:04:45,388 INFO org.apache.flink.yarn.YarnClusterDescriptor > - YARN application has been deployed successfully. > 2019-12-18 15:04:45,390 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found Web Interface debugboxcreate431x3.sa:36162 of > application 'application_1576573857638_0026'. > Job has been submitted with JobID 9140c70769f4271cc22ea8becaa26272 > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: org.apache.flink.client.program.ProgramInvocationException: > Job failed (JobID: 9140c70769f4271cc22ea8becaa26272) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205) > at > org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664) > at org.apache.flink.clien