[jira] [Comment Edited] (FLINK-15308) Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1
[ https://issues.apache.org/jira/browse/FLINK-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998931#comment-16998931 ] Feng Jiajie edited comment on FLINK-15308 at 12/18/19 8:35 AM: --- Here is my test code: [https://github.com/fengjiajie/my-flink-test|https://github.com/fengjiajie/my-flink-test/tree/master/src/main] run cmd: {code:java} bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 8192m ~/laputa-flink-example-1.0-SNAPSHOT.jar {code} and {code:java} nc -l 31212 {code} on the host debugboxcreate431x1 `cn/kbyte/StreamingJob.java:88` {code:java} new SocketClientSink<>("debugboxcreate431x1", 31212, new SimpleStringSchema())) {code} [~kevin.cyj] was (Author: fengjiajie): [https://github.com/fengjiajie/my-flink-test|https://github.com/fengjiajie/my-flink-test/tree/master/src/main] run cmd: bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 8192m ~/laputa-flink-example-1.0-SNAPSHOT.jar [~kevin.cyj] > Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1 > -- > > Key: FLINK-15308 > URL: https://issues.apache.org/jira/browse/FLINK-15308 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.10.0 > Environment: $ git log > commit 4b54da2c67692b1c9d43e1184c00899b0151b3ae > Author: bowen.li > Date: Tue Dec 17 17:37:03 2019 -0800 >Reporter: Feng Jiajie >Assignee: Yingjie Cao >Priority: Blocker > > Job worked well with default flink-conf.yaml with > pipelined-shuffle.compression: > {code:java} > taskmanager.numberOfTaskSlots: 1 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > But when I set taskmanager.numberOfTaskSlots to 4 or 6: > {code:java} > taskmanager.numberOfTaskSlots: 6 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > job failed: > {code:java} > $ bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 12288m > ~/flink-example-1.0-SNAPSHOT.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/data/sa_cluster/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,907 INFO org.apache.flink.yarn.YarnClusterDescriptor > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-18 15:04:41,084 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Cluster specification: > ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=12288, > numberTaskManagers=1, slotsPerTaskManager=6} > 2019-12-18 15:04:42,344 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Submitting application master application_1576573857638_0026 > 2019-12-18 15:04:42,370 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1576573857638_0026 > 2019-12-18 15:04:42,371 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Waiting for the cluster to be allocated > 2019-12-18 15:04:42,372 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Deploying cluster, current state ACCEPTED > 2019-12-18 15:04:45,388 INFO org.apache.flink.yarn.YarnClusterDescriptor > - YARN application has been deployed successfully. > 2019-12-18 15:04:45,390 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found Web Interface debugboxcreate431x3.sa:36162 of > application 'application_1576573857638_0026'. > Job has been submitted with JobID 9140c70769f4271cc22ea8becaa26272 >
[jira] [Comment Edited] (FLINK-15308) Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1
[ https://issues.apache.org/jira/browse/FLINK-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998931#comment-16998931 ] Feng Jiajie edited comment on FLINK-15308 at 12/18/19 8:30 AM: --- [https://github.com/fengjiajie/my-flink-test|https://github.com/fengjiajie/my-flink-test/tree/master/src/main] run cmd: bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 8192m ~/laputa-flink-example-1.0-SNAPSHOT.jar [~kevin.cyj] was (Author: fengjiajie): [https://github.com/fengjiajie/my-flink-test/tree/master/src/main] run cmd: bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 8192m ~/laputa-flink-example-1.0-SNAPSHOT.jar > Job failed when enable pipelined-shuffle.compression and numberOfTaskSlots > 1 > -- > > Key: FLINK-15308 > URL: https://issues.apache.org/jira/browse/FLINK-15308 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.10.0 > Environment: $ git log > commit 4b54da2c67692b1c9d43e1184c00899b0151b3ae > Author: bowen.li > Date: Tue Dec 17 17:37:03 2019 -0800 >Reporter: Feng Jiajie >Assignee: Yingjie Cao >Priority: Blocker > > Job worked well with default flink-conf.yaml with > pipelined-shuffle.compression: > {code:java} > taskmanager.numberOfTaskSlots: 1 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > But when I set taskmanager.numberOfTaskSlots to 4 or 6: > {code:java} > taskmanager.numberOfTaskSlots: 6 > taskmanager.network.pipelined-shuffle.compression.enabled: true > {code} > job failed: > {code:java} > $ bin/flink run -m yarn-cluster -p 16 -yjm 1024m -ytm 12288m > ~/flink-example-1.0-SNAPSHOT.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/data/sa_cluster/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,514 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The configuration directory > ('/data/build/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT/conf') > already contains a LOG4J config file.If you want to use logback, then please > delete or rename the log configuration file. > 2019-12-18 15:04:40,907 INFO org.apache.flink.yarn.YarnClusterDescriptor > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-18 15:04:41,084 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Cluster specification: > ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=12288, > numberTaskManagers=1, slotsPerTaskManager=6} > 2019-12-18 15:04:42,344 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Submitting application master application_1576573857638_0026 > 2019-12-18 15:04:42,370 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1576573857638_0026 > 2019-12-18 15:04:42,371 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Waiting for the cluster to be allocated > 2019-12-18 15:04:42,372 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Deploying cluster, current state ACCEPTED > 2019-12-18 15:04:45,388 INFO org.apache.flink.yarn.YarnClusterDescriptor > - YARN application has been deployed successfully. > 2019-12-18 15:04:45,390 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found Web Interface debugboxcreate431x3.sa:36162 of > application 'application_1576573857638_0026'. > Job has been submitted with JobID 9140c70769f4271cc22ea8becaa26272 > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: org.apache.flink.client.program.ProgramInvocationException: > Job failed (JobID: 9140c70769f4271cc22ea8becaa26272) > at >