[jira] [Comment Edited] (FLINK-24433) "No space left on device" in Azure e2e tests
[ https://issues.apache.org/jira/browse/FLINK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537748#comment-17537748 ] Martijn Visser edited comment on FLINK-24433 at 5/17/22 4:14 AM: - Remove additional pre-installed packages to clean up more diskspace before starting the E2E tests: Merged db6baf47130872ccdcd56949510704bbdf69c387 in master Merged f97738c28bc008fa0393d8f8d430e5c78890f9a4 in release-1.15 Merged 64dac77e95f0ac9fde6f2a789dd5fb5203c16fdf in release-1.14 I'm keeping it open until the nightly builds have been completed. Hopefully we can also have a more permanent fix with FLINK-27649 was (Author: martijnvisser): Merged db6baf47130872ccdcd56949510704bbdf69c387 in master: Remove additional pre-installed packages to clean up more diskspace before starting the E2E tests. I'm keeping it open until the nightly builds have been completed. Hopefully we can also have a more permanent fix with FLINK-27649 > "No space left on device" in Azure e2e tests > > > Key: FLINK-24433 > URL: https://issues.apache.org/jira/browse/FLINK-24433 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines >Affects Versions: 1.15.0 >Reporter: Dawid Wysakowicz >Assignee: Martijn Visser >Priority: Blocker > Labels: auto-deprioritized-critical, pull-request-available, > test-stability > Fix For: 1.16.0, 1.14.5, 1.15.1 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=24668=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=19772 > {code} > Sep 30 17:08:42 Job has been submitted with JobID > 5594c18e128a328ede39cfa59cb3cb07 > Sep 30 17:08:56 2021-09-30 17:08:56,809 main ERROR Recovering from > StringBuilderEncoder.encode('2021-09-30 17:08:56,807 WARN > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An > exception occurred when fetching query results > Sep 30 17:08:56 java.util.concurrent.ExecutionException: > org.apache.flink.runtime.rest.util.RestClientException: [Internal server > error., Sep 30 17:08:56 org.apache.flink.runtime.messages.FlinkJobNotFoundException: > Could not find Flink job (5594c18e128a328ede39cfa59cb3cb07) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:923) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:937) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.deliverCoordinationRequestToCoordina2021-09-30T17:08:57.1584224Z > ##[error]No space left on device > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (FLINK-24433) "No space left on device" in Azure e2e tests
[ https://issues.apache.org/jira/browse/FLINK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536562#comment-17536562 ] Martijn Visser edited comment on FLINK-24433 at 5/13/22 10:55 AM: -- In the end, the TARbal clean-up failed. While investigating further, I noticed that multiple e2e tests had debug logging (though labeled INFO) activated. I've disabled that to make sure that logs aren't overflooding. This seems to have fixed the issue for now. Fixed in master: 4c138a440f8de315470a663fd751b7293ff3ceb8 release-1.15: c57be81ce5d121523db26c86a48cff9222f688d2 release-1.14: 713f0e03a500852564c06241a2e64d141b31e4fe was (Author: martijnvisser): Fixed in master: 4c138a440f8de315470a663fd751b7293ff3ceb8 release-1.15: c57be81ce5d121523db26c86a48cff9222f688d2 release-1.14: 713f0e03a500852564c06241a2e64d141b31e4fe > "No space left on device" in Azure e2e tests > > > Key: FLINK-24433 > URL: https://issues.apache.org/jira/browse/FLINK-24433 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines >Affects Versions: 1.15.0 >Reporter: Dawid Wysakowicz >Assignee: Martijn Visser >Priority: Blocker > Labels: auto-deprioritized-critical, pull-request-available, > test-stability > Fix For: 1.16.0, 1.14.5, 1.15.1 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=24668=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=19772 > {code} > Sep 30 17:08:42 Job has been submitted with JobID > 5594c18e128a328ede39cfa59cb3cb07 > Sep 30 17:08:56 2021-09-30 17:08:56,809 main ERROR Recovering from > StringBuilderEncoder.encode('2021-09-30 17:08:56,807 WARN > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An > exception occurred when fetching query results > Sep 30 17:08:56 java.util.concurrent.ExecutionException: > org.apache.flink.runtime.rest.util.RestClientException: [Internal server > error., Sep 30 17:08:56 org.apache.flink.runtime.messages.FlinkJobNotFoundException: > Could not find Flink job (5594c18e128a328ede39cfa59cb3cb07) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:923) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:937) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.deliverCoordinationRequestToCoordina2021-09-30T17:08:57.1584224Z > ##[error]No space left on device > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (FLINK-24433) "No space left on device" in Azure e2e tests
[ https://issues.apache.org/jira/browse/FLINK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17535942#comment-17535942 ] Martijn Visser edited comment on FLINK-24433 at 5/12/22 8:49 AM: - -I'm suspecting FLINK-27578 is the cause of this.- The Elasticsearch error implies that it's suffering from a lack of diskspace, so it must be a test prior to this was (Author: martijnvisser): -I'm suspecting FLINK-27578 is the cause of this. - The Elasticsearch error implies that it's suffering from a lack of diskspace, so it must be a test prior to this > "No space left on device" in Azure e2e tests > > > Key: FLINK-24433 > URL: https://issues.apache.org/jira/browse/FLINK-24433 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines >Affects Versions: 1.15.0 >Reporter: Dawid Wysakowicz >Priority: Blocker > Labels: auto-deprioritized-critical, test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=24668=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=19772 > {code} > Sep 30 17:08:42 Job has been submitted with JobID > 5594c18e128a328ede39cfa59cb3cb07 > Sep 30 17:08:56 2021-09-30 17:08:56,809 main ERROR Recovering from > StringBuilderEncoder.encode('2021-09-30 17:08:56,807 WARN > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An > exception occurred when fetching query results > Sep 30 17:08:56 java.util.concurrent.ExecutionException: > org.apache.flink.runtime.rest.util.RestClientException: [Internal server > error., Sep 30 17:08:56 org.apache.flink.runtime.messages.FlinkJobNotFoundException: > Could not find Flink job (5594c18e128a328ede39cfa59cb3cb07) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:923) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:937) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.deliverCoordinationRequestToCoordina2021-09-30T17:08:57.1584224Z > ##[error]No space left on device > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (FLINK-24433) "No space left on device" in Azure e2e tests
[ https://issues.apache.org/jira/browse/FLINK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17535942#comment-17535942 ] Martijn Visser edited comment on FLINK-24433 at 5/12/22 8:49 AM: - -I'm suspecting FLINK-27578 is the cause of this. - The Elasticsearch error implies that it's suffering from a lack of diskspace, so it must be a test prior to this was (Author: martijnvisser): I'm suspecting FLINK-27578 is the cause of this. > "No space left on device" in Azure e2e tests > > > Key: FLINK-24433 > URL: https://issues.apache.org/jira/browse/FLINK-24433 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines >Affects Versions: 1.15.0 >Reporter: Dawid Wysakowicz >Priority: Blocker > Labels: auto-deprioritized-critical, test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=24668=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=19772 > {code} > Sep 30 17:08:42 Job has been submitted with JobID > 5594c18e128a328ede39cfa59cb3cb07 > Sep 30 17:08:56 2021-09-30 17:08:56,809 main ERROR Recovering from > StringBuilderEncoder.encode('2021-09-30 17:08:56,807 WARN > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An > exception occurred when fetching query results > Sep 30 17:08:56 java.util.concurrent.ExecutionException: > org.apache.flink.runtime.rest.util.RestClientException: [Internal server > error., Sep 30 17:08:56 org.apache.flink.runtime.messages.FlinkJobNotFoundException: > Could not find Flink job (5594c18e128a328ede39cfa59cb3cb07) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:923) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:937) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.deliverCoordinationRequestToCoordina2021-09-30T17:08:57.1584224Z > ##[error]No space left on device > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (FLINK-24433) "No space left on device" in Azure e2e tests
[ https://issues.apache.org/jira/browse/FLINK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17488074#comment-17488074 ] Yun Gao edited comment on FLINK-24433 at 2/7/22, 12:55 PM: --- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=30806=logs=e1276d0f-df12-55ec-86b5-c0ad597d83c9 https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=30806=logs=e1276d0f-df12-55ec-86b5-c0ad597d83c9 was (Author: gaoyunhaii): https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=30806=logs=e1276d0f-df12-55ec-86b5-c0ad597d83c9 > "No space left on device" in Azure e2e tests > > > Key: FLINK-24433 > URL: https://issues.apache.org/jira/browse/FLINK-24433 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines >Affects Versions: 1.15.0 >Reporter: Dawid Wysakowicz >Priority: Critical > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=24668=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=19772 > {code} > Sep 30 17:08:42 Job has been submitted with JobID > 5594c18e128a328ede39cfa59cb3cb07 > Sep 30 17:08:56 2021-09-30 17:08:56,809 main ERROR Recovering from > StringBuilderEncoder.encode('2021-09-30 17:08:56,807 WARN > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An > exception occurred when fetching query results > Sep 30 17:08:56 java.util.concurrent.ExecutionException: > org.apache.flink.runtime.rest.util.RestClientException: [Internal server > error., Sep 30 17:08:56 org.apache.flink.runtime.messages.FlinkJobNotFoundException: > Could not find Flink job (5594c18e128a328ede39cfa59cb3cb07) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:923) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:937) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.deliverCoordinationRequestToCoordina2021-09-30T17:08:57.1584224Z > ##[error]No space left on device > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (FLINK-24433) "No space left on device" in Azure e2e tests
[ https://issues.apache.org/jira/browse/FLINK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17486858#comment-17486858 ] Matthias Pohl edited comment on FLINK-24433 at 2/4/22, 7:48 AM: I've experiencing the out of space issue consistently now on my own AzureCI builds like: [https://dev.azure.com/mapohl/flink/_build/results?buildId=680=logs=dafbab6d-4616-5d7b-ee37-3c54e4828fd7=e204f081-e6cd-5c04-4f4c-919639b63be9=1001] It's always stopping at {{KinesisFirehoseSinkITCase}} which keeps me thinking whether it's related to FLINK-25924 I verified with the build failure [~trohrmann] reported. The stacktrace indicates that it's also failing in {{KinesisFirehoseSinkITCase}} was (Author: mapohl): I've experiencing the out of space issue consistently now on my own AzureCI builds like: [https://dev.azure.com/mapohl/flink/_build/results?buildId=680=logs=dafbab6d-4616-5d7b-ee37-3c54e4828fd7=e204f081-e6cd-5c04-4f4c-919639b63be9=1001] It's always stopping at {{KinesisFirehoseSinkITCase}} which keeps me thinking whether it's related to FLINK-25924 > "No space left on device" in Azure e2e tests > > > Key: FLINK-24433 > URL: https://issues.apache.org/jira/browse/FLINK-24433 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines >Affects Versions: 1.15.0 >Reporter: Dawid Wysakowicz >Priority: Critical > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=24668=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=19772 > {code} > Sep 30 17:08:42 Job has been submitted with JobID > 5594c18e128a328ede39cfa59cb3cb07 > Sep 30 17:08:56 2021-09-30 17:08:56,809 main ERROR Recovering from > StringBuilderEncoder.encode('2021-09-30 17:08:56,807 WARN > org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An > exception occurred when fetching query results > Sep 30 17:08:56 java.util.concurrent.ExecutionException: > org.apache.flink.runtime.rest.util.RestClientException: [Internal server > error., Sep 30 17:08:56 org.apache.flink.runtime.messages.FlinkJobNotFoundException: > Could not find Flink job (5594c18e128a328ede39cfa59cb3cb07) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:923) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:937) > Sep 30 17:08:56 at > org.apache.flink.runtime.dispatcher.Dispatcher.deliverCoordinationRequestToCoordina2021-09-30T17:08:57.1584224Z > ##[error]No space left on device > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)