[jira] [Comment Edited] (FLINK-24433) "No space left on device" in Azure e2e tests

2022-05-16 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17537748#comment-17537748
 ] 

Martijn Visser edited comment on FLINK-24433 at 5/17/22 4:14 AM:
-

Remove additional pre-installed packages to clean up more diskspace before 
starting the E2E tests:
Merged db6baf47130872ccdcd56949510704bbdf69c387 in master
Merged f97738c28bc008fa0393d8f8d430e5c78890f9a4 in release-1.15
Merged 64dac77e95f0ac9fde6f2a789dd5fb5203c16fdf in release-1.14

I'm keeping it open until the nightly builds have been completed. Hopefully we 
can also have a more permanent fix with FLINK-27649


was (Author: martijnvisser):
Merged db6baf47130872ccdcd56949510704bbdf69c387 in master: 
Remove additional pre-installed packages to clean up more diskspace before 
starting the E2E tests. 

I'm keeping it open until the nightly builds have been completed. Hopefully we 
can also have a more permanent fix with FLINK-27649

> "No space left on device" in Azure e2e tests
> 
>
> Key: FLINK-24433
> URL: https://issues.apache.org/jira/browse/FLINK-24433
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines
>Affects Versions: 1.15.0
>Reporter: Dawid Wysakowicz
>Assignee: Martijn Visser
>Priority: Blocker
>  Labels: auto-deprioritized-critical, pull-request-available, 
> test-stability
> Fix For: 1.16.0, 1.14.5, 1.15.1
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=24668=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=19772
> {code}
> Sep 30 17:08:42 Job has been submitted with JobID 
> 5594c18e128a328ede39cfa59cb3cb07
> Sep 30 17:08:56 2021-09-30 17:08:56,809 main ERROR Recovering from 
> StringBuilderEncoder.encode('2021-09-30 17:08:56,807 WARN  
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An 
> exception occurred when fetching query results
> Sep 30 17:08:56 java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: [Internal server 
> error.,  Sep 30 17:08:56 org.apache.flink.runtime.messages.FlinkJobNotFoundException: 
> Could not find Flink job (5594c18e128a328ede39cfa59cb3cb07)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:923)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:937)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.deliverCoordinationRequestToCoordina2021-09-30T17:08:57.1584224Z
>  ##[error]No space left on device
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (FLINK-24433) "No space left on device" in Azure e2e tests

2022-05-13 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536562#comment-17536562
 ] 

Martijn Visser edited comment on FLINK-24433 at 5/13/22 10:55 AM:
--

In the end, the TARbal clean-up failed. While investigating further, I noticed 
that multiple e2e tests had debug logging (though labeled INFO) activated. I've 
disabled that to make sure that logs aren't overflooding. This seems to have 
fixed the issue for now. 

Fixed in
master: 4c138a440f8de315470a663fd751b7293ff3ceb8
release-1.15: c57be81ce5d121523db26c86a48cff9222f688d2
release-1.14: 713f0e03a500852564c06241a2e64d141b31e4fe


was (Author: martijnvisser):
Fixed in
master: 4c138a440f8de315470a663fd751b7293ff3ceb8
release-1.15: c57be81ce5d121523db26c86a48cff9222f688d2
release-1.14: 713f0e03a500852564c06241a2e64d141b31e4fe

> "No space left on device" in Azure e2e tests
> 
>
> Key: FLINK-24433
> URL: https://issues.apache.org/jira/browse/FLINK-24433
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines
>Affects Versions: 1.15.0
>Reporter: Dawid Wysakowicz
>Assignee: Martijn Visser
>Priority: Blocker
>  Labels: auto-deprioritized-critical, pull-request-available, 
> test-stability
> Fix For: 1.16.0, 1.14.5, 1.15.1
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=24668=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=19772
> {code}
> Sep 30 17:08:42 Job has been submitted with JobID 
> 5594c18e128a328ede39cfa59cb3cb07
> Sep 30 17:08:56 2021-09-30 17:08:56,809 main ERROR Recovering from 
> StringBuilderEncoder.encode('2021-09-30 17:08:56,807 WARN  
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An 
> exception occurred when fetching query results
> Sep 30 17:08:56 java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: [Internal server 
> error.,  Sep 30 17:08:56 org.apache.flink.runtime.messages.FlinkJobNotFoundException: 
> Could not find Flink job (5594c18e128a328ede39cfa59cb3cb07)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:923)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:937)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.deliverCoordinationRequestToCoordina2021-09-30T17:08:57.1584224Z
>  ##[error]No space left on device
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (FLINK-24433) "No space left on device" in Azure e2e tests

2022-05-12 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17535942#comment-17535942
 ] 

Martijn Visser edited comment on FLINK-24433 at 5/12/22 8:49 AM:
-

-I'm suspecting FLINK-27578 is the cause of this.-

The Elasticsearch error implies that it's suffering from a lack of diskspace, 
so it must be a test prior to this


was (Author: martijnvisser):
-I'm suspecting FLINK-27578 is the cause of this. -

The Elasticsearch error implies that it's suffering from a lack of diskspace, 
so it must be a test prior to this

> "No space left on device" in Azure e2e tests
> 
>
> Key: FLINK-24433
> URL: https://issues.apache.org/jira/browse/FLINK-24433
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines
>Affects Versions: 1.15.0
>Reporter: Dawid Wysakowicz
>Priority: Blocker
>  Labels: auto-deprioritized-critical, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=24668=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=19772
> {code}
> Sep 30 17:08:42 Job has been submitted with JobID 
> 5594c18e128a328ede39cfa59cb3cb07
> Sep 30 17:08:56 2021-09-30 17:08:56,809 main ERROR Recovering from 
> StringBuilderEncoder.encode('2021-09-30 17:08:56,807 WARN  
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An 
> exception occurred when fetching query results
> Sep 30 17:08:56 java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: [Internal server 
> error.,  Sep 30 17:08:56 org.apache.flink.runtime.messages.FlinkJobNotFoundException: 
> Could not find Flink job (5594c18e128a328ede39cfa59cb3cb07)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:923)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:937)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.deliverCoordinationRequestToCoordina2021-09-30T17:08:57.1584224Z
>  ##[error]No space left on device
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (FLINK-24433) "No space left on device" in Azure e2e tests

2022-05-12 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17535942#comment-17535942
 ] 

Martijn Visser edited comment on FLINK-24433 at 5/12/22 8:49 AM:
-

-I'm suspecting FLINK-27578 is the cause of this. -

The Elasticsearch error implies that it's suffering from a lack of diskspace, 
so it must be a test prior to this


was (Author: martijnvisser):
I'm suspecting FLINK-27578 is the cause of this. 

> "No space left on device" in Azure e2e tests
> 
>
> Key: FLINK-24433
> URL: https://issues.apache.org/jira/browse/FLINK-24433
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines
>Affects Versions: 1.15.0
>Reporter: Dawid Wysakowicz
>Priority: Blocker
>  Labels: auto-deprioritized-critical, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=24668=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=19772
> {code}
> Sep 30 17:08:42 Job has been submitted with JobID 
> 5594c18e128a328ede39cfa59cb3cb07
> Sep 30 17:08:56 2021-09-30 17:08:56,809 main ERROR Recovering from 
> StringBuilderEncoder.encode('2021-09-30 17:08:56,807 WARN  
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An 
> exception occurred when fetching query results
> Sep 30 17:08:56 java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: [Internal server 
> error.,  Sep 30 17:08:56 org.apache.flink.runtime.messages.FlinkJobNotFoundException: 
> Could not find Flink job (5594c18e128a328ede39cfa59cb3cb07)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:923)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:937)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.deliverCoordinationRequestToCoordina2021-09-30T17:08:57.1584224Z
>  ##[error]No space left on device
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (FLINK-24433) "No space left on device" in Azure e2e tests

2022-02-07 Thread Yun Gao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17488074#comment-17488074
 ] 

Yun Gao edited comment on FLINK-24433 at 2/7/22, 12:55 PM:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=30806=logs=e1276d0f-df12-55ec-86b5-c0ad597d83c9
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=30806=logs=e1276d0f-df12-55ec-86b5-c0ad597d83c9


was (Author: gaoyunhaii):
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=30806=logs=e1276d0f-df12-55ec-86b5-c0ad597d83c9

> "No space left on device" in Azure e2e tests
> 
>
> Key: FLINK-24433
> URL: https://issues.apache.org/jira/browse/FLINK-24433
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines
>Affects Versions: 1.15.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=24668=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=19772
> {code}
> Sep 30 17:08:42 Job has been submitted with JobID 
> 5594c18e128a328ede39cfa59cb3cb07
> Sep 30 17:08:56 2021-09-30 17:08:56,809 main ERROR Recovering from 
> StringBuilderEncoder.encode('2021-09-30 17:08:56,807 WARN  
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An 
> exception occurred when fetching query results
> Sep 30 17:08:56 java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: [Internal server 
> error.,  Sep 30 17:08:56 org.apache.flink.runtime.messages.FlinkJobNotFoundException: 
> Could not find Flink job (5594c18e128a328ede39cfa59cb3cb07)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:923)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:937)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.deliverCoordinationRequestToCoordina2021-09-30T17:08:57.1584224Z
>  ##[error]No space left on device
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-24433) "No space left on device" in Azure e2e tests

2022-02-03 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17486858#comment-17486858
 ] 

Matthias Pohl edited comment on FLINK-24433 at 2/4/22, 7:48 AM:


I've experiencing the out of space issue consistently now on my own AzureCI 
builds like: 
[https://dev.azure.com/mapohl/flink/_build/results?buildId=680=logs=dafbab6d-4616-5d7b-ee37-3c54e4828fd7=e204f081-e6cd-5c04-4f4c-919639b63be9=1001]

It's always stopping at {{KinesisFirehoseSinkITCase}} which keeps me thinking 
whether it's related to FLINK-25924

I verified with the build failure [~trohrmann] reported. The stacktrace 
indicates that it's also failing in {{KinesisFirehoseSinkITCase}}


was (Author: mapohl):
I've experiencing the out of space issue consistently now on my own AzureCI 
builds like: 
[https://dev.azure.com/mapohl/flink/_build/results?buildId=680=logs=dafbab6d-4616-5d7b-ee37-3c54e4828fd7=e204f081-e6cd-5c04-4f4c-919639b63be9=1001]

It's always stopping at {{KinesisFirehoseSinkITCase}} which keeps me thinking 
whether it's related to FLINK-25924

> "No space left on device" in Azure e2e tests
> 
>
> Key: FLINK-24433
> URL: https://issues.apache.org/jira/browse/FLINK-24433
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines
>Affects Versions: 1.15.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=24668=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=070ff179-953e-5bda-71fa-d6599415701c=19772
> {code}
> Sep 30 17:08:42 Job has been submitted with JobID 
> 5594c18e128a328ede39cfa59cb3cb07
> Sep 30 17:08:56 2021-09-30 17:08:56,809 main ERROR Recovering from 
> StringBuilderEncoder.encode('2021-09-30 17:08:56,807 WARN  
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher [] - An 
> exception occurred when fetching query results
> Sep 30 17:08:56 java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: [Internal server 
> error.,  Sep 30 17:08:56 org.apache.flink.runtime.messages.FlinkJobNotFoundException: 
> Could not find Flink job (5594c18e128a328ede39cfa59cb3cb07)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.getJobMasterGateway(Dispatcher.java:923)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.performOperationOnJobMasterGateway(Dispatcher.java:937)
> Sep 30 17:08:56   at 
> org.apache.flink.runtime.dispatcher.Dispatcher.deliverCoordinationRequestToCoordina2021-09-30T17:08:57.1584224Z
>  ##[error]No space left on device
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)