The problem was because of the large quantity of stale docker images
generated by the Python portable tests and HDFS IT.
Dumping the docker disk usage gives me:
TYPE TOTAL ACTIVE SIZE
RECLAIMABLE
*Images 1039 356 424GB
384.2GB (90%)*
Containers 987 2 2.042GB
2.041GB (99%)
Local Volumes 126 0 392.8MB
392.8MB (100%)
REPOSITORY
TAG IMAGE ID CREATED SIZE
SHARED SIZE UNIQUE SIZE CONTAINERS
jenkins-docker-apache.bintray.io/beam/python3
latest ff1b949f4442 22 hours ago 1.639GB
922.3MB 716.9MB 0
jenkins-docker-apache.bintray.io/beam/python
latest 1dda7b9d9748 22 hours ago 1.624GB
913.7MB 710.3MB 0
<none>
<none> 05458187a0e3 22 hours ago
732.9MB 625.1MB 107.8MB 4
<none>
<none> 896f35dd685f 23 hours ago
1.639GB 922.3MB 716.9MB 0
<none>
<none> db4d24ca9f2b 23 hours ago
1.624GB 913.7MB 710.3MB 0
<none>
<none> 547df4d71c31 23 hours ago
732.9MB 625.1MB 107.8MB 4
<none>
<none> dd7d9582c3e0 23 hours ago
1.639GB 922.3MB 716.9MB 0
<none>
<none> 664aae255239 23 hours ago
1.624GB 913.7MB 710.3MB 0
<none>
<none> b528fedf9228 23 hours ago
732.9MB 625.1MB 107.8MB 4
<none>
<none> 8e996f22435e 25 hours ago
1.624GB 913.7MB 710.3MB 0
hdfs_it-jenkins-beam_postcommit_python_verify_pr-818_test latest
24b73b3fec06 25 hours ago 1.305GB 965.7MB
339.5MB 0
<none>
<none> 096325fb48de 25 hours ago
732.9MB 625.1MB 107.8MB 2
jenkins-docker-apache.bintray.io/beam/java
latest c36d8ff2945d 25 hours ago 685.6MB
625.1MB 60.52MB 0
<none>
<none> 11c86ebe025f 26 hours ago
1.639GB 922.3MB 716.9MB 0
<none>
<none> 2ecd69c89ec1 26 hours ago
1.624GB 913.7MB 710.3MB 0
hdfs_it-jenkins-beam_postcommit_python_verify-8590_test latest
3d1d589d44fe 2 days ago 1.305GB 965.7MB
339.5MB 0
hdfs_it-jenkins-beam_postcommit_python_verify_pr-801_test latest
d1cc503ebe8e 2 days ago 1.305GB 965.7MB
339.2MB 0
hdfs_it-jenkins-beam_postcommit_python_verify-8577_test latest
8582c6ca6e15 3 days ago 1.305GB 965.7MB
339.2MB 0
hdfs_it-jenkins-beam_postcommit_python_verify-8576_test latest
4591e0948170 3 days ago 1.305GB 965.7MB
339.2MB 0
hdfs_it-jenkins-beam_postcommit_python_verify-8575_test latest
ab181c49d56e 4 days ago 1.305GB 965.7MB
339.2MB 0
hdfs_it-jenkins-beam_postcommit_python_verify-8573_test latest
2104ba0a6db7 4 days ago 1.305GB 965.7MB
339.2MB 0
...
<1000+ images>
I removed unused the images and the beam15 is back now.
Opened https://issues.apache.org/jira/browse/BEAM-7650.
Ankur, I assigned the issue to you. Feel free to reassign it if needed.
Thank you.
Yifan
On Thu, Jun 27, 2019 at 11:29 AM Yifan Zou <[email protected]> wrote:
> Something were eating the disk. Disconnected the worker so jobs could be
> allocated to other nodes. Will look deeper.
> Filesystem Size Used Avail Use% Mounted on
> /dev/sda1 485G 485G 96K 100% /
>
>
> On Thu, Jun 27, 2019 at 10:54 AM Yifan Zou <[email protected]> wrote:
>
>> I'm on it.
>>
>> On Thu, Jun 27, 2019 at 10:17 AM Udi Meiri <[email protected]> wrote:
>>
>>> Opened a bug here: https://issues.apache.org/jira/browse/BEAM-7648
>>>
>>> Can someone investigate what's going on?
>>>
>>