Flink distribution housekeeping for YARN sessions

Theo Diefenthal Tue, 28 Jan 2020 09:22:22 -0800

Hi there, 

Today I realized that we currently have a lot of not housekept flink 
distribution jar files and would like to know what to do about this, i.e. how 
to proper housekeep them.


In the job submitting HDFS home directory, I find a subdirectory called 
`.flink` with hundreds of subfolders like `application_1573731655031_0420`, 
having the following structure: 

-rw-r--r-- 3 dev dev 861 2020-01-27 21:17 
/user/dev/.flink/application_1580155950981_0010/4797ff6e-853b-460c-81b3-34078814c5c9-taskmanager-conf.yaml
 
-rw-r--r-- 3 dev dev 691 2020-01-27 21:16 
/user/dev/.flink/application_1580155950981_0010/application_1580155950981_0010-flink-conf.yaml2755466919863419496.tmp
 
-rw-r--r-- 3 dev dev 861 2020-01-27 21:17 
/user/dev/.flink/application_1580155950981_0010/fdb5ef57-c140-4f6d-9791-c226eb1438ce-taskmanager-conf.yaml
 
-rw-r--r-- 3 dev dev 92.2 M 2020-01-27 21:16 
/user/dev/.flink/application_1580155950981_0010/flink-dist_2.11-1.9.1.jar 
drwxr-xr-x - dev dev 0 2020-01-27 21:16 
/user/dev/.flink/application_1580155950981_0010/lib 
-rw-r--r-- 3 dev dev 2.6 K 2020-01-27 21:16 
/user/dev/.flink/application_1580155950981_0010/log4j.properties 
-rw-r--r-- 3 dev dev 2.3 K 2020-01-27 21:16 
/user/dev/.flink/application_1580155950981_0010/logback.xml 
drwxr-xr-x - dev dev 0 2020-01-27 21:16 
/user/dev/.flink/application_1580155950981_0010/plugins 

With having tons of those folders (For each flink session we launched/killed in 
our CI CD pipeline), they sum up to some terrabytes in our HDFS in used space. 
I suppose, I kill our flink sessions wrongly. We start and stop sessions and 
and jobs separately like so: 

Start: 
$ {OS_ROOT} /flink/bin/yarn-session.sh -jm 4g -tm 32g --name " $ 
{FLINK_SESSION_NAME} " -d -Denv.java.opts= " -XX:+HeapDumpOnOutOfMemoryError " 
$ {OS_ROOT}/flink/bin/flink run -m $ {FLINK_HOST} [..savepoint/checkpoint 
options...] -d -n " $ {JOB_JAR} " $* 
Stop 
$ {OS_ROOT} /flink/bin/flink stop -p $ {SAVEPOINT_BASEDIR}/ $ {FLINK_JOB_NAME} 
-m $ {FLINK_HOST} $ {ID} 
yarn application -kill " $ {ID} " 

yarn application -kill was the best I could find as the flink docu states, the 
linux session process should only be closed (" Stop the YARN session by 
stopping the unix process (using CTRL+C) or by entering ‘stop’ into the client. 
"). 

Now my question: Is there a more elegant way to kill a yarn session (remotely 
from some host in the cluster, not necessarily the one starting the detached 
session), which also does the housekeeping then? Or should I do the 
housekeeping myself manually? (Pretty easy to script). Do I need to expect any 
more side effects when killing the session with "yarn application -kill"? 

Best regards 
Theo 

-- 
SCOOP Software GmbH - Gut Maarhausen - Eiler Straße 3 P - D-51107 Köln 
Theo Diefenthal 

T +49 221 801916-196 - F +49 221 801916-17 - M +49 160 90506575 
theo.diefent...@scoop-software.de - www.scoop-software.de 
Sitz der Gesellschaft: Köln, Handelsregister: Köln, 
Handelsregisternummer: HRB 36625 
Geschäftsführung: Dr. Oleg Balovnev, Frank Heinen, 
Martin Müller-Rohde, Dr. Wolfgang Reddig, Roland Scheel

Flink distribution housekeeping for YARN sessions

Reply via email to