[jira] [Commented] (FLINK-26030) Set FLINK_LIB_DIR to 'lib' under working dir in YARN containers
[ https://issues.apache.org/jira/browse/FLINK-26030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489978#comment-17489978 ] Yang Wang commented on FLINK-26030: --- Setting the {{FLINK_LIB_DIR}} to workDir/usrlib makes sense to me. > Set FLINK_LIB_DIR to 'lib' under working dir in YARN containers > --- > > Key: FLINK-26030 > URL: https://issues.apache.org/jira/browse/FLINK-26030 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Reporter: Biao Geng >Priority: Minor > > Currently, we utilize > {{org.apache.flink.runtime.entrypoint.ClusterEntrypointUtils#tryFindUserLibDirectory}} > to locate usrlib in both flink client and cluster side. > This method relies on the value of environment variable {{FLINK_LIB_DIR}} to > find the {{{}usrlib{}}}. > It makes sense in client side since in {{{}bin/config.sh{}}}, > {{FLINK_LIB_DIR}} will be set by default(i.e. {{FLINK_HOME/lib}} if not > exists. But in YARN cluster's containers, when we want to reuse this method > to find {{{}usrlib{}}}, as the YARN usually starts the process using commands > like > {quote}/bin/bash -c /usr/lib/jvm/java-1.8.0/bin/java -Xmx1073741824 > -Xms1073741824 > -XX:MaxMetaspaceSize=268435456org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint > -D jobmanager.memory.off-heap.size=134217728b -D > jobmanager.memory.jvm-overhead.min=201326592b -D > jobmanager.memory.jvm-metaspace.size=268435456b -D > jobmanager.memory.heap.size=1073741824b -D > jobmanager.memory.jvm-overhead.max=201326592b ... > {quote} > {{FLINK_LIB_DIR}} is not guaranteed to be set in such case. Current codes > will use current working dir to locate the {{usrlib}} which is correct in > most cases. But bad things can happen if the machine which the YARN container > resides in has already set {{FLINK_LIB_DIR}} to a different folder. In that > case, codes will try to find {{usrlib}} in a undesired place. > One possible solution would be overriding the {{FLINK_LIB_DIR}} in YARN > container env to the {{lib}} dir under YARN's working dir. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-26030) Set FLINK_LIB_DIR to 'lib' under working dir in YARN containers
[ https://issues.apache.org/jira/browse/FLINK-26030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489957#comment-17489957 ] Biao Geng commented on FLINK-26030: --- Yes, it should be a bug. I believe to be consistent with the behavior in flink client side, in YARN cluster side, we should make the {{FLINK_LIB_DIR}} in YARN container env to the {{lib}} dir under YARN's working dir. If that is the case, I may start investigating on how to implement it properly. Let me know if above assumption make sense. Thanks! > Set FLINK_LIB_DIR to 'lib' under working dir in YARN containers > --- > > Key: FLINK-26030 > URL: https://issues.apache.org/jira/browse/FLINK-26030 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Reporter: Biao Geng >Priority: Minor > > Currently, we utilize > {{org.apache.flink.runtime.entrypoint.ClusterEntrypointUtils#tryFindUserLibDirectory}} > to locate usrlib in both flink client and cluster side. > This method relies on the value of environment variable {{FLINK_LIB_DIR}} to > find the {{usrlib}}. > It makes sense in client side since in {{bin/config.sh}}, {{FLINK_LIB_DIR}} > will be set by default(i.e. {{FLINK_HOME/lib}} if not exists. But in YARN > cluster's containers, when we want to reuse this method to find {{usrlib}}, > as the YARN usually starts the process using commands like > bq. /bin/bash -c /usr/lib/jvm/java-1.8.0/bin/java -Xmx1073741824 > -Xms1073741824 > -XX:MaxMetaspaceSize=268435456org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint > -D jobmanager.memory.off-heap.size=134217728b -D > jobmanager.memory.jvm-overhead.min=201326592b -D > jobmanager.memory.jvm-metaspace.size=268435456b -D > jobmanager.memory.heap.size=1073741824b -D > jobmanager.memory.jvm-overhead.max=201326592b ... > {{FLINK_LIB_DIR}} is not guaranteed to be set in such case. Current codes > will use current working dir to locate the {{usrlib}} which is correct in > most cases. But bad things can happen if the machine which the YARN container > resides in has already set {{FLINK_LIB_DIR}} to a different folder. In that > case, codes will try to find {{usrlib}} in a undesired place. > One possible solution would be overriding the {{FLINK_LIB_DIR}} in YARN > container env to the {{lib}} dir under YARN's workding dir. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-26030) Set FLINK_LIB_DIR to 'lib' under working dir in YARN containers
[ https://issues.apache.org/jira/browse/FLINK-26030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489303#comment-17489303 ] Yang Wang commented on FLINK-26030: --- {{ClusterEntrypointUtils#tryFindUserLibDirectory}} will return a wrong usr lib if environment {{FLINK_LIB_DIR}} is pre-configured in the YARN cluster. So I think this ticket is a minor bug, not an improvement. Right? > Set FLINK_LIB_DIR to 'lib' under working dir in YARN containers > --- > > Key: FLINK-26030 > URL: https://issues.apache.org/jira/browse/FLINK-26030 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Reporter: Biao Geng >Priority: Major > > Currently, we utilize > {{org.apache.flink.runtime.entrypoint.ClusterEntrypointUtils#tryFindUserLibDirectory}} > to locate usrlib in both flink client and cluster side. > This method relies on the value of environment variable {{FLINK_LIB_DIR}} to > find the {{usrlib}}. > It makes sense in client side since in {{bin/config.sh}}, {{FLINK_LIB_DIR}} > will be set by default(i.e. {{FLINK_HOME/lib}} if not exists. But in YARN > cluster's containers, when we want to reuse this method to find {{usrlib}}, > as the YARN usually starts the process using commands like > bq. /bin/bash -c /usr/lib/jvm/java-1.8.0/bin/java -Xmx1073741824 > -Xms1073741824 > -XX:MaxMetaspaceSize=268435456org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint > -D jobmanager.memory.off-heap.size=134217728b -D > jobmanager.memory.jvm-overhead.min=201326592b -D > jobmanager.memory.jvm-metaspace.size=268435456b -D > jobmanager.memory.heap.size=1073741824b -D > jobmanager.memory.jvm-overhead.max=201326592b ... > {{FLINK_LIB_DIR}} is not guaranteed to be set in such case. Current codes > will use current working dir to locate the {{usrlib}} which is correct in > most cases. But bad things can happen if the machine which the YARN container > resides in has already set {{FLINK_LIB_DIR}} to a different folder. In that > case, codes will try to find {{usrlib}} in a undesired place. > One possible solution would be overriding the {{FLINK_LIB_DIR}} in YARN > container env to the {{lib}} dir under YARN's workding dir. -- This message was sent by Atlassian Jira (v8.20.1#820001)