[jira] [Commented] (FLINK-26030) Set FLINK_LIB_DIR to 'lib' under working dir in YARN containers

2022-02-09 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489978#comment-17489978
 ] 

Yang Wang commented on FLINK-26030:
---

Setting the {{FLINK_LIB_DIR}} to workDir/usrlib makes sense to me.

> Set FLINK_LIB_DIR to 'lib' under working dir in YARN containers
> ---
>
> Key: FLINK-26030
> URL: https://issues.apache.org/jira/browse/FLINK-26030
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Reporter: Biao Geng
>Priority: Minor
>
> Currently, we utilize 
> {{org.apache.flink.runtime.entrypoint.ClusterEntrypointUtils#tryFindUserLibDirectory}}
>  to locate usrlib in both flink client and cluster side. 
> This method relies on the value of environment variable {{FLINK_LIB_DIR}} to 
> find the {{{}usrlib{}}}.
> It makes sense in client side since in {{{}bin/config.sh{}}}, 
> {{FLINK_LIB_DIR}} will be set by default(i.e. {{FLINK_HOME/lib}} if not 
> exists. But in YARN cluster's containers, when we want to reuse this method 
> to find {{{}usrlib{}}}, as the YARN usually starts the process using commands 
> like
> {quote}/bin/bash -c /usr/lib/jvm/java-1.8.0/bin/java -Xmx1073741824 
> -Xms1073741824 
> -XX:MaxMetaspaceSize=268435456org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint
>  -D jobmanager.memory.off-heap.size=134217728b -D 
> jobmanager.memory.jvm-overhead.min=201326592b -D 
> jobmanager.memory.jvm-metaspace.size=268435456b -D 
> jobmanager.memory.heap.size=1073741824b -D 
> jobmanager.memory.jvm-overhead.max=201326592b ...
> {quote}
> {{FLINK_LIB_DIR}} is not guaranteed to be set in such case. Current codes 
> will use current working dir to locate the {{usrlib}} which is correct in 
> most cases. But bad things can happen if the machine which the YARN container 
> resides in has already set {{FLINK_LIB_DIR}} to a different folder. In that 
> case, codes will try to find {{usrlib}} in a undesired place.
> One possible solution would be overriding the {{FLINK_LIB_DIR}} in YARN 
> container env to the {{lib}} dir under YARN's working dir.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-26030) Set FLINK_LIB_DIR to 'lib' under working dir in YARN containers

2022-02-09 Thread Biao Geng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489957#comment-17489957
 ] 

Biao Geng commented on FLINK-26030:
---

Yes, it should be a bug.

I believe to be consistent with the behavior in flink client side, in YARN 
cluster side, we should make the {{FLINK_LIB_DIR}} in YARN container env to the 
{{lib}} dir under YARN's working dir. If that is the case, I may start 
investigating on how to implement it properly. Let me know if above assumption 
make sense. Thanks!

> Set FLINK_LIB_DIR to 'lib' under working dir in YARN containers
> ---
>
> Key: FLINK-26030
> URL: https://issues.apache.org/jira/browse/FLINK-26030
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Reporter: Biao Geng
>Priority: Minor
>
> Currently, we utilize 
> {{org.apache.flink.runtime.entrypoint.ClusterEntrypointUtils#tryFindUserLibDirectory}}
>  to locate usrlib in both flink client and cluster side. 
> This method relies on the value of environment variable {{FLINK_LIB_DIR}} to 
> find the {{usrlib}}.
> It makes sense in client side since in {{bin/config.sh}}, {{FLINK_LIB_DIR}} 
> will be set by default(i.e. {{FLINK_HOME/lib}} if not exists. But in YARN 
> cluster's containers, when we want to reuse this method to find {{usrlib}}, 
> as the YARN usually starts the process using commands like 
> bq. /bin/bash -c /usr/lib/jvm/java-1.8.0/bin/java -Xmx1073741824 
> -Xms1073741824 
> -XX:MaxMetaspaceSize=268435456org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint
>  -D jobmanager.memory.off-heap.size=134217728b -D 
> jobmanager.memory.jvm-overhead.min=201326592b -D 
> jobmanager.memory.jvm-metaspace.size=268435456b -D 
> jobmanager.memory.heap.size=1073741824b -D 
> jobmanager.memory.jvm-overhead.max=201326592b ...
> {{FLINK_LIB_DIR}} is not guaranteed to be set in such case. Current codes 
> will use current working dir to locate the {{usrlib}} which is correct in 
> most cases. But bad things can happen if the machine which the YARN container 
> resides in has already set {{FLINK_LIB_DIR}} to a different folder. In that 
> case, codes will try to find {{usrlib}} in a undesired place. 
> One possible solution would be overriding the {{FLINK_LIB_DIR}} in YARN 
> container env to the {{lib}} dir under YARN's workding dir.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-26030) Set FLINK_LIB_DIR to 'lib' under working dir in YARN containers

2022-02-08 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489303#comment-17489303
 ] 

Yang Wang commented on FLINK-26030:
---

{{ClusterEntrypointUtils#tryFindUserLibDirectory}} will return a wrong usr lib 
if environment {{FLINK_LIB_DIR}} is pre-configured in the YARN cluster. So I 
think this ticket is a minor bug, not an improvement. Right?

> Set FLINK_LIB_DIR to 'lib' under working dir in YARN containers
> ---
>
> Key: FLINK-26030
> URL: https://issues.apache.org/jira/browse/FLINK-26030
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Reporter: Biao Geng
>Priority: Major
>
> Currently, we utilize 
> {{org.apache.flink.runtime.entrypoint.ClusterEntrypointUtils#tryFindUserLibDirectory}}
>  to locate usrlib in both flink client and cluster side. 
> This method relies on the value of environment variable {{FLINK_LIB_DIR}} to 
> find the {{usrlib}}.
> It makes sense in client side since in {{bin/config.sh}}, {{FLINK_LIB_DIR}} 
> will be set by default(i.e. {{FLINK_HOME/lib}} if not exists. But in YARN 
> cluster's containers, when we want to reuse this method to find {{usrlib}}, 
> as the YARN usually starts the process using commands like 
> bq. /bin/bash -c /usr/lib/jvm/java-1.8.0/bin/java -Xmx1073741824 
> -Xms1073741824 
> -XX:MaxMetaspaceSize=268435456org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint
>  -D jobmanager.memory.off-heap.size=134217728b -D 
> jobmanager.memory.jvm-overhead.min=201326592b -D 
> jobmanager.memory.jvm-metaspace.size=268435456b -D 
> jobmanager.memory.heap.size=1073741824b -D 
> jobmanager.memory.jvm-overhead.max=201326592b ...
> {{FLINK_LIB_DIR}} is not guaranteed to be set in such case. Current codes 
> will use current working dir to locate the {{usrlib}} which is correct in 
> most cases. But bad things can happen if the machine which the YARN container 
> resides in has already set {{FLINK_LIB_DIR}} to a different folder. In that 
> case, codes will try to find {{usrlib}} in a undesired place. 
> One possible solution would be overriding the {{FLINK_LIB_DIR}} in YARN 
> container env to the {{lib}} dir under YARN's workding dir.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)