This is an automated email from the ASF dual-hosted git repository. yikun pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark-docker.git
The following commit(s) were added to refs/heads/master by this push: new c07ae18 [SPARK-43368] Use `libnss_wrapper` to fake passwd entry c07ae18 is described below commit c07ae18355678370fd270bedb8b39ab2aceb5ac2 Author: Yikun Jiang <yikunk...@gmail.com> AuthorDate: Fri Jun 2 10:27:01 2023 +0800 [SPARK-43368] Use `libnss_wrapper` to fake passwd entry ### What changes were proposed in this pull request? Use `libnss_wrapper` to fake passwd entry instead of changing passwd to resolve random UID problem. And also we only attempt to setup fake passwd entry for driver/executor, but for cmd like `bash`, the fake passwd will not be set. ### Why are the changes needed? In the past, we add the entry to `/etc/passwd` directly for current UID, it's mainly for [OpenShift anonymous random `uid` case](https://github.com/docker-library/official-images/pull/13089#issuecomment-1534706523) (See also in https://github.com/apache-spark-on-k8s/spark/pull/404), but this way bring the pontential security issue about widely permision of `/etc/passwd`. According to DOI reviewer [suggestion](https://github.com/docker-library/official-images/pull/13089#issuecomment-1561793792), we'd better to resolve this problem by using [libnss_wrapper](https://cwrap.org/nss_wrapper.html). It's a library to help set a fake passwd entry by setting `LD_PRELOAD`, `NSS_WRAPPER_PASSWD`, `NSS_WRAPPER_GROUP`. Such as random UID is `1000`, the env will be: ``` spark6f41b8e5be9b:/opt/spark/work-dir$ id -u 1000 spark6f41b8e5be9b:/opt/spark/work-dir$ id -g 1000 spark6f41b8e5be9b:/opt/spark/work-dir$ whoami spark spark6f41b8e5be9b:/opt/spark/work-dir$ echo $LD_PRELOAD /usr/lib/libnss_wrapper.so spark6f41b8e5be9b:/opt/spark/work-dir$ echo $NSS_WRAPPER_PASSWD /tmp/tmp.r5x4SMX35B spark6f41b8e5be9b:/opt/spark/work-dir$ cat /tmp/tmp.r5x4SMX35B spark:x:1000:1000:${SPARK_USER_NAME:-anonymous uid}:/opt/spark:/bin/false spark6f41b8e5be9b:/opt/spark/work-dir$ echo $NSS_WRAPPER_GROUP /tmp/tmp.XcnnYuD68r spark6f41b8e5be9b:/opt/spark/work-dir$ cat /tmp/tmp.XcnnYuD68r spark:x:1000: ``` ### Does this PR introduce _any_ user-facing change? Yes, setup fake ENV rather than changing `/etc/passwd`. ### How was this patch tested? #### 1. Without `attempt_setup_fake_passwd_entry`, the user is `I have no name!` ``` # docker run -it --rm --user 1000:1000 spark-test bash groups: cannot find name for group ID 1000 I have no name!998110cd5a26:/opt/spark/work-dir$ I have no name!0fea1d27d67d:/opt/spark/work-dir$ id -u 1000 I have no name!0fea1d27d67d:/opt/spark/work-dir$ id -g 1000 I have no name!0fea1d27d67d:/opt/spark/work-dir$ whoami whoami: cannot find name for user ID 1000 ``` #### 2. Mannual stub the `attempt_setup_fake_passwd_entry`, the user is `spark`. 2.1 Apply a tmp change to cmd ```patch diff --git a/entrypoint.sh.template b/entrypoint.sh.template index 08fc925..77d5b04 100644 --- a/entrypoint.sh.template +++ b/entrypoint.sh.template -118,6 +118,7 case "$1" in *) # Non-spark-on-k8s command provided, proceeding in pass-through mode... + attempt_setup_fake_passwd_entry exec "$" ;; esac ``` 2.2 Build and run the image, specify a random UID/GID 1000 ```bash $ docker build . -t spark-test $ docker run -it --rm --user 1000:1000 spark-test bash # the user is set to spark rather than unknow user spark6f41b8e5be9b:/opt/spark/work-dir$ spark6f41b8e5be9b:/opt/spark/work-dir$ id -u 1000 spark6f41b8e5be9b:/opt/spark/work-dir$ id -g 1000 spark6f41b8e5be9b:/opt/spark/work-dir$ whoami spark ``` ``` # NSS env is set right spark6f41b8e5be9b:/opt/spark/work-dir$ echo $LD_PRELOAD /usr/lib/libnss_wrapper.so spark6f41b8e5be9b:/opt/spark/work-dir$ echo $NSS_WRAPPER_PASSWD /tmp/tmp.r5x4SMX35B spark6f41b8e5be9b:/opt/spark/work-dir$ cat /tmp/tmp.r5x4SMX35B spark:x:1000:1000:${SPARK_USER_NAME:-anonymous uid}:/opt/spark:/bin/false spark6f41b8e5be9b:/opt/spark/work-dir$ echo $NSS_WRAPPER_GROUP /tmp/tmp.XcnnYuD68r spark6f41b8e5be9b:/opt/spark/work-dir$ cat /tmp/tmp.XcnnYuD68r spark:x:1000: ``` #### 3. If specify current exsiting user (such as `spark`, `root`), no fake setup ```bash # docker run -it --rm --user 0 spark-test bash roote5bf55d4df22:/opt/spark/work-dir# echo $LD_PRELOAD ``` ```bash # docker run -it --rm spark-test bash sparkdef8d8ca4e7d:/opt/spark/work-dir$ echo $LD_PRELOAD ``` Closes #45 from Yikun/SPARK-43368. Authored-by: Yikun Jiang <yikunk...@gmail.com> Signed-off-by: Yikun Jiang <yikunk...@gmail.com> --- 3.4.0/scala2.12-java11-ubuntu/Dockerfile | 3 +-- 3.4.0/scala2.12-java11-ubuntu/entrypoint.sh | 41 +++++++++++++++++------------ Dockerfile.template | 3 +-- entrypoint.sh.template | 41 +++++++++++++++++------------ 4 files changed, 50 insertions(+), 38 deletions(-) diff --git a/3.4.0/scala2.12-java11-ubuntu/Dockerfile b/3.4.0/scala2.12-java11-ubuntu/Dockerfile index a680106..aa754b7 100644 --- a/3.4.0/scala2.12-java11-ubuntu/Dockerfile +++ b/3.4.0/scala2.12-java11-ubuntu/Dockerfile @@ -24,7 +24,7 @@ RUN groupadd --system --gid=${spark_uid} spark && \ RUN set -ex; \ apt-get update; \ ln -s /lib /lib64; \ - apt install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu; \ + apt install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu libnss-wrapper; \ mkdir -p /opt/spark; \ mkdir /opt/spark/python; \ mkdir -p /opt/spark/examples; \ @@ -33,7 +33,6 @@ RUN set -ex; \ touch /opt/spark/RELEASE; \ chown -R spark:spark /opt/spark; \ echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su; \ - chgrp root /etc/passwd && chmod ug+rw /etc/passwd; \ rm -rf /var/cache/apt/*; \ rm -rf /var/lib/apt/lists/* diff --git a/3.4.0/scala2.12-java11-ubuntu/entrypoint.sh b/3.4.0/scala2.12-java11-ubuntu/entrypoint.sh index 6def3f9..08fc925 100755 --- a/3.4.0/scala2.12-java11-ubuntu/entrypoint.sh +++ b/3.4.0/scala2.12-java11-ubuntu/entrypoint.sh @@ -15,23 +15,28 @@ # See the License for the specific language governing permissions and # limitations under the License. # - -# Check whether there is a passwd entry for the container UID -myuid=$(id -u) -mygid=$(id -g) -# turn off -e for getent because it will return error code in anonymous uid case -set +e -uidentry=$(getent passwd $myuid) -set -e - -# If there is no passwd entry for the container UID, attempt to create one -if [ -z "$uidentry" ] ; then - if [ -w /etc/passwd ] ; then - echo "$myuid:x:$myuid:$mygid:${SPARK_USER_NAME:-anonymous uid}:$SPARK_HOME:/bin/false" >> /etc/passwd - else - echo "Container ENTRYPOINT failed to add passwd entry for anonymous UID" - fi -fi +attempt_setup_fake_passwd_entry() { + # Check whether there is a passwd entry for the container UID + local myuid; myuid="$(id -u)" + # If there is no passwd entry for the container UID, attempt to fake one + # You can also refer to the https://github.com/docker-library/official-images/pull/13089#issuecomment-1534706523 + # It's to resolve OpenShift random UID case. + # See also: https://github.com/docker-library/postgres/pull/448 + if ! getent passwd "$myuid" &> /dev/null; then + local wrapper + for wrapper in {/usr,}/lib{/*,}/libnss_wrapper.so; do + if [ -s "$wrapper" ]; then + NSS_WRAPPER_PASSWD="$(mktemp)" + NSS_WRAPPER_GROUP="$(mktemp)" + export LD_PRELOAD="$wrapper" NSS_WRAPPER_PASSWD NSS_WRAPPER_GROUP + local mygid; mygid="$(id -g)" + printf 'spark:x:%s:%s:${SPARK_USER_NAME:-anonymous uid}:%s:/bin/false\n' "$myuid" "$mygid" "$SPARK_HOME" > "$NSS_WRAPPER_PASSWD" + printf 'spark:x:%s:\n' "$mygid" > "$NSS_WRAPPER_GROUP" + break + fi + done + fi +} if [ -z "$JAVA_HOME" ]; then JAVA_HOME=$(java -XshowSettings:properties -version 2>&1 > /dev/null | grep 'java.home' | awk '{print $3}') @@ -85,6 +90,7 @@ case "$1" in --deploy-mode client "$@" ) + attempt_setup_fake_passwd_entry # Execute the container CMD under tini for better hygiene exec $(switch_spark_if_root) /usr/bin/tini -s -- "${CMD[@]}" ;; @@ -105,6 +111,7 @@ case "$1" in --resourceProfileId $SPARK_RESOURCE_PROFILE_ID --podName $SPARK_EXECUTOR_POD_NAME ) + attempt_setup_fake_passwd_entry # Execute the container CMD under tini for better hygiene exec $(switch_spark_if_root) /usr/bin/tini -s -- "${CMD[@]}" ;; diff --git a/Dockerfile.template b/Dockerfile.template index d1188bc..fc67534 100644 --- a/Dockerfile.template +++ b/Dockerfile.template @@ -24,7 +24,7 @@ RUN groupadd --system --gid=${spark_uid} spark && \ RUN set -ex; \ apt-get update; \ ln -s /lib /lib64; \ - apt install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu; \ + apt install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu libnss-wrapper; \ mkdir -p /opt/spark; \ mkdir /opt/spark/python; \ mkdir -p /opt/spark/examples; \ @@ -33,7 +33,6 @@ RUN set -ex; \ touch /opt/spark/RELEASE; \ chown -R spark:spark /opt/spark; \ echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su; \ - chgrp root /etc/passwd && chmod ug+rw /etc/passwd; \ rm -rf /var/cache/apt/*; \ rm -rf /var/lib/apt/lists/* diff --git a/entrypoint.sh.template b/entrypoint.sh.template index 6def3f9..08fc925 100644 --- a/entrypoint.sh.template +++ b/entrypoint.sh.template @@ -15,23 +15,28 @@ # See the License for the specific language governing permissions and # limitations under the License. # - -# Check whether there is a passwd entry for the container UID -myuid=$(id -u) -mygid=$(id -g) -# turn off -e for getent because it will return error code in anonymous uid case -set +e -uidentry=$(getent passwd $myuid) -set -e - -# If there is no passwd entry for the container UID, attempt to create one -if [ -z "$uidentry" ] ; then - if [ -w /etc/passwd ] ; then - echo "$myuid:x:$myuid:$mygid:${SPARK_USER_NAME:-anonymous uid}:$SPARK_HOME:/bin/false" >> /etc/passwd - else - echo "Container ENTRYPOINT failed to add passwd entry for anonymous UID" - fi -fi +attempt_setup_fake_passwd_entry() { + # Check whether there is a passwd entry for the container UID + local myuid; myuid="$(id -u)" + # If there is no passwd entry for the container UID, attempt to fake one + # You can also refer to the https://github.com/docker-library/official-images/pull/13089#issuecomment-1534706523 + # It's to resolve OpenShift random UID case. + # See also: https://github.com/docker-library/postgres/pull/448 + if ! getent passwd "$myuid" &> /dev/null; then + local wrapper + for wrapper in {/usr,}/lib{/*,}/libnss_wrapper.so; do + if [ -s "$wrapper" ]; then + NSS_WRAPPER_PASSWD="$(mktemp)" + NSS_WRAPPER_GROUP="$(mktemp)" + export LD_PRELOAD="$wrapper" NSS_WRAPPER_PASSWD NSS_WRAPPER_GROUP + local mygid; mygid="$(id -g)" + printf 'spark:x:%s:%s:${SPARK_USER_NAME:-anonymous uid}:%s:/bin/false\n' "$myuid" "$mygid" "$SPARK_HOME" > "$NSS_WRAPPER_PASSWD" + printf 'spark:x:%s:\n' "$mygid" > "$NSS_WRAPPER_GROUP" + break + fi + done + fi +} if [ -z "$JAVA_HOME" ]; then JAVA_HOME=$(java -XshowSettings:properties -version 2>&1 > /dev/null | grep 'java.home' | awk '{print $3}') @@ -85,6 +90,7 @@ case "$1" in --deploy-mode client "$@" ) + attempt_setup_fake_passwd_entry # Execute the container CMD under tini for better hygiene exec $(switch_spark_if_root) /usr/bin/tini -s -- "${CMD[@]}" ;; @@ -105,6 +111,7 @@ case "$1" in --resourceProfileId $SPARK_RESOURCE_PROFILE_ID --podName $SPARK_EXECUTOR_POD_NAME ) + attempt_setup_fake_passwd_entry # Execute the container CMD under tini for better hygiene exec $(switch_spark_if_root) /usr/bin/tini -s -- "${CMD[@]}" ;; --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org