rickchengx opened a new pull request #4178:
URL: https://github.com/apache/zeppelin/pull/4178
### What is this PR for?
Currently, users can build a smaller docker image of zeppelin interpreter by
modifying the Dockerfile in `scripts/docker/zeppelin-interpreter`. But I think
a more ideal way is that user can use a shell script to build the docker image
(something like `docker-image-tool.sh` in Apache Spark) and specify the
interpreters they want.
Example usage:
```
Usage: ./Dockerfile_gen.sh [options] [command]
This script outputs a Dockerfile for building the zeppelin image that
contains some specific interpreters.
Options:
-i interpreter (Optional) A comma-separated list of interpreter
directory names (under /path/to/zeppelin/interpreter/)
that need to be add into the docker image.
By default, it will add the spark interpreter.
-c conda yaml file (Optional) Specify the conda yaml file that manages
python and R packages. By default, it will not install
python and R packages through conda.
-v zeppelin version (Optional) Specify the version of zeppelin. By
default, the version is "0.9.0".
Examples:
- Output the Dockerfile for building the zeppelin image that contains
spark and python interpreter.
./Dockerfile_gen.sh -i spark,python
- Output the Dockerfile for building the zeppelin image that contains
spark and python interpreter and specify the
conda yaml file "python3.yaml"
./Dockerfile_gen.sh -i spark,python -c python3.yaml
```
Example generated Dockerfile:
```
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License."
FROM apache/zeppelin:0.9.0 AS zeppelin-distribution
FROM ubuntu:20.04
LABEL maintainer="Apache Software Foundation <[email protected]>"
ARG version="0.9.0"
ENV VERSION="${version}" \
ZEPPELIN_HOME="/opt/zeppelin"
RUN set -ex && \
apt-get -y update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y openjdk-8-jre-headless
wget tini && \
# Cleanup
rm -rf /var/lib/apt/lists/* && \
apt-get autoclean && \
apt-get clean
COPY --from=zeppelin-distribution /opt/zeppelin/bin ${ZEPPELIN_HOME}/bin
COPY log4j.properties ${ZEPPELIN_HOME}/conf/
COPY log4j_yarn_cluster.properties ${ZEPPELIN_HOME}/conf/
# Copy interpreter-shaded JAR, needed for all interpreters
COPY --from=zeppelin-distribution
/opt/zeppelin/interpreter/zeppelin-interpreter-shaded-${VERSION}.jar
${ZEPPELIN_HOME}/interpreter/zeppelin-interpreter-shaded-${VERSION}.jar
# copy interpreter spark
COPY --from=zeppelin-distribution /opt/zeppelin/interpreter/spark
${ZEPPELIN_HOME}/interpreter/spark
RUN mkdir -p "${ZEPPELIN_HOME}/logs" "${ZEPPELIN_HOME}/run"
"${ZEPPELIN_HOME}/local-repo" && \
# Allow process to edit /etc/passwd, to create a user entry for zeppelin
chgrp root /etc/passwd && chmod ug+rw /etc/passwd && \
# Give access to some specific folders
chmod -R 775 "${ZEPPELIN_HOME}/logs" "${ZEPPELIN_HOME}/run"
"${ZEPPELIN_HOME}/local-repo"
USER 1000
ENTRYPOINT [ "/usr/bin/tini", "--" ]
WORKDIR ${ZEPPELIN_HOME}
```
### What type of PR is it?
[Improvement]
### Todos
* [ ] - Task
### What is the Jira issue?
* <https://issues.apache.org/jira/browse/ZEPPELIN-5461>
### How should this be tested?
* CI pass and manually tested
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]