Copilot commented on code in PR #2576:
URL: https://github.com/apache/fluss/pull/2576#discussion_r2767558023


##########
website/docs/quickstart/lakehouse.md:
##########
@@ -72,37 +100,50 @@ services:
         datalake.paimon.warehouse: /tmp/paimon
     volumes:
       - shared-tmpfs:/tmp/paimon
+      - shared-tmpfs:/tmp/fluss
   zookeeper:
     restart: always
     image: zookeeper:3.9.2
-  #end
-  #begin Flink cluster
   jobmanager:
-    image: apache/fluss-quickstart-flink:1.20-$FLUSS_DOCKER_VERSION$
+    image: flink:1.20-scala_2.12-java17
     ports:
       - "8083:8081"
-    command: jobmanager
+    entrypoint: ["/bin/bash", "-c"]
+    command: >
+      "sed -i 's/exec $(drop_privs_cmd)//g' /docker-entrypoint.sh &&
+       cp /tmp/jars/*.jar /opt/flink/lib/ 2>/dev/null || true;
+       cp /tmp/opt/*.jar /opt/flink/opt/ 2>/dev/null || true;
+       /docker-entrypoint.sh jobmanager"

Review Comment:
   Modifying the docker-entrypoint.sh script by removing the drop_privs_cmd 
introduces a security risk by running the Flink processes as root instead of 
the intended non-privileged user. This change bypasses the container's built-in 
security mechanism. Consider using the user directive in docker-compose or 
alternative approaches to copy files without modifying security-related scripts.



##########
website/docs/quickstart/lakehouse.md:
##########
@@ -32,12 +32,39 @@ mkdir fluss-quickstart-paimon
 cd fluss-quickstart-paimon
 ```
 
-2. Create a `docker-compose.yml` file with the following content:
+2. Create directories and download required jars:
+
+```shell
+mkdir -p lib opt
+
+# Flink connectors
+wget -O lib/flink-faker-0.5.3.jar 
https://github.com/knaufk/flink-faker/releases/download/v0.5.3/flink-faker-0.5.3.jar
+wget -O "lib/fluss-flink-1.20-$FLUSS_DOCKER_VERSION$.jar" 
"https://repo1.maven.org/maven2/org/apache/fluss/fluss-flink-1.20/$FLUSS_DOCKER_VERSION$/fluss-flink-1.20-$FLUSS_DOCKER_VERSION$.jar";
+wget -O lib/paimon-flink-1.20-1.3.1.jar 
"https://repo1.maven.org/maven2/org/apache/paimon/paimon-flink-1.20/1.2.0/paimon-flink-1.20-1.3.1.jar";

Review Comment:
   The download URL for paimon-flink connector contains mismatched version 
numbers. The path specifies version 1.2.0 but the filename specifies 1.3.1, 
which will result in a 404 error. The URL should consistently use version 1.3.1 
in both the path and filename.
   ```suggestion
   wget -O lib/paimon-flink-1.20-1.3.1.jar 
"https://repo1.maven.org/maven2/org/apache/paimon/paimon-flink-1.20/1.3.1/paimon-flink-1.20-1.3.1.jar";
   ```



##########
website/docs/quickstart/lakehouse.md:
##########
@@ -116,10 +157,6 @@ The Docker Compose environment consists of the following 
containers:
 - **Fluss Cluster:** a Fluss `CoordinatorServer`, a Fluss `TabletServer` and a 
`ZooKeeper` server.
 - **Flink Cluster**: a Flink `JobManager` and a Flink `TaskManager` container 
to execute queries.
 
-**Note:** The `apache/fluss-quickstart-flink` image is based on 
[flink:1.20.3-java17](https://hub.docker.com/layers/library/flink/1.20-java17/images/sha256:296c7c23fa40a9a3547771b08fc65e25f06bc4cfd3549eee243c99890778cafc)
 and
-includes the [fluss-flink](engine-flink/getting-started.md), 
[paimon-flink](https://paimon.apache.org/docs/1.3/flink/quick-start/) and
-[flink-connector-faker](https://flink-packages.org/packages/flink-faker) to 
simplify this guide.
-
 3. To start all containers, run:

Review Comment:
   Step numbering is incorrect. The previous step is labeled "3" for creating 
the docker-compose.yml file, but this step is also labeled "3". This should be 
step "4" to maintain correct sequential numbering.
   ```suggestion
   4. To start all containers, run:
   ```



##########
website/docs/quickstart/lakehouse.md:
##########
@@ -32,12 +32,39 @@ mkdir fluss-quickstart-paimon
 cd fluss-quickstart-paimon
 ```
 
-2. Create a `docker-compose.yml` file with the following content:
+2. Create directories and download required jars:
+
+```shell
+mkdir -p lib opt
+
+# Flink connectors
+wget -O lib/flink-faker-0.5.3.jar 
https://github.com/knaufk/flink-faker/releases/download/v0.5.3/flink-faker-0.5.3.jar
+wget -O "lib/fluss-flink-1.20-$FLUSS_DOCKER_VERSION$.jar" 
"https://repo1.maven.org/maven2/org/apache/fluss/fluss-flink-1.20/$FLUSS_DOCKER_VERSION$/fluss-flink-1.20-$FLUSS_DOCKER_VERSION$.jar";
+wget -O lib/paimon-flink-1.20-1.3.1.jar 
"https://repo1.maven.org/maven2/org/apache/paimon/paimon-flink-1.20/1.2.0/paimon-flink-1.20-1.3.1.jar";
+
+# Fluss lake plugin
+wget -O "lib/fluss-lake-paimon-$FLUSS_DOCKER_VERSION$.jar" 
"https://repo1.maven.org/maven2/org/apache/fluss/fluss-lake-paimon/$FLUSS_DOCKER_VERSION$/fluss-lake-paimon-$FLUSS_DOCKER_VERSION$.jar";
+
+# Paimon bundle jar
+wget -O "lib/paimon-bundle-1.3.1.jar" 
"https://repo.maven.apache.org/maven2/org/apache/paimon/paimon-bundle/1.3.1/paimon-bundle-1.3.1.jar";
 
+# Hadoop bundle jar
+wget -O lib/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar 
https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.8.3-10.0/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar
+
+# Tiering service
+wget -O "opt/fluss-flink-tiering-$FLUSS_DOCKER_VERSION$.jar" 
"https://repo1.maven.org/maven2/org/apache/fluss/fluss-flink-tiering/$FLUSS_DOCKER_VERSION$/fluss-flink-tiering-$FLUSS_DOCKER_VERSION$.jar";

Review Comment:
   The quickstart instructions download multiple executable JARs via `wget` and 
then load them into Flink without any checksum or signature verification, which 
creates a supply-chain risk if the download source or network is compromised. 
An attacker who can tamper with these HTTPS downloads (e.g., via upstream repo 
compromise, DNS/CA abuse, or network MITM) could deliver a malicious JAR that 
Flink will execute with your cluster’s permissions. To harden this flow, 
document or enforce integrity verification (e.g., pinned checksums or 
signatures) for each downloaded JAR before using it in the cluster.



##########
website/docs/quickstart/lakehouse.md:
##########
@@ -72,37 +100,50 @@ services:
         datalake.paimon.warehouse: /tmp/paimon
     volumes:
       - shared-tmpfs:/tmp/paimon
+      - shared-tmpfs:/tmp/fluss
   zookeeper:
     restart: always
     image: zookeeper:3.9.2
-  #end
-  #begin Flink cluster
   jobmanager:
-    image: apache/fluss-quickstart-flink:1.20-$FLUSS_DOCKER_VERSION$
+    image: flink:1.20-scala_2.12-java17
     ports:
       - "8083:8081"
-    command: jobmanager
+    entrypoint: ["/bin/bash", "-c"]
+    command: >
+      "sed -i 's/exec $(drop_privs_cmd)//g' /docker-entrypoint.sh &&
+       cp /tmp/jars/*.jar /opt/flink/lib/ 2>/dev/null || true;
+       cp /tmp/opt/*.jar /opt/flink/opt/ 2>/dev/null || true;
+       /docker-entrypoint.sh jobmanager"
     environment:
       - |
         FLINK_PROPERTIES=
         jobmanager.rpc.address: jobmanager
     volumes:
       - shared-tmpfs:/tmp/paimon
+      - shared-tmpfs:/tmp/fluss
+      - ./lib:/tmp/jars
+      - ./opt:/tmp/opt
   taskmanager:
-    image: apache/fluss-quickstart-flink:1.20-$FLUSS_DOCKER_VERSION$
+    image: flink:1.20-scala_2.12-java17
     depends_on:
       - jobmanager
-    command: taskmanager
+    entrypoint: ["/bin/bash", "-c"]
+    command: >
+      "sed -i 's/exec $(drop_privs_cmd)//g' /docker-entrypoint.sh &&
+       cp /tmp/jars/*.jar /opt/flink/lib/ 2>/dev/null || true;
+       cp /tmp/opt/*.jar /opt/flink/opt/ 2>/dev/null || true;
+       /docker-entrypoint.sh taskmanager"

Review Comment:
   Modifying the docker-entrypoint.sh script by removing the drop_privs_cmd 
introduces a security risk by running the Flink processes as root instead of 
the intended non-privileged user. This change bypasses the container's built-in 
security mechanism. Consider using the user directive in docker-compose or 
alternative approaches to copy files without modifying security-related scripts.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to