PHILO-HE commented on code in PR #10793:
URL:
https://github.com/apache/incubator-gluten/pull/10793#discussion_r2382766399
##########
docs/developers/NewToGluten.md:
##########
@@ -470,24 +341,23 @@ child allocators: 0
at
org.apache.spark.memory.SparkMemoryUtil$UnsafeItr.hasNext(SparkMemoryUtil.scala:246)
```
-## CPP code memory leak
+### CPP code memory leak
-Sometimes you cannot get the coredump symbols, if you debug memory leak, you
can write googletest to use valgrind to detect
+Sometimes you cannot get the coredump symbols, when debugging a memory leak.
You can write a GoogleTest to use valgrind for detection.
```bash
apt install valgrind
valgrind --leak-check=yes ./exec_backend_test
```
-
-# Run TPC-H and TPC-DS
+## Run TPC-H and TPC-DS
We supply `<gluten_home>/tools/gluten-it` to execute these queries
Refer to
[velox_backend.yml](https://github.com/apache/incubator-gluten/blob/main/.github/workflows/velox_backend.yml)
Review Comment:
Yes, the file name is changed. I've updated.
##########
docs/developers/NewToGluten.md:
##########
@@ -4,158 +4,122 @@ title: New To Gluten
nav_order: 2
parent: Developer Overview
---
-Help users to debug and test with Gluten.
-# Environment
+# Guide for New Developers
-Gluten supports Ubuntu20.04, Ubuntu22.04, CentOS8, CentOS7 and MacOS.
+## Environment
-## JDK
+Gluten supports Ubuntu 20.04/22.04, CentOS 7/8, and MacOS.
-Currently, Gluten supports JDK 8 for Spark 3.2/3.3/3.4/3.5. For Spark 3.3 and
higher versions, Gluten
-supports JDK 11 and 17. Please note since Spark 4.0, JDK 8 will not be
supported. So we recommend Velox
-backend users to use higher JDK version now to ease the migration for
deploying Gluten with Spark-4.0
-in the future. And we may probably upgrade Arrow from 15.0.0 to some higher
version, which also requires
-JDK 11 is the minimum version.
+### JDK
-### JDK 8
+Currently, Gluten supports JDK 8 for Spark 3.2, 3.3, 3.4, and 3.5. For Spark
3.3 and later versions, Gluten
+also supports JDK 11 and 17.
-#### Environment Setting
+Note: Starting with Spark 4.0, the minimum required JDK version is 17.
-For root user, the environment variables file is `/etc/profile`, it will take
effect for all the users.
+We recommend using a higher JDK version now to ease migration when deploying
Gluten for Spark 4.0
+in the future. In addition, we may upgrade Arrow from 15.0.0 to a newer
release, which will require
+JDK 11 as the minimum version.
-For other user, you can set in `~/.bashrc`.
+By default, Gluten compiles packages using JDK 8. Enable maven profile by
`-Pjava-17` or `-Pjava-11` to use the corresponding JDK version, and ensure
that the JDK version is available in your environment.
-#### Guide for Ubuntu
-
-The default JDK version in ubuntu is java11, we need to set to java8.
-
-```bash
-apt install openjdk-8-jdk
-update-alternatives --config java
-java -version
-```
-
-`--config java` to config java executable path, `javac` and other commands can
also use this command to config.
-For some other uses, we suggest to set `JAVA_HOME`.
-
-```bash
-export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
-JRE_HOME=$JAVA_HOME/jre
-export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
-# pay attention to $PATH double quote
-export PATH="$PATH:$JAVA_HOME/bin"
-```
-
-> Must set PATH with double quote in ubuntu.
-
-### JDK 11/17
-
-By default, Gluten compiles package using JDK8. Enable maven profile by
`-Pjava-17` to use JDK17 or `-Pjava-11` to use JDK 11, and please make sure
your JAVA_HOME is set correctly.
-
-Apache Spark and Arrow requires setting java args
`-Dio.netty.tryReflectionSetAccessible=true`, see
[SPARK-29924](https://issues.apache.org/jira/browse/SPARK-29924) and
[ARROW-6206](https://issues.apache.org/jira/browse/ARROW-6206).
-So please add following configs in `spark-defaults.conf`:
+If JDK 11 or a higher version is used, Spark and Arrow require setting the
java args `-Dio.netty.tryReflectionSetAccessible=true`, see
[SPARK-29924](https://issues.apache.org/jira/browse/SPARK-29924) and
[ARROW-6206](https://issues.apache.org/jira/browse/ARROW-6206).
+So add the following configs in `spark-defaults.conf`:
```
spark.driver.extraJavaOptions=-Dio.netty.tryReflectionSetAccessible=true
spark.executor.extraJavaOptions=-Dio.netty.tryReflectionSetAccessible=true
```
-## Maven 3.6.3 or above
+### Maven 3.6.3 or above
-[Maven Download Page](https://maven.apache.org/docs/history.html)
-And then set the environment setting.
+### GCC 11 or above
-## GCC 11 or above
+## Development
-# Compile Gluten using debug mode
+To debug Java/Scala code, follow the steps in
[build-gluten-with-velox-backend](../get-started/Velox.md#build-gluten-with-velox-backend).
-If you want to just debug java/scala code, there is no need to compile cpp
code with debug mode.
-You can just refer to
[build-gluten-with-velox-backend](../get-started/Velox.md#build-gluten-with-velox-backend).
-
-If you need to debug cpp code, please compile the backend code and gluten cpp
code with debug mode.
+To debug C++ code, compile the backend code and gluten C++ code in debug mode.
```bash
## compile Velox backend with benchmark and tests to debug
gluten_home/dev/builddeps-veloxbe.sh --build_tests=ON --build_benchmarks=ON
--build_type=Debug
```
-If you need to debug the tests in <gluten>/gluten-ut, You need to compile java
code with `-P spark-ut`.
+Note: To debug the tests in <gluten>/gluten-ut, you must compile java code
with `-Pspark-ut`.
-# Java/scala code development with Intellij
+### Java/scala code development
-## Linux IntelliJ local debug
+#### Linux IntelliJ local debug
Install the Linux IntelliJ version, and debug code locally.
- Ask your linux maintainer to install the desktop, and then restart the
server.
-- If you use Moba-XTerm to connect linux server, you don't need to install x11
server, If not (e.g. putty), please follow this guide:
+- If you use Moba-XTerm to connect linux server, you don't need to install x11
server, If not (e.g. putty), follow this guide:
[X11 Forwarding: Setup Instructions for Linux and
Mac](https://www.businessnewsdaily.com/11035-how-to-use-x11-forwarding.html)
- Download [IntelliJ Linux community
version](https://www.jetbrains.com/idea/download/?fromIDE=#section=linux) to
Linux server
- Start Idea, `bash <idea_dir>/idea.sh`
-## Set up Gluten project
+#### Set up Gluten project
- Make sure you have compiled Gluten.
- Load the Gluten by File->Open, select <gluten_home/pom.xml>.
-- Activate your profiles such as <backends-velox>, and Reload Maven Project,
you will find all your need modules have been activated.
-- Create breakpoint and debug as you wish, maybe you can try `CTRL+N` to find
`TestOperator` to start your test.
+- Activate your profiles such as `<backends-velox>`, then **Reload Maven
Project** to activate all the needed modules.
+- Create breakpoints and debug as you wish. You can use `CTRL+N` to locate a
test class to start your test.
-## Java/Scala code style
+#### Java/Scala code style
IntelliJ supports importing settings for Java/Scala code style. You can import
[intellij-codestyle.xml](../../dev/intellij-codestyle.xml) to your IDE.
See [IntelliJ
guide](https://www.jetbrains.com/help/idea/configuring-code-style.html#import-code-style).
-To generate a fix for Java/Scala code style, you can run one or more of the
below commands according to the code modules involved in your PR.
+To format Java/Scala code using the Spotless plugin, run the following command:
-For Velox backend:
-```
-mvn spotless:apply -Pbackends-velox -Pceleborn -Puniffle -Pspark-3.2
-Pspark-ut -DskipTests
-mvn spotless:apply -Pbackends-velox -Pceleborn -Puniffle -Pspark-3.3
-Pspark-ut -DskipTests
```
-For Clickhouse backend:
-```
-mvn spotless:apply -Pbackends-clickhouse -Pspark-3.2 -Pspark-ut -DskipTests
-mvn spotless:apply -Pbackends-clickhouse -Pspark-3.3 -Pspark-ut -DskipTests
+./dev/format-scala-code.sh
```
-# CPP code development with Visual Studio Code
+### C++ code development
+
+This guide is for remote debugging by connecting to the remote Linux server
using `SSH`.
-This guide is for remote debug. We will connect the remote linux server by
`SSH`.
Download and install [Visual Studio
Code](https://code.visualstudio.com/Download).
Key components found on the left side bar are:
- Explorer (Project structure)
- Search
- Run and Debug
- Extensions (Install the C/C++ Extension Pack, Remote Development, and
GitLens. C++ Test Mate is also suggested.)
-- Remote Explorer (Connect linux server by ssh command, click `+`, then input
`ssh [email protected]`)
+- Remote Explorer (Connect linux server by ssh command, click **+**, then
input `ssh [email protected]`)
Review Comment:
I updated with `ssh USERNAME@REMOTE_SERVER_IP_ADDRESS`. Thanks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]