steveburnett commented on code in PR #10793: URL: https://github.com/apache/incubator-gluten/pull/10793#discussion_r2376203326
########## docs/developers/NewToGluten.md: ########## @@ -6,52 +6,21 @@ parent: Developer Overview --- Help users to debug and test with Gluten. -# Environment +# Guide for New Developers -Gluten supports Ubuntu20.04, Ubuntu22.04, CentOS8, CentOS7 and MacOS. +## Environment -## JDK +Gluten supports Ubuntu 20.04/22.04, CentOS 7/8 and MacOS. -Currently, Gluten supports JDK 8 for Spark 3.2/3.3/3.4/3.5. For Spark 3.3 and higher versions, Gluten -supports JDK 11 and 17. Please note since Spark 4.0, JDK 8 will not be supported. So we recommend Velox -backend users to use higher JDK version now to ease the migration for deploying Gluten with Spark-4.0 -in the future. And we may probably upgrade Arrow from 15.0.0 to some higher version, which also requires -JDK 11 is the minimum version. +### JDK -### JDK 8 +Currently, Gluten supports JDK 8 for Spark 3.2/3.3/3.4/3.5. For Spark 3.3 and later versions, Gluten Review Comment: ```suggestion Currently, Gluten supports JDK 8 for Spark 3.2, 3.3, 3.4, and 3.5. For Spark 3.3 and later versions, Gluten ``` ########## docs/developers/NewToGluten.md: ########## @@ -6,52 +6,21 @@ parent: Developer Overview --- Help users to debug and test with Gluten. -# Environment +# Guide for New Developers -Gluten supports Ubuntu20.04, Ubuntu22.04, CentOS8, CentOS7 and MacOS. +## Environment -## JDK +Gluten supports Ubuntu 20.04/22.04, CentOS 7/8 and MacOS. -Currently, Gluten supports JDK 8 for Spark 3.2/3.3/3.4/3.5. For Spark 3.3 and higher versions, Gluten -supports JDK 11 and 17. Please note since Spark 4.0, JDK 8 will not be supported. So we recommend Velox -backend users to use higher JDK version now to ease the migration for deploying Gluten with Spark-4.0 -in the future. And we may probably upgrade Arrow from 15.0.0 to some higher version, which also requires -JDK 11 is the minimum version. +### JDK -### JDK 8 +Currently, Gluten supports JDK 8 for Spark 3.2/3.3/3.4/3.5. For Spark 3.3 and later versions, Gluten +also supports JDK 11 and 17. Please note that starting with Spark 4.0, JDK 8 will no longer be supported. Review Comment: ```suggestion also supports JDK 11 and 17. Note: Starting with Spark 4.0, JDK 8 will no longer be supported. ``` ########## docs/developers/NewToGluten.md: ########## @@ -6,52 +6,21 @@ parent: Developer Overview --- Help users to debug and test with Gluten. -# Environment +# Guide for New Developers -Gluten supports Ubuntu20.04, Ubuntu22.04, CentOS8, CentOS7 and MacOS. +## Environment -## JDK +Gluten supports Ubuntu 20.04/22.04, CentOS 7/8 and MacOS. Review Comment: ```suggestion Gluten supports Ubuntu 20.04/22.04, CentOS 7/8, and MacOS. ``` ########## docs/developers/NewToGluten.md: ########## @@ -6,52 +6,21 @@ parent: Developer Overview --- Help users to debug and test with Gluten. -# Environment +# Guide for New Developers -Gluten supports Ubuntu20.04, Ubuntu22.04, CentOS8, CentOS7 and MacOS. +## Environment -## JDK +Gluten supports Ubuntu 20.04/22.04, CentOS 7/8 and MacOS. -Currently, Gluten supports JDK 8 for Spark 3.2/3.3/3.4/3.5. For Spark 3.3 and higher versions, Gluten -supports JDK 11 and 17. Please note since Spark 4.0, JDK 8 will not be supported. So we recommend Velox -backend users to use higher JDK version now to ease the migration for deploying Gluten with Spark-4.0 -in the future. And we may probably upgrade Arrow from 15.0.0 to some higher version, which also requires -JDK 11 is the minimum version. +### JDK -### JDK 8 +Currently, Gluten supports JDK 8 for Spark 3.2/3.3/3.4/3.5. For Spark 3.3 and later versions, Gluten +also supports JDK 11 and 17. Please note that starting with Spark 4.0, JDK 8 will no longer be supported. +So we recommend using a higher JDK version now to ease migration when deploying Gluten with Spark-4.0 Review Comment: ```suggestion We recommend using a higher JDK version now to ease migration when deploying Gluten with Spark-4.0 ``` ########## docs/developers/NewToGluten.md: ########## @@ -6,52 +6,21 @@ parent: Developer Overview --- Help users to debug and test with Gluten. -# Environment +# Guide for New Developers -Gluten supports Ubuntu20.04, Ubuntu22.04, CentOS8, CentOS7 and MacOS. +## Environment -## JDK +Gluten supports Ubuntu 20.04/22.04, CentOS 7/8 and MacOS. -Currently, Gluten supports JDK 8 for Spark 3.2/3.3/3.4/3.5. For Spark 3.3 and higher versions, Gluten -supports JDK 11 and 17. Please note since Spark 4.0, JDK 8 will not be supported. So we recommend Velox -backend users to use higher JDK version now to ease the migration for deploying Gluten with Spark-4.0 -in the future. And we may probably upgrade Arrow from 15.0.0 to some higher version, which also requires -JDK 11 is the minimum version. +### JDK -### JDK 8 +Currently, Gluten supports JDK 8 for Spark 3.2/3.3/3.4/3.5. For Spark 3.3 and later versions, Gluten +also supports JDK 11 and 17. Please note that starting with Spark 4.0, JDK 8 will no longer be supported. +So we recommend using a higher JDK version now to ease migration when deploying Gluten with Spark-4.0 +in the future. In addition, we may upgrade Arrow from 15.0.0 to a newer release, which also requires +JDK 11 as the minimum version. -#### Environment Setting - -For root user, the environment variables file is `/etc/profile`, it will take effect for all the users. - -For other user, you can set in `~/.bashrc`. - -#### Guide for Ubuntu - -The default JDK version in ubuntu is java11, we need to set to java8. - -```bash -apt install openjdk-8-jdk -update-alternatives --config java -java -version -``` - -`--config java` to config java executable path, `javac` and other commands can also use this command to config. -For some other uses, we suggest to set `JAVA_HOME`. - -```bash -export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/ -JRE_HOME=$JAVA_HOME/jre -export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar -# pay attention to $PATH double quote -export PATH="$PATH:$JAVA_HOME/bin" -``` - -> Must set PATH with double quote in ubuntu. - -### JDK 11/17 - -By default, Gluten compiles package using JDK8. Enable maven profile by `-Pjava-17` to use JDK17 or `-Pjava-11` to use JDK 11, and please make sure your JAVA_HOME is set correctly. +By default, Gluten compiles package using JDK 8. Enable maven profile by `-Pjava-17` to use JDK 17 or `-Pjava-11` to use JDK 11, and please make sure your JAVA_HOME is set correctly. Review Comment: ```suggestion By default, Gluten compiles packages using JDK 8. Enable maven profile by `-Pjava-17` to use JDK 17 or `-Pjava-11` to use JDK 11, and please make sure your JAVA_HOME is set correctly. ``` ########## docs/developers/NewToGluten.md: ########## @@ -61,30 +30,30 @@ spark.driver.extraJavaOptions=-Dio.netty.tryReflectionSetAccessible=true spark.executor.extraJavaOptions=-Dio.netty.tryReflectionSetAccessible=true ``` -## Maven 3.6.3 or above +### Maven 3.6.3 or above [Maven Download Page](https://maven.apache.org/docs/history.html) -And then set the environment setting. -## GCC 11 or above +### GCC 11 or above -# Compile Gluten using debug mode +## Compile Gluten using debug mode -If you want to just debug java/scala code, there is no need to compile cpp code with debug mode. +If you only need to debug Java/Scala code, there is no need to compile the C++ code in debug mode. Review Comment: ```suggestion To debug Java/Scala code, follow the steps in [build-gluten-with-velox-backend](../get-started/Velox.md#build-gluten-with-velox-backend). ``` ########## docs/developers/NewToGluten.md: ########## @@ -61,30 +30,30 @@ spark.driver.extraJavaOptions=-Dio.netty.tryReflectionSetAccessible=true spark.executor.extraJavaOptions=-Dio.netty.tryReflectionSetAccessible=true ``` -## Maven 3.6.3 or above +### Maven 3.6.3 or above [Maven Download Page](https://maven.apache.org/docs/history.html) -And then set the environment setting. -## GCC 11 or above +### GCC 11 or above -# Compile Gluten using debug mode +## Compile Gluten using debug mode -If you want to just debug java/scala code, there is no need to compile cpp code with debug mode. +If you only need to debug Java/Scala code, there is no need to compile the C++ code in debug mode. You can just refer to [build-gluten-with-velox-backend](../get-started/Velox.md#build-gluten-with-velox-backend). -If you need to debug cpp code, please compile the backend code and gluten cpp code with debug mode. +For debugging C++ code, please compile the backend code and gluten C++ code in debug mode. Review Comment: ```suggestion To debug C++ code, please compile the backend code and gluten C++ code in debug mode. ``` ########## docs/developers/NewToGluten.md: ########## @@ -6,52 +6,21 @@ parent: Developer Overview --- Help users to debug and test with Gluten. -# Environment +# Guide for New Developers -Gluten supports Ubuntu20.04, Ubuntu22.04, CentOS8, CentOS7 and MacOS. +## Environment -## JDK +Gluten supports Ubuntu 20.04/22.04, CentOS 7/8 and MacOS. -Currently, Gluten supports JDK 8 for Spark 3.2/3.3/3.4/3.5. For Spark 3.3 and higher versions, Gluten -supports JDK 11 and 17. Please note since Spark 4.0, JDK 8 will not be supported. So we recommend Velox -backend users to use higher JDK version now to ease the migration for deploying Gluten with Spark-4.0 -in the future. And we may probably upgrade Arrow from 15.0.0 to some higher version, which also requires -JDK 11 is the minimum version. +### JDK -### JDK 8 +Currently, Gluten supports JDK 8 for Spark 3.2/3.3/3.4/3.5. For Spark 3.3 and later versions, Gluten +also supports JDK 11 and 17. Please note that starting with Spark 4.0, JDK 8 will no longer be supported. +So we recommend using a higher JDK version now to ease migration when deploying Gluten with Spark-4.0 +in the future. In addition, we may upgrade Arrow from 15.0.0 to a newer release, which also requires Review Comment: ```suggestion in the future. In addition, we may upgrade Arrow from 15.0.0 to a newer release, which will require ``` ########## docs/developers/NewToGluten.md: ########## @@ -95,32 +64,25 @@ Install the Linux IntelliJ version, and debug code locally. - Download [IntelliJ Linux community version](https://www.jetbrains.com/idea/download/?fromIDE=#section=linux) to Linux server - Start Idea, `bash <idea_dir>/idea.sh` -## Set up Gluten project +#### Set up Gluten project - Make sure you have compiled Gluten. - Load the Gluten by File->Open, select <gluten_home/pom.xml>. - Activate your profiles such as <backends-velox>, and Reload Maven Project, you will find all your need modules have been activated. Review Comment: ```suggestion - Activate your profiles such as `<backends-velox>`, then **Reload Maven Project** to activate all the needed modules. ``` When I used View File in GitHub, `<backends-velox>` was missing. See screenshot. I marked it as code to make it visible. <img width="899" height="36" alt="Screenshot 2025-09-24 at 11 45 31 AM" src="https://github.com/user-attachments/assets/1e9e06eb-9d4d-4c7e-ad1f-52b804885e4b" /> ########## docs/developers/NewToGluten.md: ########## @@ -61,30 +30,30 @@ spark.driver.extraJavaOptions=-Dio.netty.tryReflectionSetAccessible=true spark.executor.extraJavaOptions=-Dio.netty.tryReflectionSetAccessible=true ``` -## Maven 3.6.3 or above +### Maven 3.6.3 or above [Maven Download Page](https://maven.apache.org/docs/history.html) -And then set the environment setting. -## GCC 11 or above +### GCC 11 or above -# Compile Gluten using debug mode +## Compile Gluten using debug mode -If you want to just debug java/scala code, there is no need to compile cpp code with debug mode. +If you only need to debug Java/Scala code, there is no need to compile the C++ code in debug mode. You can just refer to [build-gluten-with-velox-backend](../get-started/Velox.md#build-gluten-with-velox-backend). -If you need to debug cpp code, please compile the backend code and gluten cpp code with debug mode. +For debugging C++ code, please compile the backend code and gluten C++ code in debug mode. ```bash ## compile Velox backend with benchmark and tests to debug gluten_home/dev/builddeps-veloxbe.sh --build_tests=ON --build_benchmarks=ON --build_type=Debug ``` -If you need to debug the tests in <gluten>/gluten-ut, You need to compile java code with `-P spark-ut`. +If you need to debug the tests in <gluten>/gluten-ut, You need to compile java code with `-Pspark-ut`. Review Comment: ```suggestion Note: To debug the tests in <gluten>/gluten-ut, you must compile java code with `-Pspark-ut`. ``` ########## docs/developers/NewToGluten.md: ########## @@ -95,32 +64,25 @@ Install the Linux IntelliJ version, and debug code locally. - Download [IntelliJ Linux community version](https://www.jetbrains.com/idea/download/?fromIDE=#section=linux) to Linux server - Start Idea, `bash <idea_dir>/idea.sh` -## Set up Gluten project +#### Set up Gluten project - Make sure you have compiled Gluten. - Load the Gluten by File->Open, select <gluten_home/pom.xml>. - Activate your profiles such as <backends-velox>, and Reload Maven Project, you will find all your need modules have been activated. - Create breakpoint and debug as you wish, maybe you can try `CTRL+N` to find `TestOperator` to start your test. -## Java/Scala code style +#### Java/Scala code style IntelliJ supports importing settings for Java/Scala code style. You can import [intellij-codestyle.xml](../../dev/intellij-codestyle.xml) to your IDE. See [IntelliJ guide](https://www.jetbrains.com/help/idea/configuring-code-style.html#import-code-style). -To generate a fix for Java/Scala code style, you can run one or more of the below commands according to the code modules involved in your PR. +To format Java/Scala code using the Spotless plugin, run the following command: -For Velox backend: -``` -mvn spotless:apply -Pbackends-velox -Pceleborn -Puniffle -Pspark-3.2 -Pspark-ut -DskipTests -mvn spotless:apply -Pbackends-velox -Pceleborn -Puniffle -Pspark-3.3 -Pspark-ut -DskipTests ``` -For Clickhouse backend: -``` -mvn spotless:apply -Pbackends-clickhouse -Pspark-3.2 -Pspark-ut -DskipTests -mvn spotless:apply -Pbackends-clickhouse -Pspark-3.3 -Pspark-ut -DskipTests +./dev/format-scala-code.sh ``` -# CPP code development with Visual Studio Code +### CPP code development Review Comment: ```suggestion ### C++ code development ``` I think this is better. If you disagree, please let me know! ########## docs/developers/NewToGluten.md: ########## @@ -61,30 +30,30 @@ spark.driver.extraJavaOptions=-Dio.netty.tryReflectionSetAccessible=true spark.executor.extraJavaOptions=-Dio.netty.tryReflectionSetAccessible=true ``` -## Maven 3.6.3 or above +### Maven 3.6.3 or above [Maven Download Page](https://maven.apache.org/docs/history.html) -And then set the environment setting. -## GCC 11 or above +### GCC 11 or above -# Compile Gluten using debug mode +## Compile Gluten using debug mode -If you want to just debug java/scala code, there is no need to compile cpp code with debug mode. +If you only need to debug Java/Scala code, there is no need to compile the C++ code in debug mode. You can just refer to [build-gluten-with-velox-backend](../get-started/Velox.md#build-gluten-with-velox-backend). Review Comment: ```suggestion ``` ########## docs/developers/NewToGluten.md: ########## @@ -136,23 +98,23 @@ Key components found on the left side bar are: Input your password in the above pop-up window, it will take a few minutes to install linux vscode server in remote machine folder `~/.vscode-server` Review Comment: ```suggestion Input your password in the above pop-up window. It will take a few minutes to install the Linux VSCode server in remote machine folder `~/.vscode-server`. ``` ########## docs/developers/NewToGluten.md: ########## @@ -95,32 +64,25 @@ Install the Linux IntelliJ version, and debug code locally. - Download [IntelliJ Linux community version](https://www.jetbrains.com/idea/download/?fromIDE=#section=linux) to Linux server - Start Idea, `bash <idea_dir>/idea.sh` -## Set up Gluten project +#### Set up Gluten project - Make sure you have compiled Gluten. - Load the Gluten by File->Open, select <gluten_home/pom.xml>. - Activate your profiles such as <backends-velox>, and Reload Maven Project, you will find all your need modules have been activated. - Create breakpoint and debug as you wish, maybe you can try `CTRL+N` to find `TestOperator` to start your test. -## Java/Scala code style +#### Java/Scala code style IntelliJ supports importing settings for Java/Scala code style. You can import [intellij-codestyle.xml](../../dev/intellij-codestyle.xml) to your IDE. See [IntelliJ guide](https://www.jetbrains.com/help/idea/configuring-code-style.html#import-code-style). -To generate a fix for Java/Scala code style, you can run one or more of the below commands according to the code modules involved in your PR. +To format Java/Scala code using the Spotless plugin, run the following command: -For Velox backend: -``` -mvn spotless:apply -Pbackends-velox -Pceleborn -Puniffle -Pspark-3.2 -Pspark-ut -DskipTests -mvn spotless:apply -Pbackends-velox -Pceleborn -Puniffle -Pspark-3.3 -Pspark-ut -DskipTests ``` -For Clickhouse backend: -``` -mvn spotless:apply -Pbackends-clickhouse -Pspark-3.2 -Pspark-ut -DskipTests -mvn spotless:apply -Pbackends-clickhouse -Pspark-3.3 -Pspark-ut -DskipTests +./dev/format-scala-code.sh ``` -# CPP code development with Visual Studio Code +### CPP code development This guide is for remote debug. We will connect the remote linux server by `SSH`. Review Comment: ```suggestion This guide is for remote debugging by connecting to the remote Linux server using `SSH`. ``` ########## docs/developers/NewToGluten.md: ########## @@ -95,32 +64,25 @@ Install the Linux IntelliJ version, and debug code locally. - Download [IntelliJ Linux community version](https://www.jetbrains.com/idea/download/?fromIDE=#section=linux) to Linux server - Start Idea, `bash <idea_dir>/idea.sh` -## Set up Gluten project +#### Set up Gluten project - Make sure you have compiled Gluten. - Load the Gluten by File->Open, select <gluten_home/pom.xml>. - Activate your profiles such as <backends-velox>, and Reload Maven Project, you will find all your need modules have been activated. - Create breakpoint and debug as you wish, maybe you can try `CTRL+N` to find `TestOperator` to start your test. -## Java/Scala code style +#### Java/Scala code style IntelliJ supports importing settings for Java/Scala code style. You can import [intellij-codestyle.xml](../../dev/intellij-codestyle.xml) to your IDE. See [IntelliJ guide](https://www.jetbrains.com/help/idea/configuring-code-style.html#import-code-style). -To generate a fix for Java/Scala code style, you can run one or more of the below commands according to the code modules involved in your PR. +To format Java/Scala code using the Spotless plugin, run the following command: -For Velox backend: -``` -mvn spotless:apply -Pbackends-velox -Pceleborn -Puniffle -Pspark-3.2 -Pspark-ut -DskipTests -mvn spotless:apply -Pbackends-velox -Pceleborn -Puniffle -Pspark-3.3 -Pspark-ut -DskipTests ``` -For Clickhouse backend: -``` -mvn spotless:apply -Pbackends-clickhouse -Pspark-3.2 -Pspark-ut -DskipTests -mvn spotless:apply -Pbackends-clickhouse -Pspark-3.3 -Pspark-ut -DskipTests +./dev/format-scala-code.sh ``` -# CPP code development with Visual Studio Code +### CPP code development This guide is for remote debug. We will connect the remote linux server by `SSH`. Download and install [Visual Studio Code](https://code.visualstudio.com/Download). Review Comment: ```suggestion Download and install [Visual Studio Code](https://code.visualstudio.com/Download). ``` ########## docs/developers/NewToGluten.md: ########## @@ -136,23 +98,23 @@ Key components found on the left side bar are: Input your password in the above pop-up window, it will take a few minutes to install linux vscode server in remote machine folder `~/.vscode-server` If download failed, delete this folder and try again. -## Usage +Please note if vscode is upgraded, we need to download linux server again. Recommend switching update mode to off. Search `update` in Manage->Settings to turn off update mode. Review Comment: ```suggestion Note: If vscode is upgraded, you must download the linux server again. We recommend switching the update mode to `off`. Search `update` in Manage->Settings to turn off update mode. ``` ########## docs/developers/NewToGluten.md: ########## @@ -136,23 +98,23 @@ Key components found on the left side bar are: Input your password in the above pop-up window, it will take a few minutes to install linux vscode server in remote machine folder `~/.vscode-server` If download failed, delete this folder and try again. -## Usage +Please note if vscode is upgraded, we need to download linux server again. Recommend switching update mode to off. Search `update` in Manage->Settings to turn off update mode. -### Set up project +#### Set up project - File->Open Folder // select the Gluten folder - After the project loads, you will be prompted to "Select CMakeLists.txt". Select the Review Comment: ```suggestion - After the project loads, you will be prompted to **Select CMakeLists.txt**. Select the ``` ########## docs/developers/NewToGluten.md: ########## @@ -136,23 +98,23 @@ Key components found on the left side bar are: Input your password in the above pop-up window, it will take a few minutes to install linux vscode server in remote machine folder `~/.vscode-server` If download failed, delete this folder and try again. -## Usage +Please note if vscode is upgraded, we need to download linux server again. Recommend switching update mode to off. Search `update` in Manage->Settings to turn off update mode. -### Set up project +#### Set up project - File->Open Folder // select the Gluten folder - After the project loads, you will be prompted to "Select CMakeLists.txt". Select the `${workspaceFolder}/cpp/CMakeLists.txt` file. - Next, you will be prompted to "Select a Kit" for the Gluten project. Select GCC 11 or above. Review Comment: ```suggestion - Next, you will be prompted to **Select a Kit** for the Gluten project. Select **GCC 11** or above. ``` ########## docs/developers/NewToGluten.md: ########## @@ -136,23 +98,23 @@ Key components found on the left side bar are: Input your password in the above pop-up window, it will take a few minutes to install linux vscode server in remote machine folder `~/.vscode-server` If download failed, delete this folder and try again. -## Usage +Please note if vscode is upgraded, we need to download linux server again. Recommend switching update mode to off. Search `update` in Manage->Settings to turn off update mode. -### Set up project +#### Set up project - File->Open Folder // select the Gluten folder - After the project loads, you will be prompted to "Select CMakeLists.txt". Select the `${workspaceFolder}/cpp/CMakeLists.txt` file. - Next, you will be prompted to "Select a Kit" for the Gluten project. Select GCC 11 or above. -### Settings +#### Settings VSCode supports 2 ways to set user setting. - Manage->Command Palette (Open `settings.json`, search by `Preferences: Open Settings (JSON)`) - Manage->Settings (Common setting) -### Build using VSCode +#### Build using VSCode VSCode will try to compile using debug mode in <gluten_home>/build. We need to compile Velox debug mode before compiling Gluten. If you have previously compiled Velox in release mode, use the command below to compile in debug mode. Review Comment: ```suggestion compiling Gluten. Note: If you have previously compiled Velox in release mode, use the command below to compile in debug mode. ``` Making this a Note because it doesn't apply to everyone. Adding a line to left-justify the Note so it's more visible. ########## docs/developers/NewToGluten.md: ########## @@ -136,23 +98,23 @@ Key components found on the left side bar are: Input your password in the above pop-up window, it will take a few minutes to install linux vscode server in remote machine folder `~/.vscode-server` If download failed, delete this folder and try again. -## Usage +Please note if vscode is upgraded, we need to download linux server again. Recommend switching update mode to off. Search `update` in Manage->Settings to turn off update mode. -### Set up project +#### Set up project - File->Open Folder // select the Gluten folder Review Comment: ```suggestion - Select **File**->**Open Folder**, then select the Gluten folder. ``` A common way to indicate something the user sees on screen, like a menu command or a button name, is to bold the text the user should look for. I am applying this common convention as I notice it, please help if I overlook elements that should be bolded. ########## docs/developers/NewToGluten.md: ########## @@ -192,7 +154,7 @@ select a debugger like "C++ (GDB/LLDB)". The launch.json will be created at: `< Click the `Add Configuration` button in launch.json, and select gdb "launch" (to start and debug a program) or Review Comment: ```suggestion Click the **Add Configuration** button in `launch.json`, and select gdb **launch** to start and debug a program, or ``` Note: GitHub will not let me edit line 151 above ``` Open the `Run and Debug` panel (Ctrl-Shift-D) ``` to change `Run and Debug` to bold **Run and Debug** to be consistent with the change I mentioned in my comment on line 143. ########## docs/developers/NewToGluten.md: ########## @@ -167,7 +129,7 @@ make debug EXTRA_CMAKE_FLAGS="-DVELOX_ENABLE_PARQUET=ON -DENABLE_HDFS=ON -DVELOX Then Gluten will link the Velox debug library. Just click `build` in bottom bar, you will get intellisense search and link. Review Comment: ```suggestion Click `build` in the bottom bar to get IntelliSense search and link. ``` ########## docs/developers/NewToGluten.md: ########## @@ -266,32 +228,12 @@ Click the `Add Configuration` button in launch.json, and select gdb "launch" (to Then you can create breakpoint and debug in `Run and Debug` section. -### Velox debug +#### Debug Velox code For some Velox tests such as `ParquetReaderTest`, tests need to read the parquet file in `<velox_home>/velox/dwio/parquet/tests/examples`, you should let the screen on `ParquetReaderTest.cpp`, then click `Start Debugging`, otherwise `No such file or directory` exception will be raised. Review Comment: ```suggestion Let the screen on `ParquetReaderTest.cpp`, then click **Start Debugging**, otherwise `No such file or directory` exception will be raised. ``` I changed the beginning of this to start a new sentence in an imperative mode, but I do not know what "Let the screen on ParquetReaderTest.cpp" means so I did not rephrase those words. What action does the user need to do? ########## docs/developers/NewToGluten.md: ########## @@ -136,23 +98,23 @@ Key components found on the left side bar are: Input your password in the above pop-up window, it will take a few minutes to install linux vscode server in remote machine folder `~/.vscode-server` If download failed, delete this folder and try again. -## Usage +Please note if vscode is upgraded, we need to download linux server again. Recommend switching update mode to off. Search `update` in Manage->Settings to turn off update mode. -### Set up project +#### Set up project - File->Open Folder // select the Gluten folder - After the project loads, you will be prompted to "Select CMakeLists.txt". Select the `${workspaceFolder}/cpp/CMakeLists.txt` file. - Next, you will be prompted to "Select a Kit" for the Gluten project. Select GCC 11 or above. -### Settings +#### Settings VSCode supports 2 ways to set user setting. - Manage->Command Palette (Open `settings.json`, search by `Preferences: Open Settings (JSON)`) - Manage->Settings (Common setting) -### Build using VSCode +#### Build using VSCode VSCode will try to compile using debug mode in <gluten_home>/build. We need to compile Velox debug mode before Review Comment: ```suggestion VSCode will try to compile using debug mode in `<gluten_home>/build`. You must compile Velox debug mode before ``` ########## docs/developers/NewToGluten.md: ########## @@ -266,32 +228,12 @@ Click the `Add Configuration` button in launch.json, and select gdb "launch" (to Then you can create breakpoint and debug in `Run and Debug` section. -### Velox debug +#### Debug Velox code For some Velox tests such as `ParquetReaderTest`, tests need to read the parquet file in `<velox_home>/velox/dwio/parquet/tests/examples`, Review Comment: ```suggestion For some Velox tests such as `ParquetReaderTest`, tests need to read the parquet file in `<velox_home>/velox/dwio/parquet/tests/examples`. ``` ########## docs/developers/NewToGluten.md: ########## @@ -167,7 +129,7 @@ make debug EXTRA_CMAKE_FLAGS="-DVELOX_ENABLE_PARQUET=ON -DENABLE_HDFS=ON -DVELOX Then Gluten will link the Velox debug library. Just click `build` in bottom bar, you will get intellisense search and link. -### Debug +#### Debug Setting The default compile command does not enable test and benchmark, so we don't get any executable files. Review Comment: ```suggestion The default compile command does not enable tests and benchmarks, so we don't get any executable files of those kinds. ``` I added words for clarity. Please check that my words are correct! ########## docs/developers/NewToGluten.md: ########## @@ -400,7 +342,7 @@ or by the following commands: - `gcore <pid>` - `kill -s SIGSEGV <pid>` -# Debug cpp with gdb +## Debug cpp with gdb You can use gdb to debug tests and benchmarks. Review Comment: ```suggestion You can use gdb to debug tests, benchmarks, and JNI calls. ``` ########## docs/developers/NewToGluten.md: ########## @@ -311,7 +253,7 @@ Search `default formatter` in `Settings`, select Clang-Format. If your formatOnSave still make no effect, you can use shortcut `SHIFT+ALT+F` to format one file manually. Review Comment: ```suggestion If formatOnSave still has no effect, select a single file and use `SHIFT+ALT+F` to format it manually. ``` GitHub does not let me edit the lines immediately above this (309-310, were 251-252). I suggest those lines should be more like this: ``` If multiple clang-format versions are installed, formatOnSave may not take effect. To specify the default formatter, search for `default formatter` in **Settings**, then select **Clang-Format**. ``` ########## docs/developers/NewToGluten.md: ########## @@ -311,7 +253,7 @@ Search `default formatter` in `Settings`, select Clang-Format. If your formatOnSave still make no effect, you can use shortcut `SHIFT+ALT+F` to format one file manually. -### CMake format +#### CMake format To format cmake files, like CMakeLists.txt & *.cmake, please install `cmake-format`. Review Comment: ```suggestion To format cmake files like CMakeLists.txt & *.cmake, please install `cmake-format`. ``` ########## docs/developers/NewToGluten.md: ########## @@ -470,7 +412,7 @@ child allocators: 0 at org.apache.spark.memory.SparkMemoryUtil$UnsafeItr.hasNext(SparkMemoryUtil.scala:246) ``` -## CPP code memory leak +### CPP code memory leak Sometimes you cannot get the coredump symbols, if you debug memory leak, you can write googletest to use valgrind to detect Review Comment: ```suggestion Sometimes you cannot get the coredump symbols, when debugging a memory leak. You can write a googletest to use valgrind to detect ``` ########## docs/developers/NewToGluten.md: ########## @@ -400,7 +342,7 @@ or by the following commands: - `gcore <pid>` - `kill -s SIGSEGV <pid>` -# Debug cpp with gdb +## Debug cpp with gdb You can use gdb to debug tests and benchmarks. And also you can debug jni call. Review Comment: ```suggestion ``` ########## docs/developers/NewToGluten.md: ########## @@ -498,20 +439,20 @@ spark-shell --name run_gluten \ --conf spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager ``` -# Check Gluten Approved Spark Plan +## Check Gluten Approved Spark Plan To make sure we don't accidentally modify the Gluten and Spark Plan build logic. Review Comment: ```suggestion To avoid accidentally modifying the Gluten and Spark Plan build logic, ``` ########## docs/developers/NewToGluten.md: ########## @@ -331,7 +273,7 @@ After the above installation, you can optionally do some configuration in Visual location, you might not need to change this setting. 3. Now, you can format your CMake files by right-clicking in a file and selecting `Format Document`. -### Add UT +#### Add UT 1. For Native Code Modifications: If you have modified native code, it is best to use gtest to test the native code. Review Comment: ```suggestion 1. For Native Code Modifications: If you have modified native code, use gtest to test the native code. ``` ########## docs/developers/NewToGluten.md: ########## @@ -331,7 +273,7 @@ After the above installation, you can optionally do some configuration in Visual location, you might not need to change this setting. 3. Now, you can format your CMake files by right-clicking in a file and selecting `Format Document`. Review Comment: ```suggestion 3. Format your CMake files by right-clicking in a file and selecting `Format Document`. ``` ########## docs/developers/NewToGluten.md: ########## @@ -498,20 +439,20 @@ spark-shell --name run_gluten \ --conf spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager ``` -# Check Gluten Approved Spark Plan +## Check Gluten Approved Spark Plan To make sure we don't accidentally modify the Gluten and Spark Plan build logic. We introduce new logic in `VeloxTPCHSuite` to check whether the plan has been changed or not, Review Comment: ```suggestion we introduce new logic in `VeloxTPCHSuite` to check whether the plan has been changed or not. ``` ########## docs/developers/NewToGluten.md: ########## @@ -498,20 +439,20 @@ spark-shell --name run_gluten \ --conf spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager ``` -# Check Gluten Approved Spark Plan +## Check Gluten Approved Spark Plan To make sure we don't accidentally modify the Gluten and Spark Plan build logic. We introduce new logic in `VeloxTPCHSuite` to check whether the plan has been changed or not, and this will be triggered when running the unit test. Review Comment: ```suggestion This `VeloxTPCHSuite` is triggered when running the unit test. ``` ########## docs/developers/NewToGluten.md: ########## @@ -498,20 +439,20 @@ spark-shell --name run_gluten \ --conf spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager ``` -# Check Gluten Approved Spark Plan +## Check Gluten Approved Spark Plan To make sure we don't accidentally modify the Gluten and Spark Plan build logic. We introduce new logic in `VeloxTPCHSuite` to check whether the plan has been changed or not, and this will be triggered when running the unit test. -As a result, developers may encounter unit test fail in Github CI or locally, with the following error message: +As a result, developers may encounter unit test failures in GitHub CI or locally, with the following error message: ```log - TPC-H q5 *** FAILED *** Mismatch for query 5 Actual Plan path: /tmp/tpch-approved-plan/v2-bhj/spark322/5.txt Golden Plan path: /opt/gluten/backends-velox/target/scala-2.12/test-classes/tpch-approved-plan/v2-bhj/spark322/5.txt (VeloxTPCHSuite.scala:101) ``` -For developers to update the golden plan, you can find the actual plan in Github CI Artifacts or in local `/tmp/` directory. +To update the golden plan, you can find the actual plan in GitHub CI Artifacts or in local `/tmp/` directory. Review Comment: ```suggestion To update the golden plan, find the actual plan in GitHub CI Artifacts or in the local `/tmp/` directory. ``` ########## docs/developers/NewToGluten.md: ########## @@ -136,23 +98,23 @@ Key components found on the left side bar are: Input your password in the above pop-up window, it will take a few minutes to install linux vscode server in remote machine folder `~/.vscode-server` If download failed, delete this folder and try again. Review Comment: ```suggestion If the download fails, delete this folder and try again. ``` ########## docs/developers/NewToGluten.md: ########## @@ -95,32 +64,25 @@ Install the Linux IntelliJ version, and debug code locally. - Download [IntelliJ Linux community version](https://www.jetbrains.com/idea/download/?fromIDE=#section=linux) to Linux server - Start Idea, `bash <idea_dir>/idea.sh` -## Set up Gluten project +#### Set up Gluten project - Make sure you have compiled Gluten. - Load the Gluten by File->Open, select <gluten_home/pom.xml>. - Activate your profiles such as <backends-velox>, and Reload Maven Project, you will find all your need modules have been activated. - Create breakpoint and debug as you wish, maybe you can try `CTRL+N` to find `TestOperator` to start your test. Review Comment: ```suggestion - Create breakpoints and debug as you wish. You can use `CTRL+N` to find `TestOperator` to start your test. ``` ########## docs/developers/NewToGluten.md: ########## @@ -192,7 +154,7 @@ select a debugger like "C++ (GDB/LLDB)". The launch.json will be created at: `< Click the `Add Configuration` button in launch.json, and select gdb "launch" (to start and debug a program) or "attach" (to attach and debug a running program). Review Comment: ```suggestion **attach** to attach and debug a running program. ``` ########## docs/developers/NewToGluten.md: ########## @@ -432,9 +374,9 @@ wait to attach.... (gdb) c ``` -# Debug Memory leak +## Debug Memory leak -## Arrow memory allocator leak +### Arrow memory allocator leak If you receive error message like Review Comment: ```suggestion If you receive an error message like the following: ``` ########## docs/developers/NewToGluten.md: ########## @@ -479,15 +421,14 @@ apt install valgrind valgrind --leak-check=yes ./exec_backend_test ``` - -# Run TPC-H and TPC-DS +## Run TPC-H and TPC-DS We supply `<gluten_home>/tools/gluten-it` to execute these queries Refer to [velox_backend.yml](https://github.com/apache/incubator-gluten/blob/main/.github/workflows/velox_backend.yml) -# Run Gluten+Velox on clean machine +## Run Gluten Velox backend -We can run Gluten + Velox on clean machine by one command (supported OS: Ubuntu20.04/22.04, CentOS 7/8, etc.). +We can run Gluten Velox backend by one command. Review Comment: ```suggestion Run the Gluten Velox backend using the following command: ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
