Cai-Yao opened a new issue, #12103:
URL: https://github.com/apache/gluten/issues/12103

   ### Backend
   
   VL (Velox)
   
   ### Bug description
   
   ## Bug description
   When running Spark SQL with Gluten enabled on Spark 4.0.1, JVM crashes with 
a fatal error (`SIGSEGV`) shortly after loading `libgluten.so` and 
`libvelox.so`.
   
   ### Expected behavior
   Query should run successfully (or fallback gracefully), without JVM crash.
   
   ### Actual behavior
   Driver JVM crashes with `SIGSEGV` in `libjvm.so` during JNI method lookup 
path (`jni_GetMethodID`), and `hs_err_pid*.log` contains:
   
   - `NoClassDefFoundError: 
Lorg/apache/gluten/memory/listener/ReservationListener;`
   - native libraries loaded from Gluten bundle (`libgluten.so`, `libvelox.so`)
   - process exits due to fatal JVM error
   
   This issue report was drafted with assistance from AI.
   
   ### Gluten version
   
   main branch
   
   ### Spark version
   
   spark-4.0.x
   
   ### Spark configurations
   
   Sanitized key configs used in reproduction:
   
   --master local[80]
   --class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver
   --jars 
/<REDACTED_PATH>/gluten-velox-bundle-spark4.0_2.13-linux_amd64-1.6.0.jar
   
   --conf spark.plugins=org.apache.gluten.GlutenPlugin
   --conf spark.memory.offHeap.enabled=true
   --conf spark.memory.offHeap.size=20g
   --conf spark.driver.memory=24g
   --conf spark.sql.shuffle.partitions=80
   --conf spark.sql.crossJoin.enabled=true
   --conf spark.sql.legacy.timeParserPolicy=LEGACY
   --conf spark.sql.ansi.enabled=false
   --conf spark.gluten.sql.columnar.forceShuffledHashJoin=true
   --conf spark.sql.warehouse.dir=/<REDACTED_PATH>/warehouse
   
   spark.driver.extraJavaOptions:
     -Dderby.system.home=/<REDACTED_PATH>
     -XX:+UseG1GC
   
   ### System information
   
   JVM args (sanitized, key ones):
   JDK: Temurin 17.0.19+10
   --add-modules=jdk.incubator.vector
   multiple --add-opens=...=ALL-UNNAMED
   -Djdk.reflect.useDirectMethodHandle=false
   -Dio.netty.tryReflectionSetAccessible=true
   System information
   Sanitized environment details (from hs_err):
   OS: Alibaba Cloud Linux 3
   Kernel: 5.10.134-18.al8.x86_64
   Arch: x86_64
   CPU: Intel(R) Xeon(R) Platinum 8369B, 80 cores
   Memory: ~114 GB
   glibc: 2.32
   Java: OpenJDK Temurin 17.0.19+10 (linux-amd64)
   (If needed, I can run dev/info.sh and provide a redacted output.)
   
   ### Relevant logs
   
   ```bash
   # A fatal error has been detected by the Java Runtime Environment:
   #  SIGSEGV (0xb)
   # JRE version: OpenJDK Runtime Environment Temurin-17.0.19+10
   # Java VM: OpenJDK 64-Bit Server VM ... linux-amd64
   # Problematic frame:
   # V  [libjvm.so+0x28cda0] ... oop_access_barrier(void*)+0x0
   Internal exceptions (20 events):
   Event: 14.409 Thread ... Exception <a 'java/lang/NoClassDefFoundError' ...:
   Lorg/apache/gluten/memory/listener/ReservationListener;>
   thrown [src/hotspot/share/classfile/systemDictionary.cpp, line 245]
   Event: 3.960 Loaded shared library .../linux/amd64/libgluten.so
   Event: 4.996 Loaded shared library .../linux/amd64/libvelox.so
   java_command: org.apache.spark.deploy.SparkSubmit ... 
   --jars 
/<REDACTED_PATH>/gluten-velox-bundle-spark4.0_2.13-linux_amd64-1.6.0.jar ...
   Reproduction notes
   Same machine can run older Spark 3.3 + Gluten bundle without this crash.
   Crash is observed in Spark 4.0.1 + Gluten 1.6.0 (Spark 4.0 / Scala 2.13 
bundle).
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to