malinjawi opened a new issue, #11862:
URL: https://github.com/apache/gluten/issues/11862

   ### Backend
   
   VL (Velox)
   
   ### Bug description
   
   
   Expected behavior:
   
   Spark 3.5 queries on Gluten `main` should be able to use the Velox backend 
even when the effective Spark session timezone is `GMT`. Gluten should either 
pass a native-compatible timezone string to Velox or normalize the Spark value 
before native validation.
   
   Actual behavior:
   
   Gluten forwards the effective Spark session timezone into native query 
config. When that value is `GMT`, native validation fails before execution 
starts:
   
   ```text
   Exception: VeloxUserError
   Error Source: USER
   Error Code: INVALID_ARGUMENT
   Reason: session 'session_timezone' set with invalid value 'GMT'
   Expression: tz::getTimeZoneID(*tz, false) != -1
   Function: validateConfig
   File: .../ep/build-velox/build/velox_ep/velox/core/QueryConfig.cpp
   Line: 44
   ```
   
   As a result, the query never reaches the actual Delta/Iceberg/Velox 
execution path. Native offload is rejected at validation time because of the 
session timezone value alone.
   
   This was reproduced on a clean detached worktree without the local timezone 
normalization patch, after a fresh native rebuild using Gluten's pinned Velox 
dependency. The failure is not Delta-only; it also reproduces in a non-Delta 
Velox suite, which indicates this is a general Spark-to-native integration bug 
in the Velox backend path rather than a Delta-specific semantic issue.
   
   Reproduction summary:
   
   1. Create a clean detached worktree at 
`f350a440dbfa3a3412f0aed48a89fdac0b16ad48`.
   2. Rebuild native dependencies from the clean checkout:
   
      ```bash
      ./dev/builddeps-veloxbe.sh --run_setup_script=OFF --build_arrow=OFF 
--build_tests=ON --spark_version=3.5
      ```
   
   3. Run the backend tests from the same clean checkout:
   
      ```bash
      ./build/mvn \
        -s /path/to/dev/maven-public-settings.xml \
        -Dmaven.repo.local=/path/to/.run-scala-test-cache/m2 \
        -pl backends-velox \
        -am \
        -Pbackends-velox,delta,iceberg,spark-3.5 \
        -Dtest=OptimizedWritesSuite \
        -Dsurefire.failIfNoSpecifiedTests=false \
        test
      ```
   
   Observed failures include:
   
   - `org.apache.spark.sql.delta.DeltaColumnDefaultsInsertSuite`
   - `org.apache.gluten.execution.VeloxDeltaSuite`
   - `org.apache.gluten.execution.VeloxTPCHMiscSuite`
   
   Example clean repro reports:
   
   - 
`backends-velox/target/surefire-reports/TEST-org.apache.spark.sql.delta.DeltaColumnDefaultsInsertSuite.xml`
   - 
`backends-velox/target/surefire-reports/TEST-org.apache.gluten.execution.VeloxDeltaSuite.xml`
   - 
`backends-velox/target/surefire-reports/TEST-org.apache.gluten.execution.VeloxTPCHMiscSuite.xml`
   
   Why this matters:
   
   - It blocks native validation before query execution.
   - It prevents meaningful verification of Delta/Iceberg behavior because 
suites fail on config setup first.
   - It hides deeper runtime issues behind a config gate.
   - It affects the current clean `main` build path.
   
   Why this appears to be a Gluten-side issue:
   
   - Spark 3.5 can accept and surface `GMT`-family timezone IDs as an effective 
session timezone.
   - Gluten forwards that session timezone into native config.
   - The current clean Gluten product path fails before execution on that value.
   - The practical compatibility boundary between Spark session config and 
native Velox config is in Gluten.
   
   Candidate fix:
   
   Normalize Spark `GMT` / `GMT+/-offset` session timezone values to 
native-safe `UTC` forms at the Gluten/native boundary before native validation.
   
   ### Gluten version
   
   main branch
   
   ### Spark version
   
   Spark-3.5.x
   
   ### Spark configurations
   
   No explicit `spark.sql.session.timeZone` override was set in the reproducer.
   
   The effective session timezone seen by native validation was `GMT`.
   
   ### System information
   
   ````
   ./dev/info.sh: line 47: lscpu: command not found
   
   Gluten Version: 1.7.0-SNAPSHOT
   Commit: f350a440dbfa3a3412f0aed48a89fdac0b16ad48
   CMake Version: 4.2.3
   System: Darwin-25.3.0
   Arch: arm64
   CPU Name: 
   C++ Compiler: /usr/bin/c++
   C++ Compiler Version: 17.0.0.17000013
   17.0.0.17000013
   C Compiler: /usr/bin/cc
   C Compiler Version: 17.0.0.17000013
   17.0.0.17000013
   CMake Prefix Path: 
/Applications/Xcode_16.4.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr;/opt/homebrew;/usr/local;/usr;/;/opt/homebrew;/usr/local;/usr/X11R6;/usr/pkg;/opt;/sw;/opt/local
   
   
   ### Relevant logs
   
   ```bash
   Exception: VeloxUserError
   Error Source: USER
   Error Code: INVALID_ARGUMENT
   Reason: session 'session_timezone' set with invalid value 'GMT'
   Retriable: False
   Expression: tz::getTimeZoneID(*tz, false) != -1
   Function: validateConfig
   File: 
/Users/malinjawi/Documents/GitHub/GlutenVelox/incubator-gluten-verify-clean/ep/build-velox/build/velox_ep/velox/core/QueryConfig.cpp
   Line: 44
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to