hsiang-c commented on PR #2205:
URL: 
https://github.com/apache/datafusion-comet/pull/2205#issuecomment-3207996475

   One of the test failures is this
   
   ```
   TestStoragePartitionedJoins > testJoinsWithBucketingOnLongColumn() > 
catalogName = testhadoop, implementation = 
org.apache.iceberg.spark.SparkCatalog, config = {type=hadoop, 
cache-enabled=false}, planningMode = DISTRIBUTED FAILED
       org.opentest4j.AssertionFailedError: [SPJ should not change query 
output: row 1 col 1 contents should match] 
       expected: -593534002
        but was: -2147483648
           at 
app//org.apache.iceberg.spark.SparkTestHelperBase.assertEquals(SparkTestHelperBase.java:86)
           at 
app//org.apache.iceberg.spark.SparkTestHelperBase.assertEquals(SparkTestHelperBase.java:68)
           at 
app//org.apache.iceberg.spark.sql.TestStoragePartitionedJoins.assertPartitioningAwarePlan(TestStoragePartitionedJoins.java:661)
           at 
app//org.apache.iceberg.spark.sql.TestStoragePartitionedJoins.checkJoin(TestStoragePartitionedJoins.java:612)
           at 
app//org.apache.iceberg.spark.sql.TestStoragePartitionedJoins.testJoinsWithBucketingOnLongColumn(TestStoragePartitionedJoins.java:148)
   ```
   
   The corresponding test code expected `1` and `3`, but from the above stack 
trace we got `-593534002` and `-2147483648`. The latter value (-2^31) seems 
like an overflow?
   
   ```
       assertPartitioningAwarePlan(
           1, /* expected num of shuffles with SPJ */
           3, /* expected num of shuffles without SPJ */
           "SELECT t1.id, t1.salary, t1.%s "
               + "FROM %s t1 "
               + "INNER JOIN %s t2 "
               + "ON t1.id = t2.id AND t1.%s = t2.%s "
               + "ORDER BY t1.id, t1.%s",
           sourceColumnName,
           tableName,
           tableName(OTHER_TABLE_NAME),
           sourceColumnName,
           sourceColumnName,
           sourceColumnName);
     }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to