gaoyangxiaozhu commented on code in PR #5351:
URL: https://github.com/apache/incubator-gluten/pull/5351#discussion_r1616042266


##########
gluten-ut/spark34/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/GlutenParquetRowIndexSuite.scala:
##########
@@ -315,10 +311,10 @@ class GlutenParquetRowIndexSuite extends 
ParquetRowIndexSuite with GlutenSQLTest
               // When there is no filter, the rowIdx values should be in range
               // [0-`numRecordsPerFile`].
               val expectedRowIdxValues = List.range(0, numRecordsPerFile)
-              assert(
-                dfToAssert
-                  .filter(col(rowIndexColName).isin(expectedRowIdxValues: _*))
-                  .count() == conf.numRows)
+              val df = dfToAssert
+                .select($"id")

Review Comment:
   velox `row_index` generation logic would generate wrong indexs if without 
data colmn be selected (scan out).
   
   checking this issue https://github.com/facebookincubator/velox/issues/9943
   
   that's why in this PR we rewrite some tests to always select at least one 
data column  to make sure pass the tests



##########
gluten-ut/spark34/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/GlutenParquetRowIndexSuite.scala:
##########
@@ -315,10 +311,10 @@ class GlutenParquetRowIndexSuite extends 
ParquetRowIndexSuite with GlutenSQLTest
               // When there is no filter, the rowIdx values should be in range
               // [0-`numRecordsPerFile`].
               val expectedRowIdxValues = List.range(0, numRecordsPerFile)
-              assert(
-                dfToAssert
-                  .filter(col(rowIndexColName).isin(expectedRowIdxValues: _*))
-                  .count() == conf.numRows)
+              val df = dfToAssert
+                .select($"id")

Review Comment:
   @yma11 FYI



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org
For additional commands, e-mail: commits-h...@gluten.apache.org

Reply via email to