kevinw66 opened a new issue, #8896: URL: https://github.com/apache/incubator-gluten/issues/8896
### Backend VL (Velox) ### Bug description While I'm running [VeloxAggregateFunctionsSuite](https://github.com/apache/incubator-gluten/blob/ca2ab6ad7d9c461b7ca1eb7b032f460ce4d567ca/backends-velox/src/test/scala/org/apache/gluten/execution/VeloxAggregateFunctionsSuite.scala#L320), I got the following error: ``` - stddev_pop - var_samp - var_pop - bit_and bit_or bit_xor *** FAILED *** Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] ... ignore some output ``` you can find the full log [here](https://github.com/kevinw66/incubator-gluten/actions/runs/13649831756/job/38156232543?pr=3) and the parquet data encoding is: ```python >>> import pyarrow.parquet as pq >>> meta = pq.read_metadata("/opt/gluten/backends-velox/src/test/resources/tpch-data-parquet/lineitem/part-00000-6c374e0a-7d76-401b-8458-a8e31f8ab704-c000.snappy.parquet") >>> meta.row_group(0).column(3).path_in_schema 'l_linenumber' >>> meta.row_group(0).column(3).encodings ('PLAIN_DICTIONARY', 'BIT_PACKED', 'RLE') ``` After I change the encoding to `RLE_DICTIONARY` ```python >>> import pyarrow.parquet as pq >>> meta = pq.read_metadata("/opt/gluten/backends-velox/src/test/resources/tpch-data-parquet/lineitem/part-00000-6c374e0a-7d76-401b-8458-a8e31f8ab704-c000.snappy.parquet") >>> meta.row_group(0).column(3).path_in_schema 'l_linenumber' >>> meta.row_group(0).column(3).encodings ('PLAIN', 'RLE', 'RLE_DICTIONARY') ``` The test passes: ``` - stddev_pop - var_samp - var_pop - bit_and bit_or bit_xor - corr covar_pop covar_samp - first - last - ... ignore some output ``` Reference: https://parquet.apache.org/docs/file-format/data-pages/encodings/#dictionary-encoding-plain_dictionary--2-and-rle_dictionary--8 ### Spark version None ### Spark configurations _No response_ ### System information _No response_ ### Relevant logs ```bash ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
