Copilot commented on code in PR #6527:
URL: https://github.com/apache/hive/pull/6527#discussion_r3397723689


##########
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedPrimitiveColumnReader.java:
##########
@@ -651,14 +609,42 @@ private void decodeDictionaryIds(
       }
       break;
     case DECIMAL:
-      DecimalLogicalTypeAnnotation decimalLogicalType = null;
-      if (type.getLogicalTypeAnnotation() instanceof 
DecimalLogicalTypeAnnotation) {
-        decimalLogicalType = (DecimalLogicalTypeAnnotation) 
type.getLogicalTypeAnnotation();
+      if (column instanceof Decimal64ColumnVector dec64) {
+        fillDecimal64PrecisionScale(dec64);
+        boolean fast = dictionary.isFastDecimal64();
+        short valueScale = getEncodedDecimalScale();
+        for (int i = rowId; i < rowId + num; ++i) {

Review Comment:
   When decoding dictionary-encoded DECIMAL_64 values, the output vector's 
isRepeating flag is never updated. VectorizedParquetRecordReader initializes 
column.isRepeating=true for every batch; if this stays true here, downstream 
vectorized operators can incorrectly treat the entire batch as repeating (using 
only element 0), producing wrong results for non-constant columns. Consider 
explicitly disabling repeating (or computing it) for the dictionary decode path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to