This is an automated email from the ASF dual-hosted git repository.

xiangfu0 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/pinot.git


The following commit(s) were added to refs/heads/master by this push:
     new 51bcec3b88f Generalize RAW + dictionary column fix to all 
aggregation/distinct sites (follow-up to #18500) (#18504)
51bcec3b88f is described below

commit 51bcec3b88fe151362ef79bdea1b26515e6ff3ca
Author: Xiang Fu <[email protected]>
AuthorDate: Fri May 15 00:28:10 2026 -0700

    Generalize RAW + dictionary column fix to all aggregation/distinct sites 
(follow-up to #18500) (#18504)
    
    * Fix UnsupportedOperationException for aggregations/distinct on RAW + 
dictionary columns
    
    A column declared with EncodingType.RAW + an explicit dictionaryIndex has a
    Dictionary file on disk but a RAW forward index that throws on
    ForwardIndexReader#readDictIds. Many aggregation, group-by, and distinct
    executors gated their dict-id read path on `blockValSet.getDictionary() != 
null`
    alone, so a single such column in a query would crash with
    UnsupportedOperationException on the AggregationOperator / group-by path.
    
    This was first fixed in PR #18500 for one site 
(NoDictionaryMultiColumnGroupKeyGenerator).
    A codebase scan found ~30 more call sites with the same buggy pattern: the
    DISTINCTCOUNT/HLL/Bitmap/ULL/CPCSketch family, 
SegmentPartitionedDistinctCount,
    Mode, AnyValue, FUNNELCOUNT, DistinctExecutorFactory's single- and 
multi-column
    paths, and DefaultGroupByExecutor (already partially guarded).
    
    Fix: introduce an explicit `boolean isDictionaryEncoded()` on `BlockValSet` 
and
    `ColumnContext` that returns true only when the forward index is 
dict-encoded.
    The default `BlockValSet.isDictionaryEncoded()` falls back to 
`getDictionary() != null`
    so non-projection value sets (transform, row, data-block) keep working 
unchanged.
    `ProjectionBlockValSet` overrides to consult the forward index directly so
    RAW + dictionaryIndex columns correctly report false. `getDictionary()` 
keeps
    its straightforward "is there a dictionary file?" meaning — filter operators
    that hold dict IDs (via DataSource#getDictionary, which is unaffected) 
continue
    to work.
    
    Every aggregation/distinct/group-by call site now gates on the new flag 
rather
    than dictionary nullness, so a single helper expresses the rule once and all
    30+ sites are consistent.
    
    Regression tests in RawForwardIndexWithDictionaryTest reproduce the crash on
    multi-column GROUP BY, multi-column DISTINCT, DISTINCT with filter, 
DISTINCTCOUNT
    with filter, DISTINCTCOUNTHLL with filter, DISTINCTCOUNTBITMAP,
    SEGMENTPARTITIONEDDISTINCTCOUNT with filter, and MODE — all 16 new test runs
    fail on master and pass with this change.
    
    Co-Authored-By: Claude Opus 4.7 <[email protected]>
    
    * Address review comments on PR #18504
    
    Themes from raghavyadav01 + Copilot:
    
    1. forwardIndex == null was treated as dict-encoded (`forwardIndex == null
       || forwardIndex.isDictionaryEncoded()`). When the forward index is
       disabled (dict + inverted/range only), this returned true and callers
       would NPE on getDictionaryIdsSV(). Tightened to
       `forwardIndex != null && forwardIndex.isDictionaryEncoded()` in both
       ColumnContext.fromDataSource and 
ProjectionBlockValSet.isDictionaryEncoded().
    2. BlockValSet.getDictionary() Javadoc said "dictionary file", which is
       inaccurate for TransformBlockValSet (in-memory). Reworded to "dictionary
       (on disk or built on the fly)".
    3. Default BlockValSet.isDictionaryEncoded() mirrored the buggy pattern.
       Kept default returning getDictionary() != null for SPI compat but
       strengthened the Javadoc to call out the trap, and added explicit
       overrides on every in-tree impl (TransformBlockValSet, 
RowBasedBlockValSet,
       FilteredRowBasedBlockValSet, DataBlockValSet, FilteredDataBlockValSet).
    4. Added regression test 
testDistinctOnTransformOfRawDictColumnReturnsSameResults
       covering SELECT DISTINCT UPPER(rawDictDim) — exercises the 
TransformBlockValSet
       path raghavyadav01 flagged.
    5. Reworded stale test docstrings to describe pre-fix behavior as historical
       ("previously crashed inside ...") and reference isDictionaryEncoded() as
       the correct gate.
    
    Co-Authored-By: Claude Opus 4.7 <[email protected]>
    
    ---------
    
    Co-authored-by: Claude Opus 4.7 <[email protected]>
---
 .../org/apache/pinot/core/common/BlockValSet.java  |  22 ++-
 .../apache/pinot/core/operator/ColumnContext.java  |  35 +++-
 .../core/operator/docvalsets/DataBlockValSet.java  |   6 +
 .../docvalsets/FilteredDataBlockValSet.java        |   6 +
 .../docvalsets/FilteredRowBasedBlockValSet.java    |   6 +
 .../operator/docvalsets/ProjectionBlockValSet.java |  19 +++
 .../operator/docvalsets/RowBasedBlockValSet.java   |   6 +
 .../operator/docvalsets/TransformBlockValSet.java  |   9 ++
 .../function/AnyValueAggregationFunction.java      |   2 +-
 .../BaseDistinctAggregateAggregationFunction.java  |  12 +-
 ...istinctCountSmartSketchAggregationFunction.java |   4 +-
 .../DistinctCountBitmapAggregationFunction.java    |  12 +-
 .../DistinctCountCPCSketchAggregationFunction.java |   6 +-
 .../DistinctCountHLLAggregationFunction.java       |  12 +-
 .../DistinctCountHLLPlusAggregationFunction.java   |  12 +-
 .../DistinctCountOffHeapAggregationFunction.java   |   2 +-
 .../DistinctCountSmartHLLAggregationFunction.java  |   2 +-
 ...stinctCountSmartHLLPlusAggregationFunction.java |   2 +-
 .../DistinctCountSmartULLAggregationFunction.java  |   2 +-
 .../DistinctCountULLAggregationFunction.java       |   6 +-
 .../function/ModeAggregationFunction.java          |   6 +-
 ...artitionedDistinctCountAggregationFunction.java |   6 +-
 .../function/funnel/AggregationStrategy.java       |   9 +-
 .../groupby/DefaultGroupByExecutor.java            |  14 +-
 .../NoDictionaryMultiColumnGroupKeyGenerator.java  |  29 +---
 .../query/distinct/DistinctExecutorFactory.java    |  11 +-
 .../custom/RawForwardIndexWithDictionaryTest.java  | 177 ++++++++++++++++++++-
 27 files changed, 338 insertions(+), 97 deletions(-)

diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/common/BlockValSet.java 
b/pinot-core/src/main/java/org/apache/pinot/core/common/BlockValSet.java
index dad7da5c3bf..728ac06cc71 100644
--- a/pinot-core/src/main/java/org/apache/pinot/core/common/BlockValSet.java
+++ b/pinot-core/src/main/java/org/apache/pinot/core/common/BlockValSet.java
@@ -48,11 +48,31 @@ public interface BlockValSet {
   boolean isSingleValue();
 
   /**
-   * Returns the dictionary for the column, or {@code null} if the column is 
not dictionary-encoded.
+   * Returns the dictionary for the column if one exists, or {@code null} 
otherwise. The dictionary may live on disk
+   * (segment-backed columns) or be built on the fly (transform functions). It 
may be present even when
+   * {@link #isDictionaryEncoded()} returns {@code false} — a column declared 
as {@code EncodingType.RAW} with an
+   * explicit {@code dictionaryIndex} carries a dictionary on disk but a RAW 
forward index, and a column with a
+   * disabled forward index has no way to read dict IDs at all. Callers that 
select between a dictionary-id read
+   * path ({@link #getDictionaryIdsSV()} / {@link #getDictionaryIdsMV()}) and 
a value read path MUST gate on
+   * {@link #isDictionaryEncoded()}, not {@code getDictionary() != null}.
    */
   @Nullable
   Dictionary getDictionary();
 
+  /**
+   * Returns {@code true} if the dict-id read path ({@link 
#getDictionaryIdsSV()} / {@link #getDictionaryIdsMV()})
+   * is callable on this value set.
+   *
+   * <p>The default implementation falls back to {@code getDictionary() != 
null}, which is correct for value sets
+   * where dictionary presence and dict-id readability are coupled. 
Implementers MUST override this whenever the
+   * two can diverge — most notably the segment projection layer, where a 
column can declare
+   * {@code EncodingType.RAW} alongside an explicit {@code dictionaryIndex} 
(dictionary present, but
+   * {@code readDictIds} throws), or where the forward index is disabled 
outright (no forward index to read).
+   */
+  default boolean isDictionaryEncoded() {
+    return getDictionary() != null;
+  }
+
   /**
    * SINGLE-VALUED COLUMN APIs
    */
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/ColumnContext.java 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/ColumnContext.java
index 878ef0b3f95..a94e3a79f3d 100644
--- a/pinot-core/src/main/java/org/apache/pinot/core/operator/ColumnContext.java
+++ b/pinot-core/src/main/java/org/apache/pinot/core/operator/ColumnContext.java
@@ -24,6 +24,7 @@ import 
org.apache.pinot.core.operator.transform.function.TransformFunction;
 import org.apache.pinot.segment.spi.datasource.DataSource;
 import org.apache.pinot.segment.spi.datasource.DataSourceMetadata;
 import org.apache.pinot.segment.spi.index.reader.Dictionary;
+import org.apache.pinot.segment.spi.index.reader.ForwardIndexReader;
 import org.apache.pinot.spi.data.FieldSpec.DataType;
 
 
@@ -31,13 +32,15 @@ public class ColumnContext {
   private final DataType _dataType;
   private final boolean _isSingleValue;
   private final Dictionary _dictionary;
+  private final boolean _dictionaryEncoded;
   private final DataSource _dataSource;
 
   private ColumnContext(DataType dataType, boolean isSingleValue, @Nullable 
Dictionary dictionary,
-      @Nullable DataSource dataSource) {
+      boolean dictionaryEncoded, @Nullable DataSource dataSource) {
     _dataType = dataType;
     _isSingleValue = isSingleValue;
     _dictionary = dictionary;
+    _dictionaryEncoded = dictionaryEncoded;
     _dataSource = dataSource;
   }
 
@@ -49,11 +52,24 @@ public class ColumnContext {
     return _isSingleValue;
   }
 
+  /// Returns the column's dictionary file if one exists, regardless of 
whether the forward index can answer
+  /// dictionary-id reads. Callers that need to select between a dict-id read 
path and a value read path MUST gate
+  /// on {@link #isDictionaryEncoded()} rather than {@code getDictionary() != 
null} — a column declared as
+  /// {@code EncodingType.RAW} with an explicit {@code dictionaryIndex} 
returns a non-null dictionary here but its
+  /// forward index throws on {@link ForwardIndexReader#readDictIds}.
   @Nullable
   public Dictionary getDictionary() {
     return _dictionary;
   }
 
+  /// Returns {@code true} if the column's forward index is dictionary-encoded 
and the dict-id read path
+  /// ({@link org.apache.pinot.core.common.BlockValSet#getDictionaryIdsSV()}) 
is callable. A column with
+  /// {@code EncodingType.RAW} + an explicit {@code dictionaryIndex} returns 
{@code false} here even though
+  /// {@link #getDictionary()} is non-null.
+  public boolean isDictionaryEncoded() {
+    return _dictionaryEncoded;
+  }
+
   @Nullable
   public DataSource getDataSource() {
     return _dataSource;
@@ -61,13 +77,22 @@ public class ColumnContext {
 
   public static ColumnContext fromDataSource(DataSource dataSource) {
     DataSourceMetadata dataSourceMetadata = dataSource.getDataSourceMetadata();
-    return new ColumnContext(dataSourceMetadata.getDataType(), 
dataSourceMetadata.isSingleValue(),
-        dataSource.getDictionary(), dataSource);
+    Dictionary dictionary = dataSource.getDictionary();
+    ForwardIndexReader<?> forwardIndex = dataSource.getForwardIndex();
+    // Dict-id reads require both a dictionary AND a dict-encoded forward 
index. A column with EncodingType.RAW +
+    // dictionaryIndex has the dictionary but a RAW forward index; a column 
with a disabled forward index (dict +
+    // inverted/range only) has no forward index at all. Both must take the 
value/index-based path.
+    boolean dictEncoded = dictionary != null && forwardIndex != null && 
forwardIndex.isDictionaryEncoded();
+    return new ColumnContext(dataSourceMetadata.getDataType(), 
dataSourceMetadata.isSingleValue(), dictionary,
+        dictEncoded, dataSource);
   }
 
   public static ColumnContext fromTransformFunction(TransformFunction 
transformFunction) {
     TransformResultMetadata resultMetadata = 
transformFunction.getResultMetadata();
-    return new ColumnContext(resultMetadata.getDataType(), 
resultMetadata.isSingleValue(),
-        transformFunction.getDictionary(), null);
+    Dictionary dictionary = transformFunction.getDictionary();
+    // Transform functions that expose a dictionary always build it 
themselves, so the dict-id read path is callable
+    // whenever the dictionary is present.
+    return new ColumnContext(resultMetadata.getDataType(), 
resultMetadata.isSingleValue(), dictionary,
+        dictionary != null, null);
   }
 }
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/DataBlockValSet.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/DataBlockValSet.java
index a8bc631b674..37c928c3ac0 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/DataBlockValSet.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/DataBlockValSet.java
@@ -74,6 +74,12 @@ public class DataBlockValSet implements BlockValSet {
     return null;
   }
 
+  /// Data-block value sets never carry a dictionary; the dict-id read methods 
below always throw.
+  @Override
+  public boolean isDictionaryEncoded() {
+    return false;
+  }
+
   @Override
   public int[] getDictionaryIdsSV() {
     throw new UnsupportedOperationException();
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/FilteredDataBlockValSet.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/FilteredDataBlockValSet.java
index 2114bb6f1e6..959248f9081 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/FilteredDataBlockValSet.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/FilteredDataBlockValSet.java
@@ -97,6 +97,12 @@ public class FilteredDataBlockValSet implements BlockValSet {
     return null;
   }
 
+  /// Data-block value sets never carry a dictionary; the dict-id read methods 
below always throw.
+  @Override
+  public boolean isDictionaryEncoded() {
+    return false;
+  }
+
   @Override
   public int[] getDictionaryIdsSV() {
     throw new UnsupportedOperationException();
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/FilteredRowBasedBlockValSet.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/FilteredRowBasedBlockValSet.java
index ba107a8b9fe..e889ae509ef 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/FilteredRowBasedBlockValSet.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/FilteredRowBasedBlockValSet.java
@@ -94,6 +94,12 @@ public class FilteredRowBasedBlockValSet implements 
BlockValSet {
     return null;
   }
 
+  /// Row-based value sets never carry a dictionary; the dict-id read methods 
below always throw.
+  @Override
+  public boolean isDictionaryEncoded() {
+    return false;
+  }
+
   @Override
   public int[] getDictionaryIdsSV() {
     throw new UnsupportedOperationException();
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/ProjectionBlockValSet.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/ProjectionBlockValSet.java
index e0613a4b84f..5bd14e4565d 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/ProjectionBlockValSet.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/ProjectionBlockValSet.java
@@ -25,6 +25,7 @@ import org.apache.pinot.core.common.DataBlockCache;
 import org.apache.pinot.core.operator.ProjectionOperator;
 import org.apache.pinot.segment.spi.datasource.DataSource;
 import org.apache.pinot.segment.spi.index.reader.Dictionary;
+import org.apache.pinot.segment.spi.index.reader.ForwardIndexReader;
 import org.apache.pinot.segment.spi.index.reader.NullValueVectorReader;
 import org.apache.pinot.spi.data.FieldSpec.DataType;
 import org.apache.pinot.spi.trace.InvocationRecording;
@@ -98,6 +99,24 @@ public class ProjectionBlockValSet implements BlockValSet {
     return _dataSource.getDictionary();
   }
 
+  /// Returns {@code true} only when there is both a dictionary AND a 
dict-encoded forward index. Two cases return
+  /// {@code false} even though {@link #getDictionary()} is non-null:
+  /// <ul>
+  ///   <li>{@code EncodingType.RAW} + an explicit {@code dictionaryIndex}: 
the forward index throws on
+  ///   {@link ForwardIndexReader#readDictIds}.</li>
+  ///   <li>Disabled forward index (dict + inverted/range only): there is no 
forward index to read dict IDs from.</li>
+  /// </ul>
+  /// Callers selecting between dict-id and value paths must gate on this 
method, not {@code getDictionary() != null}.
+  @Override
+  public boolean isDictionaryEncoded() {
+    Dictionary dictionary = _dataSource.getDictionary();
+    if (dictionary == null) {
+      return false;
+    }
+    ForwardIndexReader<?> forwardIndex = _dataSource.getForwardIndex();
+    return forwardIndex != null && forwardIndex.isDictionaryEncoded();
+  }
+
   @Override
   public int[] getDictionaryIdsSV() {
     try (InvocationScope scope = 
Tracing.getTracer().createScope(ProjectionBlockValSet.class)) {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/RowBasedBlockValSet.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/RowBasedBlockValSet.java
index 9d204d70062..20394b3d7c5 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/RowBasedBlockValSet.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/RowBasedBlockValSet.java
@@ -96,6 +96,12 @@ public class RowBasedBlockValSet implements BlockValSet {
     return null;
   }
 
+  /// Row-based value sets never carry a dictionary; the dict-id read methods 
below always throw.
+  @Override
+  public boolean isDictionaryEncoded() {
+    return false;
+  }
+
   @Override
   public int[] getDictionaryIdsSV() {
     throw new UnsupportedOperationException();
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/TransformBlockValSet.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/TransformBlockValSet.java
index b8d77415725..360b35ab197 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/TransformBlockValSet.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/operator/docvalsets/TransformBlockValSet.java
@@ -79,6 +79,15 @@ public class TransformBlockValSet implements BlockValSet {
     return _transformFunction.getDictionary();
   }
 
+  /// A transform function that exposes a dictionary always builds it itself 
(e.g.,
+  /// {@link 
org.apache.pinot.core.operator.transform.function.IdentifierTransformFunction} 
only exposes the
+  /// underlying column's dictionary when its forward index is dict-encoded), 
so the dict-id read path is callable
+  /// whenever the dictionary is present.
+  @Override
+  public boolean isDictionaryEncoded() {
+    return _transformFunction.getDictionary() != null;
+  }
+
   @Override
   public int[] getDictionaryIdsSV() {
     try (InvocationScope scope = 
Tracing.getTracer().createScope(TransformBlockValSet.class)) {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/AnyValueAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/AnyValueAggregationFunction.java
index dac8774ee46..26723347874 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/AnyValueAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/AnyValueAggregationFunction.java
@@ -183,7 +183,7 @@ public class AnyValueAggregationFunction extends 
NullableSingleInputAggregationF
    */
   private void aggregateHelper(int length, BlockValSet bvs, 
ValueProcessor<Object> processor) {
     // Use dictionary-based access for efficiency when available
-    if (bvs.getDictionary() != null) {
+    if (bvs.isDictionaryEncoded()) {
       final int[] dictIds = bvs.getDictionaryIdsSV();
       final Dictionary dict = bvs.getDictionary();
       forEachNotNull(length, bvs, (from, to) -> {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/BaseDistinctAggregateAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/BaseDistinctAggregateAggregationFunction.java
index 25a189c77d7..4becf978571 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/BaseDistinctAggregateAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/BaseDistinctAggregateAggregationFunction.java
@@ -199,7 +199,7 @@ public abstract class 
BaseDistinctAggregateAggregationFunction<T extends Compara
    */
   protected void svAggregate(BlockValSet blockValSet, int length, 
AggregationResultHolder aggregationResultHolder) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       RoaringBitmap dictIdBitmap = getDictIdBitmap(aggregationResultHolder, 
dictionary);
@@ -293,7 +293,7 @@ public abstract class 
BaseDistinctAggregateAggregationFunction<T extends Compara
    */
   protected void mvAggregate(BlockValSet blockValSet, int length, 
AggregationResultHolder aggregationResultHolder) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       RoaringBitmap dictIdBitmap = getDictIdBitmap(aggregationResultHolder, 
dictionary);
       int[][] dictIds = blockValSet.getDictionaryIdsMV();
@@ -407,7 +407,7 @@ public abstract class 
BaseDistinctAggregateAggregationFunction<T extends Compara
   protected void svAggregateGroupBySV(BlockValSet blockValSet, int length, 
int[] groupKeyArray,
       GroupByResultHolder groupByResultHolder) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
 
@@ -501,7 +501,7 @@ public abstract class 
BaseDistinctAggregateAggregationFunction<T extends Compara
   protected void mvAggregateGroupBySV(BlockValSet blockValSet, int length, 
int[] groupKeyArray,
       GroupByResultHolder groupByResultHolder) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[][] dictIds = blockValSet.getDictionaryIdsMV();
       forEachNotNull(length, blockValSet, (from, to) -> {
@@ -619,7 +619,7 @@ public abstract class 
BaseDistinctAggregateAggregationFunction<T extends Compara
   protected void svAggregateGroupByMV(BlockValSet blockValSet, int length, 
int[][] groupKeysArray,
       GroupByResultHolder groupByResultHolder) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
 
@@ -709,7 +709,7 @@ public abstract class 
BaseDistinctAggregateAggregationFunction<T extends Compara
   protected void mvAggregateGroupByMV(BlockValSet blockValSet, int length, 
int[][] groupKeysArray,
       GroupByResultHolder groupByResultHolder) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[][] dictIds = blockValSet.getDictionaryIdsMV();
 
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/BaseDistinctCountSmartSketchAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/BaseDistinctCountSmartSketchAggregationFunction.java
index c07fbe61283..52a7351fd35 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/BaseDistinctCountSmartSketchAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/BaseDistinctCountSmartSketchAggregationFunction.java
@@ -210,7 +210,7 @@ abstract class 
BaseDistinctCountSmartSketchAggregationFunction
       Map<ExpressionContext, BlockValSet> blockValSetMap) {
     BlockValSet blockValSet = blockValSetMap.get(_expression);
 
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       // Track which groups were modified to check cardinality only once per 
group per batch
       IntSet modifiedGroups = new IntOpenHashSet();
@@ -347,7 +347,7 @@ abstract class 
BaseDistinctCountSmartSketchAggregationFunction
       Map<ExpressionContext, BlockValSet> blockValSetMap) {
     BlockValSet blockValSet = blockValSetMap.get(_expression);
 
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       // Track which groups were modified to check cardinality only once per 
group per batch
       IntSet modifiedGroups = new IntOpenHashSet();
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountBitmapAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountBitmapAggregationFunction.java
index 8a6801454b6..74a7cdd6e2c 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountBitmapAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountBitmapAggregationFunction.java
@@ -102,7 +102,7 @@ public class DistinctCountBitmapAggregationFunction extends 
BaseSingleInputAggre
   protected void aggregateSV(int length, AggregationResultHolder 
aggregationResultHolder, BlockValSet blockValSet,
       DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       getDictIdBitmap(aggregationResultHolder, dictionary).addN(dictIds, 0, 
length);
@@ -149,7 +149,7 @@ public class DistinctCountBitmapAggregationFunction extends 
BaseSingleInputAggre
   protected void aggregateMV(int length, AggregationResultHolder 
aggregationResultHolder, BlockValSet blockValSet,
       DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       RoaringBitmap dictIdBitmap = getDictIdBitmap(aggregationResultHolder, 
dictionary);
       int[][] dictIds = blockValSet.getDictionaryIdsMV();
@@ -238,7 +238,7 @@ public class DistinctCountBitmapAggregationFunction extends 
BaseSingleInputAggre
   protected void aggregateSVGroupBySV(int length, int[] groupKeyArray, 
GroupByResultHolder groupByResultHolder,
       BlockValSet blockValSet, DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       for (int i = 0; i < length; i++) {
@@ -288,7 +288,7 @@ public class DistinctCountBitmapAggregationFunction extends 
BaseSingleInputAggre
   protected void aggregateMVGroupBySV(int length, int[] groupKeyArray, 
GroupByResultHolder groupByResultHolder,
       BlockValSet blockValSet, DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[][] dictIds = blockValSet.getDictionaryIdsMV();
       for (int i = 0; i < length; i++) {
@@ -381,7 +381,7 @@ public class DistinctCountBitmapAggregationFunction extends 
BaseSingleInputAggre
   protected void aggregateSVGroupByMV(int length, int[][] groupKeysArray, 
GroupByResultHolder groupByResultHolder,
       BlockValSet blockValSet, DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       for (int i = 0; i < length; i++) {
@@ -431,7 +431,7 @@ public class DistinctCountBitmapAggregationFunction extends 
BaseSingleInputAggre
   protected void aggregateMVGroupByMV(int length, int[][] groupKeysArray, 
GroupByResultHolder groupByResultHolder,
       BlockValSet blockValSet, DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[][] dictIds = blockValSet.getDictionaryIdsMV();
       for (int i = 0; i < length; i++) {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountCPCSketchAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountCPCSketchAggregationFunction.java
index 49862ffd371..fd2b40395f8 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountCPCSketchAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountCPCSketchAggregationFunction.java
@@ -157,7 +157,7 @@ public class DistinctCountCPCSketchAggregationFunction
     }
 
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       getDictIdBitmap(aggregationResultHolder, dictionary).addN(dictIds, 0, 
length);
@@ -229,7 +229,7 @@ public class DistinctCountCPCSketchAggregationFunction
     }
 
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       for (int i = 0; i < length; i++) {
@@ -302,7 +302,7 @@ public class DistinctCountCPCSketchAggregationFunction
     }
 
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       for (int i = 0; i < length; i++) {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountHLLAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountHLLAggregationFunction.java
index 393ce439cbd..2464a48379a 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountHLLAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountHLLAggregationFunction.java
@@ -114,7 +114,7 @@ public class DistinctCountHLLAggregationFunction extends 
BaseSingleInputAggregat
   protected void aggregateSV(int length, AggregationResultHolder 
aggregationResultHolder, BlockValSet blockValSet,
       DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       getDictIdBitmap(aggregationResultHolder, dictionary).addN(dictIds, 0, 
length);
@@ -162,7 +162,7 @@ public class DistinctCountHLLAggregationFunction extends 
BaseSingleInputAggregat
   protected void aggregateMV(int length, AggregationResultHolder 
aggregationResultHolder, BlockValSet blockValSet,
       DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       RoaringBitmap dictIdBitmap = getDictIdBitmap(aggregationResultHolder, 
dictionary);
       int[][] dictIds = blockValSet.getDictionaryIdsMV();
@@ -256,7 +256,7 @@ public class DistinctCountHLLAggregationFunction extends 
BaseSingleInputAggregat
   protected void aggregateSVGroupBySV(int length, int[] groupKeyArray, 
GroupByResultHolder groupByResultHolder,
       BlockValSet blockValSet, DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       for (int i = 0; i < length; i++) {
@@ -305,7 +305,7 @@ public class DistinctCountHLLAggregationFunction extends 
BaseSingleInputAggregat
   protected void aggregateMVGroupBySV(int length, int[] groupKeyArray, 
GroupByResultHolder groupByResultHolder,
       BlockValSet blockValSet, DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[][] dictIds = blockValSet.getDictionaryIdsMV();
       for (int i = 0; i < length; i++) {
@@ -405,7 +405,7 @@ public class DistinctCountHLLAggregationFunction extends 
BaseSingleInputAggregat
   protected void aggregateSVGroupByMV(int length, int[][] groupKeysArray, 
GroupByResultHolder groupByResultHolder,
       BlockValSet blockValSet, DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       for (int i = 0; i < length; i++) {
@@ -454,7 +454,7 @@ public class DistinctCountHLLAggregationFunction extends 
BaseSingleInputAggregat
   protected void aggregateMVGroupByMV(int length, int[][] groupKeysArray, 
GroupByResultHolder groupByResultHolder,
       BlockValSet blockValSet, DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[][] dictIds = blockValSet.getDictionaryIdsMV();
       for (int i = 0; i < length; i++) {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountHLLPlusAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountHLLPlusAggregationFunction.java
index 67cf83a579f..d839d4b1dec 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountHLLPlusAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountHLLPlusAggregationFunction.java
@@ -124,7 +124,7 @@ public class DistinctCountHLLPlusAggregationFunction 
extends BaseSingleInputAggr
   protected void aggregateSV(int length, AggregationResultHolder 
aggregationResultHolder, BlockValSet blockValSet,
       DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       getDictIdBitmap(aggregationResultHolder, dictionary).addN(dictIds, 0, 
length);
@@ -173,7 +173,7 @@ public class DistinctCountHLLPlusAggregationFunction 
extends BaseSingleInputAggr
   protected void aggregateMV(int length, AggregationResultHolder 
aggregationResultHolder, BlockValSet blockValSet,
       DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       RoaringBitmap dictIdBitmap = getDictIdBitmap(aggregationResultHolder, 
dictionary);
       int[][] dictIds = blockValSet.getDictionaryIdsMV();
@@ -268,7 +268,7 @@ public class DistinctCountHLLPlusAggregationFunction 
extends BaseSingleInputAggr
   protected void aggregateSVGroupBySV(int length, int[] groupKeyArray, 
GroupByResultHolder groupByResultHolder,
       BlockValSet blockValSet, DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       for (int i = 0; i < length; i++) {
@@ -318,7 +318,7 @@ public class DistinctCountHLLPlusAggregationFunction 
extends BaseSingleInputAggr
   protected void aggregateMVGroupBySV(int length, int[] groupKeyArray, 
GroupByResultHolder groupByResultHolder,
       BlockValSet blockValSet, DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[][] dictIds = blockValSet.getDictionaryIdsMV();
       for (int i = 0; i < length; i++) {
@@ -419,7 +419,7 @@ public class DistinctCountHLLPlusAggregationFunction 
extends BaseSingleInputAggr
   protected void aggregateSVGroupByMV(int length, int[][] groupKeysArray, 
GroupByResultHolder groupByResultHolder,
       BlockValSet blockValSet, DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       for (int i = 0; i < length; i++) {
@@ -469,7 +469,7 @@ public class DistinctCountHLLPlusAggregationFunction 
extends BaseSingleInputAggr
   protected void aggregateMVGroupByMV(int length, int[][] groupKeysArray, 
GroupByResultHolder groupByResultHolder,
       BlockValSet blockValSet, DataType storedType) {
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[][] dictIds = blockValSet.getDictionaryIdsMV();
       for (int i = 0; i < length; i++) {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountOffHeapAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountOffHeapAggregationFunction.java
index 19d208a4342..c8c5fe88cca 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountOffHeapAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountOffHeapAggregationFunction.java
@@ -82,7 +82,7 @@ public class DistinctCountOffHeapAggregationFunction
   public void aggregate(int length, AggregationResultHolder 
aggregationResultHolder,
       Map<ExpressionContext, BlockValSet> blockValSetMap) {
     BlockValSet blockValSet = blockValSetMap.get(_expression);
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       // For dictionary-encoded expression, store dictionary ids into the 
bitmap
       if (blockValSet.isSingleValue()) {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountSmartHLLAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountSmartHLLAggregationFunction.java
index 8b18801f485..49b1df721b9 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountSmartHLLAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountSmartHLLAggregationFunction.java
@@ -100,7 +100,7 @@ public class DistinctCountSmartHLLAggregationFunction 
extends BaseDistinctCountS
     BlockValSet blockValSet = blockValSetMap.get(_expression);
 
     // For dictionary-encoded expression, use adaptive conversion strategy
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       Object result = aggregationResultHolder.getResult();
       // If already converted to HLL, aggregate directly
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountSmartHLLPlusAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountSmartHLLPlusAggregationFunction.java
index cca0d5b143b..52001ae4e5f 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountSmartHLLPlusAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountSmartHLLPlusAggregationFunction.java
@@ -100,7 +100,7 @@ public class DistinctCountSmartHLLPlusAggregationFunction 
extends BaseDistinctCo
     BlockValSet blockValSet = blockValSetMap.get(_expression);
 
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       RoaringBitmap dictIdBitmap = getDictIdBitmap(aggregationResultHolder, 
dictionary);
       if (blockValSet.isSingleValue()) {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountSmartULLAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountSmartULLAggregationFunction.java
index b87c299dc6e..3ad214b948e 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountSmartULLAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountSmartULLAggregationFunction.java
@@ -97,7 +97,7 @@ public class DistinctCountSmartULLAggregationFunction extends 
BaseDistinctCountS
     BlockValSet blockValSet = blockValSetMap.get(_expression);
 
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       RoaringBitmap dictIdBitmap = getDictIdBitmap(aggregationResultHolder, 
dictionary);
       if (blockValSet.isSingleValue()) {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountULLAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountULLAggregationFunction.java
index 753d3ef69a7..ba2d4aa1704 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountULLAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountULLAggregationFunction.java
@@ -104,7 +104,7 @@ public class DistinctCountULLAggregationFunction extends 
BaseSingleInputAggregat
     }
 
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       getDictIdBitmap(aggregationResultHolder, dictionary).addN(dictIds, 0, 
length);
@@ -177,7 +177,7 @@ public class DistinctCountULLAggregationFunction extends 
BaseSingleInputAggregat
     }
 
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       for (int i = 0; i < length; i++) {
@@ -259,7 +259,7 @@ public class DistinctCountULLAggregationFunction extends 
BaseSingleInputAggregat
     }
 
     // For dictionary-encoded expression, store dictionary ids into the bitmap
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       for (int i = 0; i < length; i++) {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/ModeAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/ModeAggregationFunction.java
index 0ddb46bef5c..3b369507e51 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/ModeAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/ModeAggregationFunction.java
@@ -264,7 +264,7 @@ public class ModeAggregationFunction
     BlockValSet blockValSet = blockValSetMap.get(_expression);
 
     // For dictionary-encoded expression, store dictionary ids into the dictId 
map
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
 
       Int2IntOpenHashMap dictIdValueMap = 
getDictIdCountMap(aggregationResultHolder, dictionary);
@@ -328,7 +328,7 @@ public class ModeAggregationFunction
     BlockValSet blockValSet = blockValSetMap.get(_expression);
 
     // For dictionary-encoded expression, store dictionary ids into the dictId 
map
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       forEachNotNull(length, blockValSet, (from, to) -> {
@@ -386,7 +386,7 @@ public class ModeAggregationFunction
     BlockValSet blockValSet = blockValSetMap.get(_expression);
 
     // For dictionary-encoded expression, store dictionary ids into the dictId 
map
-    Dictionary dictionary = blockValSet.getDictionary();
+    Dictionary dictionary = blockValSet.isDictionaryEncoded() ? 
blockValSet.getDictionary() : null;
     if (dictionary != null) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       forEachNotNull(length, blockValSet, (from, to) -> {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/SegmentPartitionedDistinctCountAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/SegmentPartitionedDistinctCountAggregationFunction.java
index 06ca46c7be5..82c8bf8319e 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/SegmentPartitionedDistinctCountAggregationFunction.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/SegmentPartitionedDistinctCountAggregationFunction.java
@@ -74,7 +74,7 @@ public class 
SegmentPartitionedDistinctCountAggregationFunction extends BaseSing
     BlockValSet blockValSet = blockValSetMap.get(_expression);
 
     // For dictionary-encoded expression, store dictionary ids into a 
RoaringBitmap
-    if (blockValSet.getDictionary() != null) {
+    if (blockValSet.isDictionaryEncoded()) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       RoaringBitmap bitmap = aggregationResultHolder.getResult();
       if (bitmap == null) {
@@ -165,7 +165,7 @@ public class 
SegmentPartitionedDistinctCountAggregationFunction extends BaseSing
     BlockValSet blockValSet = blockValSetMap.get(_expression);
 
     // For dictionary-encoded expression, store dictionary ids into a 
RoaringBitmap
-    if (blockValSet.getDictionary() != null) {
+    if (blockValSet.isDictionaryEncoded()) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       for (int i = 0; i < length; i++) {
         setIntValueForGroup(groupByResultHolder, groupKeyArray[i], dictIds[i]);
@@ -224,7 +224,7 @@ public class 
SegmentPartitionedDistinctCountAggregationFunction extends BaseSing
     BlockValSet blockValSet = blockValSetMap.get(_expression);
 
     // For dictionary-encoded expression, store dictionary ids into a 
RoaringBitmap
-    if (blockValSet.getDictionary() != null) {
+    if (blockValSet.isDictionaryEncoded()) {
       int[] dictIds = blockValSet.getDictionaryIdsSV();
       for (int i = 0; i < length; i++) {
         int dictId = dictIds[i];
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/funnel/AggregationStrategy.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/funnel/AggregationStrategy.java
index 298fd4a8052..99006c102ab 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/funnel/AggregationStrategy.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/funnel/AggregationStrategy.java
@@ -145,11 +145,14 @@ public abstract class AggregationStrategy<A> {
   abstract void add(Dictionary dictionary, A aggResult, int step, int 
correlationId);
 
   private Dictionary getDictionary(Map<ExpressionContext, BlockValSet> 
blockValSetMap) {
-    final Dictionary primaryCorrelationDictionary = 
blockValSetMap.get(_primaryCorrelationCol).getDictionary();
-    Preconditions.checkArgument(primaryCorrelationDictionary != null,
+    final BlockValSet primaryCorrelationValSet = 
blockValSetMap.get(_primaryCorrelationCol);
+    // FUNNELCOUNT requires dict-id reads from the forward index; a column 
with EncodingType.RAW + dictionaryIndex
+    // exposes a Dictionary but BlockValSet#getDictionaryIdsSV throws on the 
RAW forward index. Gate on the
+    // explicit forward-index encoding flag rather than dictionary nullness 
alone.
+    Preconditions.checkArgument(primaryCorrelationValSet.isDictionaryEncoded(),
         "CORRELATE_BY column in FUNNELCOUNT aggregation function not 
supported, please use a dictionary encoded "
             + "column.");
-    return primaryCorrelationDictionary;
+    return primaryCorrelationValSet.getDictionary();
   }
 
   private int[] getCorrelationIds(Map<ExpressionContext, BlockValSet> 
blockValSetMap) {
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/DefaultGroupByExecutor.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/DefaultGroupByExecutor.java
index 15cecadd3f7..d5af6d9eae7 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/DefaultGroupByExecutor.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/DefaultGroupByExecutor.java
@@ -87,16 +87,10 @@ public class DefaultGroupByExecutor implements 
GroupByExecutor {
     for (ExpressionContext groupByExpression : groupByExpressions) {
       ColumnContext columnContext = 
projectOperator.getResultColumnContext(groupByExpression);
       hasMVGroupByExpression |= !columnContext.isSingleValue();
-      // DictionaryBasedGroupKeyGenerator does dict-id reads from the forward 
index — that requires the
-      // forward index to actually be dict-encoded. Columns with a shared 
dictionary on a RAW forward index
-      // (dict file exists but forward stores raw values) would otherwise be 
misrouted into the dict-id
-      // path; gate on forward-index encoding so they take the no-dict GROUP 
BY path instead.
-      // ColumnContext.getDataSource() is null for computed (non-identifier) 
transforms; in that case
-      // getDictionary() == null already covers them via the first condition.
-      hasNoDictionaryGroupByExpression |= columnContext.getDictionary() == null
-          || (columnContext.getDataSource() != null
-          && columnContext.getDataSource().getForwardIndex() != null
-          && 
!columnContext.getDataSource().getForwardIndex().isDictionaryEncoded());
+      // A column with EncodingType.RAW + explicit dictionaryIndex has a 
non-null dictionary but a RAW forward
+      // index that throws on readDictIds; route those through the no-dict 
GROUP BY generator via the explicit
+      // isDictionaryEncoded() flag rather than gating on dictionary nullness 
alone.
+      hasNoDictionaryGroupByExpression |= !columnContext.isDictionaryEncoded();
     }
     _hasMVGroupByExpression = hasMVGroupByExpression;
 
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionaryMultiColumnGroupKeyGenerator.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionaryMultiColumnGroupKeyGenerator.java
index 8f588b49d9e..51e4c7fec66 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionaryMultiColumnGroupKeyGenerator.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionaryMultiColumnGroupKeyGenerator.java
@@ -32,9 +32,7 @@ import org.apache.pinot.core.operator.ColumnContext;
 import org.apache.pinot.core.operator.blocks.ValueBlock;
 import org.apache.pinot.core.query.aggregation.groupby.utils.ValueToIdMap;
 import 
org.apache.pinot.core.query.aggregation.groupby.utils.ValueToIdMapFactory;
-import org.apache.pinot.segment.spi.datasource.DataSource;
 import org.apache.pinot.segment.spi.index.reader.Dictionary;
-import org.apache.pinot.segment.spi.index.reader.ForwardIndexReader;
 import org.apache.pinot.spi.data.FieldSpec.DataType;
 import org.apache.pinot.spi.utils.ByteArray;
 import org.apache.pinot.spi.utils.FixedIntArray;
@@ -80,14 +78,11 @@ public class NoDictionaryMultiColumnGroupKeyGenerator 
implements GroupKeyGenerat
       ExpressionContext groupByExpression = groupByExpressions[i];
       ColumnContext columnContext = 
projectOperator.getResultColumnContext(groupByExpression);
       _storedTypes[i] = columnContext.getDataType().getStoredType();
-      // Only take the dict-id path when the column has a dictionary AND its 
forward index is dict-encoded.
-      // A column can have a dictionary alongside a RAW forward index (e.g. 
dict + inverted/range), in which case
-      // BlockValSet#getDictionaryIdsSV would route to 
ForwardIndexReader#readDictIds and throw on the raw forward
-      // index. Fall back to an on-the-fly dictionary on raw values instead.
-      Dictionary dictionary = _nullHandlingEnabled ? null : 
columnContext.getDictionary();
-      if (dictionary != null && !hasDictEncodedForwardIndex(columnContext)) {
-        dictionary = null;
-      }
+      // Take the dict-id path only when the forward index is dict-encoded. A 
column with EncodingType.RAW +
+      // dictionaryIndex exposes a Dictionary but 
BlockValSet#getDictionaryIdsSV throws on its RAW forward
+      // index — fall back to an on-the-fly dictionary on raw values for that 
case.
+      Dictionary dictionary = _nullHandlingEnabled || 
!columnContext.isDictionaryEncoded() ? null
+          : columnContext.getDictionary();
       if (dictionary != null) {
         _dictionaries[i] = dictionary;
       } else {
@@ -437,20 +432,6 @@ public class NoDictionaryMultiColumnGroupKeyGenerator 
implements GroupKeyGenerat
     return new GroupKeyIterator();
   }
 
-  /**
-   * Returns {@code true} if the column has a dict-encoded forward index, i.e. 
{@link BlockValSet#getDictionaryIdsSV}
-   * is callable. A column referenced by a transform (rather than directly) 
has no underlying {@link DataSource}; in
-   * that case the transform builds its own dictionary on the fly, so the 
dict-id path is always usable.
-   */
-  private static boolean hasDictEncodedForwardIndex(ColumnContext 
columnContext) {
-    DataSource dataSource = columnContext.getDataSource();
-    if (dataSource == null) {
-      return true;
-    }
-    ForwardIndexReader<?> forwardIndex = dataSource.getForwardIndex();
-    return forwardIndex == null || forwardIndex.isDictionaryEncoded();
-  }
-
   /**
    * Helper method to get or create group-id for a group key.
    *
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/distinct/DistinctExecutorFactory.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/distinct/DistinctExecutorFactory.java
index 4b9bff8cab8..60f7be22de7 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/distinct/DistinctExecutorFactory.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/distinct/DistinctExecutorFactory.java
@@ -72,8 +72,10 @@ public class DistinctExecutorFactory {
       } else {
         orderByExpression = null;
       }
-      Dictionary dictionary = columnContext.getDictionary();
+      // Use the dict-id-based executor only when the forward index is 
dict-encoded (RAW + dictionaryIndex columns
+      // expose a Dictionary but their forward index throws on readDictIds — 
gate on isDictionaryEncoded()).
       // Note: Use raw value based when ordering is needed and dictionary is 
not sorted (consuming segments).
+      Dictionary dictionary = columnContext.isDictionaryEncoded() ? 
columnContext.getDictionary() : null;
       if (dictionary != null && (orderByExpression == null || 
dictionary.isSorted())) {
         // Dictionary based
         return new DictionaryBasedSingleColumnDistinctExecutor(expression, 
dictionary, dataType, limit,
@@ -115,9 +117,10 @@ public class DistinctExecutorFactory {
         columnNames[i] = expression.toString();
         columnDataTypes[i] = 
ColumnDataType.fromDataTypeSV(columnContext.getDataType());
         if (dictionaryBased) {
-          Dictionary dictionary = columnContext.getDictionary();
-          if (dictionary != null) {
-            dictionaries.add(dictionary);
+          // RAW + dictionaryIndex columns expose a Dictionary but the forward 
index throws on readDictIds; gate
+          // the dict-id-based multi-column executor on the explicit 
forward-index encoding flag.
+          if (columnContext.isDictionaryEncoded()) {
+            dictionaries.add(columnContext.getDictionary());
           } else {
             dictionaryBased = false;
           }
diff --git 
a/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/custom/RawForwardIndexWithDictionaryTest.java
 
b/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/custom/RawForwardIndexWithDictionaryTest.java
index 2fa08e27adf..93c4703dfcb 100644
--- 
a/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/custom/RawForwardIndexWithDictionaryTest.java
+++ 
b/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/custom/RawForwardIndexWithDictionaryTest.java
@@ -517,13 +517,12 @@ public class RawForwardIndexWithDictionaryTest extends 
CustomDataQueryClusterInt
     assertEquals(rawRows, dictRows, "DISTINCT rows must match between 
dictionary-only and raw+dictionary columns");
   }
 
-  /**
-   * Multi-column GROUP BY that mixes a dict-encoded column with a 
RAW+dictionary column. This forces the executor
-   * onto the {@code NoDictionaryMultiColumnGroupKeyGenerator} path. The 
per-column branch there must check the
-   * forward-index encoding in addition to {@code ColumnContext#getDictionary} 
— otherwise it keeps the dictionary
-   * for any column that has a dict file and calls {@code 
BlockValSet#getDictionaryIdsSV()} on it, which routes to
-   * {@code ForwardIndexReader#readDictIds} and throws {@code 
UnsupportedOperationException} on a RAW forward index.
-   */
+  /// Multi-column GROUP BY that mixes a dict-encoded column with a 
RAW+dictionary column. Forces the executor onto
+  /// the {@link 
org.apache.pinot.core.query.aggregation.groupby.NoDictionaryMultiColumnGroupKeyGenerator}
 path.
+  /// Before the {@code ColumnContext.isDictionaryEncoded()} gate, the 
per-column branch there picked the dict-id
+  /// path whenever {@code ColumnContext#getDictionary() != null} and then 
called
+  /// {@code BlockValSet#getDictionaryIdsSV()} on the RAW forward index, which 
throws
+  /// {@code UnsupportedOperationException}.
   @Test(dataProvider = "useBothQueryEngines")
   public void 
testMultiColumnGroupByWithRawDictColumnReturnsSameResults(boolean 
useMultiStageQueryEngine)
       throws Exception {
@@ -542,6 +541,170 @@ public class RawForwardIndexWithDictionaryTest extends 
CustomDataQueryClusterInt
         "Multi-column GROUP BY rows must match between dictionary-only and 
raw+dictionary columns");
   }
 
+  /// Multi-column DISTINCT exercises {@link 
org.apache.pinot.core.query.distinct.DistinctExecutorFactory}'s
+  /// multi-column path. Before the {@code 
ColumnContext.isDictionaryEncoded()} gate, the factory routed to
+  /// {@code DictionaryBasedMultiColumnDistinctExecutor} whenever every column 
had a non-null dictionary, then
+  /// that executor called {@code BlockValSet#getDictionaryIdsSV()} — which 
throws on a RAW+dictionary column.
+  @Test(dataProvider = "useBothQueryEngines")
+  public void 
testMultiColumnDistinctWithRawDictColumnReturnsSameResults(boolean 
useMultiStageQueryEngine)
+      throws Exception {
+    setUseMultiStageQueryEngine(useMultiStageQueryEngine);
+    JsonNode dictRows = postQuery(
+        String.format("SELECT DISTINCT %s, %s FROM %s ORDER BY %s, %s",
+            DICT_DIMENSION, DICT_INT_DIMENSION, getTableName(), 
DICT_DIMENSION, DICT_INT_DIMENSION))
+        .get("resultTable").get("rows");
+    JsonNode rawRows = postQuery(
+        String.format("SELECT DISTINCT %s, %s FROM %s ORDER BY %s, %s",
+            RAW_DICT_DIMENSION, RAW_DICT_INT_DIMENSION, getTableName(),
+            RAW_DICT_DIMENSION, RAW_DICT_INT_DIMENSION))
+        .get("resultTable").get("rows");
+    assertEquals(rawRows, dictRows,
+        "Multi-column DISTINCT rows must match between dictionary-only and 
raw+dictionary columns");
+  }
+
+  /// {@code DISTINCTCOUNT} on a RAW+dictionary column was previously crashing 
inside
+  /// {@link 
org.apache.pinot.core.query.aggregation.function.BaseDistinctAggregateAggregationFunction#svAggregate}:
+  /// the executor entered the dict-id path whenever {@code 
blockValSet.getDictionary() != null}, then called
+  /// {@code blockValSet.getDictionaryIdsSV()} on the RAW forward index. Now 
gated on
+  /// {@code BlockValSet#isDictionaryEncoded()}, the executor takes the value 
path instead. The {@code WHERE}
+  /// predicate is required so the query bypasses
+  /// {@link 
org.apache.pinot.core.operator.query.NonScanBasedAggregationOperator}, which 
would otherwise serve the
+  /// aggregation directly from the dictionary and hide the regression.
+  @Test(dataProvider = "useBothQueryEngines")
+  public void 
testDistinctCountWithFilterOnRawDictColumnReturnsSameResults(boolean 
useMultiStageQueryEngine)
+      throws Exception {
+    setUseMultiStageQueryEngine(useMultiStageQueryEngine);
+    long dictResult = scalarLong(
+        String.format("SELECT DISTINCTCOUNT(%s) FROM %s WHERE %s > 100",
+            DICT_DIMENSION, getTableName(), METRIC_COLUMN));
+    long rawResult = scalarLong(
+        String.format("SELECT DISTINCTCOUNT(%s) FROM %s WHERE %s > 100",
+            RAW_DICT_DIMENSION, getTableName(), METRIC_COLUMN));
+    assertEquals(dictResult, UNIQUE_DIMENSION_VALUES, "Dict baseline must 
equal the unique value count");
+    assertEquals(rawResult, dictResult,
+        "DISTINCTCOUNT must match between dictionary-only and raw+dictionary 
columns");
+  }
+
+  /// {@code DISTINCTCOUNTHLL} previously crashed inside {@link
+  /// 
org.apache.pinot.core.query.aggregation.function.DistinctCountHLLAggregationFunction#aggregate}
 for the same
+  /// reason as {@code DISTINCTCOUNT}; now gated on {@code 
BlockValSet#isDictionaryEncoded()}. The {@code WHERE}
+  /// predicate is required to bypass {@link
+  /// org.apache.pinot.core.operator.query.NonScanBasedAggregationOperator}.
+  @Test(dataProvider = "useBothQueryEngines")
+  public void 
testDistinctCountHLLWithFilterOnRawDictColumnReturnsSameResults(boolean 
useMultiStageQueryEngine)
+      throws Exception {
+    setUseMultiStageQueryEngine(useMultiStageQueryEngine);
+    long dictResult = scalarLong(
+        String.format("SELECT DISTINCTCOUNTHLL(%s) FROM %s WHERE %s > 100",
+            DICT_DIMENSION, getTableName(), METRIC_COLUMN));
+    long rawResult = scalarLong(
+        String.format("SELECT DISTINCTCOUNTHLL(%s) FROM %s WHERE %s > 100",
+            RAW_DICT_DIMENSION, getTableName(), METRIC_COLUMN));
+    // Sanity floor: catches a regression where both queries silently return 0 
(e.g., planner short-circuit).
+    assertTrue(dictResult > 0, "Dict baseline DISTINCTCOUNTHLL must be > 0");
+    assertEquals(rawResult, dictResult,
+        "DISTINCTCOUNTHLL must match between dictionary-only and 
raw+dictionary columns");
+  }
+
+  /// {@code DISTINCTCOUNTBITMAP} previously crashed inside {@link
+  /// 
org.apache.pinot.core.query.aggregation.function.DistinctCountBitmapAggregationFunction#aggregate};
 now gated
+  /// on {@code BlockValSet#isDictionaryEncoded()}. Unlike {@code 
DISTINCTCOUNT} / {@code DISTINCTCOUNTHLL}, this
+  /// function is NOT in {@code 
AggregationPlanNode#DICTIONARY_BASED_FUNCTIONS}, so the bug surfaced even 
without a
+  /// {@code WHERE}.
+  @Test(dataProvider = "useBothQueryEngines")
+  public void testDistinctCountBitmapOnRawDictColumnReturnsSameResults(boolean 
useMultiStageQueryEngine)
+      throws Exception {
+    setUseMultiStageQueryEngine(useMultiStageQueryEngine);
+    long dictResult = scalarLong(
+        String.format("SELECT DISTINCTCOUNTBITMAP(%s) FROM %s", 
DICT_INT_DIMENSION, getTableName()));
+    long rawResult = scalarLong(
+        String.format("SELECT DISTINCTCOUNTBITMAP(%s) FROM %s", 
RAW_DICT_INT_DIMENSION, getTableName()));
+    assertEquals(dictResult, UNIQUE_DIMENSION_VALUES, "Dict baseline must 
equal the unique value count");
+    assertEquals(rawResult, dictResult,
+        "DISTINCTCOUNTBITMAP must match between dictionary-only and 
raw+dictionary columns");
+  }
+
+  /// {@code SEGMENTPARTITIONEDDISTINCTCOUNT} previously crashed inside {@link
+  /// 
org.apache.pinot.core.query.aggregation.function.SegmentPartitionedDistinctCountAggregationFunction#aggregate}
+  /// for the same reason as the other dict-id aggregators; now gated on 
{@code BlockValSet#isDictionaryEncoded()}.
+  /// The {@code WHERE} predicate is required to bypass
+  /// {@link 
org.apache.pinot.core.operator.query.NonScanBasedAggregationOperator}.
+  @Test(dataProvider = "useBothQueryEngines")
+  public void 
testSegmentPartitionedDistinctCountWithFilterOnRawDictColumnReturnsSameResults(
+      boolean useMultiStageQueryEngine)
+      throws Exception {
+    setUseMultiStageQueryEngine(useMultiStageQueryEngine);
+    long dictResult = scalarLong(
+        String.format("SELECT SEGMENTPARTITIONEDDISTINCTCOUNT(%s) FROM %s 
WHERE %s > 100",
+            DICT_DIMENSION, getTableName(), METRIC_COLUMN));
+    long rawResult = scalarLong(
+        String.format("SELECT SEGMENTPARTITIONEDDISTINCTCOUNT(%s) FROM %s 
WHERE %s > 100",
+            RAW_DICT_DIMENSION, getTableName(), METRIC_COLUMN));
+    // Sanity floor: catches a regression where both queries silently return 0 
(e.g., planner short-circuit).
+    assertTrue(dictResult > 0, "Dict baseline SEGMENTPARTITIONEDDISTINCTCOUNT 
must be > 0");
+    assertEquals(rawResult, dictResult,
+        "SEGMENTPARTITIONEDDISTINCTCOUNT must match between dictionary-only 
and raw+dictionary columns");
+  }
+
+  /// {@code MODE} previously crashed inside {@link
+  /// 
org.apache.pinot.core.query.aggregation.function.ModeAggregationFunction#aggregate}
 for the same reason; now
+  /// gated on {@code BlockValSet#isDictionaryEncoded()}. {@code MODE} is NOT 
in
+  /// {@code AggregationPlanNode#DICTIONARY_BASED_FUNCTIONS}, so the bug 
surfaced even without a {@code WHERE}.
+  @Test(dataProvider = "useBothQueryEngines")
+  public void testModeOnRawDictColumnReturnsSameResults(boolean 
useMultiStageQueryEngine)
+      throws Exception {
+    setUseMultiStageQueryEngine(useMultiStageQueryEngine);
+    JsonNode dictRows = postQuery(
+        String.format("SELECT MODE(%s) FROM %s", DICT_INT_DIMENSION, 
getTableName()))
+        .get("resultTable").get("rows");
+    JsonNode rawRows = postQuery(
+        String.format("SELECT MODE(%s) FROM %s", RAW_DICT_INT_DIMENSION, 
getTableName()))
+        .get("resultTable").get("rows");
+    assertEquals(rawRows, dictRows,
+        "MODE rows must match between dictionary-only and raw+dictionary 
columns");
+  }
+
+  /// Single-column {@code DISTINCT} with a filter exercises
+  /// {@link org.apache.pinot.core.query.distinct.DistinctExecutorFactory}'s 
single-column path. Without a filter
+  /// the query routes to {@link 
org.apache.pinot.core.operator.query.DictionaryBasedDistinctOperator} which
+  /// iterates the dictionary directly and never hits the bug; with a filter 
it goes through
+  /// {@link 
org.apache.pinot.core.query.distinct.dictionary.DictionaryBasedSingleColumnDistinctExecutor},
 which
+  /// previously called {@code BlockValSet#getDictionaryIdsSV()} and threw on 
the RAW forward index — now gated on
+  /// {@code ColumnContext#isDictionaryEncoded()}.
+  @Test(dataProvider = "useBothQueryEngines")
+  public void testDistinctWithFilterOnRawDictColumnReturnsSameResults(boolean 
useMultiStageQueryEngine)
+      throws Exception {
+    setUseMultiStageQueryEngine(useMultiStageQueryEngine);
+    JsonNode dictRows = postQuery(
+        String.format("SELECT DISTINCT %s FROM %s WHERE %s > 100 ORDER BY %s",
+            DICT_DIMENSION, getTableName(), METRIC_COLUMN, DICT_DIMENSION))
+        .get("resultTable").get("rows");
+    JsonNode rawRows = postQuery(
+        String.format("SELECT DISTINCT %s FROM %s WHERE %s > 100 ORDER BY %s",
+            RAW_DICT_DIMENSION, getTableName(), METRIC_COLUMN, 
RAW_DICT_DIMENSION))
+        .get("resultTable").get("rows");
+    assertEquals(rawRows, dictRows,
+        "DISTINCT (with filter) rows must match between dictionary-only and 
raw+dictionary columns");
+  }
+
+  /// Exercise the transform path: a non-identifier expression over a 
RAW+dictionary column. Goes through
+  /// {@link org.apache.pinot.core.operator.docvalsets.TransformBlockValSet}, 
whose {@code isDictionaryEncoded()}
+  /// returns whether the wrapping transform exposes its own dictionary — 
{@code UPPER} does not, so the executor
+  /// must take the value path. Regression coverage for raghavyadav01's review 
comment on PR #18504.
+  @Test(dataProvider = "useBothQueryEngines")
+  public void testDistinctOnTransformOfRawDictColumnReturnsSameResults(boolean 
useMultiStageQueryEngine)
+      throws Exception {
+    setUseMultiStageQueryEngine(useMultiStageQueryEngine);
+    JsonNode dictRows = postQuery(
+        String.format("SELECT DISTINCT UPPER(%s) FROM %s ORDER BY UPPER(%s)",
+            DICT_DIMENSION, getTableName(), 
DICT_DIMENSION)).get("resultTable").get("rows");
+    JsonNode rawRows = postQuery(
+        String.format("SELECT DISTINCT UPPER(%s) FROM %s ORDER BY UPPER(%s)",
+            RAW_DICT_DIMENSION, getTableName(), 
RAW_DICT_DIMENSION)).get("resultTable").get("rows");
+    assertEquals(rawRows, dictRows,
+        "DISTINCT(transform) rows must match between dictionary-only and 
raw+dictionary columns");
+  }
+
   @Test(dataProvider = "useBothQueryEngines")
   public void 
testAggregationWithGroupByOnRawDictColumnReturnsSameResults(boolean 
useMultiStageQueryEngine)
       throws Exception {


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to