Jackie-Jiang commented on code in PR #18368:
URL: https://github.com/apache/pinot/pull/18368#discussion_r3251590670


##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/StandardIndexes.java:
##########
@@ -142,4 +146,12 @@ public static IndexType<VectorIndexConfig, 
VectorIndexReader, VectorIndexCreator
     return (IndexType<VectorIndexConfig, VectorIndexReader, 
VectorIndexCreator>)
         IndexService.getInstance().get(VECTOR_ID);
   }
+
+  /// Returns the MAP index type, which materializes MAP column keys as 
virtual columns.
+  @SuppressWarnings("unchecked")
+  public static IndexType<MapIndexConfig, ColumnarMapIndexReader, 
ColumnarMapIndexCreator>
+      map() {

Review Comment:
   (minor) Reformat



##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/datasource/MapDataSource.java:
##########
@@ -33,6 +33,15 @@ public interface MapDataSource extends DataSource {
   /// Returns the DataSource for the given map key's values.
   DataSource getDataSource(String key);
 
+  /// Returns whether this segment MAY contain the given key. Implementations 
are allowed to return
+  /// `true` conservatively (i.e., when it is not possible to determine key 
presence without a
+  /// full scan). Callers must handle the case where the key is absent even 
when this returns
+  /// `true` — [#getDataSource(String)] will return a DataSource for an absent 
key
+  /// (forward-index reads return the column default value; null-value bitmap 
marks all rows as null).
+  default boolean mayContainKey(String key) {

Review Comment:
   I don't understand this API. How does caller use it if it is not 
deterministic?



##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/SegmentGeneratorConfig.java:
##########
@@ -530,6 +530,23 @@ public List<String> getComplexColumnNames() {
     return getQualifyingFields(FieldType.COMPLEX, true);
   }
 
+  /// Returns the names of columns that have MAP index enabled in the table 
config.
+  public List<String> getMapIndexColumnNames() {

Review Comment:
   `_indexConfigsByColName` is available. You can use 
`FieldIndexConfigsUtil.columnsWithIndexEnabled()` to return the map enabled 
columns. It also checks if index is enabled. Here you can directly return a Set



##########
pinot-spi/src/main/java/org/apache/pinot/spi/data/ComplexFieldSpec.java:
##########
@@ -58,17 +60,29 @@ public final class ComplexFieldSpec extends FieldSpec {
 
   private final Map<String, FieldSpec> _childFieldSpecs;
 
+  @JsonProperty("keyTypes")
+  private Map<String, DataType> _keyTypes;

Review Comment:
   These are value types right?
   How do you plan to use this config? Ideally the value types should be auto 
detected



##########
pinot-spi/src/main/java/org/apache/pinot/spi/data/ComplexFieldSpec.java:
##########
@@ -58,17 +60,29 @@ public final class ComplexFieldSpec extends FieldSpec {
 
   private final Map<String, FieldSpec> _childFieldSpecs;
 
+  @JsonProperty("keyTypes")
+  private Map<String, DataType> _keyTypes;

Review Comment:
   Also, `DataType` might not be enough. How do you handle SV/MV?



##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/SegmentGeneratorConfig.java:
##########
@@ -530,6 +530,23 @@ public List<String> getComplexColumnNames() {
     return getQualifyingFields(FieldType.COMPLEX, true);
   }
 
+  /// Returns the names of columns that have MAP index enabled in the table 
config.
+  public List<String> getMapIndexColumnNames() {

Review Comment:
   Alternatively, caller can directly use `getIndexConfigsByColName()`. I think 
that is more consistent with existing usages



##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/V1Constants.java:
##########
@@ -173,6 +174,16 @@ public static class Column {
       // Optional, default false
       public static final String IS_AUTO_GENERATED = "isAutoGenerated";
 
+      // Optional, default false. True for materialized columns produced from 
a MAP parent column.
+      // Example: for a MAP column "metrics", materialized column 
"value_map_string$__latency" has:
+      //   mapMaterializedColumn = true
+      //   parentMapColumn       = metrics
+      // Materialized columns appear in index_map alongside regular columns 
with their own forward index,
+      // optional inverted index, and null-value vector.
+      public static final String IS_MAP_MATERIALIZED_COLUMN = 
"mapMaterializedColumn";

Review Comment:
   To keep consistent
   ```suggestion
         public static final String IS_MAP_MATERIALIZED_COLUMN = 
"isMapMaterializedColumn";
   ```
   
   Is this even needed? If `parentMapColumn` exists, it is materialized column 
right?



##########
pinot-spi/src/main/java/org/apache/pinot/spi/data/MapNaming.java:
##########
@@ -0,0 +1,57 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.data;
+
+
+/// Naming convention for MAP materialized columns. Each dense MAP key is 
stored as
+/// a column named `<mapColumn>$__<key>`. Sparse keys share a single synthetic 
JSON column
+/// named `<mapColumn>$____sparse__`.
+public final class MapNaming {
+  public static final String SEPARATOR = "$__";

Review Comment:
   I don't think we need these underscores. We can treat `$` as a preserved 
character



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to