Jackie-Jiang commented on code in PR #18368:
URL: https://github.com/apache/pinot/pull/18368#discussion_r3251590670
##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/StandardIndexes.java:
##########
@@ -142,4 +146,12 @@ public static IndexType<VectorIndexConfig,
VectorIndexReader, VectorIndexCreator
return (IndexType<VectorIndexConfig, VectorIndexReader,
VectorIndexCreator>)
IndexService.getInstance().get(VECTOR_ID);
}
+
+ /// Returns the MAP index type, which materializes MAP column keys as
virtual columns.
+ @SuppressWarnings("unchecked")
+ public static IndexType<MapIndexConfig, ColumnarMapIndexReader,
ColumnarMapIndexCreator>
+ map() {
Review Comment:
(minor) Reformat
##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/datasource/MapDataSource.java:
##########
@@ -33,6 +33,15 @@ public interface MapDataSource extends DataSource {
/// Returns the DataSource for the given map key's values.
DataSource getDataSource(String key);
+ /// Returns whether this segment MAY contain the given key. Implementations
are allowed to return
+ /// `true` conservatively (i.e., when it is not possible to determine key
presence without a
+ /// full scan). Callers must handle the case where the key is absent even
when this returns
+ /// `true` — [#getDataSource(String)] will return a DataSource for an absent
key
+ /// (forward-index reads return the column default value; null-value bitmap
marks all rows as null).
+ default boolean mayContainKey(String key) {
Review Comment:
I don't understand this API. How does caller use it if it is not
deterministic?
##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/SegmentGeneratorConfig.java:
##########
@@ -530,6 +530,23 @@ public List<String> getComplexColumnNames() {
return getQualifyingFields(FieldType.COMPLEX, true);
}
+ /// Returns the names of columns that have MAP index enabled in the table
config.
+ public List<String> getMapIndexColumnNames() {
Review Comment:
`_indexConfigsByColName` is available. You can use
`FieldIndexConfigsUtil.columnsWithIndexEnabled()` to return the map enabled
columns. It also checks if index is enabled. Here you can directly return a Set
##########
pinot-spi/src/main/java/org/apache/pinot/spi/data/ComplexFieldSpec.java:
##########
@@ -58,17 +60,29 @@ public final class ComplexFieldSpec extends FieldSpec {
private final Map<String, FieldSpec> _childFieldSpecs;
+ @JsonProperty("keyTypes")
+ private Map<String, DataType> _keyTypes;
Review Comment:
These are value types right?
How do you plan to use this config? Ideally the value types should be auto
detected
##########
pinot-spi/src/main/java/org/apache/pinot/spi/data/ComplexFieldSpec.java:
##########
@@ -58,17 +60,29 @@ public final class ComplexFieldSpec extends FieldSpec {
private final Map<String, FieldSpec> _childFieldSpecs;
+ @JsonProperty("keyTypes")
+ private Map<String, DataType> _keyTypes;
Review Comment:
Also, `DataType` might not be enough. How do you handle SV/MV?
##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/SegmentGeneratorConfig.java:
##########
@@ -530,6 +530,23 @@ public List<String> getComplexColumnNames() {
return getQualifyingFields(FieldType.COMPLEX, true);
}
+ /// Returns the names of columns that have MAP index enabled in the table
config.
+ public List<String> getMapIndexColumnNames() {
Review Comment:
Alternatively, caller can directly use `getIndexConfigsByColName()`. I think
that is more consistent with existing usages
##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/V1Constants.java:
##########
@@ -173,6 +174,16 @@ public static class Column {
// Optional, default false
public static final String IS_AUTO_GENERATED = "isAutoGenerated";
+ // Optional, default false. True for materialized columns produced from
a MAP parent column.
+ // Example: for a MAP column "metrics", materialized column
"value_map_string$__latency" has:
+ // mapMaterializedColumn = true
+ // parentMapColumn = metrics
+ // Materialized columns appear in index_map alongside regular columns
with their own forward index,
+ // optional inverted index, and null-value vector.
+ public static final String IS_MAP_MATERIALIZED_COLUMN =
"mapMaterializedColumn";
Review Comment:
To keep consistent
```suggestion
public static final String IS_MAP_MATERIALIZED_COLUMN =
"isMapMaterializedColumn";
```
Is this even needed? If `parentMapColumn` exists, it is materialized column
right?
##########
pinot-spi/src/main/java/org/apache/pinot/spi/data/MapNaming.java:
##########
@@ -0,0 +1,57 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.data;
+
+
+/// Naming convention for MAP materialized columns. Each dense MAP key is
stored as
+/// a column named `<mapColumn>$__<key>`. Sparse keys share a single synthetic
JSON column
+/// named `<mapColumn>$____sparse__`.
+public final class MapNaming {
+ public static final String SEPARATOR = "$__";
Review Comment:
I don't think we need these underscores. We can treat `$` as a preserved
character
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]