Jackie-Jiang commented on code in PR #11824:
URL: https://github.com/apache/pinot/pull/11824#discussion_r1381215533
##########
pinot-query-planner/src/test/java/org/apache/pinot/query/testutils/MockRoutingManagerFactory.java:
##########
@@ -63,25 +66,28 @@ public MockRoutingManagerFactory(int... ports) {
_schemaMap = new HashMap<>();
_hybridTables = new HashSet<>();
_serverInstances = new HashMap<>();
+ _nullHandlingMap = new HashMap<>();
_tableServerSegmentsMap = new HashMap<>();
for (int port : ports) {
_serverInstances.put(toHostname(port), getServerInstance(HOST_NAME,
port, port, port, port));
}
}
- public void registerTable(Schema schema, String tableName) {
+ public void registerTable(Schema schema, String tableName, boolean
enableNullHandling) {
Review Comment:
Is the intention to generate null value vector for the table? IMO we should
always generate null value vector. We will persist the null value vector only
when there are null values.
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/nullvalue/NullValueIndexType.java:
##########
@@ -77,11 +81,24 @@ public String getPrettyName() {
@Override
public ColumnConfigDeserializer<IndexConfig> createDeserializer() {
- return IndexConfigDeserializer.ifIndexingConfig(
- IndexConfigDeserializer.alwaysCall((TableConfig tableConfig, Schema
schema) ->
- tableConfig.getIndexingConfig().isNullHandlingEnabled()
- ? IndexConfig.ENABLED
- : IndexConfig.DISABLED));
+ return (TableConfig tableConfig, Schema schema) -> {
Review Comment:
> That is not correct. There is a special case for null value vector in
[SegmentColumnarIndexCreator](https://github.com/apache/pinot/blob/55f29fe04dc7ed0ff18737fcec5b272969d6aba1/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java#L220)
that creates the index depending on config.isNullHandlingEnabled without
actually reading index configuration.
Does that mean we create null value vector for all columns only when it is
configured in the table config, and the column level nullable is ignored; When
reading the null value vector, we do check the column level nullable flag?
This behavior seems inconsistent and there is no way to only create null
value vector for part of the columns. Can we make the write path logic the same
as read path? i.e. check if it is enabled at column level, then decide whether
to create the null value vector?
##########
pinot-query-runtime/src/test/resources/queries/CountDistinct.json:
##########
@@ -119,11 +119,13 @@
{
"comments": "table aren't actually partitioned by val thus all
segments can produce duplicate results, thus [[8]]",
"sql": "SELECT SEGMENT_PARTITIONED_DISTINCT_COUNT(val) FROM {tbl1}",
+ "ignored": true,
Review Comment:
What exception did you get for these queries?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]