Re: [PR] Add nullability to FieldSpec and use it in TypeFactory [pinot]

via GitHub Fri, 03 Nov 2023 00:12:59 -0700


Jackie-Jiang commented on code in PR #11824:
URL: https://github.com/apache/pinot/pull/11824#discussion_r1381215533



##########
pinot-query-planner/src/test/java/org/apache/pinot/query/testutils/MockRoutingManagerFactory.java:
##########
@@ -63,25 +66,28 @@ public MockRoutingManagerFactory(int... ports) {
     _schemaMap = new HashMap<>();
     _hybridTables = new HashSet<>();
     _serverInstances = new HashMap<>();
+    _nullHandlingMap = new HashMap<>();
     _tableServerSegmentsMap = new HashMap<>();
     for (int port : ports) {
       _serverInstances.put(toHostname(port), getServerInstance(HOST_NAME, 
port, port, port, port));
     }
   }
 
-  public void registerTable(Schema schema, String tableName) {
+  public void registerTable(Schema schema, String tableName, boolean 
enableNullHandling) {

Review Comment:
   Is the intention to generate null value vector for the table? IMO we should 
always generate null value vector. We will persist the null value vector only 
when there are null values.



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/nullvalue/NullValueIndexType.java:
##########
@@ -77,11 +81,24 @@ public String getPrettyName() {
 
   @Override
   public ColumnConfigDeserializer<IndexConfig> createDeserializer() {
-    return IndexConfigDeserializer.ifIndexingConfig(
-        IndexConfigDeserializer.alwaysCall((TableConfig tableConfig, Schema 
schema) ->
-            tableConfig.getIndexingConfig().isNullHandlingEnabled()
-                ? IndexConfig.ENABLED
-                : IndexConfig.DISABLED));
+    return (TableConfig tableConfig, Schema schema) -> {

Review Comment:
   > That is not correct. There is a special case for null value vector in 
[SegmentColumnarIndexCreator](https://github.com/apache/pinot/blob/55f29fe04dc7ed0ff18737fcec5b272969d6aba1/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java#L220)
 that creates the index depending on config.isNullHandlingEnabled without 
actually reading index configuration.
   
   Does that mean we create null value vector for all columns only when it is 
configured in the table config, and the column level nullable is ignored; When 
reading the null value vector, we do check the column level nullable flag?
   This behavior seems inconsistent and there is no way to only create null 
value vector for part of the columns. Can we make the write path logic the same 
as read path? i.e. check if it is enabled at column level, then decide whether 
to create the null value vector?



##########
pinot-query-runtime/src/test/resources/queries/CountDistinct.json:
##########
@@ -119,11 +119,13 @@
       {
         "comments": "table aren't actually partitioned by val thus all 
segments can produce duplicate results, thus [[8]]",
         "sql": "SELECT SEGMENT_PARTITIONED_DISTINCT_COUNT(val) FROM {tbl1}",
+        "ignored": true,

Review Comment:
   What exception did you get for these queries?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add nullability to FieldSpec and use it in TypeFactory [pinot]

Reply via email to