Re: [PR] Add nullability to FieldSpec and use it in TypeFactory [pinot]

via GitHub Wed, 01 Nov 2023 08:15:57 -0700


walterddr commented on code in PR #11824:
URL: https://github.com/apache/pinot/pull/11824#discussion_r1378937058



##########
pinot-spi/src/main/java/org/apache/pinot/spi/data/FieldSpec.java:
##########
@@ -300,6 +303,27 @@ public void setTransformFunction(@Nullable String 
transformFunction) {
     _transformFunction = transformFunction;
   }
 
+  /**
+   * Returns whether the column is nullable or not.
+   *
+   * This Java property itself is nullable. In case null is returned, actual 
nullability depends on the value of
+   * {@link IndexingConfig#isNullHandlingEnabled()}.
+   *
+   * @return true if the field is nullable, false if it is not and null if 
column level nullability should be delegated
+   * to {@link IndexingConfig#isNullHandlingEnabled()}.
+   */
+  @Nullable
+  public Boolean getNullable() {

Review Comment:
   sorry for following up late:
   
   > Phase2: ... otherwise delegate on `IndexingConfig.isNullHandlingEnabled()` 
?
   > Phase 3:
   > Deprecate `IndexingConfig.isNullHandlingEnabled()` (although we still need 
to support it for compatibility reasons)
   
   ^ when it is deprecated in phase 3 what's the logic originally in phase 2? 
is it false or true?
   
   > There are potential big performance impact
   
   IMO: 4 scenarios currently
   1. `IndexingConfig.isNullHandlingEnabled() = true` and `SET 
enableNullHandling=true;`
   2. `IndexingConfig.isNullHandlingEnabled() = false` and `SET 
enableNullHandling=true;`
   3. `IndexingConfig.isNullHandlingEnabled() = true` and `SET 
enableNullHandling=false;`
   4. `IndexingConfig.isNullHandlingEnabled() = false` and `SET 
enableNullHandling=false;`
   
   as long as the following are met we should be good: 
   - performance on query should be the same on both 3&4 --> we should 
guarantee the same if all columns are marked as non-nullable by the user 
   - performance on query should be the same on both 1&2 and will be worst than 
3&4 b/c it accesses the null vector --> we should guarantee better performance 
when some columns are marked as non-nullable by the user
   
   
   > I don't think that is a good idea. Without making this non-primitive we 
cannot know whether the value was explicitly defined as false or it has 
defaulted to false. We would also be breaking backward compatibility. By using 
a primitive boolean with false by default, in order to make a column nullable 
users would need to:
   > In Phase 1:
   >   - Set IndexingConfig.isNullHandlingEnabled() to true. This is required 
at write time, given in V1 write path ignores FieldSpec nullability.
   >   - Set each, for each nullable column, FieldSpec nullability to true. 
This is a breaking change. All tables already created would need to be 
modified, otherwise they would be suddenly treated as not null.
   > In Phase 2:
   >   - Set each, for each nullable column, FieldSpec nullability to true.
   
   there's no backward compatibility here, b/c the default behavior of V1 is 
all non-nullable, so setting it to primitive boolean with default to false 
exactly achieves that, 
   you don't need to set nullability = true for all columns, b/c
     - in v1 this can be controlled (and should still be controlled by `SET 
enableNullHandling=true` option regardless whether this field is `boolean` or 
`Boolean`
     - in v2 there's no such option, so it will honor the nullability setting 
--> which requires it to be set explicitly to true anyway otherwise nothing 
works.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add nullability to FieldSpec and use it in TypeFactory [pinot]

Reply via email to