Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-07 Thread via GitHub


Jackie-Jiang merged PR #13187:
URL: https://github.com/apache/pinot/pull/13187


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1629984437


##
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/metadata/ColumnMetadataImpl.java:
##
@@ -229,11 +229,17 @@ public static ColumnMetadataImpl 
fromPropertiesConfiguration(String column, Prop
 if (defaultNullValueString != null && storedType == DataType.STRING) {
   defaultNullValueString = 
CommonsConfigurationUtils.recoverSpecialCharacterInPropertyValue(defaultNullValueString);
 }
+int maxLength = config.getInt(Column.getKeyFor(column, 
Column.SCHEMA_MAX_LENGTH), FieldSpec.DEFAULT_MAX_LENGTH);
+String maxLengthExceedStrategyString =
+config.getString(Column.getKeyFor(column, 
Column.SCHEMA_MAX_LENGTH_EXCEED_STRATEGY), null);
+FieldSpec.MaxLengthExceedStrategy maxLengthExceedStrategy = 
maxLengthExceedStrategyString != null
+? 
FieldSpec.MaxLengthExceedStrategy.valueOf(maxLengthExceedStrategyString) : null;
 FieldSpec fieldSpec;
 switch (fieldType) {
   case DIMENSION:
 boolean isSingleValue = config.getBoolean(Column.getKeyFor(column, 
Column.IS_SINGLE_VALUED));
-fieldSpec = new DimensionFieldSpec(column, dataType, isSingleValue, 
defaultNullValueString);
+fieldSpec = new DimensionFieldSpec(column, dataType, isSingleValue, 
maxLength,
+defaultNullValueString, maxLengthExceedStrategy);
 break;
   case METRIC:
 fieldSpec = new MetricFieldSpec(column, dataType, 
defaultNullValueString);

Review Comment:
   I see the constructor was missing in MetricFieldSpec so added that as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1629874567


##
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/metadata/ColumnMetadataImpl.java:
##
@@ -229,11 +229,17 @@ public static ColumnMetadataImpl 
fromPropertiesConfiguration(String column, Prop
 if (defaultNullValueString != null && storedType == DataType.STRING) {
   defaultNullValueString = 
CommonsConfigurationUtils.recoverSpecialCharacterInPropertyValue(defaultNullValueString);
 }
+int maxLength = config.getInt(Column.getKeyFor(column, 
Column.SCHEMA_MAX_LENGTH), FieldSpec.DEFAULT_MAX_LENGTH);
+String maxLengthExceedStrategyString =
+config.getString(Column.getKeyFor(column, 
Column.SCHEMA_MAX_LENGTH_EXCEED_STRATEGY), null);
+FieldSpec.MaxLengthExceedStrategy maxLengthExceedStrategy = 
maxLengthExceedStrategyString != null
+? 
FieldSpec.MaxLengthExceedStrategy.valueOf(maxLengthExceedStrategyString) : null;
 FieldSpec fieldSpec;
 switch (fieldType) {
   case DIMENSION:
 boolean isSingleValue = config.getBoolean(Column.getKeyFor(column, 
Column.IS_SINGLE_VALUED));
-fieldSpec = new DimensionFieldSpec(column, dataType, isSingleValue, 
defaultNullValueString);
+fieldSpec = new DimensionFieldSpec(column, dataType, isSingleValue, 
maxLength,
+defaultNullValueString, maxLengthExceedStrategy);
 break;
   case METRIC:
 fieldSpec = new MetricFieldSpec(column, dataType, 
defaultNullValueString);

Review Comment:
   This also applies to metric columns



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1629870264


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##
@@ -585,6 +585,19 @@ public static void 
addColumnMetadataInfo(PropertiesConfiguration properties, Str
 String.valueOf(columnIndexCreationInfo.getTotalNumberOfEntries()));
 properties.setProperty(getKeyFor(column, IS_AUTO_GENERATED),
 String.valueOf(columnIndexCreationInfo.isAutoGenerated()));
+FieldSpec.MaxLengthExceedStrategy maxLengthExceedStrategy = 
fieldSpec.getMaxLengthExceedStrategy();
+if (dataType.equals(DataType.STRING)) {

Review Comment:
   Sounds good to me! Updated accordingly! 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1629848687


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##
@@ -585,6 +585,19 @@ public static void 
addColumnMetadataInfo(PropertiesConfiguration properties, Str
 String.valueOf(columnIndexCreationInfo.getTotalNumberOfEntries()));
 properties.setProperty(getKeyFor(column, IS_AUTO_GENERATED),
 String.valueOf(columnIndexCreationInfo.isAutoGenerated()));
+FieldSpec.MaxLengthExceedStrategy maxLengthExceedStrategy = 
fieldSpec.getMaxLengthExceedStrategy();
+if (dataType.equals(DataType.STRING)) {

Review Comment:
   The overall goal should be to reconstruct the original field spec



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1629847762


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##
@@ -585,6 +585,19 @@ public static void 
addColumnMetadataInfo(PropertiesConfiguration properties, Str
 String.valueOf(columnIndexCreationInfo.getTotalNumberOfEntries()));
 properties.setProperty(getKeyFor(column, IS_AUTO_GENERATED),
 String.valueOf(columnIndexCreationInfo.isAutoGenerated()));
+FieldSpec.MaxLengthExceedStrategy maxLengthExceedStrategy = 
fieldSpec.getMaxLengthExceedStrategy();
+if (dataType.equals(DataType.STRING)) {

Review Comment:
   We want to set it for all var-length types so that the schema can be 
properly re-constructed.
   For `SCHEMA_MAX_LENGTH_EXCEED_STRATEGY`, we want to add it when it is 
configured



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1629444612


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##
@@ -585,6 +585,7 @@ public static void 
addColumnMetadataInfo(PropertiesConfiguration properties, Str
 String.valueOf(columnIndexCreationInfo.getTotalNumberOfEntries()));
 properties.setProperty(getKeyFor(column, IS_AUTO_GENERATED),
 String.valueOf(columnIndexCreationInfo.isAutoGenerated()));
+properties.setProperty(getKeyFor(column, SCHEMA_MAX_LENGTH), 
String.valueOf(fieldSpec.getMaxLength()));

Review Comment:
   hey @Jackie-Jiang updated this PR based on #13103 changes. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-05-21 Thread via GitHub


tibrewalpratik17 commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1609276542


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##
@@ -585,6 +585,7 @@ public static void 
addColumnMetadataInfo(PropertiesConfiguration properties, Str
 String.valueOf(columnIndexCreationInfo.getTotalNumberOfEntries()));
 properties.setProperty(getKeyFor(column, IS_AUTO_GENERATED),
 String.valueOf(columnIndexCreationInfo.isAutoGenerated()));
+properties.setProperty(getKeyFor(column, SCHEMA_MAX_LENGTH), 
String.valueOf(fieldSpec.getMaxLength()));

Review Comment:
   Agreed! During persistence, we can optimise the logic a bit to only include 
STRING fields with certain MaxLengthStrategies. Will update post #13103 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-05-21 Thread via GitHub


Jackie-Jiang commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1609078436


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##
@@ -585,6 +585,7 @@ public static void 
addColumnMetadataInfo(PropertiesConfiguration properties, Str
 String.valueOf(columnIndexCreationInfo.getTotalNumberOfEntries()));
 properties.setProperty(getKeyFor(column, IS_AUTO_GENERATED),
 String.valueOf(columnIndexCreationInfo.isAutoGenerated()));
+properties.setProperty(getKeyFor(column, SCHEMA_MAX_LENGTH), 
String.valueOf(fieldSpec.getMaxLength()));

Review Comment:
   We don't need to store this property for all fields. I'd suggest revisiting 
this PR after wrapping up #13103 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-05-21 Thread via GitHub


codecov-commenter commented on PR #13187:
URL: https://github.com/apache/pinot/pull/13187#issuecomment-2122012004

   ## 
[Codecov](https://app.codecov.io/gh/apache/pinot/pull/13187?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 Report
   Attention: Patch coverage is `0%` with `3 lines` in your changes are missing 
coverage. Please review.
   > Project coverage is 0.00%. Comparing base 
[(`59551e4`)](https://app.codecov.io/gh/apache/pinot/commit/59551e45224f1535c4863fd577622b37366ccc97?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 to head 
[(`9233b8c`)](https://app.codecov.io/gh/apache/pinot/pull/13187?dropdown=coverage&src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   > Report is 471 commits behind head on master.
   
   | 
[Files](https://app.codecov.io/gh/apache/pinot/pull/13187?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 | Patch % | Lines |
   |---|---|---|
   | 
[...segment/spi/index/metadata/ColumnMetadataImpl.java](https://app.codecov.io/gh/apache/pinot/pull/13187?src=pr&el=tree&filepath=pinot-segment-spi%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fsegment%2Fspi%2Findex%2Fmetadata%2FColumnMetadataImpl.java&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL2luZGV4L21ldGFkYXRhL0NvbHVtbk1ldGFkYXRhSW1wbC5qYXZh)
 | 0.00% | [2 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 |
   | 
[...ment/creator/impl/SegmentColumnarIndexCreator.java](https://app.codecov.io/gh/apache/pinot/pull/13187?src=pr&el=tree&filepath=pinot-segment-local%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fsegment%2Flocal%2Fsegment%2Fcreator%2Fimpl%2FSegmentColumnarIndexCreator.java&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==)
 | 0.00% | [1 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13187?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 |
   
   Additional details and impacted files
   
   
   ```diff
   @@  Coverage Diff  @@
   ## master   #13187   +/-   ##
   =
   - Coverage 61.75%0.00%   -61.76% 
   =
 Files  2436 2454   +18 
 Lines133233   134626 +1393 
 Branches  2063620848  +212 
   =
   - Hits  822740-82274 
   - Misses44911   134626+89715 
   + Partials   60480 -6048 
   ```
   
   | 
[Flag](https://app.codecov.io/gh/apache/pinot/pull/13187/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 | Coverage Δ | |
   |---|---|---|
   | 
[custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/13187/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 | `?` | |
   | 
[integration](https://app.codecov.io/gh/apache/pinot/pull/13187/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 | `0.00% <0.00%> (-0.01%)` | :arrow_down: |
   | 
[integration1](https://app.codecov.io/gh/apache/pinot/pull/13187/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 | `?` | |
   | 
[integration2](https://app.codecov.io/gh/apache/pinot/pull/13187/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 | `0.00% <0.00%> (ø)` | |
   | 
[java-11](https://app.codecov.io/gh/apache/pinot/pull/13187/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 | `?` | |
   | 
[java-21](https://app.codecov.io/gh/apache/pinot/pull/13187/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 | `0.00% <0.00%> (-61.63%)` | :arrow_down: |
   | 
[skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/13187/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campa