abhioncbr commented on code in PR #12440: URL: https://github.com/apache/pinot/pull/12440#discussion_r1525985967
########## pinot-spi/src/main/java/org/apache/pinot/spi/env/SegmentMetadataPropertyWriter.java: ########## @@ -0,0 +1,67 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.pinot.spi.env; + +import java.io.IOException; +import java.io.Writer; +import org.apache.commons.configuration2.PropertiesConfiguration.PropertiesWriter; +import org.apache.commons.configuration2.convert.ListDelimiterHandler; + + +/** + * SegmentMetadataPropertyWriter extends the PropertiesWriter + * <p> + * Purpose: custom property writer for writing the segment metadata faster by skipping the escaping of key. + */ +public class SegmentMetadataPropertyWriter extends PropertiesWriter { + private boolean _skipEscapePropertyName; + private final String _segmentMetadataVersionHeader; + + public SegmentMetadataPropertyWriter(final Writer writer, ListDelimiterHandler handler, + String segmentMetadataVersionHeader) { + super(writer, handler); + _segmentMetadataVersionHeader = segmentMetadataVersionHeader; + } + + @Override + protected String escapeKey(final String key) { + // skip the escapeKey functionality, if segment metadata has a newer version + // if not newer version, follow the escape for backward compatibility. + if (_skipEscapePropertyName) { + return key; + } + return super.escapeKey(key); + } + + @Override + public void writeln(final String s) throws IOException { Review Comment: Let me try to provide a little more context around how in commons-configuration2, properties are getting written to the file or read from the file - Each PropertiesConfiguration object in config2 is associated with the [PropertiesConfigurationLayout](https://commons.apache.org/proper/commons-configuration/apidocs/org/apache/commons/configuration2/PropertiesConfigurationLayout.html) and this object helps in preserving the comments, header and footer comment of the PropertiesConfiguration. - Header is a global comment that can be set for the properties file. This comment is written at the very start of the file, followed by an empty line. [[reference](https://commons.apache.org/proper/commons-configuration/apidocs/org/apache/commons/configuration2/PropertiesConfigurationLayout.html#setHeaderComment(java.lang.String))]. - With above statement, your concern > which may be too late as we already parse/write many lines in the expensive ways is not valid, since before writing the first property we will be able to set the flag value(`_skipEscapePropertyName`). Header comment lines are the first lines written to file. [[code reference](https://github.com/apache/commons-configuration/blob/master/src/main/java/org/apache/commons/configuration2/PropertiesConfigurationLayout.java#L489)]. However I understand it's not a clean implementation. I think introducing the separate new method (say `saveSegmentMetadata`) in `CommonsConfigurationUtils` can work for us. Before saving, we will set the IOFactory again based on table or cluster config and also by that moment version header content will also be available. I have added test case with segment metadata resources [with](https://github.com/apache/pinot/pull/12440/files#diff-9e5c3dcfcbdf8a9ae4d2cbdef02df3805e931990151c47d4d3030f2ecf39b63a) and [without](https://github.com/apache/pinot/pull/12440/files#diff-8d406f553720a36462fae94ddd6c682ea4d9d5b1eebae3f9425d14d10537aa63) header. Please refer. For `SegmentMetadataPropertyReader`, I think the current implementation is correct since the header comment is the first thing read from the file and also it's going to be the only comment present in the segment metadata properties file, since we don't set any other comment. Also, imo, opening/closing the file first to determine the header will be more expensive. please let me know if it make sense or you have any further comments. Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org