mcvsubbu commented on a change in pull request #6710:
URL: https://github.com/apache/incubator-pinot/pull/6710#discussion_r601741649
##########
File path:
pinot-common/src/main/java/org/apache/pinot/common/utils/DataTable.java
##########
@@ -45,9 +51,140 @@
String NUM_RESIZES_METADATA_KEY = "numResizes";
String RESIZE_TIME_MS_METADATA_KEY = "resizeTimeMs";
String EXECUTION_THREAD_CPU_TIME_NS_METADATA_KEY =
"executionThreadCpuTimeNs";
+ String RESPONSE_SERIALIZATION_CPU_TIME_NS_METADATA_KEY =
"responseSerializationCpuTimeNs";
Review comment:
Why do we need this?
##########
File path:
pinot-core/src/main/java/org/apache/pinot/core/common/datatable/DataTableImplV2.java
##########
@@ -209,24 +190,22 @@ public DataTableImplV2(ByteBuffer byteBuffer)
}
}
- private static String decodeString(DataInputStream dataInputStream)
- throws IOException {
- int length = dataInputStream.readInt();
- if (length == 0) {
- return StringUtils.EMPTY;
- } else {
- byte[] buffer = new byte[length];
- int numBytesRead = dataInputStream.read(buffer);
- assert numBytesRead == length;
- return StringUtil.decodeUtf8(buffer);
- }
- }
-
@Override
public void addException(ProcessingException processingException) {
_metadata.put(EXCEPTION_METADATA_KEY + processingException.getErrorCode(),
processingException.getMessage());
}
+ @Override
+ public Map<Integer, String> getExceptions() {
+ Map<Integer, String> exceptions = new HashMap<>();
+ for (String key : _metadata.keySet()) {
+ if (key.startsWith(DataTable.EXCEPTION_METADATA_KEY)) {
+ exceptions.put(Integer.parseInt(key.substring(9)), _metadata.get(key));
Review comment:
what is `9`? Can we have a `static final int` and an example here?
##########
File path:
pinot-common/src/main/java/org/apache/pinot/common/utils/DataTable.java
##########
@@ -45,9 +51,140 @@
String NUM_RESIZES_METADATA_KEY = "numResizes";
String RESIZE_TIME_MS_METADATA_KEY = "resizeTimeMs";
String EXECUTION_THREAD_CPU_TIME_NS_METADATA_KEY =
"executionThreadCpuTimeNs";
+ String RESPONSE_SERIALIZATION_CPU_TIME_NS_METADATA_KEY =
"responseSerializationCpuTimeNs";
+
+ /* The TrailerKeys is used in V3, where we put all metadata as part of
trailer and use enum keys as metadata keys.
+ * Currently all trailer keys are metadata keys, but in future we may add
trailer key which is not a metadata key.
+ *
+ * NOTE:
+ * if you add a new key in TrailerKeys enum
+ * - you need add it's corresponding string to
TrailerKeyToMetadataKeyMap/MetadataKeyToTrailerKeyMap also.
+ * - if it happen to be a metadata key, add it into MetadataKeys also.
+ * - if it has a long/int type value, add it into
LongValueTrailerKeys/LongValueTrailerKeys also.
+ *
+ * ATTENTION:
+ * - Always add new key to the end of enum.
+ * - Don't remove existing keys.
+ * Otherwise, backward compatibility will be broken.
+ */
+ enum TrailerKeys {
+ TABLE_KEY, // NOTE: this key is only used in PrioritySchedulerTest
Review comment:
Add an UNKNOWN as the first one (=0). You will then be able to code
better around things easier when exceptions are thrown in valueOf() methods.
##########
File path:
pinot-core/src/main/java/org/apache/pinot/core/common/datatable/DataTableBuilder.java
##########
@@ -263,6 +263,14 @@ public void finishRow()
}
public DataTable build() {
+ return new DataTableImplV3(_numRows, _dataSchema, _reverseDictionaryMap,
+ _fixedSizeDataByteArrayOutputStream.toByteArray(),
_variableSizeDataByteArrayOutputStream.toByteArray());
+ }
+
+ // buildV2() is only used in V2V3Compatibility test
Review comment:
It may be better to make it configurable on the server to generate
either version? (default the config to V3).
Alternative is that we insist that the brokers be upgraded before picking up
this feature. That may be ok, but may cause other constraints (e.g. if someone
needs a fix on the server that is urgently needed in production, forcing them
to pull a newer version of the server).
##########
File path:
pinot-common/src/main/java/org/apache/pinot/common/utils/DataTable.java
##########
@@ -45,9 +51,140 @@
String NUM_RESIZES_METADATA_KEY = "numResizes";
String RESIZE_TIME_MS_METADATA_KEY = "resizeTimeMs";
String EXECUTION_THREAD_CPU_TIME_NS_METADATA_KEY =
"executionThreadCpuTimeNs";
+ String RESPONSE_SERIALIZATION_CPU_TIME_NS_METADATA_KEY =
"responseSerializationCpuTimeNs";
+
+ /* The TrailerKeys is used in V3, where we put all metadata as part of
trailer and use enum keys as metadata keys.
+ * Currently all trailer keys are metadata keys, but in future we may add
trailer key which is not a metadata key.
+ *
+ * NOTE:
+ * if you add a new key in TrailerKeys enum
+ * - you need add it's corresponding string to
TrailerKeyToMetadataKeyMap/MetadataKeyToTrailerKeyMap also.
+ * - if it happen to be a metadata key, add it into MetadataKeys also.
+ * - if it has a long/int type value, add it into
LongValueTrailerKeys/LongValueTrailerKeys also.
+ *
+ * ATTENTION:
+ * - Always add new key to the end of enum.
+ * - Don't remove existing keys.
+ * Otherwise, backward compatibility will be broken.
+ */
+ enum TrailerKeys {
+ TABLE_KEY, // NOTE: this key is only used in PrioritySchedulerTest
+ EXCEPTION_METADATA_KEY,
+ NUM_DOCS_SCANNED_METADATA_KEY,
+ NUM_ENTRIES_SCANNED_IN_FILTER_METADATA_KEY,
+ NUM_ENTRIES_SCANNED_POST_FILTER_METADATA_KEY,
+ NUM_SEGMENTS_QUERIED,
+ NUM_SEGMENTS_PROCESSED,
+ NUM_SEGMENTS_MATCHED,
+ NUM_CONSUMING_SEGMENTS_PROCESSED,
+ MIN_CONSUMING_FRESHNESS_TIME_MS,
+ TOTAL_DOCS_METADATA_KEY,
+ NUM_GROUPS_LIMIT_REACHED_KEY,
+ TIME_USED_MS_METADATA_KEY,
+ TRACE_INFO_METADATA_KEY,
+ REQUEST_ID_METADATA_KEY,
+ NUM_RESIZES_METADATA_KEY,
+ RESIZE_TIME_MS_METADATA_KEY,
+ EXECUTION_THREAD_CPU_TIME_NS_METADATA_KEY,
+ RESPONSE_SERIALIZATION_CPU_TIME_NS_METADATA_KEY,
+ }
+
+ // LongValueTrailerKeys contains all trailer keys which has value of long
type.
+ Set<TrailerKeys> LongValueTrailerKeys = ImmutableSet.of(
+ TrailerKeys.NUM_DOCS_SCANNED_METADATA_KEY,
+ TrailerKeys.NUM_ENTRIES_SCANNED_IN_FILTER_METADATA_KEY,
+ TrailerKeys.NUM_ENTRIES_SCANNED_POST_FILTER_METADATA_KEY,
+ TrailerKeys.MIN_CONSUMING_FRESHNESS_TIME_MS,
+ TrailerKeys.TOTAL_DOCS_METADATA_KEY,
+ TrailerKeys.TIME_USED_MS_METADATA_KEY,
+ TrailerKeys.REQUEST_ID_METADATA_KEY,
+ TrailerKeys.RESIZE_TIME_MS_METADATA_KEY,
+ TrailerKeys.EXECUTION_THREAD_CPU_TIME_NS_METADATA_KEY,
+ TrailerKeys.RESPONSE_SERIALIZATION_CPU_TIME_NS_METADATA_KEY
+ );
+
+ // IntValueTrailerKeys contains all trailer keys which has value of int type.
+ Set<TrailerKeys> IntValueTrailerKeys = ImmutableSet.of(
+ TrailerKeys.NUM_SEGMENTS_QUERIED,
+ TrailerKeys.NUM_SEGMENTS_PROCESSED,
+ TrailerKeys.NUM_SEGMENTS_MATCHED,
+ TrailerKeys.NUM_RESIZES_METADATA_KEY,
+ TrailerKeys.NUM_CONSUMING_SEGMENTS_PROCESSED,
+ TrailerKeys.NUM_RESIZES_METADATA_KEY
+ );
+
+ // MetadataKeys contains all trailer keys which is also metadata key.
+ Set<TrailerKeys> MetadataKeys = ImmutableSet.of(
+ TrailerKeys.TABLE_KEY, // NOTE: this key is only used in
PrioritySchedulerTest
+ TrailerKeys.EXCEPTION_METADATA_KEY,
+ TrailerKeys.NUM_DOCS_SCANNED_METADATA_KEY,
+ TrailerKeys.NUM_ENTRIES_SCANNED_IN_FILTER_METADATA_KEY,
+ TrailerKeys.NUM_ENTRIES_SCANNED_POST_FILTER_METADATA_KEY,
+ TrailerKeys.NUM_SEGMENTS_QUERIED,
+ TrailerKeys.NUM_SEGMENTS_PROCESSED,
+ TrailerKeys.NUM_SEGMENTS_MATCHED,
+ TrailerKeys.NUM_CONSUMING_SEGMENTS_PROCESSED,
+ TrailerKeys.MIN_CONSUMING_FRESHNESS_TIME_MS,
+ TrailerKeys.TOTAL_DOCS_METADATA_KEY,
+ TrailerKeys.NUM_GROUPS_LIMIT_REACHED_KEY,
+ TrailerKeys.TIME_USED_MS_METADATA_KEY,
+ TrailerKeys.TRACE_INFO_METADATA_KEY,
+ TrailerKeys.REQUEST_ID_METADATA_KEY,
+ TrailerKeys.NUM_RESIZES_METADATA_KEY,
+ TrailerKeys.RESIZE_TIME_MS_METADATA_KEY,
+ TrailerKeys.EXECUTION_THREAD_CPU_TIME_NS_METADATA_KEY,
+ TrailerKeys.RESPONSE_SERIALIZATION_CPU_TIME_NS_METADATA_KEY
+ );
+
+ // TrailerKeyToMetadataKeyMap is used to convert enum key to metadata
key(string).
+ Map<TrailerKeys, String> TrailerKeyToMetadataKeyMap =
ImmutableMap.<TrailerKeys, String>builder()
Review comment:
Will BiMap work better?
##########
File path:
pinot-common/src/main/java/org/apache/pinot/common/utils/DataTable.java
##########
@@ -45,9 +51,140 @@
String NUM_RESIZES_METADATA_KEY = "numResizes";
String RESIZE_TIME_MS_METADATA_KEY = "resizeTimeMs";
String EXECUTION_THREAD_CPU_TIME_NS_METADATA_KEY =
"executionThreadCpuTimeNs";
+ String RESPONSE_SERIALIZATION_CPU_TIME_NS_METADATA_KEY =
"responseSerializationCpuTimeNs";
+
+ /* The TrailerKeys is used in V3, where we put all metadata as part of
trailer and use enum keys as metadata keys.
+ * Currently all trailer keys are metadata keys, but in future we may add
trailer key which is not a metadata key.
+ *
+ * NOTE:
+ * if you add a new key in TrailerKeys enum
+ * - you need add it's corresponding string to
TrailerKeyToMetadataKeyMap/MetadataKeyToTrailerKeyMap also.
+ * - if it happen to be a metadata key, add it into MetadataKeys also.
+ * - if it has a long/int type value, add it into
LongValueTrailerKeys/LongValueTrailerKeys also.
+ *
+ * ATTENTION:
+ * - Always add new key to the end of enum.
+ * - Don't remove existing keys.
+ * Otherwise, backward compatibility will be broken.
+ */
+ enum TrailerKeys {
+ TABLE_KEY, // NOTE: this key is only used in PrioritySchedulerTest
+ EXCEPTION_METADATA_KEY,
+ NUM_DOCS_SCANNED_METADATA_KEY,
+ NUM_ENTRIES_SCANNED_IN_FILTER_METADATA_KEY,
+ NUM_ENTRIES_SCANNED_POST_FILTER_METADATA_KEY,
+ NUM_SEGMENTS_QUERIED,
+ NUM_SEGMENTS_PROCESSED,
+ NUM_SEGMENTS_MATCHED,
+ NUM_CONSUMING_SEGMENTS_PROCESSED,
+ MIN_CONSUMING_FRESHNESS_TIME_MS,
+ TOTAL_DOCS_METADATA_KEY,
+ NUM_GROUPS_LIMIT_REACHED_KEY,
+ TIME_USED_MS_METADATA_KEY,
+ TRACE_INFO_METADATA_KEY,
+ REQUEST_ID_METADATA_KEY,
+ NUM_RESIZES_METADATA_KEY,
+ RESIZE_TIME_MS_METADATA_KEY,
+ EXECUTION_THREAD_CPU_TIME_NS_METADATA_KEY,
+ RESPONSE_SERIALIZATION_CPU_TIME_NS_METADATA_KEY,
+ }
+
+ // LongValueTrailerKeys contains all trailer keys which has value of long
type.
+ Set<TrailerKeys> LongValueTrailerKeys = ImmutableSet.of(
+ TrailerKeys.NUM_DOCS_SCANNED_METADATA_KEY,
+ TrailerKeys.NUM_ENTRIES_SCANNED_IN_FILTER_METADATA_KEY,
+ TrailerKeys.NUM_ENTRIES_SCANNED_POST_FILTER_METADATA_KEY,
+ TrailerKeys.MIN_CONSUMING_FRESHNESS_TIME_MS,
+ TrailerKeys.TOTAL_DOCS_METADATA_KEY,
+ TrailerKeys.TIME_USED_MS_METADATA_KEY,
+ TrailerKeys.REQUEST_ID_METADATA_KEY,
+ TrailerKeys.RESIZE_TIME_MS_METADATA_KEY,
+ TrailerKeys.EXECUTION_THREAD_CPU_TIME_NS_METADATA_KEY,
+ TrailerKeys.RESPONSE_SERIALIZATION_CPU_TIME_NS_METADATA_KEY
+ );
+
+ // IntValueTrailerKeys contains all trailer keys which has value of int type.
+ Set<TrailerKeys> IntValueTrailerKeys = ImmutableSet.of(
+ TrailerKeys.NUM_SEGMENTS_QUERIED,
+ TrailerKeys.NUM_SEGMENTS_PROCESSED,
+ TrailerKeys.NUM_SEGMENTS_MATCHED,
+ TrailerKeys.NUM_RESIZES_METADATA_KEY,
+ TrailerKeys.NUM_CONSUMING_SEGMENTS_PROCESSED,
+ TrailerKeys.NUM_RESIZES_METADATA_KEY
+ );
+
+ // MetadataKeys contains all trailer keys which is also metadata key.
+ Set<TrailerKeys> MetadataKeys = ImmutableSet.of(
Review comment:
Instead of duplicating keys, can we just union the set?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]