mqliang commented on a change in pull request #6710:
URL: https://github.com/apache/incubator-pinot/pull/6710#discussion_r599306832
##########
File path:
pinot-core/src/main/java/org/apache/pinot/core/common/datatable/DataTableImplV2V3.java
##########
@@ -33,12 +33,15 @@
import org.apache.pinot.common.utils.DataTable;
import org.apache.pinot.common.utils.StringUtil;
import org.apache.pinot.core.common.ObjectSerDeUtils;
+import org.apache.pinot.core.query.request.context.ThreadTimer;
import org.apache.pinot.spi.utils.ByteArray;
import org.apache.pinot.spi.utils.BytesUtils;
-public class DataTableImplV2 implements DataTable {
- private static final int VERSION = 2;
+public class DataTableImplV2V3 implements DataTable {
Review comment:
I name it as DataTableImplV2V3 since V2 and V3 share a lot of common
logic. If V2 and V3 has major changes, as you suggest:
> Since we are anyway bumping up the version, how about we move the existing
metadata of key-value pairs to the end of file to keep consistency in the
format. So, all the metadata stuff (aka key-value pairs) + new positional stuff
can be a file footer.
If we do that, I vote for put V2 logic into DataTableImplV2 and V3 logic
into DataTableImplV3, and extract common logic (e.g. serialize/de-serialize
metada/dictionaryMap into DataTableUtils.java)
> move the existing metadata of key-value pairs to the end of file
Actually I considered that. I also considered to make metadata as a
`String[]` instead of `Map<String, String>` and make all meta data keys as enum
value. Also make "serialization_cpu_times_ns" as part of metadata. In other
words, "serialization_cpu_times_ns" is part of mate data and footer section
only contains meta data. In this way:
* all meta data is positional, we can replace values in metadata even after
data table is serialized. (`Map<String, String>` is not positional because when
loop over a hashmap, the order of items is not deterministic, but loop over of
an array, the order is deterministic)
* meta data previously is `Map<String, String>`, where we need to write
keys(type string) to byte buffer. When replaces as `String[]`, we don't write
the enum constant itself. Just the value (length+bytes) corresponding to the
ordinal/position of the constant. So less data is transfered between
server/broker.
But if we change in this way, as I previously stated, I vote to keep the
current DataTableImplV2.java as it is, and create a DataTableImplV3.java to put
all V3 logic (with extracting common into DataTableUtils.java ). Otherwise,
puting all V2/V3 logic in same file will make the code hard to read.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]