mqliang commented on pull request #6710:
URL: https://github.com/apache/incubator-pinot/pull/6710#issuecomment-805121867
@mcvsubbu
> Any reason we are restricting the trailer (or footer) to have only
key-value pairs? We don't need to place that restriction as long as the length
is also encoded up front. It can be any serialized object, right?
You are right, it can be any serialized object, but restricting to only
contains KV pairs has following benefit:
* Any object can be add as a KV pair, just: (key, serialized_object). So
it's easy to add new section to footer in future.
* For all KV pairs in footer, put their keys in enum, so when serialize
footer, the order of KV pairs is deterministic. This make all KV pairs is
positional/locatable. So we are able to replace value of a given key in footer
even after serialized.
* If we want to add a new object into data table. If we are OK to put it as
a KV pair into footer, we don't need to bum up version Here is the pseudocode
of serialize/de-serialize footer:
```
enum footerkeys {
k0,
k1,
k2,
}
String footerkeysToStr = new String[]{
"k0",
"k1",
"k2",
}
function serializeFooter() {
byte[] bytes;
for (key in footerkeys) {
String data = encode_to_str(value_of_key(key));
bytes = append(bytes, len(data));
bytes = append(bytes, data.toBytes());
}
}
function String[] deSerializeFooter(byte[] bytes) {
String[] values = new String[len(footerkeys)];
for (int i = 0; i < len(footerkeys); i++) {
int data_len = bytes.nextInt();
values[i] = bytes.nextBytesofLens(data_len);
}
}
// If values_i is a complex object instead of a string, we can deserialize
it even further:
String[] footerKVpairs = deSerializeFooter(bytes);
Object_i = deserialize(footerKVpairs[i].toBytes());
```
So, if we want to add new object to footer, add it as KV pair, and as long
as we add the key as the last one of the enum, old broker will just ignore the
extra one, it's back-compatable).
If we make footer not only contains KV pairs, but also other arbitrary
serializable objects:
```
+------------------------------------+
|
| serializable object 1
|
+------------------------------------
|
| serializable object 2
|
+------------------------------------
|
| KV pairs
|
+------------------------------------
```
It's not extensible: If we wanner add a serializable_object_3 in between of
serializable_object_2 and KV_pairs, we need to bump up version (If we bump
version, we can also add in to the middle of data table, not necessarily in
footer).
That's the reason I prefer footer only contains KV pairs: If we want to add
a new simple section into data table, and don't want bump up version, add it as
KV pair to footer. If we want add new very complex section or re-arrange
current sections, add it into the middle of data table, and bump up version.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]