[
https://issues.apache.org/jira/browse/HBASE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713153#comment-13713153
]
stack commented on HBASE-8693:
------------------------------
IIRC, their avro idl is for all but the description of the rowkey. When they
talk about rowkey 'schema', it is allowed that it cannot evolve for reasons
discussed above. Adding to the right of a rowkey should be fine though. Ditto
when serializing column qualifiers.
High in this issue you raise: "Do you think we should have a similar kind of
dichotomy for encoding into order-preserving context vs non-order-preserving
context? My initial thinking is probably not (due to additional API surface
area), but I want to have the conversation."
You allow that there are two contexts (and indeed Matteo asks for clarification
on this) -- one where there is no way around it but you need to rewrite the
data if you want to refer to it using a different struct/'schema'; e.g. a
rowkey (caveat adding fields to the right) -- and then there are the contexts
where you should be able to evolve the content; e.g. cell content and even to a
higher level where you might impose a schema made of multiple column content
(or full row), and so on.
This seems like a good split. In the cell context, the area where you would
like to be able to evolve, sort order preservation is not required. In the
simple case, an int16 type, you probably don't need versioning either? Its
serialization is unlikely to change but you might want version even these
primitive types just in case? If a compound type in a cell, you would like to
be able to evolve it; to add fields, etc. So you could add a version to
structs here? (but why would user use this lib over pb in this case?) Now you
bleed over into higher level issues; schema and its follow-ons, where to store
it and how to evolve, etc. (Matteo's concerns).
I suppose we are fine given you have 'schema' and 'schema evolution' as
out-of-scope in your answer to Matteo. We should be clear that these problems
remain as to-be-solved (or solved by others -- see kiji) after this patch is
done and be sure folks don't get the wrong impression. Just saying.
On the adding fields to the right of your struct, where you have the
application use the right struct version, pity your lib couldn't do that for
the app. PB has a lead-off serialized length which saves it reading off the
end of the record. You can't do that because you'll mess up your ordering.
You can't lead the record with a version since that will also mess your sort
order (as you say above). A buffer where you check available would be
expensive...
> Implement extensible type API based on serialization primitives
> ---------------------------------------------------------------
>
> Key: HBASE-8693
> URL: https://issues.apache.org/jira/browse/HBASE-8693
> Project: HBase
> Issue Type: Sub-task
> Components: Client
> Reporter: Nick Dimiduk
> Assignee: Nick Dimiduk
> Fix For: 0.95.2
>
> Attachments: 0001-HBASE-8693-Extensible-data-types-API.patch,
> 0001-HBASE-8693-Extensible-data-types-API.patch,
> 0001-HBASE-8693-Extensible-data-types-API.patch,
> 0002-HBASE-8693-example-Use-DataType-API-to-build-regionN.patch,
> KijiFormattedEntityId.java
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira