[
https://issues.apache.org/jira/browse/HBASE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713834#comment-13713834
]
Nick Dimiduk commented on HBASE-8693:
-------------------------------------
This {{HDataType}} interface and the two codecs upon which the implementations
rely is not schema management for HBase. {{HDataType}} can be used to manage
encoding values into rowkeys, column qualifiers, or values. Use an instance of
{{Struct}}, or don't, in any of those contexts. The use of {{Struct}} in the
order-sensitive context has driven more design thought, but it generates a
{{byte[]}} wherever it's used. Would an example of an Avro, Thrift, or
Protobuff {{HDataType}} implementation help to drive this idea home?
My trouble with using the word "schema" for key-values is that context is too
narrow a scope. Being able to consistently read a value out of a cell does not
tell me what the schema of the database is. HBase provides basic *table*
definition management but not *data* definition management, the effective
meaning of schema. Pheonix and Kiji both provide a layer of schema management
on top of HBase. Through them you define the logical layout of data in tables,
and you abandon to them how that data is physically arranged and encoded.
{{HDataType}} provides an API with which its user can control how data is
physically arranged and encoded. Its user is still left to manage the logical
layout and its meaning to their application for themselves.
This patch is not schema management. It provides a common set of primitives
that other applications can consume -- be them user applications developed
directly against HBase or Phoenix or Kiji themselves. The consumers I've always
had in mind have always been myself and application developers like me, Hive,
Pig, and Phoenix. The primary benefit being that all those applications gain
some level of interoperability through data in HBase. That I was able to read
Kiji's avdl file and in an afternoon understand how HDataType could be used to
make it's implementation simpler and more extensible is validation of utility.
> Implement extensible type API based on serialization primitives
> ---------------------------------------------------------------
>
> Key: HBASE-8693
> URL: https://issues.apache.org/jira/browse/HBASE-8693
> Project: HBase
> Issue Type: Sub-task
> Components: Client
> Reporter: Nick Dimiduk
> Assignee: Nick Dimiduk
> Fix For: 0.95.2
>
> Attachments: 0001-HBASE-8693-Extensible-data-types-API.patch,
> 0001-HBASE-8693-Extensible-data-types-API.patch,
> 0001-HBASE-8693-Extensible-data-types-API.patch,
> 0002-HBASE-8693-example-Use-DataType-API-to-build-regionN.patch,
> KijiFormattedEntityId.java
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira