[ 
https://issues.apache.org/jira/browse/HBASE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713834#comment-13713834
 ] 

Nick Dimiduk commented on HBASE-8693:
-------------------------------------

This {{HDataType}} interface and the two codecs upon which the implementations 
rely is not schema management for HBase. {{HDataType}} can be used to manage 
encoding values into rowkeys, column qualifiers, or values. Use an instance of 
{{Struct}}, or don't, in any of those contexts. The use of {{Struct}} in the 
order-sensitive context has driven more design thought, but it generates a 
{{byte[]}} wherever it's used. Would an example of an Avro, Thrift, or 
Protobuff {{HDataType}} implementation help to drive this idea home?

My trouble with using the word "schema" for key-values is that context is too 
narrow a scope. Being able to consistently read a value out of a cell does not 
tell me what the schema of the database is. HBase provides basic *table* 
definition management but not *data* definition management, the effective 
meaning of schema. Pheonix and Kiji both provide a layer of schema management 
on top of HBase. Through them you define the logical layout of data in tables, 
and you abandon to them how that data is physically arranged and encoded. 
{{HDataType}} provides an API with which its user can control how data is 
physically arranged and encoded. Its user is still left to manage the logical 
layout and its meaning to their application for themselves.

This patch is not schema management. It provides a common set of primitives 
that other applications can consume -- be them user applications developed 
directly against HBase or Phoenix or Kiji themselves. The consumers I've always 
had in mind have always been myself and application developers like me, Hive, 
Pig, and Phoenix. The primary benefit being that all those applications gain 
some level of interoperability through data in HBase. That I was able to read 
Kiji's avdl file and in an afternoon understand how HDataType could be used to 
make it's implementation simpler and more extensible is validation of utility.
                
> Implement extensible type API based on serialization primitives
> ---------------------------------------------------------------
>
>                 Key: HBASE-8693
>                 URL: https://issues.apache.org/jira/browse/HBASE-8693
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>             Fix For: 0.95.2
>
>         Attachments: 0001-HBASE-8693-Extensible-data-types-API.patch, 
> 0001-HBASE-8693-Extensible-data-types-API.patch, 
> 0001-HBASE-8693-Extensible-data-types-API.patch, 
> 0002-HBASE-8693-example-Use-DataType-API-to-build-regionN.patch, 
> KijiFormattedEntityId.java
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to