[ https://issues.apache.org/jira/browse/HBASE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713834#comment-13713834 ]
Nick Dimiduk commented on HBASE-8693: ------------------------------------- This {{HDataType}} interface and the two codecs upon which the implementations rely is not schema management for HBase. {{HDataType}} can be used to manage encoding values into rowkeys, column qualifiers, or values. Use an instance of {{Struct}}, or don't, in any of those contexts. The use of {{Struct}} in the order-sensitive context has driven more design thought, but it generates a {{byte[]}} wherever it's used. Would an example of an Avro, Thrift, or Protobuff {{HDataType}} implementation help to drive this idea home? My trouble with using the word "schema" for key-values is that context is too narrow a scope. Being able to consistently read a value out of a cell does not tell me what the schema of the database is. HBase provides basic *table* definition management but not *data* definition management, the effective meaning of schema. Pheonix and Kiji both provide a layer of schema management on top of HBase. Through them you define the logical layout of data in tables, and you abandon to them how that data is physically arranged and encoded. {{HDataType}} provides an API with which its user can control how data is physically arranged and encoded. Its user is still left to manage the logical layout and its meaning to their application for themselves. This patch is not schema management. It provides a common set of primitives that other applications can consume -- be them user applications developed directly against HBase or Phoenix or Kiji themselves. The consumers I've always had in mind have always been myself and application developers like me, Hive, Pig, and Phoenix. The primary benefit being that all those applications gain some level of interoperability through data in HBase. That I was able to read Kiji's avdl file and in an afternoon understand how HDataType could be used to make it's implementation simpler and more extensible is validation of utility. > Implement extensible type API based on serialization primitives > --------------------------------------------------------------- > > Key: HBASE-8693 > URL: https://issues.apache.org/jira/browse/HBASE-8693 > Project: HBase > Issue Type: Sub-task > Components: Client > Reporter: Nick Dimiduk > Assignee: Nick Dimiduk > Fix For: 0.95.2 > > Attachments: 0001-HBASE-8693-Extensible-data-types-API.patch, > 0001-HBASE-8693-Extensible-data-types-API.patch, > 0001-HBASE-8693-Extensible-data-types-API.patch, > 0002-HBASE-8693-example-Use-DataType-API-to-build-regionN.patch, > KijiFormattedEntityId.java > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira