[jira] [Commented] (HBASE-8693) Implement extensible type API based on serialization primitives

Matteo Bertozzi (JIRA) Thu, 18 Jul 2013 16:25:36 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713076#comment-13713076
 ]


Matteo Bertozzi commented on HBASE-8693:
----------------------------------------

Thanks for keeping following up on my out of scope questions.

again, I think that I'm focusing more on the cell-value side instead of the key 
part which will be the one that will have the benefit from the ordered byte 
stuff and will probably have more restriction on the evolution since this stuff 
is client side only and you've to deal with the raw byte sorting of hbase.

{quote}It's quite out of scope for my purposes, but I'm curious what you think 
about the future direction with schema. I think the Phoenix and Kiji folk will 
have some good insights.{quote}

(I'll talk only about cell-values here, so I'm not interested in the ordered 
stuff in this case)
I want to write my app today with this library.
I'll start off using a Struct, and it's ok until I have to add/remove a field.
so.. I can add a version/schema id.. but now I have the problem that I have to 
keep all the schemas and then project to the schema that I want to use.

Example:
- get row0 -> cell with schema 1
- get row1 -> cell with schema 2
- get row2 -> cell with schema 3
- Now the user/api have to handle this 3 different rows and project to a user 
provided schema to get out something useful to the user...

In this case, you have to store all the schemas and you've to provide a mapping 
for each schema to the one that the user wants.

The other approach, more protobuf like is each field has an id that must be 
unique. on read you provide your "read schema" and you load only the field 
present in the "read schema".
note that this can also work with just with the api similar to what you have 
"getField(field_id)" where the id is the unique id and not the index.

again, I think that your focus at the moment is more on the key side... and my 
guess is that the struct is fine for that.
but this jira is "serialization primitives" without a "row-keys" in front... so 
I assume you plan to use this stuff also for the cell values, and from what I 
said above... I don't see an easy way to evolve my cell data, without rewrite 
every time or doing "manual" mappings for each struct version.
                
> Implement extensible type API based on serialization primitives
> ---------------------------------------------------------------
>
>                 Key: HBASE-8693
>                 URL: https://issues.apache.org/jira/browse/HBASE-8693
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>             Fix For: 0.95.2
>
>         Attachments: 0001-HBASE-8693-Extensible-data-types-API.patch, 
> 0001-HBASE-8693-Extensible-data-types-API.patch, 
> 0001-HBASE-8693-Extensible-data-types-API.patch, 
> 0002-HBASE-8693-example-Use-DataType-API-to-build-regionN.patch, 
> KijiFormattedEntityId.java
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8693) Implement extensible type API based on serialization primitives

Reply via email to