[hbase] VOTE: should row keys be less restrictive than hadoop.io.Text?
----------------------------------------------------------------------

                 Key: HADOOP-2334
                 URL: https://issues.apache.org/jira/browse/HADOOP-2334
             Project: Hadoop
          Issue Type: Wish
          Components: contrib/hbase
    Affects Versions: 0.16.0
            Reporter: Jim Kellerman
            Assignee: Jim Kellerman
            Priority: Minor
             Fix For: 0.16.0


I have heard from several people that row keys in HBase should be less 
restricted than hadoop.io.Text.

What do you think?

At the very least, a row key has to be a WritableComparable. This would lead to 
the most general case being either hadoop.io.BytesWritable or 
hbase.io.ImmutableBytesWritable. The primary difference between these two 
classes is that hadoop.io.BytesWritable by default allocates 100 bytes and if 
you do not pay attention to the length, (BytesWritable.getSize()), converting a 
String to a BytesWritable and vice versa can become problematic. 

hbase.io.ImmutableBytesWritable, in contrast only allocates as many bytes as 
you pass in and then does not allow the size to be changed.

If we were to change from Text to a non-text key, my preference would be for 
ImmutableBytesWritable, because it has a fixed size once set, and operations 
like get, etc do not have to something like System.arrayCopy where you specify 
the number of bytes to copy.

Your comments, questions are welcome on this issue. If we receive enough 
feedback that Text is too restrictive, we are willing to change it, but we need 
to hear what would be the most useful thing to change it to as well.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to