[jira] Commented: (HBASE-82) [hbase] VOTE: should row keys be less restrictive than hadoop.io.Text?

Kevin Beyer (JIRA) Tue, 19 Feb 2008 17:37:21 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570531#action_12570531
 ]


Kevin Beyer commented on HBASE-82:
----------------------------------

I'm referring to org.apache.hadoop.io.WritableComparator which has these 
methods:

  /** Optimization hook.  Override this to make SequenceFile.Sorter's scream.
   *
   * <p>The default implementation reads the data into two [EMAIL PROTECTED]
   * WritableComparable}s (using [EMAIL PROTECTED]
   * Writable#readFields(DataInput)}, then calls [EMAIL PROTECTED]
   * #compare(WritableComparable,WritableComparable)}.
   */
  public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
    try {
      buffer.reset(b1, s1, l1);                   // parse key1
      key1.readFields(buffer);
      
      buffer.reset(b2, s2, l2);                   // parse key2
      key2.readFields(buffer);
      
    } catch (IOException e) {
      throw new RuntimeException(e);
    }
    
    return compare(key1, key2);                   // compare them
  }

  /** Compare two WritableComparables.
   *
   * <p> The default implementation uses the natural ordering, calling [EMAIL 
PROTECTED]
   * Comparable#compareTo(Object)}. */
  @SuppressWarnings("unchecked")
  public int compare(WritableComparable a, WritableComparable b) {
    return a.compareTo(b);
  }

The default is to deserialize and call the compare method on the 
WritableComparable,  but many WritableComparable supply their own 
WritableComparator to improve sort performance.


> [hbase] VOTE: should row keys be less restrictive than hadoop.io.Text?
> ----------------------------------------------------------------------
>
>                 Key: HBASE-82
>                 URL: https://issues.apache.org/jira/browse/HBASE-82
>             Project: Hadoop HBase
>          Issue Type: Wish
>            Reporter: Jim Kellerman
>
> I have heard from several people that row keys in HBase should be less 
> restricted than hadoop.io.Text.
> What do you think?
> At the very least, a row key has to be a WritableComparable. This would lead 
> to the most general case being either hadoop.io.BytesWritable or 
> hbase.io.ImmutableBytesWritable. The primary difference between these two 
> classes is that hadoop.io.BytesWritable by default allocates 100 bytes and if 
> you do not pay attention to the length, (BytesWritable.getSize()), converting 
> a String to a BytesWritable and vice versa can become problematic. 
> hbase.io.ImmutableBytesWritable, in contrast only allocates as many bytes as 
> you pass in and then does not allow the size to be changed.
> If we were to change from Text to a non-text key, my preference would be for 
> ImmutableBytesWritable, because it has a fixed size once set, and operations 
> like get, etc do not have to something like System.arrayCopy where you 
> specify the number of bytes to copy.
> Your comments, questions are welcome on this issue. If we receive enough 
> feedback that Text is too restrictive, we are willing to change it, but we 
> need to hear what would be the most useful thing to change it to as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-82) [hbase] VOTE: should row keys be less restrictive than hadoop.io.Text?

Reply via email to