[ 
https://issues.apache.org/jira/browse/HBASE-10531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932698#comment-13932698
 ] 

stack commented on HBASE-10531:
-------------------------------

So not copying is faster using YCSB?

You need to add good tests for the additions to CellComparator.

bq. +    public int compareFlatKey(Cell left, Cell right) {

What is a 'flat' key?   Why do we have a Cell compare in KV?  Should it be in 
CellUtil or CellComparator?  Or this is how KV is now?  It takes Cells?  It 
seems so.

Here too.  Why in KV are we talking Cells?

-    public int compareTimestamps(final KeyValue left, final KeyValue right) {
+    public int compareTimestamps(final Cell left, final Cell right) {

... and here:

-    public int compareRows(final KeyValue left, final KeyValue right) {
+    public int compareRows(final Cell left, final Cell right) {

That said, nice to see our use of Cell in KV compares.

But I'd say KV methods should take KVs, not Cells?  If it takes Cells, could be 
comparing a non-KV and it could actually work?  Is this what we want?  Maybe 
this is just uglyness left over from how KV has been used/abused up to this?  
But I'm thinking these compares would be Cell compares out in a CellUtil or 
CellCompare class? 

You have long lines in here Mr. [~ram_krish]

You should mark these with @VisibleForTesting

+      // Only for test case purpose

It is good that the byte compares added are not public.

Shouldn't this be unsupportedoperationexception in your new key only class?

+    public byte[] getValueArray() {
+      return HConstants.EMPTY_BYTE_ARRAY;
+    }

Same for value offset and length.

Tags too?

What is difference between KeyAloneKeyValue and a copy of the original Key but 
with no value?  I am wary introducing new type?  Or, a new type that just threw 
UnsupportedOperationException when you tried to get a value...

Why we have to create new key when we pass to a comparator?  Is it because we 
need to parse the bytes so can pass in an Object?  That makes sense. I see now 
why it is needed.  Suggest rename it as KeyOnlyKeyValue rather than KeyAlongKV. 
   Also, the inner class needs a bit of a class comment on why it is needed: 
i.e. you have a byte array but comparator wants a Cell of some type.... rather 
than parse whole KV byte buffer....  This is used in the case where we have key 
only bytes; i.e. the block index?

Is passing 0 correct in the below?

+      return comparator.compareFlatKey(key,
+          new KeyValue.KeyAloneKeyValue(current.keyBuffer, 0, 
current.keyLength));

Should it be an offset?  We do this '0' in a few places.

Yeah, what is a 'flat' key?  Is it key only?

So, this is the replacement: seekToKeyInBlock ?  Purge the old stuff!!!!

We have tests for the above?

Should this be a CellComparator rather than a KVComparator:

+
+    public int compareKey(KVComparator comparator, Cell key);

(Sorry if you answered this already).

Needs doc:  +  public static int binarySearch(byte[][] arr, Cell leftCell, 
RawComparator<Cell> comparator) {

The array of byte arrays has Cells in it or it seems KVs?    Will it always be 
arrays of byte arrays?  Or will the binary search be in a block?  Or is the 
'arr' here a block?

Formatting:

-        forceBeforeOnExactMatch);
-    }else{
+          forceBeforeOnExactMatch);
+    } else {
       return seekToOrBeforeUsingPositionAtOrAfter(keyOnlyBytes, offset, length,
-        forceBeforeOnExactMatch);
+          forceBeforeOnExactMatch);

We creating a KV where we did not before here?

-        return this.delegate.seekBefore(key, offset, length);
+        return seekBefore(new KeyValue.KeyAloneKeyValue(key, offset, length));

Or is it that I am just misreading? (It is being created elsewhere anyways)

Why we add these byte-based methods?

+      public int reseekTo(byte[] key) throws IOException {

+      public int reseekTo(byte[] key, int offset, int length)

We should let this out?

+      public org.apache.hadoop.hbase.io.hfile.HFile.Reader getReader() {

Any chance of micro benchmarks on difference between this patch and what was 
there before?

Why we adding a method that passes key as bytes?

+    public BlockWithScanInfo loadDataBlockWithScanInfo(final byte[] key, int 
keyOffset,
+        int keyLength, HFileBlock currentBlock, boolean cacheBlocks,
+        boolean pread, boolean isCompaction)
+        throws IOException {

The ByteBuffers here come from where?

+    static int locateNonRootIndexEntry(ByteBuffer nonRootBlock, Cell key, 
KVComparator comparator) {
+      int entryIndex = binarySearchNonRootIndex(key, nonRootBlock, comparator);


Yeah, why you add these byte array versions?

+    public boolean seekBefore(byte[] key, int offset, int length) throws 
IOException {
+      HFileBlock seekToBlock = 
reader.getDataBlockIndexReader().seekToDataBlock(key, offset,
+          length, block, cacheBlocks, pread, isCompaction);


Is there duplicated code between HFile2 and HFile3?

Just remove?

+  @Deprecated
   int seekTo(byte[] key) throws IOException;
+  @Deprecated
   int seekTo(byte[] key, int offset, int length) throws IOException;

This looks good:

-    int result = s.reseekTo(k.getBuffer(), k.getKeyOffset(), k.getKeyLength());
+    int result = s.reseekTo(k);

... and this:

-    int ret = scanner.reseekTo(getLastOnCol(curr).getKey());
+    int ret = scanner.reseekTo(getLastOnCol(curr));


We need this: TestNewReseekTo  ? Why not just change the current test to use 
Cells instead?

Is TestNewSeekTo a copy of the old code?  If so, just replace the byte based 
with the new Cell based?

Patch looks great otherwise.  Good on you [~ram_krish]








> Revisit how the key byte[] is passed to HFileScanner.seekTo and reseekTo
> ------------------------------------------------------------------------
>
>                 Key: HBASE-10531
>                 URL: https://issues.apache.org/jira/browse/HBASE-10531
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.99.0
>
>         Attachments: HBASE-10531.patch, HBASE-10531_1.patch, 
> HBASE-10531_2.patch
>
>
> Currently the byte[] key passed to HFileScanner.seekTo and 
> HFileScanner.reseekTo, is a combination of row, cf, qual, type and ts.  And 
> the caller forms this by using kv.getBuffer, which is actually deprecated.  
> So see how this can be achieved considering kv.getBuffer is removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to