[ 
https://issues.apache.org/jira/browse/HBASE-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129211#comment-16129211
 ] 

ramkrishna.s.vasudevan commented on HBASE-18471:
------------------------------------------------

My doubt is correct.
Say assume we have qual1 and qual0.
We first do a put for qual1 and then we add a deleteFamily.
Say in the same test case after DeleteFamily is added, if we do puts for qual0 
(instead of empty qual as done now) every thing works fine. I think the simple 
reason is because just after adding 
Put (qual1), Delete family, put(qual0, val0), put(qual0, val1) - the 
Deletefamily always sorts out first becuase it knows that qual0 is lesser than 
qual1 and so while scanning DeleteFamily always peeks out as the first Cell 
when we do StoreScanner#next().
But when an empty qualifier is added the sorting takes a different pattern.
Ideally a cell with qualifier qual0 and a cell without qualifier the cell 
without qualifier should sort first and then the cell with qualifier.
I think in CellComparator#compareColums if we can handle this then we are able 
to solve this issue?
{code}
      if(lclength != 0 && rclength == 0) {
        // means the right hand side should be sorted lower.
        return 1;
      }
      if(lclength == 0 && rclength != 0) {
        // means the right hand side should be sorted higher.
        return -1;
      }
{code}
What do you think [~chia7712]?

> The DeleteFamily cell is skipped when StoreScanner seeks to next column
> -----------------------------------------------------------------------
>
>                 Key: HBASE-18471
>                 URL: https://issues.apache.org/jira/browse/HBASE-18471
>             Project: HBase
>          Issue Type: Bug
>          Components: Deletes, hbase, scan
>    Affects Versions: 3.0.0, 1.3.0, 1.3.1, 2.0.0-alpha-1
>            Reporter: Thomas Martens
>            Assignee: Chia-Ping Tsai
>            Priority: Critical
>             Fix For: 2.0.0, 1.4.0, 1.3.2, 1.5.0, 1.2.7
>
>         Attachments: HBASE-18471.branch-1.2.v0.patch, HBASE-18471.v0.patch, 
> HBASE-18471.v1.patch, HBaseDmlTest.java
>
>
> The qualifier of a deleted row (with keep deleted cells true) re-appears 
> after re-inserting the same row multiple times (with different timestamp) 
> with an empty qualifier.
> Scenario:
> # Put row with family and qualifier (timestamp 1).
> # Delete entire row (timestamp 2).
> # Put same row again with family without qualifier (timestamp 3).
> A scan (latest version) returns the row with family without qualifier, 
> version 3 (which is correct).
> # Put the same row again with family without qualifier (timestamp 4).
> A scan (latest version) returns multiple rows:
> * the row with family without qualifier, version 4 (which is correct).
> * the row with family with qualifier, version 1 (which is wrong).
> There is a test scenario attached.
> output:
> <LOG> 13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml
> <LOG> 13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml
> <LOG> 13:42:57,256 [main] client.HBaseAdmin - Deleted test_dml
> <LOG> 13:42:58,592 [main] client.HBaseAdmin - Created test_dml
> Put row: 'myRow' with family: 'myFamily' with qualifier: 'myQualifier' with 
> timestamp: '1'
> Scan printout =>
>   Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', 
> Value: 'myValue'
> Delete row: 'myRow'
> Scan printout =>
> Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with 
> timestamp: '3'
> Scan printout =>
>   Row: 'myRow', Timestamp: '3', Family: 'myFamily', Qualifier: '', Value: 
> 'myValue'
> Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with 
> timestamp: '4'
> Scan printout =>
>   Row: 'myRow', Timestamp: '4', Family: 'myFamily', Qualifier: '', Value: 
> 'myValue'
>   {color:red}Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 
> 'myQualifier', Value: 'myValue'{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to