[ https://issues.apache.org/jira/browse/HBASE-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129975#comment-16129975 ]
Chia-Ping Tsai commented on HBASE-18471: ---------------------------------------- bq. why a normal kv with empty column sorted before delete mark {code:title=CellComparator#compareWithoutRow} ... diff = compareTimestamps(left, right); if (diff != 0) return diff; // Compare types. Let the delete types sort ahead of puts; i.e. types // of higher numbers sort before those of lesser numbers. Maximum (255) // appears ahead of everything, and minimum (0) appears after // everything. return (0xff & right.getTypeByte()) - (0xff & left.getTypeByte()); {code} We compare the TS first, so normal kv with empty column can be sorted before delete mark if the kv's ts is larger(newer) than delete mark. > The DeleteFamily cell is skipped when StoreScanner seeks to next column > ----------------------------------------------------------------------- > > Key: HBASE-18471 > URL: https://issues.apache.org/jira/browse/HBASE-18471 > Project: HBase > Issue Type: Bug > Components: Deletes, hbase, scan > Affects Versions: 3.0.0, 1.3.0, 1.3.1, 2.0.0-alpha-1 > Reporter: Thomas Martens > Assignee: Chia-Ping Tsai > Priority: Critical > Fix For: 2.0.0, 1.4.0, 1.3.2, 1.5.0, 1.2.7 > > Attachments: HBASE-18471.branch-1.2.v0.patch, HBASE-18471.v0.patch, > HBASE-18471.v1.patch, HBaseDmlTest.java > > > The qualifier of a deleted row (with keep deleted cells true) re-appears > after re-inserting the same row multiple times (with different timestamp) > with an empty qualifier. > Scenario: > # Put row with family and qualifier (timestamp 1). > # Delete entire row (timestamp 2). > # Put same row again with family without qualifier (timestamp 3). > A scan (latest version) returns the row with family without qualifier, > version 3 (which is correct). > # Put the same row again with family without qualifier (timestamp 4). > A scan (latest version) returns multiple rows: > * the row with family without qualifier, version 4 (which is correct). > * the row with family with qualifier, version 1 (which is wrong). > There is a test scenario attached. > output: > <LOG> 13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml > <LOG> 13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml > <LOG> 13:42:57,256 [main] client.HBaseAdmin - Deleted test_dml > <LOG> 13:42:58,592 [main] client.HBaseAdmin - Created test_dml > Put row: 'myRow' with family: 'myFamily' with qualifier: 'myQualifier' with > timestamp: '1' > Scan printout => > Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', > Value: 'myValue' > Delete row: 'myRow' > Scan printout => > Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with > timestamp: '3' > Scan printout => > Row: 'myRow', Timestamp: '3', Family: 'myFamily', Qualifier: '', Value: > 'myValue' > Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with > timestamp: '4' > Scan printout => > Row: 'myRow', Timestamp: '4', Family: 'myFamily', Qualifier: '', Value: > 'myValue' > {color:red}Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: > 'myQualifier', Value: 'myValue'{color} -- This message was sent by Atlassian JIRA (v6.4.14#64029)