[ 
https://issues.apache.org/jira/browse/HBASE-15487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202745#comment-15202745
 ] 

Mathias Herberts commented on HBASE-15487:
------------------------------------------

>From thinking more about it I guess the problem is probably related more to 
>the way 'VERSIONS' is enforced than to the delete operation itself.

By setting 'VERSIONS' at table creation time it is expected that only that many 
versions of a cell will be retained. We assume 'VERSIONS' was set to 1 for the 
purpose of the present explanation.

The 'VERSIONS' parameter seems to be enforced during a Scan, even if the data 
being scanned is still in the memstore since a Scan done with a requested 
number of versions >  'VERSIONS' won't return more than 'VERSIONS' versions of 
the cell.

When issueing a Delete against a cell which was written more than 'VERSIONS' 
time, one would expect that the deletion removes all versions of the cell since 
no versions past the last one will be retained at compaction time.

But it seems that only the last version is deleted and the one prior to that 
then becomes visible again when it was not visible before (Scan with 
setMaxVersions() won't return it).


> Deletions done via BulkDeleteEndpoint make past data re-appear
> --------------------------------------------------------------
>
>                 Key: HBASE-15487
>                 URL: https://issues.apache.org/jira/browse/HBASE-15487
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.0.3
>            Reporter: Mathias Herberts
>         Attachments: HBaseTest.java, HBaseTest.java
>
>
> The Warp10 (www.warp10.io) time series database uses HBase as its underlying 
> data store. The deletion of ranges of cells is performed using the 
> BulkDeleteEndpoint.
> In the following scenario the deletion does not appear to be working properly:
> The table 't' is created with a single version using:
> create 't', {NAME => 'v', DATA_BLOCK_ENCODING => 'FAST_DIFF', BLOOMFILTER => 
> 'NONE', REPLICATION_SCOPE => '0', VERSIONS=> '1', MIN_VERSIONS => '0', TTL => 
> '2147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY 
> =>'false', BLOCKCACHE => 'true'}
> We write a cell at row '0x00', colfam 'v', colq '', value 0x0
> We write the same cell again with value 0x1
> A scan will return a single value 0x1
> We then perform a delete using the BulkDeleteEndpoint and a Scan with a 
> DeleteType of 'VERSION'
> The reported number of deleted versions is 1 (which is coherent given the 
> table was created with MAX_VERSIONS=1)
> The same scan as the one performed before the delete returns a single value 
> 0x0.
> This seems to happen when all operations are performed against the memstore.
> A regular delete will remove the cell and a later scan won't show it.
> I'll attach a test which demonstrates the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to