Gabriel Reid created PHOENIX-1108:
-------------------------------------

             Summary: Clarify, verify, and document intended behavior from 
using HColumnDescriptor.KEEP_DELETED_CELLS
                 Key: PHOENIX-1108
                 URL: https://issues.apache.org/jira/browse/PHOENIX-1108
             Project: Phoenix
          Issue Type: Improvement
            Reporter: Gabriel Reid
            Assignee: Gabriel Reid


The current default for all Phoenix tables is to enable the KEEP_DELETED_CELLS 
flag on all column families. The general functionality of this default should 
be reviewed, as well as checking that it works as intended (particularly in 
terms of the ChunkedResultIterator, which uses multiple scans).

The general idea of the KEEP_DELETED_CELLS flag is that it prevents deleted 
cells from being permanently removed during a (major) compaction. If the number 
of versions to keep for a cell is small (3 is the default) then this won’t 
cause a major problem, and is in might be needed in order to function correctly 
(i.e. to handle deletes and a major compaction occurring while a query is being 
run).

On the other hand, if the number of versions to keep for a column family is 
large (e.g. Integer.MAX_VALUE), the default of KEEP_DELETED_CELLS=true will 
mean that a delete in Phoenix never actually deletes data.

Tasks to be performed are:
* clear up (and document) the intended behavior that of using 
KEEP_DELETED_CELLS=true as a default in Phoenix
* add tests to verify that this intended behavior still works with the 
ChunkedResultIterator
* document the implications and/or workaround if a large number of versions is 
configured for a column family



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to