[ https://issues.apache.org/jira/browse/PHOENIX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230830#comment-14230830 ]
James Taylor commented on PHOENIX-1498: --------------------------------------- In theory, it takes two major compactions for delete markers to go away with KEEP_DELETED_CELLS=true (assuming there are no earlier cell values). Just curious if you've tried this? If we set KEEP_DELETED_CELLS=false by default, shouldn't we also set VERSIONS=1 by default? I believe for backup and restore, users may want KEEP_DELETED_CELLS=true. [~lhofhansl] has been doing some work in this area lately. > Turn KEEP_DELETED_CELLS off by default > -------------------------------------- > > Key: PHOENIX-1498 > URL: https://issues.apache.org/jira/browse/PHOENIX-1498 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.0.0, 5.0.0 > Reporter: Jeffrey Zhong > > Phoenix table is created with "KEEP_DELETED_CELLS" enabled by default, this > is only used to allow for flashback queries to work correctly. While > flashback query isn't used often in field and we found that query performance > degraded with the option on. This is likely a hbase scan issue though(will > create a JIRA once having more info). > Anyway Keeping deleted cells will add performance penalty and it's not used > often. Therefore, I'm suggesting to set it off by default. > We have a test where a table is loaded with > 5m rows and then some are > deleted/reinserted. The count ( * ) performance became worse & worse: > {code} > +------------+ > | COUNT(1) | > +------------+ > | 5078242 | > +------------+ > 1 row selected (33.273 seconds) > +------------+ > | COUNT(1) | > +------------+ > | 5078242 | > +------------+ > 1 row selected (174.771 seconds) > +------------+ > | COUNT(1) | > +------------+ > | 5078242 | > +------------+ > 1 row selected (458.251 seconds) > {code} > I think we can provide a table property in CREATE TABLE & ALTER TABLE > statement for people to enable KEEP_DELETED_CELLS if there is a need but by > default it should be turned off. -- This message was sent by Atlassian JIRA (v6.3.4#6332)