Jeffrey Zhong created PHOENIX-1498:
--------------------------------------

             Summary: Turn KEEP_DELETED_CELLS off by default
                 Key: PHOENIX-1498
                 URL: https://issues.apache.org/jira/browse/PHOENIX-1498
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 4.0.0, 5.0.0
            Reporter: Jeffrey Zhong


Phoenix table is created with "KEEP_DELETED_CELLS" enabled by default, this is 
only used to allow for flashback queries to work correctly. While flashback 
query isn't used often in field and we found that query performance degraded 
with the option on. This is likely a hbase scan issue though(will create a JIRA 
once having more info). 

Anyway Keeping deleted cells will add performance penalty and it's not used 
often. Therefore, I'm suggesting to set it off by default. 

We have a test where a table is loaded with > 5m rows and then some are 
deleted/reinserted. The count ( * ) performance became worse & worse:

{code}
+------------+
|  COUNT(1)  |
+------------+
| 5078242    |
+------------+
1 row selected (33.273 seconds)

+------------+
|  COUNT(1)  |
+------------+
| 5078242    |
+------------+
1 row selected (174.771 seconds)

+------------+
|  COUNT(1)  |
+------------+
| 5078242    |
+------------+
1 row selected (458.251 seconds)
{code}

I think we can provide a table property in CREATE TABLE & ALTER TABLE statement 
for people to enable KEEP_DELETED_CELLS if there is a need but by default it 
should be turned off.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to