Jeffrey Zhong created PHOENIX-1498:
--------------------------------------
Summary: Turn KEEP_DELETED_CELLS off by default
Key: PHOENIX-1498
URL: https://issues.apache.org/jira/browse/PHOENIX-1498
Project: Phoenix
Issue Type: Bug
Affects Versions: 4.0.0, 5.0.0
Reporter: Jeffrey Zhong
Phoenix table is created with "KEEP_DELETED_CELLS" enabled by default, this is
only used to allow for flashback queries to work correctly. While flashback
query isn't used often in field and we found that query performance degraded
with the option on. This is likely a hbase scan issue though(will create a JIRA
once having more info).
Anyway Keeping deleted cells will add performance penalty and it's not used
often. Therefore, I'm suggesting to set it off by default.
We have a test where a table is loaded with > 5m rows and then some are
deleted/reinserted. The count ( * ) performance became worse & worse:
{code}
+------------+
| COUNT(1) |
+------------+
| 5078242 |
+------------+
1 row selected (33.273 seconds)
+------------+
| COUNT(1) |
+------------+
| 5078242 |
+------------+
1 row selected (174.771 seconds)
+------------+
| COUNT(1) |
+------------+
| 5078242 |
+------------+
1 row selected (458.251 seconds)
{code}
I think we can provide a table property in CREATE TABLE & ALTER TABLE statement
for people to enable KEEP_DELETED_CELLS if there is a need but by default it
should be turned off.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)