[ 
https://issues.apache.org/jira/browse/PHOENIX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241901#comment-14241901
 ] 

Hudson commented on PHOENIX-1498:
---------------------------------

SUCCESS: Integrated in Phoenix-master #514 (See 
[https://builds.apache.org/job/Phoenix-master/514/])
PHOENIX-1498: Turn KEEP_DELETED_CELLS off by default (jeffreyz: rev 
5722a4d31318cb04473fed8bf72c5202b33d1e0d)
* phoenix-core/src/main/java/org/apache/phoenix/query/QueryServices.java
* phoenix-core/src/main/java/org/apache/phoenix/query/QueryConstants.java
* 
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataProtocol.java
* phoenix-core/src/it/java/org/apache/phoenix/end2end/QueryIT.java
* 
phoenix-core/src/it/java/org/apache/phoenix/end2end/QueryDatabaseMetaDataIT.java
* phoenix-core/src/it/java/org/apache/phoenix/end2end/BaseQueryIT.java
* phoenix-core/src/main/java/org/apache/phoenix/query/QueryServicesOptions.java
* phoenix-core/src/it/java/org/apache/phoenix/end2end/GroupByIT.java
* 
phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java


> Turn KEEP_DELETED_CELLS off by default
> --------------------------------------
>
>                 Key: PHOENIX-1498
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1498
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.0.0, 5.0.0
>            Reporter: Jeffrey Zhong
>            Assignee: Jeffrey Zhong
>             Fix For: 5.0.0, 4.3
>
>         Attachments: PHOENIX-1498-v2.patch, PHOENIX-1498-v3.patch, 
> PHOENIX-1498.patch
>
>
> Phoenix table is created with "KEEP_DELETED_CELLS" enabled by default, this 
> is only used to allow for flashback queries to work correctly. While 
> flashback query isn't used often in field and we found that query performance 
> degraded with the option on. This is likely a hbase scan issue though(will 
> create a JIRA once having more info). 
> Anyway Keeping deleted cells will add performance penalty and it's not used 
> often. Therefore, I'm suggesting to set it off by default. 
> We have a test where a table is loaded with > 5m rows and then some are 
> deleted/reinserted. The count ( * ) performance became worse & worse:
> {code}
> +------------+
> |  COUNT(1)  |
> +------------+
> | 5078242    |
> +------------+
> 1 row selected (33.273 seconds)
> +------------+
> |  COUNT(1)  |
> +------------+
> | 5078242    |
> +------------+
> 1 row selected (174.771 seconds)
> +------------+
> |  COUNT(1)  |
> +------------+
> | 5078242    |
> +------------+
> 1 row selected (458.251 seconds)
> {code}
> I think we can provide a table property in CREATE TABLE & ALTER TABLE 
> statement for people to enable KEEP_DELETED_CELLS if there is a need but by 
> default it should be turned off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to