[ 
https://issues.apache.org/jira/browse/PHOENIX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237074#comment-14237074
 ] 

Jeffrey Zhong commented on PHOENIX-1498:
----------------------------------------

[~jamestaylor] Thanks for your comments. Below are my answers for your comments:

{quote}
We need to set KEEP_DELETED_CELLS=true for SYSTEM.CATALOG, SYSTEM.SEQUENCE, and 
SYSTEM.STATS - just add it to the DDL statements in QueryConstants. Otherwise, 
tables that want to use KEEP_DELETED_CELLS will no longer maintain the proper 
versions for schema changes.
Tell me more about the test changes, in particular why the system tables need 
to be deleted after each test suite? If you do the above, is that no longer 
necessary?
{quote}
This is a good point. But we also have to set VERSIONS=unlimited otherwise it 
won't work for some data table anyway. IMO, it's better for users to set by 
themselves if there is a need. So I don't turn KEEP_DELETED_CELLS on for all 
tests 

In addition, do we need enable it for SYSTEM.STATS which we should use latest 
info as region split is happening all the time?

{quote}
Also, any reason why these are all added here? I suppose they were getting set 
on the super class before?
{quote}
It's due to the reason that its base class doSetup() is overridden and the same 
method in base class wont' be called. The reason I override is that I want the 
test uses system table with different KEEP_DELETED_CELLS setting.

{quote}
DEFAULT_KEEP_DELETED_CELLS
{quote}
Good point. I can add this constant.

{quote}
Is it necessary to set the queue size depth here?
{quote}
This is just copied over from base class as we override its base doSetup(). I 
could move the property to default props.


> Turn KEEP_DELETED_CELLS off by default
> --------------------------------------
>
>                 Key: PHOENIX-1498
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1498
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.0.0, 5.0.0
>            Reporter: Jeffrey Zhong
>            Assignee: Jeffrey Zhong
>         Attachments: PHOENIX-1498-v2.patch, PHOENIX-1498.patch
>
>
> Phoenix table is created with "KEEP_DELETED_CELLS" enabled by default, this 
> is only used to allow for flashback queries to work correctly. While 
> flashback query isn't used often in field and we found that query performance 
> degraded with the option on. This is likely a hbase scan issue though(will 
> create a JIRA once having more info). 
> Anyway Keeping deleted cells will add performance penalty and it's not used 
> often. Therefore, I'm suggesting to set it off by default. 
> We have a test where a table is loaded with > 5m rows and then some are 
> deleted/reinserted. The count ( * ) performance became worse & worse:
> {code}
> +------------+
> |  COUNT(1)  |
> +------------+
> | 5078242    |
> +------------+
> 1 row selected (33.273 seconds)
> +------------+
> |  COUNT(1)  |
> +------------+
> | 5078242    |
> +------------+
> 1 row selected (174.771 seconds)
> +------------+
> |  COUNT(1)  |
> +------------+
> | 5078242    |
> +------------+
> 1 row selected (458.251 seconds)
> {code}
> I think we can provide a table property in CREATE TABLE & ALTER TABLE 
> statement for people to enable KEEP_DELETED_CELLS if there is a need but by 
> default it should be turned off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to