[ 
https://issues.apache.org/jira/browse/PHOENIX-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448519#comment-16448519
 ] 

James Taylor edited comment on PHOENIX-4701 at 4/23/18 5:38 PM:
----------------------------------------------------------------

Also, I don't think a PK of only QUERY_ID is particularly useful. It should be 
part of the PK (for uniqueness), but at the end as we won't every query by 
QUERY_ID). What would be the most common query against the table? That'll drive 
what the PK should be. Perhaps a PK of (START_TIME, TOTAL_EXECUTION_TIME, 
QUERY_ID). We'd want to salt the table as well to prevent write hotspotting. 
This would let us efficiently query for queries that occurred within a given 
time range that were slow. For example:
{code:java}
SELECT * FROM SYSTEM.LOG WHERE START_TIME > CURRENT_DATE()-1.0/24.0 AND 
START_TIME < CURRENT_DATE() AND TOTAL_EXECUTION_TIME > 1000;{code}
Without a better PK, the above would be full table scan.

Also, START_TIME should be declared as a DATE, not a TIMESTAMP. TIMESTAMP is 
nanosecond granularity which we don't need (and wouldn't capture anyway) and 
it'd cause more overhead. DATE is millisecond granularity which is what we'd 
want.


was (Author: jamestaylor):
Also, I don't think a PK of only QUERY_ID is particularly useful. It should be 
part of the PK (for uniqueness), but at the end as we won't every query by 
QUERY_ID). What would be the most common query against the table? That'll drive 
what the PK should be. Perhaps a PK of (TOTAL_EXECUTION_TIME, START_TIME, 
QUERY_ID).

Also, START_TIME should be declared as a DATE, not a TIMESTAMP. TIMESTAMP is 
nanosecond granularity which we don't need (and wouldn't capture anyway) and 
it'd cause more overhead. DATE is millisecond granularity which is what we'd 
want.

> Declare SYSTEM.LOG table as immutable with compact storage format
> -----------------------------------------------------------------
>
>                 Key: PHOENIX-4701
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4701
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: James Taylor
>            Priority: Major
>             Fix For: 4.14.0, 5.0.0
>
>
> If possible, the SYSTEM.LOG table would benefit greatly  (3-5x perf gain) 
> from being declared as immutable with a column encoding of 1 byte and a 
> storage format of SINGLE_CELL_ARRAY_WITH_OFFSETS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to