[ 
https://issues.apache.org/jira/browse/PHOENIX-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143998#comment-15143998
 ] 

James Taylor commented on PHOENIX-2666:
---------------------------------------

Looking at this closer, I don't think it's a regression. The prior Phoenix 
version would use the default column family to determine the guideposts, 
essentially over chunking the query based on the guidepost width (28 guideposts 
versus 12). So we'd be throwing more than 2x the number of threads at it.

An interesting test (and one which PHOENIX-1312 targets) would be a query that 
targets a particular range of data that may have high skew for the column being 
queried (versus even data distribution for the default column family). In the 
prior version, Phoenix wouldn't take that into account, but instead would use 
the guideposts based on the default column family. In the new version, you'd 
get an equal distribution of work for each thread and hence better performance.

> Performance regression: Aggregate query with filter on table with multiple 
> column families
> ------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2666
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2666
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.7.0
>            Reporter: Mujtaba Chohan
>            Assignee: Thomas D'Silva
>             Fix For: 4.7.0
>
>
> In the test, table contains total of 6 columns with one column per column 
> family.
> Running a query  {code}select count(*) from T where last_column < ?{code} is 
> 4x slower after commit for PHOENIX-1312 
> (https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=3fdaecdaaa2a2f07070df67f861252fd44e338c3)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to