[
https://issues.apache.org/jira/browse/PHOENIX-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143998#comment-15143998
]
James Taylor commented on PHOENIX-2666:
---------------------------------------
Looking at this closer, I don't think it's a regression. The prior Phoenix
version would use the default column family to determine the guideposts,
essentially over chunking the query based on the guidepost width (28 guideposts
versus 12). So we'd be throwing more than 2x the number of threads at it.
An interesting test (and one which PHOENIX-1312 targets) would be a query that
targets a particular range of data that may have high skew for the column being
queried (versus even data distribution for the default column family). In the
prior version, Phoenix wouldn't take that into account, but instead would use
the guideposts based on the default column family. In the new version, you'd
get an equal distribution of work for each thread and hence better performance.
> Performance regression: Aggregate query with filter on table with multiple
> column families
> ------------------------------------------------------------------------------------------
>
> Key: PHOENIX-2666
> URL: https://issues.apache.org/jira/browse/PHOENIX-2666
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.7.0
> Reporter: Mujtaba Chohan
> Assignee: Thomas D'Silva
> Fix For: 4.7.0
>
>
> In the test, table contains total of 6 columns with one column per column
> family.
> Running a query {code}select count(*) from T where last_column < ?{code} is
> 4x slower after commit for PHOENIX-1312
> (https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=3fdaecdaaa2a2f07070df67f861252fd44e338c3)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)