[
https://issues.apache.org/jira/browse/PHOENIX-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14144436#comment-14144436
]
James Taylor commented on PHOENIX-1278:
---------------------------------------
That's a good point, [~lhofhansl]. It just shows up every time for a salted
table because of the merge sort among the chunks. But it would show up for an
ORDER BY or a GROUP BY too. At least that's my theory on the cause of the
slowdown - the merge sort is more expensive - but we should verify. There's
overhead in doing one versus many scans too on the server - how would you
characterize the difference on the server between say doing 10 scans over 1/10
of the data versus 2 scans over 1/2?
> Performance degradation for salted tables with guideposts
> ---------------------------------------------------------
>
> Key: PHOENIX-1278
> URL: https://issues.apache.org/jira/browse/PHOENIX-1278
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Assignee: Anoop Sam John
>
> When a table is salted, we're seeing a degradation in performance using our
> new guidepost-based parallelization. With salted tables, we do a merge sort
> with the results from all the parallel scans. I suspect the cause here is
> that we're doing a merge sort now between more chunks than before (since we
> chunk everything up more now than we used to). We should group the scans
> we're doing for the same bucket together and do a concat with those results
> and then do a merge sort only with the concatenated batches.
> Pls revert PHOENIX-1279 when we implement this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)