[
https://issues.apache.org/jira/browse/PHOENIX-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145150#comment-14145150
]
James Taylor commented on PHOENIX-1278:
---------------------------------------
bq. Do you mean 10 scans in parallel vs. 2 scans in parallel? Or just breaking
up a scan into 10 chunks vs 2 and executing them serially?
The former.
In the case we've benchmarked, it's 9 chunks versus 46 chunks. The time
difference is 2.3 sec versus 3.1 sec. We can certainly dial up the # of bytes
after which we create a guideposts. Rather than disable the setting of
guideposts for salted tables (PHOENIX-1279), I think it'd be pretty easy to
combine together our ranges in the case where we're doing a merge sort. I'll
take a stab at this. Worst case, I can just combine them in the salted case, as
that's pretty trivial.
> Performance degradation for salted tables with guideposts
> ---------------------------------------------------------
>
> Key: PHOENIX-1278
> URL: https://issues.apache.org/jira/browse/PHOENIX-1278
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Assignee: Anoop Sam John
>
> When a table is salted, we're seeing a degradation in performance using our
> new guidepost-based parallelization. With salted tables, we do a merge sort
> with the results from all the parallel scans. I suspect the cause here is
> that we're doing a merge sort now between more chunks than before (since we
> chunk everything up more now than we used to). We should group the scans
> we're doing for the same bucket together and do a concat with those results
> and then do a merge sort only with the concatenated batches.
> Pls revert PHOENIX-1279 when we implement this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)