[ 
https://issues.apache.org/jira/browse/PHOENIX-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145150#comment-14145150
 ] 

James Taylor commented on PHOENIX-1278:
---------------------------------------

bq. Do you mean 10 scans in parallel vs. 2 scans in parallel? Or just breaking 
up a scan into 10 chunks vs 2 and executing them serially?
The former.

In the case we've benchmarked, it's 9 chunks versus 46 chunks. The time 
difference is 2.3 sec versus 3.1 sec. We can certainly dial up the # of bytes 
after which we create a guideposts. Rather than disable the setting of 
guideposts for salted tables (PHOENIX-1279), I think it'd be pretty easy to 
combine together our ranges in the case where we're doing a merge sort. I'll 
take a stab at this. Worst case, I can just combine them in the salted case, as 
that's pretty trivial.

> Performance degradation for salted tables with guideposts
> ---------------------------------------------------------
>
>                 Key: PHOENIX-1278
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1278
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: Anoop Sam John
>
> When a table is salted, we're seeing a degradation in performance using our 
> new guidepost-based parallelization. With salted tables, we do a merge sort 
> with the results from all the parallel scans. I suspect the cause here is 
> that we're doing a merge sort now between more chunks than before (since we 
> chunk everything up more now than we used to). We should group the scans 
> we're doing for the same bucket together and do a concat with those results 
> and then do a merge sort only with the concatenated batches.
> Pls revert PHOENIX-1279 when we implement this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to