[
https://issues.apache.org/jira/browse/PHOENIX-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14144302#comment-14144302
]
James Taylor commented on PHOENIX-1278:
---------------------------------------
A quick-and-dirty fix would be to not collect guideposts for salted tables. The
time when we'd see a benefit with guideposts for salted tables is if a range
scan over a salted bucket is big enough to span a few guideposts. One example
might be when you're querying for a time range that has a lot of data. Each
bucket might have enough data that without the guideposts, you wouldn't be able
to chunk up the work very well.
> Performance degradation for salted tables with guideposts
> ---------------------------------------------------------
>
> Key: PHOENIX-1278
> URL: https://issues.apache.org/jira/browse/PHOENIX-1278
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
>
> When a table is salted, we're seeing a degradation in performance using our
> new guidepost-based parallelization. With salted tables, we do a merge sort
> with the results from all the parallel scans. I suspect the cause here is
> that we're doing a merge sort now between more chunks than before (since we
> chunk everything up more now than we used to). We should group the scans
> we're doing for the same bucket together and do a concat with those results
> and then do a merge sort only with the concatenated batches.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)