Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/3#discussion_r14899559
  
    --- Diff: 
phoenix-core/src/main/java/org/apache/phoenix/iterate/DefaultParallelIteratorRegionSplitter.java
 ---
    @@ -140,7 +142,14 @@ public boolean apply(HRegionLocation location) {
             // distributed across regions, using this scheme compensates for 
regions that
             // have more rows than others, by applying tighter splits and 
therefore spawning
             // off more scans over the overloaded regions.
    -        int splitsPerRegion = regions.size() >= targetConcurrency ? 1 : 
(regions.size() > targetConcurrency / 2 ? maxConcurrency : targetConcurrency) / 
regions.size();
    +        PTable table = tableRef.getTable();
    --- End diff --
    
    That's what the splitsPerRegion variable and subsequent logic in 
ParallelIterators does - it creates additional split points within the range so 
that multiple scans get run over a single region. We'd want to prefix each of 
these with the same start region key. I'll open a separate JIRA for this too. 
It's not a big deal - most of the time the parallelization slots would be used 
up by having to do a scan in each region anyway.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to