[ 
https://issues.apache.org/jira/browse/PHOENIX-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363760#comment-15363760
 ] 

Lars Hofhansl commented on PHOENIX-2724:
----------------------------------------

How do you actually run a query like this with the default client settings?
By default I get:
{code}
java.util.concurrent.RejectedExecutionException: Task 
org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask@492fc69e rejected 
from org.apache.phoenix.job.JobManager$1@74a9c4b0[Running, pool size = 128, 
active threads = 128, queued tasks = 5000, completed tasks = 10267
{code}

which in my case does not make any sense because I'm running with a few region 
servers anyway.

That's another reason why it is important (IMHO) to do guidepost grouping on 
the server, or simply as part of the scan.
It might be as simple as having a target scan size and scanning forward through 
the STATS table accumulating guideposts until the aggregate range covers the 
target size.
On the other hand, the smaller the scan chunks, the fairer the execution will 
be between multiple large queries (or between large and small queries).

> Query with large number of guideposts is slower compared to no stats
> --------------------------------------------------------------------
>
>                 Key: PHOENIX-2724
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2724
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.7.0
>         Environment: Phoenix 4.7.0-RC4, HBase-0.98.17 on a 8 node cluster
>            Reporter: Mujtaba Chohan
>            Assignee: Lars Hofhansl
>             Fix For: 4.8.0
>
>         Attachments: 2724.txt, PHOENIX-2724.patch, 
> PHOENIX-2724_addendum.patch, PHOENIX-2724_v2.patch
>
>
> With 1MB guidepost width for ~900GB/500M rows table. Queries with short scan 
> range gets significantly slower.
> Without stats:
> {code}
> select * from T limit 10; // query execution time <100 msec
> {code}
> With stats:
> {code}
> select * from T limit 10; // query execution time >20 seconds
> Explain plan: CLIENT 876085-CHUNK 476569382 ROWS 876060986727 BYTES SERIAL 
> 1-WAY FULL SCAN OVER T SERVER 10 ROW LIMIT CLIENT 10 ROW LIMIT
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to