[jira] [Commented] (PHOENIX-3073) Fast path for single-key point lookups

Junegunn Choi (JIRA) Thu, 14 Jul 2016 01:48:54 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376593#comment-15376593
 ]


Junegunn Choi commented on PHOENIX-3073:
----------------------------------------

bq. Can you measure CPU usage for point lookups without any change to 4.8 with 
and without the /*+ SERIAL */ hint? 

Actually the hint has no effect (plan.isSerial() == false), so no difference in 
CPU usage.

{code}
> explain select /*+ small serial */ id, docid from image where id = 1;
+-----------------------------------------------------------------------------------+
|                                       PLAN                                    
    |
+-----------------------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN SMALL POINT LOOKUP ON 1 KEY OVER 
IMAGE  |
+-----------------------------------------------------------------------------------+

> explain select /*+ small */ id, docid from image where id = 1;
+-----------------------------------------------------------------------------------+
|                                       PLAN                                    
    |
+-----------------------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN SMALL POINT LOOKUP ON 1 KEY OVER 
IMAGE  |
+-----------------------------------------------------------------------------------+
{code}

Anyway, this issue addresses the computation cost of preparing scans on the 
client-side before actually submitting them. So how the optimizer should 
execute them, whether in parallel or serially, is a separate issue to be 
discussed.

> Fast path for single-key point lookups
> --------------------------------------
>
>                 Key: PHOENIX-3073
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3073
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Junegunn Choi
>            Assignee: Junegunn Choi
>         Attachments: PHOENIX-3073.patch
>
>
> While comparing Phoenix JDBC client to the native HBase Java client, I 
> noticed that Phoenix client uses significantly more CPU time on the client 
> machine. Profiling revealed that the majority of the time was spent on 
> {{BaseResultIterators.getParallelScans()}}. This was surprising to me as I 
> was only testing with simple point lookup queries.
> Here's how I tested:
> - {{SELECT /*+ SMALL SERIAL */ ID, DOCID FROM IMAGE WHERE ID = ?}}
>     - {{IMAGE}} is a salted table with 100 salt buckets
>     - {{ID}}, the primary key, was randomly selected in a small range so that 
> the requests are served without disk I/O
> - 20K/sec concurrent requests using 128 threads
> {{getParallelScans()}} is quite expensive as it iterates over all regions of 
> the table which can be many, only to return a single Scan object for this 
> query. Since such a single-key point lookup is one of the most frequent type 
> of requests in a typical OLTP application, I believe it makes sense to have a 
> fast path for it. With the patch, the average CPU usage of the client during 
> the workload dropped to 18.8% from 56.7% before the patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-3073) Fast path for single-key point lookups

Reply via email to