[jira] [Commented] (PHOENIX-3073) Fast path for single-key point lookups

Junegunn Choi (JIRA) Thu, 14 Jul 2016 19:09:52 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15378752#comment-15378752
 ]


Junegunn Choi commented on PHOENIX-3073:
----------------------------------------

[~samarthjain]

bq. ... or if the user has phoenix.query.force.rowkeyorder set to true in the 
config. ... can you verify if you have the config set to true?

I don't explicitly set it. So it should be the default. I followed the code to 
find out why the hint is ignored and here's the part:

https://github.com/apache/phoenix/blob/e5a8dca/phoenix-core/src/main/java/org/apache/phoenix/execute/ScanPlan.java#L130

{{isAmountOfDataToScanWithinThreshold}} returns false due to the above 
condition and {{ScanPlan.isSerial}} returns false accordingly. I couldn't tell 
if it was intentional or not though.

bq. Having said that, Junegunn Choi, do you see this with the latest on the 4.x 
branch too?

I'm testing with the latest master branch where the fix from [~lhofhansl] is 
included.

[~lhofhansl] I ran the same test with your patch (and without my patch), but 
there was no significant change in CPU usage, around 55%. However, I could 
confirm from JFR recording that the lock contention on {{SecureRandom}} 
disappeared so it should help with scalability to some degree.

> Fast path for single-key point lookups
> --------------------------------------
>
>                 Key: PHOENIX-3073
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3073
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Junegunn Choi
>            Assignee: Junegunn Choi
>         Attachments: 3073-try.txt, PHOENIX-3073.patch
>
>
> While comparing Phoenix JDBC client to the native HBase Java client, I 
> noticed that Phoenix client uses significantly more CPU time on the client 
> machine. Profiling revealed that the majority of the time was spent on 
> {{BaseResultIterators.getParallelScans()}}. This was surprising to me as I 
> was only testing with simple point lookup queries.
> Here's how I tested:
> - {{SELECT /*+ SMALL SERIAL */ ID, DOCID FROM IMAGE WHERE ID = ?}}
>     - {{IMAGE}} is a salted table with 100 salt buckets
>     - {{ID}}, the primary key, was randomly selected in a small range so that 
> the requests are served without disk I/O
> - 20K/sec concurrent requests using 128 threads
> {{getParallelScans()}} is quite expensive as it iterates over all regions of 
> the table which can be many, only to return a single Scan object for this 
> query. Since such a single-key point lookup is one of the most frequent type 
> of requests in a typical OLTP application, I believe it makes sense to have a 
> fast path for it. With the patch, the average CPU usage of the client during 
> the workload dropped to 18.8% from 56.7% before the patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-3073) Fast path for single-key point lookups

Reply via email to