[ 
https://issues.apache.org/jira/browse/PHOENIX-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16975344#comment-16975344
 ] 

Lars Hofhansl edited comment on PHOENIX-5494 at 11/16/19 12:13 AM:
-------------------------------------------------------------------

Thanks [~comnetwork] and [~kozdemir], I had expected this to become a giant 
project, but you both proved me wrong :)

Somehow I had missed {{ScanRanges.createPointLookup(...)}}. I'd expect the 
performance between HBase's MultiRowRangeFilter and Phoenix' SkipScanFilter to 
be in the same ballpark. Perhaps for consistency it might be better to use the 
SkipScanFilter here.

Setting the start key makes sense, we'd safe one SEEK in the beginning (but I 
doubt it makes a huge difference in practice).

As for whether we can use the same caching for both normal upsert and index 
replay... I'll let you two work that out. :)

Also [~kozdemir] ran some long running tests with real-life workloads and found 
that the patch makes things slower (which looking at the patch, I find hard to 
understand).


was (Author: lhofhansl):
Thanks [~comnetwork] and [~kozdemir], I had expected this to become a giant 
project, but you both proved me wrong :)

Somehow I had missed {{ScanRanges.createPointLookup(...)}}. I'd expect the 
performance between HBase's MultiRowRangeFilter and Phoenix' SkipScanFilter to 
be in the same ballpark. Perhaps for consistency it might be better to use the 
SkipScanFilter here.

Setting the start key makes sense, we'd safe one SEEK in the beginning (but I 
doubt it makes a huge difference in practice).

As for whether we can use the same caching for both normal upsert and index 
replay... I'll let you two work that out. :)

Also [~kozdemir] ran some long running tests with real-life workloads and found 
that the patch makes things slower (which looking at the patch, I find hard to 
a understand).

> Batched, mutable Index updates are unnecessarily run one-by-one
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-5494
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5494
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Lars Hofhansl
>            Assignee: Kadir OZDEMIR
>            Priority: Major
>              Labels: performance
>         Attachments: 5494-4.x-HBase-1.5.txt, 
> PHOENIX-5494-4.x-HBase-1.4.patch, PHOENIX-5494.master.001.patch, 
> PHOENIX-5494.master.002.patch, PHOENIX-5494.master.003.patch, 
> Screenshot_20191110_160243.png, Screenshot_20191110_160351.png, 
> Screenshot_20191110_161453.png
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I just noticed that index updates on mutable tables retrieve their deletes 
> (to invalidate the old index entry) one-by-one.
> For batches, this can be *the* major time spent during an index update. The 
> cost is mostly incured by the repeated setup (and seeking) of the new region 
> scanner (for each row).
> We can instead do a skip scan and get all updates in a single scan per region.
> (Logically that is simple, but it will require some refactoring)
> I won't be getting to this, but recording it here in case someone feels 
> inclined.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to