[jira] [Commented] (PHOENIX-3271) Distribute UPSERT SELECT across cluster

Samarth Jain (JIRA) Mon, 16 Jan 2017 11:39:17 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824482#comment-15824482
 ]


Samarth Jain commented on PHOENIX-3271:
---------------------------------------

[[email protected]] - one thing to make sure would be that the thread pool used 
by cross region server UPSERTs is different from the thread pool whose threads 
are doing the scans for SELECTs. Otherwise, it could lead to deadlock like 
scenarios.

> Distribute UPSERT SELECT across cluster
> ---------------------------------------
>
>                 Key: PHOENIX-3271
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3271
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Assignee: Ankit Singhal
>             Fix For: 4.10.0
>
>         Attachments: PHOENIX-3271.patch, PHOENIX-3271_v1.patch, 
> PHOENIX-3271_v2.patch, PHOENIX-3271_v3.patch
>
>
> Based on some informal testing we've done, it seems that creation of a local 
> index is orders of magnitude faster that creation of global indexes (17 
> seconds versus 10-20 minutes - though more data is written in the global 
> index case). Under the covers, a global index is created through the running 
> of an UPSERT SELECT. Also, UPSERT SELECT provides an easy way of copying a 
> table. In both of these cases, the data being upserted must all flow back to 
> the same client which can become a bottleneck for a large table. Instead, 
> what can be done is to push each separate, chunked UPSERT SELECT call out to 
> a different region server for execution there. One way we could implement 
> this would be to have an endpoint coprocessor push the chunked UPSERT SELECT 
> out to each region server and return the number of rows that were upserted 
> back to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-3271) Distribute UPSERT SELECT across cluster

Reply via email to