[jira] [Commented] (PHOENIX-5998) Paged server side ungrouped aggregate operations

ASF GitHub Bot (Jira) Mon, 26 Oct 2020 20:26:03 -0700


    [ 
https://issues.apache.org/jira/browse/PHOENIX-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17221095#comment-17221095
 ]


ASF GitHub Bot commented on PHOENIX-5998:
-----------------------------------------

kadirozde commented on pull request #936:
URL: https://github.com/apache/phoenix/pull/936#issuecomment-716946162


   > @kadirozde The changes are substantial and I will need some heads-down 
time to review them. If it is urgent, please feel free to rely on other's 
reviews and don't wait for me. I plan on taking a look in detail within the 
next couple of days.
   
   No, it is not urgent. I was going to start working on PHOENIX-6207 and there 
is some dependency between them and so I wanted to push this before starting 
the other. But it is okay and I do not have to wait for this PR to be checked 
in. Please take your time.
   
   Yes, the changes are substantial but mostly mechanic. The core of the change 
is that instead of scanning the entire table region in the postScannerOpen hook 
and returning the result of the aggregate operation for the entire table region 
in one result iteration, this PR just returns a region scanner (i.e., an new 
scanner called UngroupedAggregateRegionScanner) in the postScannerOpen hook for 
the UngroupedAggregateRegionObserver coproc, and then applies the aggregate 
operation on a chunk (i.e,  page) of a table region in each result iteration. 
This means the client needs to do many iterations in order to process a table 
region and aggregate the results of these pages on the client side. Please note 
that previously, the client needed to aggregate the results of server side 
aggregations, one for each table region (not for each table region page). Hope 
this helps.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


> Paged server side ungrouped aggregate operations 
> -------------------------------------------------
>
>                 Key: PHOENIX-5998
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5998
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Kadir OZDEMIR
>            Assignee: Kadir OZDEMIR
>            Priority: Major
>             Fix For: 4.x
>
>         Attachments: PHOENIX-5998.4.x.001.patch, PHOENIX-5998.4.x.002.patch, 
> PHOENIX-5998.4.x.003.patch
>
>
> Phoenix provides the option of performing upsert select and delete query 
> operations on the client or server side.  This is decided by the Phoenix 
> optimizer based on configuration parameters. For the server side option, the 
> table operation (upsert select/delete query) is parallelized such that 
> multiple table regions are scanned and the mutations derived from these scans 
> can also be executed in parallel on the server side. However, currently there 
> is no paging capability and the server side operation can take long enough 
> lead to HBase client timeouts. When this happens, Phoenix can return failure 
> to its applications and the rest of the parallel scans and mutations on the 
> server side can still continue since  Phoenix has no mechanism in place to 
> stop these operations before returning failure to applications. This can 
> create unexpected race conditions between these left-over operations and the 
> new operations issued by applications. Putting a limit on the number of rows 
> to be processed within a single RPC call (i.e., the next operation on the 
> scanner) on the server side using a Phoenix level paging is highly desirable 
> and a required step to prevent the possible race conditions. This paging 
> mechanism has been already implemented for index rebuild and verification 
> operations and proven to be effective to prevent timeouts. This paging can be 
> implemented for all server side operations including aggregates, upsert 
> selects, delete queries and so on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5998) Paged server side ungrouped aggregate operations

Reply via email to