[ 
https://issues.apache.org/jira/browse/PHOENIX-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089865#comment-16089865
 ] 

Ankit Singhal commented on PHOENIX-4018:
----------------------------------------

Yes, only last scanner should update the progress(which you can do easily in 
BaseScannerRegionObserver.RegionScannerHolder#next(List<Cell>,ScannerContext)), 
HeartBeat will also work fine if we do update the time progress in the same 
manner.





> HashJoin may produce nulls for LHS table columns
> ------------------------------------------------
>
>                 Key: PHOENIX-4018
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4018
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.11.0
>            Reporter: Sergey Soldatov
>            Assignee: Sergey Soldatov
>            Priority: Critical
>         Attachments: PHOENIX-4018-1.patch
>
>
> Here is the problem: in HashJoinRegionScanner methods (nextRow for example) 
> we are using the same scanner context that was created in RSRpcServices. It 
> has limits (i.e. 2Mb size). Let's say that we have 3Mb region and the only 
> key that match the join condition is located at the end of the region. In 
> HashJoinRegionScanner#nextRow when we iterate through the region rows once we 
> reached the limit of 2Mb, every region scanner nextRow will  return a single 
> cell and the scanner context will have SIZE_LIMIT_REACHED_MID_ROW state. But 
> we don't have any logic that check that, so this single cell is considered as 
> a complete row with all nulls except one column. 
> How to fix it: 
> 1. for region scanner we may provide NoLimitScannerContext, so we will never 
> get a partial result.  
> 2. We need to update the scanner context that we got from RSRpcServices with 
> the real data, basing on the size of results we are going to return. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to