[
https://issues.apache.org/jira/browse/PHOENIX-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089865#comment-16089865
]
Ankit Singhal commented on PHOENIX-4018:
----------------------------------------
Yes, only last scanner should update the progress(which you can do easily in
BaseScannerRegionObserver.RegionScannerHolder#next(List<Cell>,ScannerContext)),
HeartBeat will also work fine if we do update the time progress in the same
manner.
> HashJoin may produce nulls for LHS table columns
> ------------------------------------------------
>
> Key: PHOENIX-4018
> URL: https://issues.apache.org/jira/browse/PHOENIX-4018
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.11.0
> Reporter: Sergey Soldatov
> Assignee: Sergey Soldatov
> Priority: Critical
> Attachments: PHOENIX-4018-1.patch
>
>
> Here is the problem: in HashJoinRegionScanner methods (nextRow for example)
> we are using the same scanner context that was created in RSRpcServices. It
> has limits (i.e. 2Mb size). Let's say that we have 3Mb region and the only
> key that match the join condition is located at the end of the region. In
> HashJoinRegionScanner#nextRow when we iterate through the region rows once we
> reached the limit of 2Mb, every region scanner nextRow will return a single
> cell and the scanner context will have SIZE_LIMIT_REACHED_MID_ROW state. But
> we don't have any logic that check that, so this single cell is considered as
> a complete row with all nulls except one column.
> How to fix it:
> 1. for region scanner we may provide NoLimitScannerContext, so we will never
> get a partial result.
> 2. We need to update the scanner context that we got from RSRpcServices with
> the real data, basing on the size of results we are going to return.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)