[ 
https://issues.apache.org/jira/browse/PHOENIX-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Soldatov updated PHOENIX-4018:
-------------------------------------
    Attachment: PHOENIX-4018-1.patch

implemented the suggested fix.

> HashJoin may produce nulls for LHS table columns
> ------------------------------------------------
>
>                 Key: PHOENIX-4018
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4018
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.11.0
>            Reporter: Sergey Soldatov
>            Assignee: Sergey Soldatov
>            Priority: Critical
>         Attachments: PHOENIX-4018-1.patch
>
>
> Here is the problem: in HashJoinRegionScanner methods (nextRow for example) 
> we are using the same scanner context that was created in RSRpcServices. It 
> has limits (i.e. 2Mb size). Let's say that we have 3Mb region and the only 
> key that match the join condition is located at the end of the region. In 
> HashJoinRegionScanner#nextRow when we iterate through the region rows once we 
> reached the limit of 2Mb, every region scanner nextRow will  return a single 
> cell and the scanner context will have SIZE_LIMIT_REACHED_MID_ROW state. But 
> we don't have any logic that check that, so this single cell is considered as 
> a complete row with all nulls except one column. 
> How to fix it: 
> 1. for region scanner we may provide NoLimitScannerContext, so we will never 
> get a partial result.  
> 2. We need to update the scanner context that we got from RSRpcServices with 
> the real data, basing on the size of results we are going to return. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to