[ 
https://issues.apache.org/jira/browse/HBASE-26863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan resolved HBASE-26863.
-------------------------------
    Fix Version/s: hbase-connectors-1.0.1
     Hadoop Flags: Reviewed
       Resolution: Fixed

> Rowkey pushdown does not work with complex conditions
> -----------------------------------------------------
>
>                 Key: HBASE-26863
>                 URL: https://issues.apache.org/jira/browse/HBASE-26863
>             Project: HBase
>          Issue Type: Bug
>          Components: hbase-connectors
>    Affects Versions: connector-1.0.0
>            Reporter: Yohei Kishimoto
>            Priority: Major
>             Fix For: hbase-connectors-1.0.1
>
>
> When using pushdown column filter feature of hbase-spark-connector, issuing 
> complex query containing rowkey conditions does not get expected rowkey 
> pushdown.
> {code:java}
> {
>   "table":{"namespace":"default", "name":"t1"},
>   "rowkey":"key",
>   "columns":{
>     "KEY_FIELD":{"cf":"rowkey", "col":"key", "type":"string"},
>     "A_FIELD":{"cf":"c", "col":"a", "type":"string"},
>     "B_FIELD":{"cf":"c", "col":"b", "type":"string"}
>   }
> }
> {code}
> For example, given the catalog, a query `spark.sql("SELECT * FROM table WHERE 
> KEY_FIELD >= 'get1' AND KEY_FIELD <= 'get3' AND A_FIELD IS NOT NULL")` gets 
> incomplete rowkey pushdown 
> (ScanRange:(upperBound:get3,isUpperBoundEqualTo:true,lowerBound:,isLowerBoundEqualTo:true)).
> If a query is `spark.sql("SELECT * FROM table WHERE KEY_FIELD >= 'get1' AND 
> KEY_FIELD <= 'get3'")`, we get normal rowkey pushdown 
> (ScanRange:(upperBound:get3,isUpperBoundEqualTo:true,lowerBound:,isLowerBoundEqualTo:true)).
> I found that ScanRange#getOverlapScanRange and ScanRange#mergeIntersect 
> return incorrect results if the range from the arguments is wider than the 
> instance (or  scanRange.getOverlapScanRange(scanRange) where 
> scanRange1⊂scanRange2). Depending on the order of the Filters that the Spark 
> optimization results produce, the order of the scan ranges that these methods 
> receive could be the one that causes such a problem.
> I will create a PR later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to