[jira] [Created] (HBASE-26211) [hbase-connectors] Pushdown filters in Spark do not work correctly with long types

Hristo Iliev (Jira) Fri, 20 Aug 2021 07:41:08 -0700

Hristo Iliev created HBASE-26211:
------------------------------------

             Summary: [hbase-connectors] Pushdown filters in Spark do not work 
correctly with long types
                 Key: HBASE-26211
                 URL: https://issues.apache.org/jira/browse/HBASE-26211
             Project: HBase
          Issue Type: Bug
          Components: hbase-connectors
    Affects Versions: 1.0.0
            Reporter: Hristo Iliev



Reading from an HBase table and filtering on a LONG column does not seem to 
work correctly.

{{Dataset<Row> df = spark.read()
   .format("org.apache.hadoop.hbase.spark")
   .option("hbase.columns.mapping", "id STRING :key, v LONG cf:v")
   ...
   .load();
 df.filter("v > 100").show();}}

Expected behaviour is to show rows where cf:v > 100, but instead an empty 
dataset is shown.

Moreover, replacing {{"v > 100"}} with {{"v >= 100"}} results in a dataset 
where some rows have values of v less than 100. 

The problem appears to be that long values are decoded incorrectly as integers 
in {{NaiveEncoder.filter}}:

{{case LongEnc | TimestampEnc =>
   val in = Bytes.toInt(input, offset1)
   val value = Bytes.toInt(filterBytes, offset2 + 1)
   compare(in.compareTo(value), ops)}}

It looks like that error hasn’t been caught because 
{{DynamicLogicExpressionSuite}} lack test cases with long values.

The erroneous code is also present in the master branch. We have extended the 
test suite and implemented a quick fix and will PR on GitHub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HBASE-26211) [hbase-connectors] Pushdown filters in Spark do not work correctly with long types

Reply via email to