Hristo Iliev created HBASE-26211: ------------------------------------ Summary: [hbase-connectors] Pushdown filters in Spark do not work correctly with long types Key: HBASE-26211 URL: https://issues.apache.org/jira/browse/HBASE-26211 Project: HBase Issue Type: Bug Components: hbase-connectors Affects Versions: 1.0.0 Reporter: Hristo Iliev
Reading from an HBase table and filtering on a LONG column does not seem to work correctly. {{Dataset<Row> df = spark.read() .format("org.apache.hadoop.hbase.spark") .option("hbase.columns.mapping", "id STRING :key, v LONG cf:v") ... .load(); df.filter("v > 100").show();}} Expected behaviour is to show rows where cf:v > 100, but instead an empty dataset is shown. Moreover, replacing {{"v > 100"}} with {{"v >= 100"}} results in a dataset where some rows have values of v less than 100. The problem appears to be that long values are decoded incorrectly as integers in {{NaiveEncoder.filter}}: {{case LongEnc | TimestampEnc => val in = Bytes.toInt(input, offset1) val value = Bytes.toInt(filterBytes, offset2 + 1) compare(in.compareTo(value), ops)}} It looks like that error hasn’t been caught because {{DynamicLogicExpressionSuite}} lack test cases with long values. The erroneous code is also present in the master branch. We have extended the test suite and implemented a quick fix and will PR on GitHub. -- This message was sent by Atlassian Jira (v8.3.4#803005)