[
https://issues.apache.org/jira/browse/PHOENIX-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18065198#comment-18065198
]
Xavier Fernandis edited comment on PHOENIX-5236 at 3/11/26 9:50 PM:
--------------------------------------------------------------------
Hi [~chandra_r] , [~karanmehta93] Not sure if this is still useful but i did
some digging.
I discovered that Phoenix chooses different filtering strategies depending on
how many columns you're filtering on. The bug lives in the
_*WhereCompiler.java*_ file
+*For Single Column Filters*+
When you filter on just one column, Phoenix uses
{_}*SingleCQKeyValueComparisonFilter*{_}. This tool is flexible - it treats
column names as regular strings, so it works perfectly with both regular
columns and dynamic columns.
*+For Multiple Column Filters+*
When you filter on multiple columns, Phoenix uses
{_}*MultiEncodedCQKeyValueComparisonFilter*{_}. This tool assumes all column
names are encoded for performance optimization. This works great for regular
schema columns, but it breaks with dynamic columns.
*Dynamic columns* use the actual column name as the identifier ,
_*MultiEncodedCQKeyValueComparisonFilter*_ expects a tiny encoded number around
2 bytes. When the filter tries to read dynamic column name as a 2-byte number,
it fails with "Expected 2 bytes, got 10 bytes" This is why it fails
Phoenix should actually do something like this
How many columns in WHERE?
↓
ONE → Use SingleCQKeyValueComparisonFilter
↓
MULTIPLE → Are any columns dynamic?
↓
YES → Use MultiCQKeyValueComparisonFilter
↓
NO → Does table support encoding?
↓
YES → Use MultiEncodedCQKeyValueComparisonFilter
↓
NO → Use MultiCQKeyValueComparisonFilter
was (Author: JIRAUSER311168):
This issue is referenced in this Jira :
https://issues.apache.org/jira/browse/PHOENIX-4816
I discovered that Phoenix chooses different filtering strategies depending on
how many columns you're filtering on. The bug lives in the
_*WhereCompiler.java*_ file
+*For Single Column Filters*+
When you filter on just one column, Phoenix uses
{_}*SingleCQKeyValueComparisonFilter*{_}. This tool is flexible - it treats
column names as regular strings, so it works perfectly with both regular
columns and dynamic columns.
*+For Multiple Column Filters+*
When you filter on multiple columns, Phoenix uses
{_}*MultiEncodedCQKeyValueComparisonFilter*{_}. This tool assumes all column
names are encoded for performance optimization. This works great for regular
schema columns, but it breaks with dynamic columns.
*Dynamic columns* use the actual column name as the identifier ,
_*MultiEncodedCQKeyValueComparisonFilter*_ expects a tiny encoded number around
2 bytes. When the filter tries to read dynamic column name as a 2-byte number,
it fails with "Expected 2 bytes, got 10 bytes" This is why it fails
> Multiple dynamic columns in WHERE clause is not working
> -------------------------------------------------------
>
> Key: PHOENIX-5236
> URL: https://issues.apache.org/jira/browse/PHOENIX-5236
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.14.0
> Reporter: Chandrasekar
> Priority: Major
>
> When the query uses multiple dynamic columns in the WHERE clause it's
> failing. Consider the following example:
>
> CREATE TABLE CITY (ID INTEGER, NAME VARCHAR CONSTRAINT PK PRIMARY KEY (ID))
> And the following upserts:
> UPSERT INTO CITY (ID, NAME, POPULATION INTEGER, DISTRICT VARCHAR) VALUES (1,
> 'XYZ', 1000, 'XYZ')
> UPSERT INTO CITY (ID, NAME, POPULATION INTEGER) VALUES (2, 'ABC', 2000)
>
> And now when the following query is executed:
> SELECT ID, NAME FROM CITY (POPULATION INTEGER, DISTRICT VARCHAR) WHERE
> DISTRICT IS NULL AND POPULATION > 1000
> It throws the following exception:
> org.apache.phoenix.schema.PTable$QualifierEncodingScheme$InvalidQualifierBytesException:
> Invalid number of qualifier bytes. Expected length: 2. Actual: 8
> Even comparing different dynamic columns is not working. For example in the
> same table, following upsert is executed:
> UPSERT INTO CITY (ID, NAME, CURRENT_POPULATION INTEGER) VALUES (2, 'ABC',
> 2000)
> And when the following query is executed:
> SELECT ID, NAME FROM CITY (POPULATION INTEGER, CURRENT_POPULATION INTEGER)
> WHERE POPULATION != CURRENT_POPULATION
> It again throws the same error (
> org.apache.phoenix.schema.PTable$QualifierEncodingScheme$InvalidQualifierBytesException:
> Invalid number of qualifier bytes. Expected length: 2. Actual: 10)
> Does phoenix allows querying by conditions on multiple dynamic columns? The
> phoenix version used is 4.14.0 and HBase version is 1.4.8.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)