[
https://issues.apache.org/jira/browse/PHOENIX-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410344#comment-15410344
]
Lars Hofhansl commented on PHOENIX-3156:
----------------------------------------
Or we could allow this optimization when there's absolutely no WHERE clause,
which makes this of far less utility value.
Lastly we could check for the 2nd condition and simply avoid the optimization.
> Bug in DistinctPrefixFilter
> ---------------------------
>
> Key: PHOENIX-3156
> URL: https://issues.apache.org/jira/browse/PHOENIX-3156
> Project: Phoenix
> Issue Type: Bug
> Reporter: Lars Hofhansl
> Priority: Blocker
> Fix For: 4.8.0
>
> Attachments: 3156.txt
>
>
> There's a corner case I found where a DISTINCT and GROUP BY query along a
> prefix of a compound row key might return incorrect results.
> The filter relies on seeing the _0 column absolutely last, and not seeing all
> Cells that should be filtered. That break in two scenarios:
> # we have a table with key (key1, key2, key3) and columns (c1 and c2). Now
> construct a WHERE <a clause that always matches c1>, <a clause that filters
> by c2) GROUP BY key1, key2. Now the filter would mis-skip when it sees the
> Cell for c1.
> # we force lower key column names. In that case those would sort after the _0
> column. The DistinctPrefixFilter would see the _0 column first and skip.
> I can fix #1 (by ignoring all Cells other than then _0 one). I do not know
> how to fix case #2.
> I think this is a blocker and we may have to undo the entire DISTINCT and
> GROUP BY prefix optimization.
> [[email protected]], [~giacomotaylor], [~samarthjain].
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)