[ https://issues.apache.org/jira/browse/PHOENIX-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lars Hofhansl updated PHOENIX-3156: ----------------------------------- Attachment: 3156.txt Patch that shows problem #1 with a fix. #2 is not solvable I think. > Bug in DistinctPrefixFilter > --------------------------- > > Key: PHOENIX-3156 > URL: https://issues.apache.org/jira/browse/PHOENIX-3156 > Project: Phoenix > Issue Type: Bug > Reporter: Lars Hofhansl > Priority: Blocker > Fix For: 4.8.0 > > Attachments: 3156.txt > > > There's a corner case I found where a DISTINCT and GROUP BY query along a > prefix of a compound row key might return incorrect results. > The filter relies on seeing the _0 column absolutely last, and not seeing all > Cells that should be filtered. That break in two scenarios: > # we have a table with key (key1, key2, key3) and columns (c1 and c2). Now > construct a WHERE <a clause that always matches c1>, <a clause that filters > by c2) GROUP BY key1, key2. Now the filter would mis-skip when it sees the > Cell for c1. > # we force lower key column names. In that case those would sort after the _0 > column. The DistinctPrefixFilter would see the _0 column first and skip. > I can fix #1 (by ignoring all Cells other than then _0 one). I do not know > how to fix case #2. > I think this is a blocker and we may have to undo the entire DISTINCT and > GROUP BY prefix optimization. > [~an...@apache.org], [~giacomotaylor], [~samarthjain]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)