[ https://issues.apache.org/jira/browse/PHOENIX-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15635076#comment-15635076 ]
Hudson commented on PHOENIX-3439: --------------------------------- FAILURE: Integrated in Jenkins build Phoenix-master #1471 (See [https://builds.apache.org/job/Phoenix-master/1471/]) PHOENIX-3439 Query using an RVC based on the base table PK is (jamestaylor: rev b47973a219228ff3ec190bb7e6facf6ba589e10d) * (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/ScanRanges.java * (edit) phoenix-core/src/test/java/org/apache/phoenix/compile/QueryOptimizerTest.java > Query using an RVC based on the base table PK is incorrectly using an index > and doing a full scan instead of a point query > -------------------------------------------------------------------------------------------------------------------------- > > Key: PHOENIX-3439 > URL: https://issues.apache.org/jira/browse/PHOENIX-3439 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.8.1 > Reporter: Jan Fernando > Assignee: James Taylor > Fix For: 4.9.0 > > Attachments: PHOENIX-3439.patch > > > We use Phoenix RVCs to support paginated queries. This performance of this > functionality relies on Phoenix predictably generating scans against a table > or index with a PK that matches the RVC specified for each page. > What we do is that on the initial query we use > PhoenixRuntime.getPkColsDataTypesForSql() to get the list of PK columns and > persist that and use those to generate RVCs for paginated queries. > We have discovered that for queries where: > a) the user doesn't specify an ORDER BY > b) for tables where secondary indexes are present > Phoenix returns pk cols for the base table via getPkColsDataTypesForSql() but > then subsequent queries using the RVCs to paginate execute against a > secondary index doing a full scan. > We have a table with a secondary index where this is an issue. The base table > has a PK of PKCOL1, PKCOL2, PKCOL3 and > PKCOL4. We have an immutable secondary index where the PK is PKCOL1, PKCOL3, > PKCOL2, PKCOL4. > Here's what happens: > Here is our query we run to get the Query plan from which we generate the > RVCs to be used for paging: > EXPLAIN SELECT pkcol1, pkcol2, pkcol3, pkcol4, col1, col2 > FROM MY_TABLES."MYTABLE" > LIMIT 501; > I get the following explain: > CLIENT 6-CHUNK SERIAL 1-WAY ROUND ROBIN RANGE SCAN OVER MY_TABLES.MY_TABLE > ['00Dxx0000001gFA'] > SERVER 501 ROW LIMIT > CLIENT 501 ROW LIMIT > Therefore the columns we record for RVCs for paging are PK1, PK2, PK3, PK4 > from MY_TABLES.MY_TABLE > However when I generate the RVC query to page through the data: > EXPLAIN SELECT pkcol1, pkcol2, pkcol3, pkcol4, col1, col2 > FROM MY_TABLES."MYTABLE" > (pkcol1, pkcol2, pkcol3,pkcol4) > > ('001','001xx000003DHml',to_date('2015-10-21 09 > (tel:2015102109):50:55.0'),'017xx0000022FuI') > LIMIT 501; > I get the follow explain plan: > CLIENT 24-CHUNK 7811766 ROWS 6291457403 BYTES PARALLEL 1-WAY ROUND ROBIN > RANGE SCAN OVER MY_TABLES.MY_SECONDARY_INDEX ['00Dxx0000001gFA','001'] - > ['00Dxx0000001gFA',*] > SERVER FILTER BY ("PKCOL1", "PKCOL2, "PKCOL3", "PKCOL4") > (TO_CHAR('001'), > TO_CHAR('001xx000003DHml'), DATE '2015-10-21 09 (tel:2015102109):50:55.000', > TO_CHAR('017xx0000022FuI')) > SERVER 501 ROW LIMIT > CLIENT 501 ROW LIMIT > We expected that the second query with RVCs above would execute against the > base table as the base table PK is PKCOL1, PKCOL2, PKCOL3, PKCOL4 and the > index PK is PKCOL1, PKCOL3, PKCOL2, PKCOL4. -- This message was sent by Atlassian JIRA (v6.3.4#6332)