[ https://issues.apache.org/jira/browse/PHOENIX-3836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041133#comment-16041133 ]
James Taylor commented on PHOENIX-3836: --------------------------------------- +1. Nice find, [~samarthjain]! > Estimated row count is twice the actual row count when stats are updated via > major compaction > --------------------------------------------------------------------------------------------- > > Key: PHOENIX-3836 > URL: https://issues.apache.org/jira/browse/PHOENIX-3836 > Project: Phoenix > Issue Type: Bug > Reporter: Mujtaba Chohan > Assignee: Samarth Jain > Fix For: 4.11.0 > > Attachments: PHOENIX-3836.patch > > > Estimated row count for a 2M table is 3986498 after stats updated via major > compaction vs 1993250 with {{update statistics}}. > {noformat} > Explain plan for count(*) on 2M row table after major compaction: > +--------------------------------------------------------------------------------------+ > | PLAN > | > +--------------------------------------------------------------------------------------+ > | CLIENT 364-CHUNK 3986498 ROWS 3774892993 BYTES PARALLEL 1-WAY FULL SCAN > OVER T | > | SERVER FILTER BY FIRST KEY ONLY > | > | SERVER AGGREGATE INTO SINGLE ROW > | > +--------------------------------------------------------------------------------------+ > Explain plan for count(*) on 2M row table after update statistics: > +--------------------------------------------------------------------------------------+ > | PLAN > | > +--------------------------------------------------------------------------------------+ > | CLIENT 364-CHUNK 1993250 ROWS 3774892993 BYTES PARALLEL 1-WAY FULL SCAN > OVER T | > | SERVER FILTER BY FIRST KEY ONLY > | > | SERVER AGGREGATE INTO SINGLE ROW > | > +--------------------------------------------------------------------------------------+ > {noformat} > Following schema was used with 2M rows and 10MB guidepost width: > {noformat} > CREATE TABLE IF NOT EXISTS T (PKA CHAR(15) NOT NULL, PKF CHAR(3) NOT NULL, > PKP CHAR(15) NOT NULL, CRD DATE NOT NULL, EHI CHAR(15) NOT NULL, STD_COL > VARCHAR, INDEXED_COL INTEGER, > CONSTRAINT PK PRIMARY KEY ( PKA, PKF, PKP, CRD DESC, EHI)) > VERSIONS=1,MULTI_TENANT=true,IMMUTABLE_ROWS=true > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)