Bin Shi created PHOENIX-4916:
--------------------------------

             Summary: When collecting statistics, the estimated size of a guide 
post may only count part of cells of the last row
                 Key: PHOENIX-4916
                 URL: https://issues.apache.org/jira/browse/PHOENIX-4916
             Project: Phoenix
          Issue Type: Bug
            Reporter: Bin Shi
            Assignee: Bin Shi


In DefaultStatisticsCollector.collectStatistics(...), it iterate all cells of 
the current row, once the accumulated estimated size plus the size of the 
current cell >= guide post width, it skipped all the remaining cells. The 
result is that  he estimated size of a guide post may only count part of cells 
of the last row.

This problem can be ignored in clusters with real data where the guide post 
width is much bigger than the row size, but it does have impact on unit test 
and iteration test, because we use very small guide post width in the test 
which results in inaccuracy of the estimated size of the query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to