[ https://issues.apache.org/jira/browse/CARBONDATA-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jacky Li resolved CARBONDATA-3593. ---------------------------------- Fix Version/s: 2.0.0 Resolution: Fixed > total_blocklets in query statistic always the same with valid_blocklets > ----------------------------------------------------------------------- > > Key: CARBONDATA-3593 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3593 > Project: CarbonData > Issue Type: Improvement > Components: core > Reporter: Hong Shen > Priority: Major > Fix For: 2.0.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > When I run sql on carbondata table with "enable.query.statistics=true", > total_blocklets in query statistic always the same with valid_blocklets. > Below is an example. > Table test_table_hdfs_sort_city and test_table_hdfs_no_sort has the same > data, the only different is test_table_hdfs_sort_city has > SORT_COLUMN='city_name', while test_table_hdfs_no_sort with no sort column. > {code} > carbon.sql("select * from test_table_hdfs_sort_city where city_name='city1' > ") > {code} > |scan_blocks_num|total_blocklets|valid_blocklets|total_pages|scanned_pages|valid_pages| > | 1| 1| > 1 | 193| 4| 4| > {code} > carbon.sql("select * from test_table_hdfs_no_sort where city_name='city1' ") > {code} > |scan_blocks_num|total_blocklets|valid_blocklets|total_pages|scanned_pages|valid_pages| > | 1| 3| > 3 | 193| 193| 193| > After read the code, I found both TOTAL_BLOCKLET_NUM and > VALID_SCAN_BLOCKLET_NUM will plus 1 in BlockletFilterScanner.executeFilter(), > BlockletFilterScanner.executeFilterForPages, > BlockletFullScanner.scanBlocklet. > I think total_blocklets should be the total blocklet, valid_blocklets should > be the filtered blocklet. If it need to be modified. I will provide a patch, > since I have modified it locally. -- This message was sent by Atlassian Jira (v8.3.4#803005)