I think the query plan info is not right, 1. Total blocklet from carbondata cli is 233 + 86 = 319 2. But query plan tell me that I have 560 blocklet
I hope below info could help you to locate issue. *********************************************************************************** I use carbondata cli print the blocklet summary like below: java -cp "/home/hadoop/carbontool/*:/opt/spark/jars/*" org.apache.carbondata.tool.CarbonCli -cmd summary -a -p hdfs://ec2-dca-aa-p-sdn-16.appannie.org:9000/usr/carbon/data/default/storev3/ ## Summary total: 80 blocks, 9 shards, 233 blocklets, 62,698 pages, 2,006,205,228 rows, 12.40GB avg: 158.72MB/block, 54.50MB/blocklet, 25,077,565 rows/block, 8,610,322 rows/blocklet java -cp "/home/hadoop/carbontool/*:/opt/spark/jars/*" org.apache.carbondata.tool.CarbonCli -cmd summary -a -p hdfs://ec2-dca-aa-p-sdn-16.appannie.org:9000/usr/carbon/data/default/usage_basickpi/ ## Summary total: 30 blocks, 14 shards, 86 blocklets, 3,498 pages, 111,719,467 rows, 4.24GB avg: 144.57MB/block, 50.43MB/blocklet, 3,723,982 rows/block, 1,299,063 rows/blocklet ************************************************************************************ But at the same time, I run a sql, carbon told me below info: |== CarbonData Profiler == Table Scan on storev3 - total: 194 blocks, 560 blocklets - filter: (((((((((granularity <> null and date <> null) and date >= 1472688000000000 between date <= 1475280000000000) and true) and granularity = monthly) and country_code in (LiteralExpression(US);LiteralExpression(CN);LiteralExpression(JP);)) and device_code in (LiteralExpression(ios-phone);)) and product_id <> null) and country_code <> null) and device_code <> null) - pruned by Main DataMap - skipped: 192 blocks, 537 blocklets ************************************************************************************ The select sql like is SELECT f.country_code, f.date, f.product_id, f.category_id, f.arpu FROM ( SELECT a.country_code, a.date, a.product_id, a.category_id, a.revenue/a.average_active_users as arpu FROM( SELECT r.device_code, r.category_id, r.country_code, r.date, r.product_id, r.revenue, u.average_active_users FROM ( SELECT b.device_code, b.country_code, b.product_id, b.date, b.category_id, sum(b.revenue) as revenue FROM ( SELECT v.device_code, v.country_code, v.product_id, v.revenue, v.date, p.category_id FROM ( SELECT device_code, country_code, product_id, est_revenue as revenue, timeseries(date, 'month') as date FROM storev3 WHERE market_code='apple-store' AND date BETWEEN '2016-09-01' AND '2016-10-01' and device_code in ('ios-phone') and country_code in ('US', 'CN', 'JP') ) as v JOIN( SELECT DISTINCT product_id, category_id FROM storev3 WHERE market_code='apple-store' AND date BETWEEN '2016-09-01' AND '2016-10-01' and device_code in ('ios-phone') and category_id in (100000, 100001, 100021) and country_code in ('US', 'CN', 'JP') ) as p ON p.product_id = v.product_id ) as b GROUP BY b.device_code, b.country_code, b.product_id, b.date, b.category_id ) AS r JOIN ( SELECT country_code, date, product_id, (CASE WHEN est_average_active_users is not NULL THEN est_average_active_users ELSE 0 END) as average_active_users, device_code FROM usage_basickpi WHERE date BETWEEN '2016-09-01' AND '2016-10-01'and granularity ='monthly' and country_code in ('US', 'CN', 'JP') AND device_code in ('ios-phone') ) AS u ON r.country_code=u.country_code AND r.date=u.date AND r.product_id=u.product_id AND r.device_code=u.device_code ) AS a )AS f ORDER BY f.arpu DESC LIMIT 10 Thanks Aaron -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/