I think the query plan info is not right,

1. Total blocklet from carbondata cli is 233 + 86 = 319
2. But query plan tell me that I have 560 blocklet

I hope below info could help you to locate issue.

***********************************************************************************
I use carbondata cli print the blocklet summary like below:

java -cp "/home/hadoop/carbontool/*:/opt/spark/jars/*"
org.apache.carbondata.tool.CarbonCli -cmd summary -a -p
hdfs://ec2-dca-aa-p-sdn-16.appannie.org:9000/usr/carbon/data/default/storev3/

## Summary
total: 80 blocks, 9 shards, 233 blocklets, 62,698 pages, 2,006,205,228 rows,
12.40GB
avg: 158.72MB/block, 54.50MB/blocklet, 25,077,565 rows/block, 8,610,322
rows/blocklet

java -cp "/home/hadoop/carbontool/*:/opt/spark/jars/*"
org.apache.carbondata.tool.CarbonCli -cmd summary -a -p
hdfs://ec2-dca-aa-p-sdn-16.appannie.org:9000/usr/carbon/data/default/usage_basickpi/

## Summary
total: 30 blocks, 14 shards, 86 blocklets, 3,498 pages, 111,719,467 rows,
4.24GB
avg: 144.57MB/block, 50.43MB/blocklet, 3,723,982 rows/block, 1,299,063
rows/blocklet


************************************************************************************
But at the same time, I run a sql, carbon told me below info:

|== CarbonData Profiler ==
Table Scan on storev3
 - total: 194 blocks, 560 blocklets
 - filter: (((((((((granularity <> null and date <> null) and date >=
1472688000000000 between date <= 1475280000000000) and true) and granularity
= monthly) and country_code in
(LiteralExpression(US);LiteralExpression(CN);LiteralExpression(JP);)) and
device_code in (LiteralExpression(ios-phone);)) and product_id <> null) and
country_code <> null) and device_code <> null)
 - pruned by Main DataMap
    - skipped: 192 blocks, 537 blocklets


************************************************************************************
The select sql like is

SELECT f.country_code, f.date, f.product_id, f.category_id, f.arpu FROM (
    SELECT a.country_code, a.date, a.product_id, a.category_id,
a.revenue/a.average_active_users as arpu
    FROM(
        SELECT r.device_code, r.category_id, r.country_code, r.date,
r.product_id, r.revenue, u.average_active_users
        FROM
        (
            SELECT b.device_code, b.country_code, b.product_id,  b.date,
b.category_id, sum(b.revenue) as revenue
            FROM (
                SELECT v.device_code, v.country_code, v.product_id,
v.revenue, v.date, p.category_id FROM
                (
                    SELECT device_code, country_code, product_id,
est_revenue as revenue, timeseries(date, 'month') as date
                    FROM storev3
                    WHERE market_code='apple-store' AND date BETWEEN
'2016-09-01' AND '2016-10-01' and device_code in ('ios-phone') and
country_code in ('US', 'CN', 'JP')
                ) as v
                JOIN(
                    SELECT DISTINCT product_id, category_id
                    FROM storev3
                    WHERE market_code='apple-store' AND date BETWEEN
'2016-09-01' AND '2016-10-01' and device_code in ('ios-phone') and
category_id in (100000, 100001, 100021) and country_code in ('US', 'CN',
'JP')
                ) as p
                ON p.product_id = v.product_id
            ) as b
            GROUP BY b.device_code, b.country_code, b.product_id, b.date,
b.category_id
        ) AS r
        JOIN
        (
            SELECT country_code, date, product_id, (CASE WHEN
est_average_active_users is not NULL THEN est_average_active_users ELSE 0
END) as average_active_users, device_code
            FROM usage_basickpi
            WHERE date BETWEEN '2016-09-01' AND '2016-10-01'and granularity
='monthly' and country_code in ('US', 'CN', 'JP') AND device_code in
('ios-phone')
        ) AS u
        ON r.country_code=u.country_code AND r.date=u.date AND
r.product_id=u.product_id AND r.device_code=u.device_code
    ) AS a
)AS f
ORDER BY f.arpu DESC
LIMIT 10

Thanks
Aaron




--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Reply via email to