Recently , I used the latest code done test as below 1. Create Table: CREATE TABLE rx5_tbox_parquet_all( carid STRING, inputstime TIMESTAMP, carsyspwrmod INT, cardofrontpas INT, cardofrontdrv INT, cardorearleft INT, cardorearright INT, carbonnet INT, carboot INT, carwinfrontleft INT, carwinrearleft INT, carwinfrontright INT, carwinrearright INT, carsunroof INT, carcsactive INT, carcsenabled INT, carseatbeltdrv INT ) STORED BY 'carbondata' TBLPROPERTIES('SORT_COLUMNS'='carid', 'DICTIONARY_INCLUDE'='carid')
2. Load 0.1 billion data 3. Run the below sql select carid, inputstime, carsyspwrmod, cardofrontpas, cardofrontdrv, cardorearleft, cardorearright, carbonnet, carboot, carwinfrontleft, carwinrearleft, carwinfrontright, carwinrearright from rx5_tbox_parquet_all2 order by carid limit 10 Use carbondata1.2 master code + spark2.1 to run |carid |inputstime|carsyspwrmod|cardofrontpas|cardofrontdrv|cardorearleft|cardorearright|carbonnet|carboot|carwinfrontleft|carwinrearleft|carwinfrontright|carwinrearright| +-----------------+--------+------------+---------------+---------------+---------------+----------------+---------+-------+------------------+-----------------+-------------------+------------------+ |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | +-----------------+--------+------------+---------------+---------------+---------------+----------------+---------+-------+------------------+-----------------+-------------------+------------------+ limit 10 query time: *28777* Use orderby +limit optimized carbondata1.2 master code + spark1.6.3 to run +-----------------+--------+------------+---------------+---------------+---------------+----------------+---------+-------+------------------+-----------------+-------------------+------------------+ |carid |inputstime|carsyspwrmod|cardofrontpas|cardofrontdrv|cardorearleft|cardorearright|carbonnet|carboot|carwinfrontleft|carwinrearleft|carwinfrontright|carwinrearright| +-----------------+--------+------------+---------------+---------------+---------------+----------------+---------+-------+------------------+-----------------+-------------------+------------------+ |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | |LSJA24790HS020662|null |2 |0 |0 |0 |0 |0 |0 |0 |0 |0 |0 | +-----------------+--------+------------+---------------+---------------+---------------+----------------+---------+-------+------------------+-----------------+-------------------+------------------+ limit 10 query time: *1640* Apparently, after optimization, even I use spark1.6.3, it also improved 90% performance Thanks 马云 -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/