Load data into carbondata executors distributed unevenly

a Wed, 29 Mar 2017 19:32:55 -0700

Hello!


Test result：
When I load csv data into carbondata table 3 times，the executors distributed 
unevenly。My  purpose is one node one task，but the result is some node has 2 
task and some node has no task。
See the load data 1.png,data 2.png,data 3.png。
The carbondata data.PNG is the data structure in hadoop.


I load 4 0000 0000 records into carbondata table takes 2629s seconds，its too 
long。


Question：
How can i make the executors distributed evenly ?


The environment：

spark2.1+carbondata1.1，there are 7 datanodes.


./bin/spark-shell   \
--master yarn \
--deploy-mode client  \
--num-executors n \ （the first time is 7(result in load data 1.png)，the second 
time is 6(result in load data 2.png),the three time is 8(result in load 
data3.png)）
--executor-cores 10 \
--executor-memory 40G \
--driver-memory 8G \


carbon.properties
######## DataLoading Configuration ########
carbon.sort.file.buffer.size=20
carbon.graph.rowset.size=10000
carbon.number.of.cores.while.loading=10
carbon.sort.size=50000
carbon.number.of.cores.while.compacting=10
carbon.number.of.cores=10


Best regards!

Load data into carbondata executors distributed unevenly

Reply via email to