hi, Mayur, thanks for replying. I know spark application should take all cores by default. My question is how to set task number on each core ? If one silce, one task, how can i set silce file size ?
2014-05-23 16:37 GMT+08:00 Mayur Rustagi <mayur.rust...@gmail.com>: > How many cores do you see on your spark master (8080 port). > By default spark application should take all cores when you launch it. > Unless you have set max core configuration. > > > Mayur Rustagi > Ph: +1 (760) 203 3257 > http://www.sigmoidanalytics.com > @mayur_rustagi <https://twitter.com/mayur_rustagi> > > > > On Thu, May 22, 2014 at 4:07 PM, qingyang li <liqingyang1...@gmail.com>wrote: > >> my aim of setting task number is to increase the query speed, and I >> have also found " mapPartitionsWithIndex at >> Operator.scala:333<http://192.168.1.101:4040/stages/stage?id=17>" >> is costing much time. so, my another question is : >> how to tunning >> mapPartitionsWithIndex<http://192.168.1.101:4040/stages/stage?id=17> >> to make the costing time down? >> >> >> >> >> 2014-05-22 18:09 GMT+08:00 qingyang li <liqingyang1...@gmail.com>: >> >> i have added SPARK_JAVA_OPTS+="-Dspark. >>> default.parallelism=40 " in shark-env.sh, >>> but i find there are only10 tasks on the cluster and 2 tasks each >>> machine. >>> >>> >>> 2014-05-22 18:07 GMT+08:00 qingyang li <liqingyang1...@gmail.com>: >>> >>> i have added SPARK_JAVA_OPTS+="-Dspark.default.parallelism=40 " in >>>> shark-env.sh >>>> >>>> >>>> 2014-05-22 17:50 GMT+08:00 qingyang li <liqingyang1...@gmail.com>: >>>> >>>> i am using tachyon as storage system and using to shark to query a >>>>> table which is a bigtable, i have 5 machines as a spark cluster, there are >>>>> 4 cores on each machine . >>>>> My question is: >>>>> 1. how to set task number on each core? >>>>> 2. where to see how many partitions of one RDD? >>>>> >>>> >>>> >>> >> >