my aim of setting task number is to increase the query speed,    and I have
also found " mapPartitionsWithIndex at
Operator.scala:333<http://192.168.1.101:4040/stages/stage?id=17>"
is costing much time.  so, my another question is :
how to tunning 
mapPartitionsWithIndex<http://192.168.1.101:4040/stages/stage?id=17>
to make the costing time down?




2014-05-22 18:09 GMT+08:00 qingyang li <liqingyang1...@gmail.com>:

> i have added  SPARK_JAVA_OPTS+="-Dspark.
> default.parallelism=40 "  in shark-env.sh,
> but i find there are only10 tasks on the cluster and 2 tasks each machine.
>
>
> 2014-05-22 18:07 GMT+08:00 qingyang li <liqingyang1...@gmail.com>:
>
> i have added  SPARK_JAVA_OPTS+="-Dspark.default.parallelism=40 "  in
>> shark-env.sh
>>
>>
>> 2014-05-22 17:50 GMT+08:00 qingyang li <liqingyang1...@gmail.com>:
>>
>> i am using tachyon as storage system and using to shark to query a table
>>> which is a bigtable, i have 5 machines as a spark cluster, there are 4
>>> cores on each machine .
>>> My question is:
>>> 1. how to set task number on each core?
>>> 2. where to see how many partitions of one RDD?
>>>
>>
>>
>

Reply via email to