Re: Spark.default.parallelism can not set reduce number

Takeshi Yamamuro Fri, 20 May 2016 04:20:27 -0700

You need to use `spark.sql.shuffle.partitions`.

// maropu


On Fri, May 20, 2016 at 8:17 PM, 喜之郎 <251922...@qq.com> wrote:

>  Hi all.
> I set Spark.default.parallelism equals 20 in spark-default.conf. And send
> this file to all nodes.
> But I found reduce number is still default value,200.
> Does anyone else encouter this problem? can anyone give some advice?
>
> ############
> [Stage 9:>                                                        (0 + 0)
> / 200]
> [Stage 9:>                                                        (0 + 2)
> / 200]
> [Stage 9:>                                                        (1 + 2)
> / 200]
> [Stage 9:>                                                        (2 + 2)
> / 200]
> #######
>
> And this results in many empty files.Because my data is little, only some
> of the 200 files have data.
> #######
>  2016-05-20 17:01
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00000
>  2016-05-20 17:01
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00001
>  2016-05-20 17:01
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00002
>  2016-05-20 17:01
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00003
>  2016-05-20 17:01
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00004
>  2016-05-20 17:01
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00005
> ########
>
>
>
>


-- 
---
Takeshi Yamamuro

Re: Spark.default.parallelism can not set reduce number

Reply via email to