Re: is repartition very cost

2015-12-09 Thread Daniel Siegmann
ral you need to do performance testing to see if a repartition is > worth the shuffle time. > > A common model is to repartition the data once after ingest to achieve > parallelism and avoid shuffles whenever possible later. > > *From:* Zhiliang Zhu [mailto:zchl.j...@yahoo.com.IN

Re: is repartition very cost

2015-12-08 Thread Zhiliang Zhu
r 08, 2015 5:05 AM To: User Subject: is repartition very cost     Hi All,   I need to do optimize objective function with some linear constraints by  genetic algorithm.  I would like to make as much parallelism for it by spark.   repartition / shuffle may be used sometimes in it, however, is

RE: is repartition very cost

2015-12-08 Thread Young, Matthew T
repartition is worth the shuffle time. A common model is to repartition the data once after ingest to achieve parallelism and avoid shuffles whenever possible later. From: Zhiliang Zhu [mailto:zchl.j...@yahoo.com.INVALID] Sent: Tuesday, December 08, 2015 5:05 AM To: User Subject: is repartition very cost

is repartition very cost

2015-12-08 Thread Zhiliang Zhu
Hi All, I need to do optimize objective function with some linear constraints by   genetic algorithm. I would like to make as much parallelism for it by spark. repartition / shuffle may be used sometimes in it, however, is repartition API very cost ? Thanks in advance!Zhiliang