Re: RDD: Execution and Scheduling

2015-09-20 Thread gsvic
Concerning answers 1 and 2: 1) How Spark determines a node as a "slow node" and how slow is that? 2) How an RDD chooses a location as a preferred location and with which criteria? Could you please also include the links of the source files for the two questions above? -- View this message

Re: RDD: Execution and Scheduling

2015-09-20 Thread Reynold Xin
On Sun, Sep 20, 2015 at 3:58 PM, gsvic wrote: > Concerning answers 1 and 2: > > 1) How Spark determines a node as a "slow node" and how slow is that? > There are two cases here: 1. If a node is busy (e.g. all slots are already occupied), the scheduler cannot schedule

Re: Using scala-2.11 when making changes to spark source

2015-09-20 Thread Ted Yu
Maybe the following can be used for changing Scala version: http://maven.apache.org/archetype/maven-archetype-plugin/ I played with it a little bit but didn't get far. FYI On Sun, Sep 20, 2015 at 6:18 AM, Stephen Boesch wrote: > > The dev/change-scala-version.sh [2.11]

Using scala-2.11 when making changes to spark source

2015-09-20 Thread Stephen Boesch
The dev/change-scala-version.sh [2.11] script modifies in-place the pom.xml files across all of the modules. This is a git-visible change. So if we wish to make changes to spark source in our own fork's - while developing with scala 2.11 - we would end up conflating those updates with our own.