I have a basic qs: how spark assigns partition to an executor? Does it
respect data locality? Does this behaviour depend on cluster manager, ie
yarn vs standalone?
On 22 Jun 2015 22:45, "Akhil Das" <ak...@sigmoidanalytics.com> wrote:

> Option 1 should be fine, Option 2 would bound a lot on network as the data
> increase in time.
>
> Thanks
> Best Regards
>
> On Mon, Jun 22, 2015 at 5:59 PM, Ashish Soni <asoni.le...@gmail.com>
> wrote:
>
>> Hi All  ,
>>
>> What is the Best Way to install and Spark Cluster along side with Hadoop
>> Cluster , Any recommendation for below deployment topology will be a great
>> help
>>
>> *Also Is it necessary to put the Spark Worker on DataNodes as when it
>> read block from HDFS it will be local to the Server / Worker or  I can put
>> the Worker on any other nodes and if i do that will it affect the
>> performance of the Spark Data Processing ..*
>>
>> Hadoop Option 1
>>
>> Server 1 - NameNode   & Spark Master
>> Server 2 - DataNode 1  & Spark Worker
>> Server 3 - DataNode 2  & Spark Worker
>> Server 4 - DataNode 3  & Spark Worker
>>
>> Hadoop Option 2
>>
>>
>> Server 1 - NameNode
>> Server 2 - Spark Master
>> Server 2 - DataNode 1
>> Server 3 - DataNode 2
>> Server 4 - DataNode 3
>> Server 5 - Spark Worker 1
>> Server 6 - Spark Worker 2
>> Server 7 - Spark Worker 3
>>
>> Thanks.
>>
>>
>>
>>
>

Reply via email to