note that with AWS, you can use Placement Groups
<http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html>
and EC2 instances with Enhanced Networking
<http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html>
to
lower network latency and increase network throughput within the same AZ
(data center).

On Tue, Dec 22, 2015 at 12:11 AM, Eran Witkon <eranwit...@gmail.com> wrote:

> I'll check it out.
>
> On Tue, 22 Dec 2015 at 00:30 Michal Klos <michal.klo...@gmail.com> wrote:
>
>> If you are running on Amazon, then it's always a crapshoot as well.
>>
>> M
>>
>> On Dec 21, 2015, at 4:41 PM, Josh Rosen <joshro...@databricks.com> wrote:
>>
>> @Eran, are Server 1 and Server 2 both part of the same cluster / do they
>> have similar positions in the network topology w.r.t the Spark executors?
>> If Server 1 had fast network access to the executors but Server 2 was
>> across a WAN then I'd expect the job to run slower from Server 2 duet to
>> the extra network latency / reduced bandwidth. This is assuming that you're
>> running the driver in non-cluster deploy mode (so the driver process runs
>> on the machine which submitted the job).
>>
>> On Mon, Dec 21, 2015 at 1:30 PM, Igor Berman <igor.ber...@gmail.com>
>> wrote:
>>
>>> look for differences: packages versions, cpu/network/memory diff etc etc
>>>
>>>
>>> On 21 December 2015 at 14:53, Eran Witkon <eranwit...@gmail.com> wrote:
>>>
>>>> Hi,
>>>> I know it is a wide question but can you think of reasons why a pyspark
>>>> job which runs on from server 1 using user 1 will run faster then the same
>>>> job when running on server 2 with user 1
>>>> Eran
>>>>
>>>
>>>
>>


-- 

*Chris Fregly*
Principal Data Solutions Engineer
IBM Spark Technology Center, San Francisco, CA
http://spark.tc | http://advancedspark.com

Reply via email to