Re: default parallelism in trunk

2014-02-02 Thread Koert Kuipers
After the upgrade spark-shell still behaved properly. But a scala program
that defined it's own SparkContext and did not set
spark.default.parallelism suddenly was stuck with a parallelism of 2. I
"fixed it" by setting a desired spark.default.parallelism system property
for now, and no longer relying on the default.


On Sun, Feb 2, 2014 at 7:48 PM, Aaron Davidson  wrote:

> Sorry, I meant to say we will use the maximum between (the total number of
> cores in the cluster) and (2) if spark.default.parallelism is not set. So
> this should not be causing your problem unless your cluster thinks it has
> less than 2 cores.
>
>
> On Sun, Feb 2, 2014 at 4:46 PM, Aaron Davidson  wrote:
>
>> Could you give an example where default parallelism is set to 2 where it
>> didn't used to be?
>>
>> Here is the relevant section for the spark standalone mode:
>> CoarseGrainedSchedulerBackend.scala#L211.
>> If spark.default.parallelism is set, it will override anything else. If it
>> is not set, we will use the total number of cores in the cluster and 2,
>> which is the same logic that has been used since 
>> spark-0.7
>> .
>>
>> Simplest possibility is that you're setting spark.default.parallelism,
>> otherwise there may be a bug introduced somewhere that isn't defaulting
>> correctly anymore.
>>
>>
>> On Sat, Feb 1, 2014 at 12:30 AM, Koert Kuipers  wrote:
>>
>>> i just managed to upgrade my 0.9-SNAPSHOT from the last scala 2.9.x
>>> version to the latest.
>>>
>>>
>>> everything seems good except that my default parallelism is now set to 2
>>> for jobs instead of some smart number based on the number of cores (i think
>>> that is what it used to do). it this change on purpose?
>>>
>>> i am running spark standalone.
>>>
>>> thx, koert
>>>
>>
>>
>


Re: default parallelism in trunk

2014-02-02 Thread Aaron Davidson
Sorry, I meant to say we will use the maximum between (the total number of
cores in the cluster) and (2) if spark.default.parallelism is not set. So
this should not be causing your problem unless your cluster thinks it has
less than 2 cores.


On Sun, Feb 2, 2014 at 4:46 PM, Aaron Davidson  wrote:

> Could you give an example where default parallelism is set to 2 where it
> didn't used to be?
>
> Here is the relevant section for the spark standalone mode:
> CoarseGrainedSchedulerBackend.scala#L211.
> If spark.default.parallelism is set, it will override anything else. If it
> is not set, we will use the total number of cores in the cluster and 2,
> which is the same logic that has been used since 
> spark-0.7
> .
>
> Simplest possibility is that you're setting spark.default.parallelism,
> otherwise there may be a bug introduced somewhere that isn't defaulting
> correctly anymore.
>
>
> On Sat, Feb 1, 2014 at 12:30 AM, Koert Kuipers  wrote:
>
>> i just managed to upgrade my 0.9-SNAPSHOT from the last scala 2.9.x
>> version to the latest.
>>
>>
>> everything seems good except that my default parallelism is now set to 2
>> for jobs instead of some smart number based on the number of cores (i think
>> that is what it used to do). it this change on purpose?
>>
>> i am running spark standalone.
>>
>> thx, koert
>>
>
>


Re: default parallelism in trunk

2014-02-02 Thread Aaron Davidson
Could you give an example where default parallelism is set to 2 where it
didn't used to be?

Here is the relevant section for the spark standalone mode:
CoarseGrainedSchedulerBackend.scala#L211.
If spark.default.parallelism is set, it will override anything else. If it
is not set, we will use the total number of cores in the cluster and 2,
which is the same logic that has been used since
spark-0.7
.

Simplest possibility is that you're setting spark.default.parallelism,
otherwise there may be a bug introduced somewhere that isn't defaulting
correctly anymore.


On Sat, Feb 1, 2014 at 12:30 AM, Koert Kuipers  wrote:

> i just managed to upgrade my 0.9-SNAPSHOT from the last scala 2.9.x
> version to the latest.
>
>
> everything seems good except that my default parallelism is now set to 2
> for jobs instead of some smart number based on the number of cores (i think
> that is what it used to do). it this change on purpose?
>
> i am running spark standalone.
>
> thx, koert
>