Thanks Iulian, I'll retest with 1.6.x once it's released (probably won't
have enough spare time to test with the RC).
On 11/12/2015 15:00, Iulian DragoČ™ wrote:
On Wed, Dec 9, 2015 at 4:29 PM, Adrian Bridgett <adr...@opensignal.com
<mailto:adr...@opensignal.com>> wrote:
(resending, text only as first post on 2nd never seemed to make it)
Using parallelize() on a dataset I'm only seeing two tasks rather
than the number of cores in the Mesos cluster. This is with spark
1.5.1 and using the mesos coarse grained scheduler.
Running pyspark in a console seems to show that it's taking a
while before the mesos executors come online (at which point the
default parallelism is changing). If I add "sleep 30" after
initialising the SparkContext I get the "right" number (42 by
coincidence!)
I've just tried increasing minRegisteredResourcesRatio to 0.5 but
this doesn't affect either the test case below nor my code.
This limit seems to be implemented only in the coarse-grained Mesos
scheduler, but the fix will be available starting with Spark 1.6.0
(1.5.2 doesn't have it).
iulian
Is there something else I can do instead? Perhaps it should be
seeing how many tasks _should_ be available rather than how many
are (I'm also using dynamicAllocation).
15/12/02 14:34:09 INFO mesos.CoarseMesosSchedulerBackend:
SchedulerBackend is ready for scheduling beginning after reached
minRegisteredResourcesRatio: 0.0
>>>
>>>
>>> print (sc.defaultParallelism)
2
>>> 15/12/02 14:34:12 INFO mesos.CoarseMesosSchedulerBackend:
Mesos task 5 is now TASK_RUNNING
15/12/02 14:34:13 INFO mesos.MesosExternalShuffleClient:
Successfully registered app
20151117-115458-164233482-5050-24333-0126 with external shuffle
service.
....
15/12/02 14:34:15 INFO mesos.CoarseMesosSchedulerBackend:
Registered executor:
AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@ip-10-1-200-147.ec2.internal:41194/user/Executor#-1021429650])
with ID 20151117-115458-164233482-5050-24333-S22/5
15/12/02 14:34:15 INFO spark.ExecutorAllocationManager: New
executor 20151117-115458-164233482-5050-24333-S22/5 has registered
(new total is 1)
....
>>> print (sc.defaultParallelism)
42
Thanks
Adrian Bridgett
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
<mailto:user-unsubscr...@spark.apache.org>
For additional commands, e-mail: user-h...@spark.apache.org
<mailto:user-h...@spark.apache.org>
--
--
Iulian Dragos
------
Reactive Apps on the JVM
www.typesafe.com <http://www.typesafe.com>
--
*Adrian Bridgett* | Sysadmin Engineer, OpenSignal
<http://www.opensignal.com>
_____________________________________________________
Office: First Floor, Scriptor Court, 155-157 Farringdon Road,
Clerkenwell, London, EC1R 3AD
Phone #: +44 777-377-8251
Skype: abridgett |@adrianbridgett <http://twitter.com/adrianbridgett>|
LinkedIn link <https://uk.linkedin.com/in/abridgett>
_____________________________________________________