Have you tried to repartition() your original data to make more partitions
before you aggregate?
--
Martin Goodson | VP Data Science
(0)20 3397 1240
[image: Inline image 1]
On Mon, Mar 23, 2015 at 4:12 PM, Yiannis Gkoufas johngou...@gmail.com
wrote:
Hi Yin,
Yes, I have set
per month and are second only to Google in the contextual advertising
space (ok - a distant second!).
Details here:
*http://grnh.se/rl8f25 http://grnh.se/rl8f25*
--
Martin Goodson | VP Data Science
(0)20 3397 1240
[image: Inline image 1]
Thank you Nishkam,
I have read your code. So, for the sake of my understanding, it seems that
for each spark context there is one executor per node? Can anyone confirm
this?
--
Martin Goodson | VP Data Science
(0)20 3397 1240
[image: Inline image 1]
On Thu, Jul 24, 2014 at 6:12 AM, Nishkam
Great - thanks for the clarification Aaron. The offer stands for me to
write some documentation and an example that covers this without leaving
*any* room for ambiguity.
--
Martin Goodson | VP Data Science
(0)20 3397 1240
[image: Inline image 1]
On Thu, Jul 24, 2014 at 6:09 PM, Aaron
available (daemon memory, worker memory etc). Perhaps a
worked example could be added to the docs? I would be happy to provide some
text as soon as someone can enlighten me on the technicalities!
Thank you
--
Martin Goodson | VP Data Science
(0)20 3397 1240
[image: Inline image 1]
by Spark.)
Am I reading this incorrectly?
Anyway our configuration is 21 machines (one master and 20 slaves) each
with 60Gb. We would like to use 4 cores per machine. This is pyspark so we
want to leave say 16Gb on each machine for python processes.
Thanks again for the advice!
--
Martin Goodson
I am also having exactly the same problem, calling using pyspark. Has
anyone managed to get this script to work?
--
Martin Goodson | VP Data Science
(0)20 3397 1240
[image: Inline image 1]
On Wed, Jul 16, 2014 at 2:10 PM, Ian Wilkinson ia...@me.com wrote:
Hi,
I’m trying to run the Spark
My experience is that gaining 20 spot instances accounts for a tiny
fraction of the total time of provisioning a cluster with spark-ec2. This
is not (solely) an AWS issue.
--
Martin Goodson | VP Data Science
(0)20 3397 1240
[image: Inline image 1]
On Thu, Jun 26, 2014 at 10:14 PM, Nicholas
How about London?
--
Martin Goodson | VP Data Science
(0)20 3397 1240
[image: Inline image 1]
On Mon, Mar 31, 2014 at 6:28 PM, Andy Konwinski andykonwin...@gmail.comwrote:
Hi folks,
We have seen a lot of community growth outside of the Bay Area and we are
looking to help spur even more