Hi,
My iterative program written in Spark got quite various running time for
each iterations, although the computation load is supposed to
be roughly the same. My program logic would add a batch of tuples and
delete roughly same number of tuples in each iteration.
I suspect part of the reason is
I guess it could be solved by extending from existing RDD and override the
getPreferredLocations() definition.
But I am not sure, I will wait for the answer.
On Thu, Oct 31, 2013 at 10:44 PM, Wenlei Xie wrote:
> Hi,
>
> My iterative program written in Spark got quite various running time for
>
Any official answer from the developers? Is the partition guaranteed to be
generated on the preferred location?
Best,
Wenlei
On Thu, Oct 31, 2013 at 7:53 PM, dachuan wrote:
> I guess it could be solved by extending from existing RDD and override the
> getPreferredLocations() definition.
>
> Bu
Thank you for this suggestion! :)
On Thu, Oct 31, 2013 at 7:53 PM, dachuan wrote:
> I guess it could be solved by extending from existing RDD and override the
> getPreferredLocations() definition.
>
> But I am not sure, I will wait for the answer.
>
>
> On Thu, Oct 31, 2013 at 10:44 PM, Wenlei X