t data locality in Spark data source v2. How can I
> provide Spark the ability to read and process data on the same node?
>
> I didn't find any interface that supports 'getPreferredLocations' (or
> equivalent).
>
> Thanks!
>
Hi,
I would like to support data locality in Spark data source v2. How can I
provide Spark the ability to read and process data on the same node?
I didn't find any interface that supports 'getPreferredLocations' (or
equivalent).
Thanks!
Dear all,
I tried to customize my own RDD. In the getPreferredLocations() function, I
used the following code to query anonter RDD, which was used as an input to
initialize this customized RDD:
* val results: Array[Array[DataChunkPartition]] =
context.runJob(partitionsRDD
at 3:46 PM, Andy Sloane wrote:
>
>> We are seeing something that looks a lot like a regression from spark
>> 1.2. When we run jobs with multiple threads, we have a crash somewhere
>> inside getPreferredLocations, as was fixed in
h somewhere inside
> getPreferredLocations, as was fixed in SPARK-4454. Except now it's inside
> org.apache.spark.MapOutputTrackerMaster.getLocationsWithLargestOutputs
> instead of DAGScheduler directly.
>
> I tried Spark 1.2 post-SPARK-4454 (before this patch it's only slightl
We are seeing something that looks a lot like a regression from spark 1.2.
When we run jobs with multiple threads, we have a crash somewhere inside
getPreferredLocations, as was fixed in SPARK-4454. Except now it's inside
org.apache.spark.MapOutputTrackerMaster.getLocationsWithLargestOu
> 1) Is there a guarantee that a partition will only be processed on a node
> which is in the "getPreferredLocations" set of nodes returned by the RDD ?
No there isn't, by default Spark may schedule in a "non preferred"
location after `spark.locality.wait` has ex
I am building my own custom RDD class.
1) Is there a guarantee that a partition will only be processed on a node
which is in the "getPreferredLocations" set of nodes returned by the RDD ?
2) I am implementing this custom RDD in Java and plan to extend JavaRDD.
However, I