If you are talking about a tree, then the RDDs are nodes, and the dependencies
are the edges.
If you are talking about a DAG, then the partitions in the RDDs are the nodes,
and the dependencies between the partitions are the edges.
On Thu, Apr 16, 2020 at 4:02 PM, Mania Abdi <
Is it correct to say, the nodes in the DAG are RDDs and the edges are
computations?
On Thu, Apr 16, 2020 at 6:21 PM Reynold Xin wrote:
> The RDD is the DAG.
>
>
> On Thu, Apr 16, 2020 at 3:16 PM, Mania Abdi wrote:
>
>> Hello everyone,
>>
>> I am implementing a caching mechanism for analytic
The RDD is the DAG.
On Thu, Apr 16, 2020 at 3:16 PM, Mania Abdi < abdi...@husky.neu.edu > wrote:
>
> Hello everyone,
>
> I am implementing a caching mechanism for analytic workloads running on
> top of Spark and I need to retrieve the Spark DAG right after it is
> generated and the DAG
Hello everyone,
I am implementing a caching mechanism for analytic workloads running on top
of Spark and I need to retrieve the Spark DAG right after it is generated
and the DAG scheduler. I would appreciate it if you could give me some
hints or reference me to some documents about where the DAG
Hi again,
Does anyone have thoughts on either the idea or the implementation?
Thanks,
Andrew
On Thu, Apr 9, 2020 at 11:32 PM Andrew Melo wrote:
>
> Hi all,
>
> I've opened a WIP PR here https://github.com/apache/spark/pull/28159
> I'm a novice at Scala, so I'm sure the code isn't idiomatic,
Hi,
While trying to understand the relationship of BlockManager
and ShuffleManager I found that ShuffleManager is used for shuffle block
data [1] (and that makes sense).
What I found quite surprising is that BlockManager can call getLocalBytes
for non-shuffle blocks that in turn does...fetching
Hi Jungtaek,
Thanks a lot for your answer. What you're saying reflects my understanding
perfectly. There's a small change, but makes understanding where rules are
used much simpler (= less confusing). I'll propose a PR and see where it
goes from there. Thanks!
Pozdrawiam,
Jacek Laskowski