Yes, sorry I meant DAG. I fixed it in my message but not the subject. The terminology of "leaf" wasn't helpful I know so hopefully my visual example was enough. Anyway, I noticed what you said in a local-mode test. I can try that in a cluster, too. Thank you!
On Thu, Sep 18, 2014 at 10:28 PM, Tobias Pfeiffer <t...@preferred.jp> wrote: > Hi, > > On Thu, Sep 18, 2014 at 8:55 PM, Victor Tso-Guillen <v...@paxata.com> >> wrote: >> >>> Is it possible to express a diamond DAG and have the leaf dependency >>> evaluate only once? >>> >> > Well, strictly speaking your graph is not a "tree", and also the meaning > of "leaf" is not totally clear, I'd say. > > >> So say data flows left to right (and the dependencies are oriented right >>> to left): >>> >>> [image: Inline image 1] >>> Is it possible to run d.collect() and have a evaluate its iterator only >>> once? >>> >> > If you say a.cache() (or a.persist()) then it will be evaluated only once > and then the cached data will be used for later accesses. > > Tobias >