Could we alternatively use a state mapping function to keep track of the computation so far instead of outputting V each time? (also the progress so far is probably of a different type R rather than V).
On Wed, Mar 14, 2018 at 4:28 PM Holden Karau <hol...@pigscanfly.ca> wrote: > So we had a quick chat about what it would take to add something like > SplittableDoFns to Spark. I'd done some sketchy thinking about this last > year but didn't get very far. > > My back-of-the-envelope design was as follows: > For input type T > Output type V > > Implement a mapper which outputs type (T, V) > and if the computation finishes T will be populated otherwise V will be > > For determining how long to run we'd up to either K seconds or listen for > a signal on a port > > Once we're done running we take the result and filter for the ones with T > and V into seperate collections re-run until finished > and then union the results > > > This is maybe not a great design but it was minimally complicated and I > figured terrible was a good place to start and improve from. > > > Let me know your thoughts, especially the parts where this is worse than I > remember because its been awhile since I thought about this. > > > -- > Twitter: https://twitter.com/holdenkarau >