Thanks for the pointer, Paris. Finding the right abstraction level for distributed streaming ML is definitely a worthy (and non-trivial) task.
We are currently working on some improvements for VHT. Once that's done, re-working it on a window-based abstraction with proper support for iterations could be a nice project. We wound need to drop support for S4 (not sure about Samza), but that's on the roadmap anyway. Cheers, -- Gianmarco On Sat, Feb 6, 2016 at 1:42 PM, Márton Balassi <[email protected]> wrote: > Great suggestion, Paris. I would love to see Samoa building on these > concept once they are stable enough in the supported data processing > engines. > > On Fri, Feb 5, 2016 at 6:15 PM, Paris Carbone <[email protected]> wrote: > > > Hello Samoans, > > > > It seems that system semantics in stream processing are converging > lately. > > Apache Storm has now explicit state and windows [1], almost identical to > > Flink and Beam. Samza is also moving in a similar direction. > > > > This is really exciting and it feels natural to start moving the Samoa > > programming model a level up on top these establishing concepts. For > > example, there is no more need for custom buffering to implement > windowing > > and ML models etc. can be re-defined and engineered as operator state to > be > > durable. There are quite many cool things to be done and I believe there > > can be a very attractive roadmap for Samoa in that direction. What do you > > think? > > > > [1] > > > https://community.hortonworks.com/articles/14171/windowing-and-state-checkpointing-in-apache-storm.html > > > > Paris > > >
