Hello everyone,

First of all congratulations for putting your effort in creating Samoa and 
making it open source. While there is a lot of research around distributed 
streaming ML, there is great lack of actual tools, frameworks and programming 
models that let people use such techniques and compose custom pipelines.
My main experience with SAMOA so far is at its runtime level since I worked on 
adding Apache Flink as one of the main adapters together with Faye (CCed) . We 
generally did not face any great challenges, apart from the fact that we had do 
manual circle detection in order to support Flink's iterations but maybe we 
could sum up some observations here to consider:


  *   There was no documentation regarding the integration of new backend 
systems. We had to extract information from a thesis report and by looking into 
the existing adapters source code and structure to do that. Perhaps a doc guide 
for backend contributors would be very useful in the future.
  *   If you notice all adapters there is a lot of replicated logic for each 
system. That means that there is some room for more abstractions to generalise 
things such as parametrising, instantiating and deploying tasks.
  *   Regarding the programming model, I noticed that you sometimes use action 
triggers (e.g. ‘evaluate  every x records’). You could maybe abstract trigger 
actions so you can reuse them in various components. Another advantage is that 
you could even expose them downwards from tasks so systems like Apache Flink, 
Crunch or Google Dataflow can override (and optimise) some of this logic using 
build-in windowing semantics. This is just a simple idea but I think there is 
in general potential in abstracting and exposing as much as possible while 
still keeping implementation complexities to a minimum.

We will keep and eye on the dev-list and participate actively with more 
feedback the more we use Samoa and find out needs. Currently, we are working on 
an experimental ML pipeline prototype that also works on streams so we will try 
to keep it as much in sync with Samoa as possible.

cheers
Paris, Faye



Reply via email to