Hi All,

Here is the status of integration of Apache Apex into Apache Samoa.

   - Samoa API implemented and able to convert Samoa topology into Apex Dag.
   - Implemented partitioning support using parallelism hints from Samoa
   API.
   - Implemented stream multiplexing:
      - Added All-based partitioner. Upstream tuples go to all downstream
      partitions
      - Stream codec for Key based partitioning
      - Stream codec for Random partitioning
   - Able to launch a Samoa task on the cluster. This has to be worked on.
   Currently some hacks are used by calling DTCli explicitly from the main
   entry point in Samoa code. Also jars are needed to be manually bundled.
   This will be worked on in this sprint.
   - Tested the following algorithms on local cluster:
      - Prequential Evaluation using Vertical Hoeffding Tree classifier.
      This is a decision tree based classifier.
      - Clustering using CluStream algorithm.
   - I have asked clarifications on some more details of these algorithms
   as well as serialization issues with Samoa classes. I am waiting for some
   response from the Samoa community.
   - Although Samoa does not have many algorithms currently (it is a
   framework for developing algorithms), more algorithms are expected as a
   part of their roadmap:
   https://cwiki.apache.org/confluence/display/SAMOA/Roadmap

Thanks,

Bhupesh

Reply via email to