[
https://issues.apache.org/jira/browse/SAMOA-16?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536502#comment-14536502
]
ASF GitHub Bot commented on SAMOA-16:
-------------------------------------
Github user senorcarbone commented on the pull request:
https://github.com/apache/incubator-samoa/pull/11#issuecomment-100484501
Hello again @gdfm and @abifet ,
I did a lot of cross-profiling between storm and flink, running the same
`VerticalHoeffdingTree` task under different configurations during the last two
days and I think the results are quite interesting.
It looks like the algorithm performance (and accuracy) depends heavily on
the ingestion speed of the local statistics processors. The paradox is that the
greater the speed the slower the whole computation gets by time since more and
more attribute events are sent to the local statistics processors with higher
rate, the more updates the model aggregator gets back.
The average processing delay (in num of flatten instances processed by the
aggregator between sending a process event and receiving the respective local
statistics) is ~2k instances for Flink and around 400k instances for Storm.
Also in Storm the aggregator continuously broadcasts ~100-200 attribute
messages to local processors on average while Flink broadcasts ~2100 attribute
messages due to the rate it gets results back I assume. These are collected
locally on each component and there was no message duplication.
Since you worked on the algorithm, do you find this behavior reasonable?
> Add an adapter for Apache Flink-Streaming
> -----------------------------------------
>
> Key: SAMOA-16
> URL: https://issues.apache.org/jira/browse/SAMOA-16
> Project: SAMOA
> Issue Type: New Feature
> Reporter: Paris Carbone
> Assignee: Gianmarco De Francisci Morales
>
> Apache Flink-Streaming is a new system for distributed stream processing
> built for unique and flexible high level stream transformations. A Flink
> adapter for Samoa should be able to translate a Samoa Task topology into
> Flink streaming transformations. Some of the challenges are the compositional
> topology support, circle detection and their translation to Flink iterations.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)