Github user gdfm commented on the pull request:
https://github.com/apache/incubator-samoa/pull/11#issuecomment-96609902
Thanks, I managed to make it run on Flink.
I am testing the VHT algorithm with ```bin/samoa flink
target/SAMOA-Flink-0.3.0-SNAPSHOT.jar "PrequentialEvaluation -d /tmp/dump.csv
-i 1000000 -f 100000 -l (classifiers.trees.VerticalHoeffdingTree -p 4) -s
(generators.RandomTreeGenerator -c 2 -o 10 -u 10)"```
I had to increase the task manager slots considerably to make it run. I see
many "Filter" tasks in Flink's dashboard. Is that normal?
The accuracy of the VHT is reasonable, and comparable with what we get with
Storm. It's lower than with the local execution engine, but that's an
algorithmic problem we are working on.
I also feel that's a bit slow. I didn't get the output from the console
telling me how long it took, but it took a while and the fan of my laptop
started spinning. Do you think there is margin for improvement in this respect?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---