Hello Fabio, what you describe sounds very possible, the easiest way to do it would be to save your incoming data in HDFS as you already do if I understand correctly, and then use the batch ALS algorithm [1] to create your recommendations from the static data, which you could do at regular intervals.
Regards, Theodore [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/ml/als.html On Fri, Mar 31, 2017 at 4:10 PM, Fábio Dias <fabiodio...@gmail.com> wrote: > Hi to all, > > I'm building a recommendation system to my application. > I have a set of logs (that contains the user info, the hour, the button > that was clicked ect...) that arrive to my Flink by kafka, then I save > every log in a HDFS (HADOOP), but know I have a problem, I want to apply ML > to (all) my data. > > I think in 2 scenarios: > First : Transform my DataStream in a DataSet and perform the ML task. It > is possible? > Second : Preform a task in flink that get the data from Hadoop and perform > the ML task. > > What is the best way to do it? > > I already check the IncrementalLearningSkeleton but I didn't understand > how to apply that to an actual real case. Is there some complex example > that I could look? > (https://github.com/apache/flink/tree/master/flink- > examples/flink-examples-streaming/src/main/java/org/ > apache/flink/streaming/examples/ml) > > Another thing that I would like to ask is how to perform the second > scenario, where I need to perform this task every hour, what it is the best > way to do it? > > Thanks, > Fábio Dias. >