Re: Flink Scheduling and FlinkML

Theodore Vasiloudis Mon, 03 Apr 2017 00:52:43 -0700

Hello Fabio,

what you describe sounds very possible, the easiest way to do it would be
to save your incoming data in HDFS as you already do if I understand
correctly,
and then use the batch ALS algorithm [1] to create your recommendations
from the static data, which you could do at regular intervals.


Regards,
Theodore

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/ml/als.html

On Fri, Mar 31, 2017 at 4:10 PM, Fábio Dias <[email protected]> wrote:

> Hi to all,
>
> I'm building a recommendation system to my application.
> I have a set of logs (that contains the user info, the hour, the button
> that was clicked ect...) that arrive to my Flink by kafka, then I save
> every log in a HDFS (HADOOP), but know I have a problem, I want to apply ML
> to (all) my data.
>
> I think in 2 scenarios:
> First : Transform my DataStream in a DataSet and perform the ML task. It
> is possible?
> Second : Preform a task in flink that get the data from Hadoop and perform
> the ML task.
>
> What is the best way to do it?
>
> I already check the IncrementalLearningSkeleton but I didn't understand
> how to apply that to an actual real case. Is there some complex example
> that I could look?
> (https://github.com/apache/flink/tree/master/flink-
> examples/flink-examples-streaming/src/main/java/org/
> apache/flink/streaming/examples/ml)
>
> Another thing that I would like to ask is how to perform the second
> scenario, where I need to perform this task every hour, what it is the best
> way to do it?
>
> Thanks,
> Fábio Dias.
>

Re: Flink Scheduling and FlinkML

Reply via email to