Hi Felix,

thanks for the idea. But doesn't this mean that I can only train one model
per partition? The thing is, I have way more models than that :(

Best regards,
Felix

2015-07-07 22:37 GMT+02:00 Felix Schüler <fschue...@posteo.de>:

> Hi Felix!
>
> We had a similar usecase and I trained multiple models on partitions of
> my data with mapPartition and the model-parameters (weights) as
> broadcast variable. If I understood broadcast variables in Flink
> correctly, you should end up with one model on each TaskManager.
>
> Does that work?
>
> Felix
>
> Am 07.07.2015 um 17:32 schrieb Felix Neutatz:
> > Hi,
> >
> > at the moment I have a dataset which looks like this:
> >
> > DataSet[model_ID, DataVector] data
> >
> > So what I want to do is group by the model_ID and build for each model_ID
> > one regression model
> >
> > in pseudo code:
> > data.groupBy(model_ID)
> >         --> MultipleLinearRegression().fit(data_grouped)
> >
> > Is there anyway besides an iteration how to do this at the moment?
> >
> > Thanks for your help,
> >
> > Felix Neutatz
> >
>

Reply via email to