Looks fine to me.
The key are the interfaces for learning and predicting so we should define
some vectors and matrices.
It would be enough to define the algorithms via the interfaces and a
generic BSP should just run them based on the given input.

2012/7/7 Tommaso Teofili <[email protected]>

> Hi all,
>
> in my spare time I started writing some basic BSP based machine learning
> algorithms for our ml module, now I'm wondering, from a design point of
> view, where it'd make sense to put the training data / model. I'd assume
> the obvious answer would be HDFS so this makes me think we should come with
> (at least) two BSP jobs for each algorithm: one for learning and one for
> "predicting" each to be run separately.
> This would allow to read the training data from HDFS, and consequently
> create a model (also on HDFS) and then the created model could be read
> (again from HDFS) in order to predict an output for a new input.
> Does that make sense?
> I'm just wondering what a general purpose design for Hama based ML stuff
> would look like so this is just to start the discussion, any opinion is
> welcome.
>
> Cheers,
> Tommaso
>

Reply via email to