On Fri, Feb 28, 2014 at 1:37 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> Here are some goals that I think would be good in the area of numerics,
> classifiers and clustering:


> - simple programming model
>

+1


>
> - programmable via Java or R
>

+1


>
> - runs clustered or not
>

I think both.


>
>
> What does everybody think?
>

Good thread. Some of the comments are a bit above my head when it comes to
specific topics yet here are my 2 cents.

I come from the perspective of a Java developer who likes to add text
clustering, classification and recommendation algorithms to an existing
application and data, whether it's smallish data from a SQL database or
larger amounts of data that requires distributed computing.

So ideally I would like to see

1 A Java beans API for every algorithm.
2 Have a unified way to vectorize data, no matter where it comes from (SQL
database or NoSQL store, filesystem, Lucene index, etc)
3 Have the option to use Hadoop or some other distributed computing
framework to scale out.

I have some ideas on these topics but maybe that's better for another
thread.

Reply via email to