On Mar 24, 2008, at 4:48 PM, Isabel Drost wrote:
On Thursday 20 March 2008, Grant Ingersoll wrote:
In the longer run, intro to ML would be cool, but there is lots
available
on that.
I think Mahout is not really suitable to build demos that explain
the inner
workings of the algorithms implemented.
I agree, but as we develop, we will probably have programmer's guides,
etc. which may go into some of the theory in a practical way.
I don't think it should be that large, as I don't think we
can really show scale.
I agree with that. I think once we offer enough functionality to be
usable for
commercial projects it would be nice to gather a list of links to
users.
I added a PoweredBy page on the Wiki.
I would also love to see our name mentioned in a few research
publications or
at one of the machine learning competitions - the blog track would
be a
really great start ;)
+1
Just something that shows how to get the source, set it up to run
against a
test set of data and somehow see the results, even if it is trivial
cmd.
line stuff.
I think for that we should rely on datasets that are manageable with
a few
machines. I would guess people evaluating our library or want to add
more
functionality do not necessarily have a huge cluster of machines at
their
disposal.
Definitely, even a single machine would be fine, but will then easily
scale up (in other words, it does all the Hadoop setup). You don't
really want a demo that runs for more than a few minutes, I don't think.
Something simple like Hadoop's WordCount example comes to mind.
Isabel
--
God must have loved calories, she made so many of them.
|\ _,,,---,,_ Web: <http://www.isabel-drost.de>
/,`.-'`' -. ;-;;,_
|,4- ) )-,_..;\ ( `'-'
'---''(_/--' `-'\_) (fL) IM: <xmpp://[EMAIL PROTECTED]>
--------------------------
Grant Ingersoll
http://www.lucenebootcamp.com
Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ