On 28.12.2011 Lance Norskog wrote:
> Or you can take a small set of good data and generate variations to
> get a big set with the same disribution curves.
... and motivate users to evaluate upcoming releases against their setup to
spot
regressions that slipped through performance tests.
Isabel
On Dec 28, 2011, at 7:28 PM, Jeff Eastman wrote:
> This is something that I'm enthusiastic about investigating right now. I'm
> heartened that K-Means seems to scale well in your tests and I think I've
> just improved Dirichlet a lot.
I suspect we found out why before, at least for Dirichlet,
This is something that I'm enthusiastic about investigating right now.
I'm heartened that K-Means seems to scale well in your tests and I think
I've just improved Dirichlet a lot. I'd like to test it again with your
data. FuzzyK is problematic as its clusters always end up with dense
vectors fo
On Dec 28, 2011, at 1:47 PM, Ted Dunning wrote:
> I have nearly given up on getting publicly available large data sets and
> have started to specify synthetic datasets for development projects. The
> key is to build reasonably realistic generation algorithms and for that
> there are always some
Or you can take a small set of good data and generate variations to
get a big set with the same disribution curves.
On Wed, Dec 28, 2011 at 10:47 AM, Ted Dunning wrote:
> I have nearly given up on getting publicly available large data sets and
> have started to specify synthetic datasets for deve
I have nearly given up on getting publicly available large data sets and
have started to specify synthetic datasets for development projects. The
key is to build reasonably realistic generation algorithms and for that
there are always some serious difficulties.
For simple scaling tests, however,
To me, the big thing we continue to be missing is the ability for those of us
working on the project to reliably test the algorithms at scale. For instance,
I've seen hints of several places where our clustering algorithms don't appear
to scale very well (which are all M/R -- K-Means does scale
On Tue, Dec 27, 2011 at 3:24 PM, Tom Pierce wrote:
> ...
>
> They discover Mahout, which does specifically bill itself as scalable
> (from http://mahout.apache.org, in some of the largest letters: "What
> is Apache Mahout? The Apache Mahoutâ„¢ machine learning library's goal
> is to build scalable
Tom,
Thanks for the your input. I have nothing to argue with but I think
project can use help of the people who are kicking the tires in a way
that they may make those problems (in particular, scale problems)
available to the list.
> They discover Mahout, which does specifically bill itself as sc
The users I'm talking about are often quite advanced in many ways -
familiar with R, SAS, etc., capable of coding up their own
implementations based on papers, etc. They don't know Mahout, they
aren't eager to study a new API out of curiosity, but they would like
to find a suite of super-scalable
On Tue, Dec 27, 2011 at 2:13 PM, Dmitriy Lyubimov wrote:
> Yes, i think this one is in terms of documentation.
I meant, this patch one is going in in terms of its effects for API
and their docs.
>
> Wiki technically doesn't require annotation to be useful in describing
> method use though.
>
> N
Yes, i think this one is in terms of documentation.
Wiki technically doesn't require annotation to be useful in describing
method use though.
No plans for command line as of the moment as far as i know. What you
would suggest people should see there in addition to what they cannot
see on wiki?
>
Is there a plan to bubble these annotations out further? Say to the
wiki or as command-line feedback?
I think it would be really helpful (and promote uptake of Mahout) to
have metadata and prominent documentation that describes the general
scaling/stability properties of the different methods. I
Hmm... this looks promising:
http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/annotation/Documented.html
See the documentation section here:
http://docs.oracle.com/javase/tutorial/java/javaOO/annotations.html
On Thu, Dec 22, 2011 at 2:43 PM, Ted Dunning wrote:
> I think annotations are s
I think annotations are significantly better. The integration with javadoc
isn't impossible and the integration from javadoc markup to annotation is
impossible.
Interestingly, the javadoc tool documentation tends to recommend an
annotation *and* a javadoc tag. That does make the integration simp
We just use @lucene.experimental (or something like that)
On Dec 22, 2011, at 3:54 PM, Dmitriy Lyubimov wrote:
> Well it looks like lucene people were talking about custom javadoc
> tags, not annotations.
>
> i did a brief scan and it looks like it would require a specific
> doclet developed to
Well it looks like lucene people were talking about custom javadoc
tags, not annotations.
i did a brief scan and it looks like it would require a specific
doclet developed to handle annotations. Documentation is not terribly
clear what of standard doclets to subclass.
just a custom javadoc tag wo
Yes. Could be due to my lacking maven skills :)
On 22.12.2011 21:33, Dmitriy Lyubimov wrote:
> you mean you couldn't make them come up in javadocs?
>
> On Thu, Dec 22, 2011 at 12:25 PM, Sebastian Schelter wrote:
>> There is still a ticket open for those ->
>> https://issues.apache.org/jira/brow
you mean you couldn't make them come up in javadocs?
On Thu, Dec 22, 2011 at 12:25 PM, Sebastian Schelter wrote:
> There is still a ticket open for those ->
> https://issues.apache.org/jira/browse/MAHOUT-831. I tried to integrate
> the javadoc "annotations" like proposed by the lucene guys, but f
There is still a ticket open for those ->
https://issues.apache.org/jira/browse/MAHOUT-831. I tried to integrate
the javadoc "annotations" like proposed by the lucene guys, but for some
reason I didn't get them working. Would be great if someone could help here.
--sebastian
On 22.12.2011 21:03, D
Hi,
what happened to these annotations to mark maturity level? Did we ever
commit those?
thank you.
21 matches
Mail list logo