I'm inclined towards Sean's perspective. Making the kinds of significant changes to the vector implementation that 165 entails strike me as non-trivial and likely to delay 0.2 significantly. I vote to not include it in this point release so that the functionality which is ready to go public can get released. What we have now seems to work adequately even if it does not scale as well as we can imagine it should. Support for 100k cardinality sparse vectors would be a fine focus point for 0.3 and I'm willing to help make it happen.

Jeff



Grant Ingersoll wrote:

On Oct 12, 2009, at 6:18 PM, Jake Mannix wrote:

Yeah, I'm suggesting that any discussion about Colt/cMath/etc be for 0.3, not now. The changes in M-165 don't require any library changes - they're
all internal to Mahout's vector impls.

Yeah, except Shashi says it doesn't perform.


 -jake

On Mon, Oct 12, 2009 at 3:09 PM, Sean Owen <sro...@gmail.com> wrote:

I don't have a strong view on Colt vs anything else. The only thing
that would concern me here would be to let this block 0.2, if it's not
even fully clear what the change will be, or implemented or tested.
This is months off at this rate? Without a clear picture that this is
getting wrapped up in a week, I'd strongly push the modest suggestion
that it simply not be part of 0.2. Absolutely not saying it shouldn't
be done. Not even saying it should be done soon -- I think 0.3 should
follow soon and in general we should release more often.

We're another week on in the discussion about releasing 0.2. Two folks
seem ready to go. May I ask again what it seems 0.2 can't be released
without? Having put a load of changes I'm keen to get into the wild
myself, I'm aware of the drawbacks to letting this drag on a while. I
really feel like people have "1.0" in mind when they say "0.2". This
definitely doesn't need to be perfect, just roughly stable and a
significant iteration over 0.1, and it is.

Could I ask anyone that really wants this issue to be in 0.2 to at
least name a deadline and create a plan to make it happen? seems like
a reasonable request now. Otherwise it's 0.3.

On Mon, Oct 12, 2009 at 9:43 PM, Grant Ingersoll <gsing...@apache.org>
wrote:
I think 165 needs to be in this release, it is a pretty big performance
issue.  I'm leaning towards the Colt stuff at the moment.  Perhaps in
0.3,
we can refocus on how we want to attack the matrix stuff.



--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
http://www.lucidimagination.com/search




Attachment: PGP.sig
Description: PGP signature

Reply via email to