On Thu, Oct 15, 2009 at 6:47 AM, Grant Ingersoll <gsing...@apache.org>wrote:
> > On Oct 15, 2009, at 8:22 AM, Sean Owen wrote: > > On Thu, Oct 15, 2009 at 4:57 AM, Grant Ingersoll <gsing...@apache.org> >> wrote: >> >>> MAHOUT-165 Using better primitives hash for sparse vector for >>>> performance gains Open 14/Oct/09 >>>> >>>> Per discussion, move the remainder (migration to Colt or something) to >>>> 0.3 >>>> >>> >>> I will try to get to this, as I think it is important. >>> >> >> I agree with Jeff that the migration to a new framework is a big >> change and should be left to 0.3. (Vote?) There is a whole lot of >> change already, more than might normally go into a point release. >> Since you have another blocker below, and limited time, I say don't >> kill yourself to work on this. It's going to be hard to get it done in >> a weekend. >> >> > > I don't think it is that big. We can likely just make another > implementation of Vector. We don't have to convert everything to Colt. > Ted's patch (since monkeyed with my you and myself) has the other implementation of Vector, but testing showed it's slower? This patch also had a significant refactoring of the Vector hierarchy so it's not just "a new class". I'm all for getting this in as soon as we can, because this issue (well, finalizing on a linear api) pretty much blocks my donating decomposer to Mahout, but it looks like you're the only one who feels strongly about resolving M-165 for 0.2, Grant. Can we not just have 0.3 in another 6-8 weeks or so which covers this? What Mahout user is getting blocked by having too-slow sparse vectors currently? -jake