On Mon, Mar 11, 2013 at 6:07 PM, Ted Dunning <[email protected]> wrote:
> On Mon, Mar 11, 2013 at 5:44 PM, Jake Mannix <[email protected]> > wrote: > > > On Mon, Mar 11, 2013 at 5:14 PM, Ted Dunning <[email protected]> > > wrote: > > > > > [mvn compile|test|package] will do the trick. > > >... > > > Not that it matters much since the compile is so fast. > > > > > > > Ok, I'll try that. For some reason, it wasn't doing anything (I think?) > > before, > > as we hardcode dependency on mahout-collections-1.0 in a lot of poms, > > I think? > > > > Shouldn't be any more. > > I don't see these dependencies. > > > > I was imagining doing very similar to what we have in our vectors: truly > > implement Iterable<${KeyTypeCap}${ValueTypeCap}Pair>, by instantiating > > exactly *one* ${KeyTypeCap}${ValueTypeCap}Pair per iterator, and having > > it serve as a layer of indirection to fetch keys/values directly from the > > underlying > > primitive arrays (and keeping the simple state of the index offset into > the > > arrays which is incremented as iteration commences). > > > > That would work just as well. Even better since it is very well > understood. My guess is that the JIT will see through the re-used object. > > The only downside is if somebody naively retains the Pair object imagining > that it is not re-used. This *will* cause some odd bugs, but we have faced > that before. Hiding the data inside the Iterator will make this a little > less likely since people can't get a reference to a re-used object. The > extra method will be a little easier for the JIT to figure out as well, but > I don't expect any practical difference. > > If you would like, I will mirror yoru implementation with my approach and > we can measure to see if there is any important difference in speed. > Oooooh, am I being offered a chance to go head-to-head in an inner-loop performance challenge against Ted Dunning? How can I pass *that* up? ;) See me on MAHOUT-1160, and brink your pinks! -- -jake
