On Mon, Mar 11, 2013 at 5:44 PM, Jake Mannix <[email protected]> wrote:

>
>
>
> On Mon, Mar 11, 2013 at 5:14 PM, Ted Dunning <[email protected]>wrote:
>
>> [mvn compile|test|package] will do the trick.
>>
>> Everything is built-in.  The code generator is a maven plug-in that runs
>> whenever you build math.  That is why the build isn't real incremental.
>>  Not that it matters much since the compile is so fast.
>>
>
> Ok, I'll try that.  For some reason, it wasn't doing anything (I think?)
> before,
> as we hardcode dependency on mahout-collections-1.0 in a lot of poms,
> I think?
>

In particular, when I build, I notice that I see:

Downloading:
http://repo1.maven.org/maven2/org/apache/mahout/mahout-collections/1.0/mahout-collections-1.0.jar

Which implies to me that I'm not going to be using my newly minted code...


>
>
>> Getting a good iterator would be awesome.  Should be easy to have internal
>> state variables with getters to avoid cons'ing up temporary values or to
>> avoid boxing.  For key/value pairs, the iterator can nominally be over the
>> keys but have an extra method to be called after next() which will give
>> the
>> value without a real lookup.
>>
>
> I was imagining doing very similar to what we have in our vectors: truly
> implement Iterable<${KeyTypeCap}${ValueTypeCap}Pair>, by instantiating
> exactly *one* ${KeyTypeCap}${ValueTypeCap}Pair per iterator, and having
> it serve as a layer of indirection to fetch keys/values directly from the
> underlying
> primitive arrays (and keeping the simple state of the index offset into
> the
> arrays which is incremented as iteration commences).
>
>
>
>>
>> On Mon, Mar 11, 2013 at 4:42 PM, Jake Mannix <[email protected]>
>> wrote:
>>
>> > On Mon, Mar 11, 2013 at 4:21 PM, Ted Dunning <[email protected]>
>> > wrote:
>> >
>> > > It is part of math now since we had zero pull for it separate from
>> math.
>> > >
>> >
>> > I see the code templates living in math, yes, but how to build it?
>> >
>> >
>> > > What did you need?
>> > >
>> >
>> > Iterators.
>> >
>> > The way we use OpenIntDoubleHashMap in our primary sparse vector impl
>> is to
>> > use forEachPair() to fill a secondary structure with the keys and
>> values,
>> > and then iterate over this.  In addition to being wasteful in the usual
>> > case of iterating over all values (both for CPU and memory), it's super
>> > wasteful if your iteration terminates early: you've already done the
>> full
>> > O(n) walk, but the "second pass" might terminate after a few values: you
>> > want to know whether the vector has any values > 1.0.  You might find
>> out
>> > that the first one does, but instead of being an O(1) operation, it's
>> O(n).
>> >
>> > For raw OpenIntDoubleHashMap, you can use forEachXYZ methods, but
>> exposing
>> > these in the Vector interface is a bit heavy-handed.  What would be
>> better
>> > would be to just properly implement the iterateAllNonZero() method to
>> > properly delegate to an efficient iterater() method on
>> > OpenIntDoubleHashMap.  It's not hard to write (it's basically what we
>> have
>> > in RandomAccessSparseVector), it just needs to be implemented in the
>> > templates.
>> >
>> >
>> > >
>> > > On Mon, Mar 11, 2013 at 1:43 PM, Jake Mannix <[email protected]>
>> > > wrote:
>> > >
>> > > > Question which I ought to know the answer to, but don't: if we want
>> to
>> > > make
>> > > > changes to mahout-collections, what's the build process / maven
>> target
>> > to
>> > > > do this?
>> > > >
>> > > > --
>> > > >
>> > > >   -jake
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> >
>> >   -jake
>> >
>>
>
>
>
> --
>
>   -jake
>



-- 

  -jake

Reply via email to