strictly speaking, it would be A.rdd.reduce(_._2 + _._2). oh well

On Sun, Jul 13, 2014 at 10:08 PM, Dmitriy Lyubimov <[email protected]>
wrote:

> the only problem with that I see is that would not be algebra any more.
> that would be functional programming, and as such there are probably better
> frameworks to address these kind of things than a DRM. Drm currently
> suggest just to exist to engine level primitives, i.e. do something like
> A.rdd.reduce(_+_).
>
>
> On Sun, Jul 13, 2014 at 10:02 PM, Anand Avati <[email protected]> wrote:
>
>> How about a new drm API:
>>
>>
>>   type ReduceFunc = (Vector, Vector) => Vector
>>
>>   def reduce(rf: ReduceFunc): Vector = { ... }
>>
>> The row keys in this case are ignored/erased, but I'm not sure if they are
>> useful (or even meaningful) for reduction. Such an API should be
>> sufficient
>> for kmeans (in combination with mapBlock). But does this feel generic
>> enough? Maybe a good start? Feedback welcome.
>>
>>
>>
>> On Sun, Jul 13, 2014 at 6:34 PM, Ted Dunning <[email protected]>
>> wrote:
>>
>> >
>> > Yeah.  Collect was where I had gotten, and was rather sulky about the
>> > results.
>> >
>> > It does seem like a reduce is going to be necessary.
>> >
>> > Anybody else have thoughts on this?
>> >
>> > Sent from my iPhone
>> >
>> > > On Jul 13, 2014, at 17:58, Anand Avati <[email protected]> wrote:
>> > >
>> > > collect(), hoping the result fits in memory, and do the reduction
>> > in-core.
>> > > I think some kind of a reduce operator needs to be introduced for
>> doing
>> > > even simple things like scalable kmeans. Haven't thought of how it
>> would
>> > > look yet.
>> >
>>
>
>

Reply via email to