Oh-ha, that's simple. :)
/Edward J. Yoon
On Tue, Oct 7, 2008 at 7:14 PM, Miles Osborne <[EMAIL PROTECTED]> wrote:
> this is a well known problem. basically, you want to aggregate values
> computed at some previous step.
>
> --emit pairs and have the reducer simply sum-up
> the probabilities for
this is a well known problem. basically, you want to aggregate values
computed at some previous step.
--emit pairs and have the reducer simply sum-up
the probabilities for a given category
(it is the same task as summing-up the word counts)
Miles
2008/10/7 Edward J. Yoon <[EMAIL PROTECTED]>:
I would like to get the spam probability P(word|category) of the words
from an files of category (bad/good e-mails) as describe below. BTW,
To computes it on reduce, I need a sum of "spamTotal" between map
tasks. How can i get it?
Map:
/**
* Counts word frequency
*/
public void