Edit : instead of buffering in Hash and then emitting at cleanup you can
use a combiner. Likely slower but easier to code if speed is not your
main concern
Le 01/03/2015 13:41, Ulul a écrit :
Hi
I probably misunderstood your question because my impression is that
it's typically a job for a
Hi
I probably misunderstood your question because my impression is that
it's typically a job for a reducer. Emit local min and max with two
keys from each mapper and you will easily get gobal min and max in reducer
Ulul
Le 28/02/2015 14:10, Shahab Yunus a écrit :
As far as I understand
As far as I understand cleanup is called per task. In your case I.e.
per map task. To get an overall count or measure, you need to aggregate
it yourself after the job is done.
One way to do that is to use counters and then merge them programmatically
at the end of the job.
Regards,
Shahab
On
I am having an input file, which contains last column as class label
7.4 0.29 0.5 1.8 0.042 35 127 0.9937 3.45 0.5 10.2 7 1
10 0.41 0.45 6.2 0.071 6 14 0.99702 3.21 0.49 11.8 7 -1
7.8 0.26 0.27 1.9 0.051 52 195 0.9928 3.23 0.5 10.9 6 1
6.9 0.32 0.3 1.8 0.036 28 117 0.99269 3.24 0.48 11 6 1