I have large number of key,value pairs. I don't actually care if data goes in
value or key. Let me be more exact. 
(k,v) pair after combiner is about 1 mil. I have approx 1kb data for each
pair. I can put it in keys or values.
I have experimented with both options (heavy key , light value)  vs (light
key, heavy value). It turns out that hk,lv option is much much better than
(lk,hv). 
Has someone else also noticed this?
Is there a way to make things faster in light key , heavy value option. As
some application will need that also. 
Remember in both cases we are talking about atleast dozen or so million
pairs.
There is a difference of time in shuffle phase. Which is weird as amount of
data transferred is same.

-gyanit
-- 
View this message in context: 
http://www.nabble.com/Why-is-large-number-of---%28heavy%29-keys-%2C-%28light%29-value--faster-than-%28light%29key-%2C-%28heavy%29-value-tp22447877p22447877.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Reply via email to