Mark,

thanks for the pointer. So as far as I understand you are not using hadoop's
default split but using your own split of one record as specified by the
everything between the starting tag and the end tag in your xml? So in a way
you have one map per record? In my case this will not be efficient since my
xml files are small. What I would want to do is to have a split that
includes multiple files so that I can use one map for around 64meg of data.
And do the parsing inside map. I will update you once it makes more sense to
even me.

-- 
Vipul Sharma
sharmavipul AT gmail DOT com

Reply via email to