Thanks Dmitriy and Vitalii... !!
I am able to control number of mappers by setting the split size. And, yes
there isn't any reason of re-reading the dictionary, except that i was
porting an existing code. I will re-implement to read it once and check
the performance.
Regards,
Dipesh
On Mon, Jan
Well, if you will set split size to 1, you should get per-line split.
2013/1/13 Dipesh Kumar Singh
> Hello users,
>
> I have an input file (1.2 MB) which contains list of words/phrases in every
> new line. I am reading each phrase per line and passing it to udf to
> correct/check that phrase.
>
"The udf (simple extends eval func) refers and reads a dictionary file of 6
MB for each input phrase."
Any reason to keep re-reading the dictionary instead of just reading it
once?
D
On Sun, Jan 13, 2013 at 4:47 AM, Dipesh Kumar Singh
wrote:
> The udf (simple extends eval func) refers and reads