Hi folks,
I have a huge text file in TBs and it has multiline records. And we are not
given that each records takes how many lines. One records can be of size 5
lines, other may be of 6 lines another may be 4 lines. Its not sure. Line
size may vary for each record.
Since we cannot use default TextI
Hi guys,
I've a confusion related to NLineInputFormat.
I have written MR job using NLineInputFormat ,output I am getting fine. But
I am getting only 2 Map jobs running.
According to documentation of NLineInputFormat :
If you want your mappers to receive a fixed number of lines of input, then
NLi
Hi guys,
I read somewhere that for better performance
For maximum performance, the number of reducers should be slightly less than
the number of reduce slots in the cluster. This allows the reducers to
finish in
one wave and fully utilizes the cluster during the reduce phase.
I don't quite under