Hello,
I'm able to run a Nutch job on a single-node in a pseudo-distributed mode
where each Hadoop daemon runs in a separate Java process. I have to set
mapred.min.split.size to a large number. If mapred.min.split.size is too
small, the local disk fills after a large number of maps and the job is
never completed.
My problem is when I try to set mapred.min.split.size when running in
non-distributed mode, as a single Java process. It appears to be ignored.
Is this the expected behavior?
Thanks,
Ken