a server not running a datanode,it can only be namenode or jobtracker. the
copier jobs run on such server I think can bring uncertain risks.
I find something in the book hadoop definitive guide
When copying data into HDFS, it’s important to consider cluster balance. HDFS
works
best when the
You can set mapred.child.java.opts in mapred-site.xml
it affects both Mappers and Reducers, while I need to specify jvm
options for mappers and reducers separately.
Regards,
Vitaliy S
On Tue, Oct 5, 2010 at 5:14 PM, Jeff Zhang zjf...@gmail.com wrote:
You can set mapred.child.java.opts in
I think you can change that in your conf/mapred-site.xml, since it's
a site specific config
file (see: http://hadoop.apache.org/common/docs/current/cluster_setup.html)
e.g.:
property namemapred.child.java.opts/namevalue-Xmx8G/value
/property
I hope this helps
Yours
Pablo Cingolani
On
I have a collection of dirty data files, which I can detect during the
setup() phase of my Map job. It would be best that I can quit the map job
and prevent it from being retried again. What is the best practice to do
this?
Thanks in advance.
How about deleting/moving the dirty files in your mapper or in another job ?
On Fri, Oct 8, 2010 at 4:30 PM, Steve Kuo kuosen...@gmail.com wrote:
I have a collection of dirty data files, which I can detect during the
setup() phase of my Map job. It would be best that I can quit the map job