Hi All,
We have a HDFS cluster with ~200 nodes, and for some reason, it's
divided into 4 MR clusters which sharing the same HDFS.
Recently, we saw a lots of SocketTimeoutException in datanode log, such
as:
2012-02-24 11:57:51,882 WARN datanode.DataNode
(DataXceiver.java:readBlock(236))
Hello Mohit,
I am looking at some hadoop tuning parameters like io.sort.mb,
mapred.child.javaopts etc.
- My question was where to look at for current setting
The default settings as well as the documentations can be found in Hadoop
directory:
src/mapred/mapred-default.xml
Hi Pavel,
Seems your team spent some time on the performance and tuning issues. Just
wonder whether an automatic Hadoop tuning tool like Starfish would be
interesting to you. We'd like to exchange the tuning experience with you.
Thanks,
Jie
Starfish Group, Duke
Hi Jinyan,
I'd like to introduce you our system Starfish, which can be used to analyze
and estimate the Hadoop performance and memory usage.
With Starfish, you can analyze the performance of your Hadoop job at fine
grained levels, e.g. the time for map processing, spilling, merging,
shuffling,
Use a search engine to find the Hadoop best practices blog by Arun Murthy.
Sriram
On Feb 24, 2012, at 10:36 PM, Mohit Anchlia mohitanch...@gmail.com wrote:
I am looking at some hadoop tuning parameters like io.sort.mb,
mapred.child.javaopts etc.
- My question was where to look at for
If I want to change the block size then can I use Configuration in
mapreduce job and set it when writing to the sequence file or does it need
to be cluster wide setting in .xml files?
Also, is there a way to check the block of a given file?
Yes, it is supported by Hadoop sequence file. It is splittable
by default. If you have installed and specified LZO correctly,
use these:
org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputForma
t.setCompressOutput(job,true);
Thanks. Does it mean LZO is not installed by default? How can I install LZO?
On Sat, Feb 25, 2012 at 6:27 PM, Shi Yu sh...@uchicago.edu wrote:
Yes, it is supported by Hadoop sequence file. It is splittable
by default. If you have installed and specified LZO correctly,
use these: