My understanding is that HDFS places blocks randomly. As I would expect,
then,
when I use "hadoop fsck" to look at block placements for my files, I see
that
some nodes have more blocks than the average. I would expect that these
hot
spots would cause a performance hit relative to a more even placement of
blocks.

I'd like to experiment with non-random block placement to see if this
can
provide a performance improvement. Where in the code would I start
looking to
find the existing code for random placement?

Cheers,
John

Reply via email to