This is mostly disk bound on NameNode. I think this ends up being one fsync for each file. If you have multiple directories, you could start multiple commands in parallel. Because of the way NameNode syncs having multiple clients helps.
Raghu. Frank Singleton wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, Did a test on recursive chown on a fedora 9 box here (2xquad core,16Gram) Took about 12.5 minutes to complete for 45000 files. (hmm approx 60 files/sec) This was the namenode that I executed the command on Q1. Is this rate (60 files/sec) typical of what other folks are seeing ? Q2. Are there any dfs/jvm parameters I should look at to see if I can improve this time /home/hadoop/hadoop-0.18.1/bin/hadoop dfs -chown -R frank:frank /home/frank/proj100 real 12m38.631s user 1m54.662s sys 0m33.124s time /home/hadoop/hadoop-0.18.1/bin/hadoop dfs -count /home/frank/proj100 220 45891 3965996260 hdfs://namenode:9000/home/frank/proj100 real 0m1.579s user 0m0.686s sys 0m0.129s cheers / frank -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org iEYEARECAAYFAkjln0MACgkQpZzN+MMic6dqgQCdEtto3qEhKIc50ICMf058w8ar o4QAoILcDRDYmUUuxPwSFh7LNTQdKodn =xuZE -----END PGP SIGNATURE-----