[ http://issues.apache.org/jira/browse/HADOOP-470?page=comments#action_12429859 ] Yoram Arnon commented on HADOOP-470: ------------------------------------
du is inexpensive. See time comparisons of ls and du on the root of a DFS containing 4500 directories and 250000 files. 'top' on the namenode showed no discernable difference. lsr is a different story, see timing at the bottom. >time hadoop dfs -du / real 0m2.217s user 0m0.473s sys 0m0.100s >time hadoop dfs -ls / real 0m2.036s user 0m0.469s sys 0m0.096s >time hadoop dfs -lsr / real 0m55.100s user 0m25.186s sys 0m4.105s > Some improvements in the DFS content browsing UI > ------------------------------------------------ > > Key: HADOOP-470 > URL: http://issues.apache.org/jira/browse/HADOOP-470 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Reporter: Devaraj Das > Priority: Minor > > Some improvement requests from Yoram: > 1. directory browsing: the size, replication and block size fields are > unused, and indeed the replication field contains random junk. It would be > useful to use these fields to represent the size of the folder (recursive, > like du -s), and possibly the number of files in the folder. > 2. since file sizes are typically very large, introducing a comma thousands > separator will make them more readable. > 3. For a particular file I have the list of blocks that make it up. It would > be useful to see the block placement information - which datanodes are > holding that block. That's arguably more relevant than the block contents > when clicking on the block. > 4. a nit - 2048 may be too small a chunk size by default. The overhead of > getting the first byte is so high (redirect, connect, handshake etc.) that > you may as well get 10-20k as your first shot. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira