[ 
https://issues.apache.org/jira/browse/HADOOP-6502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208203#comment-13208203
 ] 

Aaron T. Myers commented on HADOOP-6502:
----------------------------------------

Tiny nit: In "} else {//check already performed on this class name" please put 
a space between "{" and "//".

Otherwise the patch looks good to me. +1.
                
> DistributedFileSystem#listStatus is very slow when listing a directory with a 
> size of 1300
> ------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-6502
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6502
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.20.0
>            Reporter: Hairong Kuang
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: 6502.patch, 6502_v2.patch, hadoop-6502-trunk.txt, 
> hadoop-6502-trunk.txt
>
>
> When listing a directory of around 1300 children, it takes hundreds of 
> milliseconds. It turns out the slowdowness is caused by the change made by 
> HADOOP-4187. The return value of listStatus is an array of FileStatus. When 
> deserializing each element of the array, 
> ReflectionUtils#newInstance(Class<T>, Configuration) is called and then calls 
> setConf, which calls setJobConf. SetJobConf checks if JobConf is on the class 
> path by calling Configuration#getClassByName. Even though 
> Configuration#getClassByName tries to optimize the lookup using a cached map, 
> but since JobConf is not in the class path, so it is not in the cache. Every 
> checkup ends up calling Class.ForName which is very expensive. Deserializing 
> an array of 1300 entries requires calling of Class#ForName 1300 times!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to