[ 
https://issues.apache.org/jira/browse/HADOOP-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12775180#action_12775180
 ] 

Aaron Kimball commented on HADOOP-5958:
---------------------------------------

|  DF.java:37 - "has most of the functionality, and has better performance" - 
makes sense as a jira comment, but when that's the only implementation people 
may be left wondering "Better than what?" Best to specifically compare with 
PosixDF

Agreed.

| Now that there are two getDFs, one with a conf and one without, shouldn't one 
either be marked deprecated or private? I'd say we should leave the one that 
takes a Configuration and just ignore the configuration variable, unless we're 
certain we'll never want Configuration here again.

{{getDF()}} never existed before; I created those as a replacement for the 
{{DF()}} constructors, now that {{DF}} itself is abstract. I'm ok with 
providing the Configuration-handling version only.

| We used to have a limit on how often df would be called. That's gone with the 
new implementation - I dunno if the interval was due to the fork overhead or 
actually some overhead in the calls themselves. Are the j.io.File 
implementations fast enough that we don't have to worry about it, or should 
JavaDF do some caching?

I just ran a quick benchmark of calling {{File.getFreeSpace()}} a million 
times; an individual call to {{f = new File(); f.getFreeSpace()}} takes on 
average 45 microseconds. By comparison, forking the {{df}} executable takes 
2.83 milliseconds. I don't think we need to worry about caching in JavaDF.


> Use JDK 1.6 File APIs in DF.java wherever possible
> --------------------------------------------------
>
>                 Key: HADOOP-5958
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5958
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Devaraj Das
>            Assignee: Aaron Kimball
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5958-hdfs.patch, HADOOP-5958-mapred.patch, 
> HADOOP-5958.2.patch, HADOOP-5958.3.patch, HADOOP-5958.patch
>
>
> JDK 1.6 has File APIs like File.getFreeSpace() which should be used instead 
> of spawning a command process for getting the various disk/partition related 
> attributes. This would avoid spikes in memory consumption by tasks when 
> things like LocalDirAllocator is used for creating paths on the filesystem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to