I am using FileSystem.get(URI uri, Configuration conf,
String user) to create FileSystem implementation(LocalFileSystem in this case)
instances. From what I know, FileSystem internally has a cache to retain the
objects based on uri and user. So if I call FileSystem.get(..) method multiple
times with same uri and user, then only one instance of LocalFileSystem needs
to be created and cached. However, I observed(with hadoop-core-1.0.0) that each
call creates a new instance of LocalFileSystem and puts it in the cache leading
to memory issues.
 
Please see the code below and let me know if I am doing
something wrong.
 
Thanks
 
 
import java.net.URI;
 
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
 
public class FileSystemCacheIssue {
 
    private static FileSystem
getFileSystem(String user) throws Exception {
        Configuration
conf = new Configuration();
       
conf.set("fs.default.name", "file:///");
        return
FileSystem.get(new URI("file:///"),conf,user);
    }
 
    public static void main(String[] args)
throws Exception {
        for(int i = 0; i
< 1000; i++) {
           
getFileSystem("himanshg");
        }
        
        FileSystem
fs = getFileSystem("himanshg");
        System.out.println(fs.getClass().getCanonicalName());
        
        //put a
breakpoint here and look at the heap dump for number of LocalFileSystem
        //instances,
Ideally I expect it to be 1, but there are 1001
       
System.out.println("Keep your debugger here and check.");
    }
}

Reply via email to