Timothy Potter created SOLR-6305: ------------------------------------ Summary: Ability to set the replication factor for index files created by HDFSDirectoryFactory Key: SOLR-6305 URL: https://issues.apache.org/jira/browse/SOLR-6305 Project: Solr Issue Type: Improvement Components: hdfs Environment: hadoop-2.2.0 Reporter: Timothy Potter
HdfsFileWriter doesn't allow us to create files in HDFS with a different replication factor than the configured DFS default because it uses: {{FsServerDefaults fsDefaults = fileSystem.getServerDefaults(path);}} Since we have two forms of replication going on when using HDFSDirectoryFactory, it would be nice to be able to set the HDFS replication factor for the Solr directories to a lower value than the default. I realize this might reduce the chance of data locality but since Solr cores each have their own path in HDFS, we should give operators the option to reduce it. My original thinking was to just use Hadoop setrep to customize the replication factor, but that's a one-time shot and doesn't affect new files created. For instance, I did: {{hadoop fs -setrep -R 1 solr49/coll1}} My default dfs replication is set to 3 ^^ I'm setting it to 1 just as an example Then added some more docs to the coll1 and did: {{hadoop fs -stat %r solr49/hdfs1/core_node1/data/index/segments_3}} 3 <-- should be 1 So it looks like new files don't inherit the repfact from their parent directory. Not sure if we need to go as far as allowing different replication factor per collection but that should be considered if possible. I looked at the Hadoop 2.2.0 code to see if there was a way to work through this using the Configuration object but nothing jumped out at me ... and the implementation for getServerDefaults(path) is just: public FsServerDefaults getServerDefaults(Path p) throws IOException { return getServerDefaults(); } Path is ignored ;-) -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org