On 24/09/11 15:48, Harsh J wrote:
There are specific derivatives of Configuration class that each read
certain *-site.xml files. This is because the XML files are service
specific.


I'm confused now.

My belief is that when a default configuration file is pushed to the list via Configuration.addDefaultResource(), then all Configuration instances that are created after that get the config, whether they are Configuration instances or subclasses thereof.

For example, JobConf explicitly adds the MR files

  static{
    Configuration.addDefaultResource("mapred-default.xml");
    Configuration.addDefaultResource("mapred-site.xml");
  }


If the resource hasn't been loaded already, that loading triggers a reload of all existing configurations with the loadResource flag set

/* in org.apache.hadoop.conf.Configuration */

  public static synchronized void addDefaultResource(String name) {
    if(!defaultResources.contains(name)) {
      defaultResources.add(name);
      for(Configuration conf : REGISTRY.keySet()) {
        if(conf.loadDefaults) {
          conf.reloadConfiguration();
        }
      }
    }
  }

Configuration.loadResource is true unless you construct an instance with new Configuration(false); the state propagates when you create a new Configuration instance off another.

The way the constructor adds all Configuration instances to the static (weak ref) REGISTRY map is inefficient as the loadDefaults flag is only ever set in the ASF codebase at construction time; it would be better to make that flag static and only register instances with loadDefaults = true

Now, for some extra fun, Configuration.reloadConfiguration() is not final. Which allows subclasses to do it, before even their static construction/initialisation is fully complete. I know this as I have done it, and would not recommend it to anyone. You can end up in that weird world of class initialisation time stack traces.

To clean up Configuration, then, I would
 -make reloadConfiguration final
 -make loadDefaults static
 -only add confs to the keySet if loadDefaults = true
 -add some debug strings to see whats going on/wrong.

This would break my code, but that's OK. What I did was not something I'd recommend to anyone else, and that class of mine is now marked as @Deprecated in my own codebase, as it was more trouble than it was worth. What was it trying to do? Get a live config from a Configuration Management service, and retain that bonding to the CM infrastructure even when cloned. This stops working once you start serializing/deserializing them, so it's not worth the hassle.

Reply via email to