Hi Folks, I had implemented the Hadoop FileSystem abstract class for a storage system at work. This implementation uses some config files that are similar in structure to hadoop config files. They have a *-default.xml and a *-site.xml for users to override default properties. In the class that implemented the Hadoop FileSystem, I had added these configuration files as default resources in a static block using Configuration.addDefaultResource("my-default.xml") and Configuration.addDefaultResource("my-site.xml". This was working fine and we were able to run the Hadoop Filesystem CLI and map-reduce jobs just fine for our storage system. However, when we tried using this storage system in pig scripts, we saw errors indicating that our configuration parameters were not available. Upon further debugging, we saw that the config files were added to the Configuration object as resources, but were part of defaultResources. However, in Main.java in the pig source, we saw that the Configuration object was created as Configuration conf = new Configuration(false);, thereby setting loadDefaults to false in the conf object. As a result, properties from the default resources (including my config files) were not loaded and hence, unavailable.
We solved the problem by using Configuration.addResource instead of Configuration.addDefaultResource, but still could not figure out why Pig does not use default resources? Could someone on the list explain why this is the case? Thanks, -- Bhooshan