Reloading configuration when using imputstream resources results in org.xml.sax.SAXParseException -------------------------------------------------------------------------------------------------
Key: HADOOP-7614 URL: https://issues.apache.org/jira/browse/HADOOP-7614 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.21.0 Reporter: Ferdy Priority: Minor When using an inputstream as a resource for configuration, reloading this configuration will throw the following exception: Exception in thread "main" java.lang.RuntimeException: org.xml.sax.SAXParseException: Premature end of file. at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1576) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1445) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1381) at org.apache.hadoop.conf.Configuration.get(Configuration.java:569) ... Caused by: org.xml.sax.SAXParseException: Premature end of file. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1504) ... 4 more To reproduce see following testcode: Configuration conf = new Configuration(); ByteArrayInputStream bais = new ByteArrayInputStream("<configuration></configuration>".getBytes()); conf.addResource(bais); System.out.println(conf.get("blah")); conf.addResource("core-site.xml"); //just add a named resource, doesn't matter which one System.out.println(conf.get("blah")); Allowing inputstream resources is flexible, but in cases such as this in can lead to difficult to debug problems. What do you think is the best solution? We could: A) reset the inputstream after it is read instead of closing it (but what to do when the stream does not support marking?) B) leave it up to the client (for example make sure you implement close() so that it resets the steam) C) when reading the inputstream for the first time, cache or wrap the contents somehow so that is can be read multiple times (let's at least document it) D) remove inputstream method altogether e) something else? For now I have attached a patch for solution A. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira