Hi,

We're planning on a jaxp project to address usability issues in the JAXP API. One of the complaints about the JAXP API is the number of lines of code that are needed to implement a simple task. Tasks that should take one or two lines often take ten or twelve lines instead. Consider the following example:

        File file = new File(FILEPATH + "results.xml");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);

        XPathFactory xf = XPathFactory.newInstance();
        XPath xp = xf.newXPath();
        xp.setNamespaceContext(new NamespaceContext() {
            @Override
            public String getNamespaceURI(String prefix) {
                return "http://example.com/users/";;
            }

            @Override
            public String getPrefix(String namespaceURI) {
                return "ns1";
            }

            @Override
            public Iterator getPrefixes(String namespaceURI) {
                ArrayList list = new ArrayList();
                list.add("ns1");
                return list.iterator();
            }

        });
        try {
            DocumentBuilder builder = factory.newDocumentBuilder();
            Document document = builder.parse(file);
            String s = (String) xp.evaluate(
"/ns1:Results/ns1:Row[ns1:USERID=2]/ns1:NAME[text()]",
                    document, XPathConstants.STRING);
            System.out.println("Company: " + s);
        } catch (ParserConfigurationException e) {
            //creating DocumentBuilder
        } catch (SAXException ex) {
            //parsing
        } catch (IOException ex) {
            //parsing
        } catch (XPathExpressionException ex) {
            //xpath evaluation
        }

The issues reflected in the above sample include:

*1. Too many abstractions*

As shown in the above sample, there are multiple layers of abstraction in the DOM and Xpath API: factory, builder, and document, XPathFactory and XPath.

*2. Unnecessary Pluggability*

The pluggability layer allows easily plugging in third party implementations. However, in many use cases where pluggability is not needed, it becomes the performance bottleneck for the applications.

*3. Too many unrelated checked exceptions*

There are four unrelated checked exceptions in the above example. None of them is recoverable. ParserConfigurationException actually reflects a design flaw in the DocumentBuilderFactory that allowed setting both a Schema and Schema Language. In practice, Exception is often used to avoid having to catch each of the checked exceptions.

*4. Lack of integration*

JAXP is an umbrella of several libraries. The sample code above demonstrated the lack of integration among them. First of all, A DOM Document and XPath have to be created separately. Secondly, as in the above case, the Document may be either namespace-aware or unaware, depending on the setting on the DOM Factory, which is unknown to XPath created later.



This project may have two aspects: one to provide APIs to get straight to the objects such as DOM Document, and another to provide convenient methods for some common use cases. (*Note that, there is no intention to replace the existing API nor duplicate all of the features.)

For the example above, it could potentially be done in a couple of lines (this is just an illustration on how existing APIs could be simplified and may not reflect what the API would look like):

        String company =
            XMLDocument.fromFile(FILEPATH + "results.xml")
.evalXPath("/Results/Row[USERID=2]/NAME[text()]");
        System.out.printf("Company: %s%n", company);


We would love to hear from you. Any thoughts, concerns, experiences/complaints would be very welcome.

Thanks,
Joe

Reply via email to