Three possible bugs here, and one fix.

We have been using JXPath 1.1 to extract elements from webpages, parsed
into Document format by JTidy (or sometimes NekoHTML).  JXPath is great.
=)

But when running JXPath at the same time in many different threads, we
sometimes get:

java.util.ConcurrentModificationException
        at java.util.HashMap$HashIterator.nextEntry(HashMap.java:782)
        at java.util.HashMap$EntryIterator.next(HashMap.java:824)
        at
org.apache.commons.jxpath.ri.JXPathContextReferenceImpl.cleanupCache(JXPathContextReferenceImpl.java:270)
        at
org.apache.commons.jxpath.ri.JXPathContextReferenceImpl.compileExpression(JXPathContextReferenceImpl.java:252)
        at
org.apache.commons.jxpath.ri.JXPathContextReferenceImpl.getValue(JXPathContextReferenceImpl.java:283)

At the moment we avoid this problem by synchronising all our calls to
JXPath so that they will never execute concurrently.  Maybe some
synchronization should be introduced inside the JXPath library (e.g.
around compileExpression or cleanupCache) to prevent this happening in
general.



Another issue we encountered: the "last()" function throws a
NullPointerException every time we try to use it:

java.lang.NullPointerException
        at
org.apache.commons.jxpath.ri.model.dom.DOMNodeIterator.previous(DOMNodeIterator.java:131)
        at
org.apache.commons.jxpath.ri.model.dom.DOMNodeIterator.setPosition(DOMNodeIterator.java:121)
        at
org.apache.commons.jxpath.ri.axes.ChildContext.setPosition(ChildContext.java:152)
        at
org.apache.commons.jxpath.ri.compiler.CoreFunction.functionLast(CoreFunction.java:335)

We have been working around this by doing "<path>[count(<path>)]", but
it would be nice to fix "last()" so that it always works.  (See fix
below.)



I thought I might poke around in the code and try to fix this, but
before I started I tried to upgrade to JXPath 1.2.

Unfortunately, I encountered a new problem with this upgrade.  The
XPaths returned to me no longer referred to HTML elements (e.g.
"/html[1]/body[1]/p[3]/br[2]").  Searches for "//p" were now  failing,
and a search for "//*" revealed that the XPath results now looked like:
"/node[1]/node[1]/node[5]/node[4]".

I tried again with the source code in subversion, and the results were
the same.

Do you have any idea how to fix this problem?



OK well back to the fix for "last()".  I just added this second
"context.reset()" in JXPath 1.1's
org.apache.commons.jxpath.ri.compiler.CoreFunction.functionLast():

    protected Object functionLast(EvalContext context) {
        assertArgCount(0);
        // Move the position to the beginning and iterate through
        // the context to count nodes.
        int old = context.getCurrentPosition();
        context.reset();
        int count = 0;
        while (context.nextNode()) {
            count++;
        }

        // Restore the current position.
        context.reset(); // First reset, since the counting has probably
sent us off the end of the node list.
        if (old != 0) {
            context.setPosition(old);
        }
        return new Double(count);
    }

Now using "last()" no longer causes an Exception.  :)

If this seems a valid fix, you may wish you add it to your latest
source.

Thanks, Joey.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to