Hi,

I am running the latest version for nutch. While crawling one particular
site I get a AbstractMethodError in the cyberneko plugin for all of it pages
when doing a Fetch.
As i understand, this has to do because of difference between the runtime
and compile version. However, I am running it afresh after an ant clean.

Any suggestions would be helpful. Btw, i am using java version "1.6.0_18" on
a windows environment


java.lang.AbstractMethodError:
org.cyberneko.html.HTMLScanner.getCharacterOffset
()I
        at org.apache.xerces.xni.parser.XMLParseException.<init>(Unknown
Source)

        at
org.cyberneko.html.HTMLConfiguration$ErrorReporter.createException(HT
MLConfiguration.java:673)
        at
org.cyberneko.html.HTMLConfiguration$ErrorReporter.reportError(HTMLCo
nfiguration.java:662)
        at
org.cyberneko.html.HTMLScanner$ContentScanner.scanAttribute(HTMLScann
er.java:2404)
        at
org.cyberneko.html.HTMLScanner$ContentScanner.scanAttribute(HTMLScann
er.java:2360)
        at
org.cyberneko.html.HTMLScanner$ContentScanner.scanStartElement(HTMLSc
anner.java:2267)
        at
org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:1
820)
        at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:789)
        at
org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:478
)
        at
org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:431
)
        at
org.cyberneko.html.parsers.DOMFragmentParser.parse(DOMFragmentParser.
java:164)
        at
org.apache.nutch.parse.html.HtmlParser.parseNeko(HtmlParser.java:249)

        at org.apache.nutch.parse.html.HtmlParser.parse(HtmlParser.java:212)
        at
org.apache.nutch.parse.html.HtmlParser.getParse(HtmlParser.java:145)
        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:82)
        at
org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:87
9)
        at
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:646)
java.lang.AbstractMethodError:
org.cyberneko.html.HTMLScanner.getCharacterOffset
()I
        at org.apache.xerces.xni.parser.XMLParseException.<init>(Unknown
Source)

        at
org.cyberneko.html.HTMLConfiguration$ErrorReporter.createException(HT
MLConfiguration.java:673)
        at
org.cyberneko.html.HTMLConfiguration$ErrorReporter.reportError(HTMLCo
nfiguration.java:662)
        at
org.cyberneko.html.HTMLScanner$ContentScanner.scanAttribute(HTMLScann
er.java:2404)
        at
org.cyberneko.html.HTMLScanner$ContentScanner.scanAttribute(HTMLScann
er.java:2360)
        at
org.cyberneko.html.HTMLScanner$ContentScanner.scanStartElement(HTMLSc
anner.java:2267)
        at
org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:1
820)
        at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:789)
        at
org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:478
)
        at
org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:431
)
        at
org.cyberneko.html.parsers.DOMFragmentParser.parse(DOMFragmentParser.
java:164)
        at
org.apache.nutch.parse.html.HtmlParser.parseNeko(HtmlParser.java:249)

        at org.apache.nutch.parse.html.HtmlParser.parse(HtmlParser.java:212)
        at
org.apache.nutch.parse.html.HtmlParser.getParse(HtmlParser.java:145)
        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:82)
        at
org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:87
9)
        at
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:646)

Reply via email to