I uploaded a new patch on NUTCH-1253 for this. It would be greatly
appreciated if someone could look into it as it seems that the
TestDOMCOntentUtils tests are all broken further to the neko version
upgrade!

https://issues.apache.org/jira/browse/NUTCH-1253

Thank you very much.

Best
Lewis

On Wed, Feb 6, 2013 at 9:35 AM, Lewis John Mcgibbney <
[email protected]> wrote:

> Hi,
> Two observations here
> 1) Did you try any versions more recent than 1.9.12? I assume you are
> talking about the net.sourceforge.nekohtml groupId artifact [0] as oppose
> to the nekohtml groupId artifact [1]?
> 2) We need to completely update the nekohtml dependency altogether. We
> currently use a completely outdated artifact [2] which was rather
> embarrassingly released in 2005!
> Great work, and great persistence this is something which definitely need
> to address.
> Thanks
> Lewis
>
> [0]
> http://search.maven.org/#search|gav|1|g%3A%22net.sourceforge.nekohtml%22%20AND%20a%3A%22nekohtml%22<http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22net.sourceforge.nekohtml%22%20AND%20a%3A%22nekohtml%22>
> [1]
> http://search.maven.org/#search|gav|1|g%3A%22nekohtml%22%20AND%20a%3A%22nekohtml%22<http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22nekohtml%22%20AND%20a%3A%22nekohtml%22>
> [2] http://search.maven.org/#artifactdetails|nekohtml|nekohtml|0.9.5|jar
>
>
> On Wed, Feb 6, 2013 at 8:28 AM, mbehlok <[email protected]> wrote:
>
>> I fixed it, nutch source comes with outdated nekohtml.jar. Trial and
>> errored
>> with many neko versions until this one worked for me:
>>
>> nekohtml-1.9.12.tar.gz
>>
>> mitch
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Parsing-error-java-lang-NoClassDefFoundError-org-cyberneko-html-LostText-tp4029958p4038809.html
>> Sent from the Nutch - User mailing list archive at Nabble.com.
>>
>
>
>
> --
> *Lewis*
>



-- 
*Lewis*

Reply via email to