Chirag Chaman wrote:
Andrzej,
Thank you -- and here we were going nuts thinking the problem might have
been with the plugin!
Would it be possible to post the patch file of the changes once you have
made them as our version of Nutch is different from SVN.
I suggest keeping around a vanilla version, and porting diffs to your
tree, otherwise you will end up with more and more out-of-sync version...
The change itself is trivial (available as 'svn diff -r 179640
DOMContentUtils.java'):
Index: DOMContentUtils.java
===================================================================
--- DOMContentUtils.java (revision 179640)
+++ DOMContentUtils.java (working copy)
@@ -102,25 +102,9 @@
boolean abortOnNestedAnchors,
int anchorDepth) {
if ("script".equalsIgnoreCase(node.getNodeName())) {
- Node n = node.getAttributes().getNamedItem("language");
- if (n != null) {
- String text = n.getNodeValue();
- sb.append(text);
- }
return false;
}
if ("style".equalsIgnoreCase(node.getNodeName())) {
- Node n = node.getAttributes().getNamedItem("rel");
- if (n != null) {
- String text = n.getNodeValue();
- sb.append(text);
- }
- n = node.getAttributes().getNamedItem("type");
- if (n != null) {
- String text = n.getNodeValue();
- if (sb.length() > 0) sb.append(", ");
- sb.append(text);
- }
return false;
}
if (abortOnNestedAnchors &&
"a".equalsIgnoreCase(node.getNodeName())) {
Thankx again.
You're welcome.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers