Hi,
in the fetcher line 192 in case the status is NOTMODIFIED we collect
null as content but we already have the content.
I'm worry what is happen with a page that does not change for 60
days, since the concept of nutch is do delete segments that are older
than "db.default.fetch.interval", isn't it?
If this is true, may be someone with write access can change null to
content.
Thanks for any comments.
Stefan
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers