Sure, I don't see why not.pages: <url, <status, contentHash, lastFetchDate, numFailures> >Is this list of storable fields extendable by plugins?
Great. I was not sure about that.
I am not sure exactly in the moment. Currently I think I would store the length of the document in one field.For example it might be intersting to monitor changes on websites and prefer more up to date pages in ranking.So you'd add a lastChangedDate?
So I could calculate the size of changes in the length when fetching the page again. There might be better possibilities to calculate a value about the size of changes. But currently I am not familar with that.
Second I would store a value about how frequently the page change.
If the page changes more then 10% or 10 words in length I would increment this value, else decrement it.
This value I would use to influence ranking. Often changing pages would be preferred.
So, 2 key/value pairs should be enough.
But storing the lastChangedDate could be interesting also.
Matthias
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
