On Mon, Nov 01, 2004 at 01:25:34PM -0800, Andrew Chen wrote:
> IIRC, there's no field for date in the DB entries though - so I assume
> I would use DocId as a proxy for the date, since a higher one would
> indicate it was crawled later?
> 
> I'm holding the "Last Modified" http field in the ParseData metadata,
> but I assume there's no efficient way to use one of these parsedata
> fields in the comparable.

Currently, the "Last Modified" is saved by index-more plugin as long
and Field.UnIndexed(). You need to make it indexed first and cook a query
plugin to search for it. I planned to do a query-more plugin (for content-type,
content-length as well as last-modified), but never got around to
do it and won't have time soon. It will be great if you
can contribute along the line.

John


-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to