On Mon, Nov 01, 2004 at 01:25:34PM -0800, Andrew Chen wrote: > IIRC, there's no field for date in the DB entries though - so I assume > I would use DocId as a proxy for the date, since a higher one would > indicate it was crawled later? > > I'm holding the "Last Modified" http field in the ParseData metadata, > but I assume there's no efficient way to use one of these parsedata > fields in the comparable.
Currently, the "Last Modified" is saved by index-more plugin as long and Field.UnIndexed(). You need to make it indexed first and cook a query plugin to search for it. I planned to do a query-more plugin (for content-type, content-length as well as last-modified), but never got around to do it and won't have time soon. It will be great if you can contribute along the line. John ------------------------------------------------------- This SF.Net email is sponsored by: Sybase ASE Linux Express Edition - download now for FREE LinuxWorld Reader's Choice Award Winner for best database on Linux. http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
