chris sleeman wrote:
Hi Andrzej,

Thanks for your response. However, I still have a couple of  doubts.

In your case, I would recommend setting a very short interval for the
main page, and setting longer (default) intervals for other pages.

Isnt' the fetch interval a system wide setting? Or can we set it for
individual urls?

That's true. A workaround for this would be to inject urls in batches - for the first batch you would set a certain fetch interval, and for the next batch you would set a different one (seems a bit ugly, I admit, but it works).

Perhaps we should add a command-line option to Injector to specify the fetch interval for urls?

Or we should use a line-oriented text format for Injector, which allows to specify fetch interval and/or other metadata, something like this:

seedFile ::= {lineEOL} ;
lineEOL ::= line <EOL> ;
line ::= url [{"|" meta}] ;
url ::= ? valid url characters except pipe symbol ? ;
meta ::= type " " name " " value ;
type ::= "S" | "F" | "I" | "L" (* string, float, int, long *) ;
name ::= ? any string without whitespace or pipe symbol ? ;
value ::= ? any string except pipe symbol ? ;




What
I would basically need is a different fetch interval for injected
(seed urls) as compared to the other urls.
Since this may not be available out of the box, I was thinking of just
modifying the injector code and using a much different
value for the fetch interval, in this
case. Would such an approach work? and will the same
fetch value, set once per url, be used throughout?

The fetch interval value is set by calling FetchSechule.initializeSchedule(), so you should probably modify the implementation of this method in your active FetchSchedule.


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to