The Scrape tag seems to need reloading at least 3 times (regardless of the time attribute's value or status) to do a fresh scrape. I tried modifying the tag to have a default minimum time of one minute vs. the 10 minute it arrives with.

Regardless of the time attribute's value, the taglib always needs to be reloaded several times in order for a fresh scrape to occur
(unless the page has been modified). Although, with my modified version of the taglib, it's several reloads after a minute has passed rather than after 10 mins have passed.

The behavior as it's outlined in the doc* seems to imply this in a roundabout way -that is, really, the time attribute seems not to be taken into account --at all--, and only an expired header or a modification calls a rescrape. am I reading the chain correctly?

The Scrape tag's 10 minute minimum default seems a bit long. The ability to scrape every second, or minute is useful (I'm using it to pull in current weather). The cached behavior is strange too. I've looked a the sources PageTag and PageData but can't figure out why the page needs to be evaluated three times (or so) before calling a rescrape (and no it's not my browser or container), can anyone point me to the block(s) I should modify to make it rescrape every second, and not be cached?

*scrape doc

1>The status of the scrape tags and attributes in the JSP is examined. Any modifications to the tags or attributes trigger a rescrape. If the tags have not been modified, the JSP proceeds to step 2.

2> The minimum time for rescraping, specified by the time attribute of the page tag, is examined. The default time is 10 minutes. If this time has not passed since the last scrape, cached results are returned. If this time has passed, the JSP proceeds to step 3.

3> The expired header of the scraped document is examined. If the expiration date/time has not passed, cached results are returned. If the expiration date/time is not specified or the document has expired, the JSP proceeds to step 4.

4> The headers for the scraped document are requested and examined. If the document has not been modified since the last scrape, cached results are returned. If the document has been modified, it is rescraped and the new results are returned.

_________________________________________________________________
STOP MORE SPAM with the new MSN 8 and get 3 months FREE*. http://join.msn.com/?page=features/junkmail&xAPID=42&PS=47575&PI=7324&DI=7474&SU= http://www.hotmail.msn.com/cgi-bin/getmsg&HL=1216hotmailtaglines_stopmorespam_3mf


--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to