Hello, no one as ever needed such a thing ? 

If it doesn't exists I should do it. One of the implementation is to use a 
priority queue, instead of just adding new page to crawl into a FIFO you 
set up a priority in this queue. Do someone has I idea about implementing 
that ?
Some pointers to this part of scrapy code would be much appreciated ! thanks



Le jeudi 17 juillet 2014 12:30:01 UTC+2, Magikmeuh a écrit :
>
> What I called 'smart refresh algorithm' (but in fact I realise that it's 
> maybe not the good term...) is the ability to schedule/change the crawling 
> refresh period of pages depending on the changing rate of the content.
> You specify 2 range, the min and max crawl refresh. If the content of a 
> page never change, it tend to be the max time. 
> If, each time you fetch it again the content has changed, it tend to get 
> the min. 
> If it's partial it evolves between in this range.
>
> Is there something similar ? It would be very strange that it desn't 
> exists because I just can't imagine crawling a big site without this 
> functionnality... (and having a good refresh rate of pages of course)
>
> In your DeltaFetch if i understand well, it's a way to avoid to recrawl 
> pages that has been already fetched.
>
>
> Le mercredi 16 juillet 2014 11:01:04 UTC+2, Paul Tremberth a écrit :
>>
>> Hi Frédéric,
>>
>> what do you mean by "smart refresh crawling"?
>> scrapylib has the DeltaFetch spider middleware
>>
>> https://github.com/scrapinghub/scrapylib/blob/master/scrapylib/deltafetch.py
>>
>> Paul.
>>
>> On Wednesday, July 16, 2014 10:15:11 AM UTC+2, Magikmeuh wrote:
>>>
>>> Hello everyone, 
>>>
>>> Does scrapy have a smart refresh crawling algorithm ?
>>>
>>> I don't see any trace of it in the documentation or on this googlegroup;
>>>
>>> Does someone have already implemented it ?
>>>
>>> Thanks
>>>
>>>
>>> -- 
>>> Frédéric Passaniti
>>>  
>>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to