Using allow=('\.egg', ) matches the 4 ".egg" links on the page. But 
allow=('\.zip', ) does not match the zip one.

Em quarta-feira, 23 de setembro de 2015 19:20:00 UTC-3, Travis Leleu 
escreveu:
>
> What happens if you use allow=('\.egg', ) instead?
>
> On Wed, Sep 23, 2015 at 2:59 PM, Piter Oliveira Vergara <
> [email protected] <javascript:>> wrote:
>
>> I'm having trouble in figuring out why the following code does not work. 
>> I want to download a zip file from a page but my LinkExtractor never seems 
>> to match the link. I've tried various patterns with no luck. Everything 
>> works fine if I try to match the other links in that same page.
>>
>> class Zope2Spider(CrawlSpider):
>> """docstring for Zope2Spider"""
>> name = "zope2"
>>
>> start_urls = [
>> "http://download.zope.org/Zope2/index/2.13.22/AccessControl/"; 
>> ]
>>
>> rules = (
>> # This does not work
>> Rule(LinkExtractor(allow=('.*\.zip', ), ), callback='parse_pacote'), 
>>
>> # this works 
>> # Rule(LinkExtractor(allow=('.*\.egg', ), ), callback='parse_pacote'), 
>> ) 
>>
>> def parse_pacote(self, response):
>> raise Exception(response.url)  # just to see if it enters here
>>                 
>>                 # ommited 
>>
>>
>> Can anyone give me a help with that?
>>
>> thanks!
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "scrapy-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at http://groups.google.com/group/scrapy-users.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to