CONCURRENT_REQUESTS=1

isn't good. Your code looks good in principle. This 
<https://github.com/scrapy/scrapy/blob/ebef6d7c6dd8922210db8a4a44f48fe27ee0cd16/scrapy/spiders/crawl.py#L42>
 
is the default CrawlSpider parse() method that handles e.g. 
your FormRequest.from_response(). From that response onwards your responses 
get handled by the Rules you set. Before that they don't. Since you have my 
e-mail - give me access to the code to have a run/look and we can update 
this thread. 


On Thursday, May 26, 2016 at 11:32:45 AM UTC+1, Massimo Canonico wrote:
>
> Hi all, 
>
> I've obtained some progress with my problem by setting 
> "CONCURRENT_REQUESTS=1" in , but I would like to: 
>
> - run authentication request first and 
>
> - then run concurrently requests for pages to scrapy 
>
> Could you help me, please? 
>
> M 
>
>
> On 19/05/16 16:59, Massimo Canonico wrote: 
> > Hi all, 
> > 
> > I'm using scrapy for a site with bulletin board (phpbb) and I would 
> > like to start "scraping" the pages ONLY after the autentication went 
> > good. 
> > 
> > In my code, the authentication is done inside star_request method: 
> > 
> > def start_requests(self): 
> >         self.log("start_requests called") 
> > 
> >         return [ 
> >             Request( 
> > 
> >                 "http://<mysite>/phpBB3", 
> >                 callback=self.parse_welcome, 
> >                 priority=100 
> >             ) 
> >         ] 
> > 
> > 
> >     def parse_welcome(self, response): 
> >         self.log("parse_welcome called") 
> > 
> >         request = FormRequest.from_response( 
> >             response, 
> >             formnumber=1, 
> > 
> >             formdata={"username": "rightusername", "password": 
> > "rightpassword"} 
> >         ) 
> > 
> >         return request 
> > 
> >     rules = ( 
> >         Rule(LinkExtractor(),callback = 'parse_standard',follow=True), 
> > 
> >     ) 
> > 
> >   [cut] 
> > 
> > 
> > From the output that I got, it seems that some pages are scraped 
> > without the autentication first. 
> > 
> > Am I wrong/missing something? Am I using "priority" in the right way? 
> > 
> > Thanks, 
> > 
> > Massimo 
> > 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to