Yeah, you're right the HTTP request is happening in the twisted reactor https://github.com/scrapy/scrapy/blob/0.24/scrapy/core/downloader/handlers/http10.py
On Thursday, May 21, 2015 at 2:47:59 PM UTC-7, Philipp Bussche wrote: > > Thanks Daniel, > > that sounds like a good idea and I will have a look at that. > > But I would also be interested to instrument the call to crawl the actual > URL so I can put some monitoring code before and after it. > Do you know how the actual crawl is being done ? Is it done via twisted ? > It does not look like httplib is being used for that. > > Thanks > Philipp > > On Thursday, May 21, 2015 at 10:29:44 PM UTC+2, Daniel Fockler wrote: >> >> Hey, >> >> Not sure exactly what you are looking for, but you can implement a Scrapy >> Downloader Middleware and run a process_request function that will pass >> each request into that function so you can examine it. Here's the docs for >> that. >> >> http://scrapy.readthedocs.org/en/latest/topics/downloader-middleware.html >> >> On Thursday, May 21, 2015 at 7:03:35 AM UTC-7, Philipp Bussche wrote: >>> >>> Hi there, >>> I am working on some monitoring for my python/scrapy deployment using >>> one of the commercial APM tools. >>> I was able to instrument the parsing of the response as well as the >>> pipeline which pushes the items into an ElasticSearch instance. >>> You can see in the attached screenshot how that is visualized in the >>> tool. >>> I would now also like to see the outgoing calls that Scrapy is making >>> through the downloader to actually crawl the http pages (which is obviously >>> happening before parsing and pipelining). >>> But I can't figure out where in the code the actual http call is made so >>> that I could put my monitoring hook around it. >>> Could you guys please point me to the class that is actually doing the >>> http calls ? >>> >>> Thanks >>> Philipp >>> >> -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
