Two ways, either pass it as meta data to the Request instance or get it
from the Request's url.
#First method, put this code
url = response.request.response.request.url # for the
#This is the second method
yield scrapy.Request(response.urljoin(u), callback=self.parse_contact, meta
={'url': response.request.url})
#Then inside your other function just get it like this
url = response.meta['url']
On Saturday, August 27, 2016 at 1:56:13 AM UTC-7, Raf Roger wrote:
>
> Hi,
>
> i would like to know how to get the calling/reference url once we are
> parsing data from child url ?
>
> e.g.:
>
> we are on url "http://myweb.com/list?page=2" what list 20 contact urls
>
> once we are on the contact url (e.g.: http://myweb.com/contact?id=123)
> how can we get the calling URL (e.g.: http://myweb.com/list?page=2) ?
>
> here is a sample of my code:
> def parse_start_url(self, response):
> urls = Selector(response).xpath('//div[contains(@class,
> "lien-ville")]/ul/li/a/@href').extract()
> for u in urls:
> yield scrapy.Request(response.urljoin(u),
> callback=self.parse_contact)
>
> def parse_contact(self, response):
> yield {
> "count" : self.counter,
> "page number" : self.page_num,
> "contact page url" : response.url,
> "reference url" : self.url
> }
>
> in my code i would like in parse_contact be able to get the url passed to
> parse_start_url.
> How can i do that ?
>
> thx
>
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.