Re: FilesPipeline Not Downloading

Szymon Roziewski Fri, 17 Oct 2014 05:48:46 -0700

Hi Drew,

The problem could be with using tbody, try dropping it.


Best wishes

On Sunday, 21 September 2014 15:07:53 UTC+2, Drew Friestedt wrote:
>
> I have a working scrape that targets this site:  start_urls = ["
> http://www.domu.com/chicago/apartment-search/list?";].  Here is one target 
> page where I'm trying to download files:
>
>
> http://www.domu.com/chicago/neighborhoods/arlington-heights/central-park-east
>
> I am able to download images no problem.  I'm using ImagesPipeline built 
> into scrapy.  I'm trying to download some pdfs with the FilesPipeline and 
> cannot seem to get it working.  Here is my relevant code:
>
> ----
> settings.py
>
> ITEM_PIPELINES = {'scrapy.contrib.pipeline.images.ImagesPipeline': 100,
>                   'scrapy.contrib.pipeline.files.FilesPipeline': 200,
>                   'scrapy_mongodb.MongoDBPipeline': 300}
>
> IMAGES_STORE = '/home/dfriestedt/PycharmProjects/domu/images/'
>
> -----
> items.py
>
> import scrapy
>
> class BuildingData(scrapy.Item):
>     file_urls = scrapy.Field()
>     files = scrapy.Field()
>
> ------
> spider.py
>
> def parse_building(self, response):
>
>         file_urls = response.css('#available table.sticky-enabled > tbody 
> > tr > td:nth-child(6) a::attr(href)').extract()
>
>         yield BuildingData(file_urls=file_urls)
>
> ------
> I'm confident the response.css is working.  Here is the output in scrapy 
> shell.
>
> >>> response.css('#available table.sticky-enabled > tbody > tr > 
> td:nth-child(6) a::attr(href)').extract()
>
> [u'
> http://www.domu.com/sites/default/files/filefield/field_units/11-26-2013%201-07-01%20PM.png',
>  
> u'
> http://www.domu.com/sites/default/files/filefield/field_units/11-26-2013%201-07-49%20PM.png',
>  
> u'
> http://www.domu.com/sites/default/files/filefield/field_units/11-26-2013%201-08-22%20PM.png',
>  
> u'
> http://www.domu.com/sites/default/files/filefield/field_units/11-26-2013%201-10-01%20PM.png',
>  
> u'
> http://www.domu.com/sites/default/files/filefield/field_units/11-26-2013%201-10-48%20PM.png',
>  
> u'
> http://www.domu.com/sites/default/files/filefield/field_units/11-26-2013%201-11-09%20PM.png
> ']
> What am I missing?  
>
> I ware careful to follow these instructions here: 
> https://groups.google.com/forum/#!msg/scrapy-users/kzGHFjXywuY/O6PIhoT3thsJ 
>  
>
>
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: FilesPipeline Not Downloading

Reply via email to