Re: Can't save urls/items

Paul Tremberth Thu, 16 Jan 2014 08:13:48 -0800

Hi,
Try with
allowed_domains = ["monsterboard.nl <http://www.monsterboard.nl/>"]
Otherwise, I'm afraid you'll have many offsite requests filtered.


/Paul.


On Thursday, January 16, 2014 5:00:22 PM UTC+1, pkforlol3 wrote:
>
> *This is my items.py:*
>
> from scrapy.item import Item, Field
>
> class Cvcrawler1(Item):
> title = Field()
> titel = Field()
> strong = Field()
> paragraaf = Field()
> link = Field()
>
> *This is my crawlcsv.py:*
>
> from scrapy.spider import BaseSpider
> from scrapy.selector import HtmlXPathSelector
> from cvcrawler1.items import Cvcrawler1
>
> class MySpider(BaseSpider):
> name = "crawlcsv"
> allowed_domains = ["http://www.monsterboard.nl/";]
> f = open("url.txt")
> start_urls = [url.strip() for url in f.readlines()]
> f.close()
>  def parse(self, response):
> hxs = HtmlXPathSelector(response)
> titles = hxs.select("//div[@class='body']")
> items = []
> for titles in titles:
> item = JobCrawlItems()
> item ["titel"] = titles.select("h2/text()").extract()
> item ["strong"] = titles.select("p/strong/text()").extract()
> item ["paragraaf"] = titles.select("p/text()").extract()
>
> return items
>
> *Command i use in cmd:*
>
> Scrapy crawl crawlcsv -o data.csv -t csv
>
> *How can I save the data it finds? what did i do wrong?*
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Can't save urls/items

Reply via email to