Thanks for your reply Bill. I will go with the good option. Where should I place the SQL insert? I am thinking of placing it inside the parse function of the spider.
On Monday, May 5, 2014 10:01:59 PM UTC+5:30, Bill Ebeling wrote: > > Good option: Sounds like a case for a database. > > Very Bad Option: The only other option I can think of is storing a hash of > the url's in a flat file, and reading in that file and checking to see if a > hash of the current url is in that list, if not, save it and add the url to > that list.. this leads to many other problems. > > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
