I'm trying to crawl through a forum and find the number of time each member uses certain words (let's say "the's"). Rather than exporting every post from every user (as say a .csv file) and then creating another program to read and count the number of "the's", how can I use scrapy to produce a file with the user and the number of "the"s?
One way I've thought of is to have an Item Pipeline count the number of "the's" whenever it is given an Item, but I would need the Item Pipeline to hold onto a dictionary with users and # of "the's", and need to figure out how to print or export the array when the spider was finished. How can I do this? Also, would this be any more efficient than just creating a file with all posts and then having another program read and count the number of "the's"? -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
