Dear all if have the following problem, for which I couldn't find a
solution on the Internet or develop an own one...:(
I would like to write a rss-feed with PyRSS2Gen within a pipeline to a
file. The problem I have is that the code in the pipeline is run only with
one scrapy-item at a time. So I cannot loop through the rss-items
PyRSS2Gen.RSS2 and prior to that write the language, copyright, title
information,… only once. (see comment in the code example)
The header is repeated with each item in the generated rss-file. The file
is written with the "a" option. This should rather be “w”.
How can I write the rss file without repeating the header? Where is the
best place in the scapy frame work to place this code: collect all items
and write it to file with one header? THX for any hints!
#Code as seen in pipelines.py
import datetime
import PyRSS2Gen
def write_to_rss(item):
# This should be done only once per rss-file write operation
rss = PyRSS2Gen.RSS2(
language = "en-US",
copyright = "None",
title = "My Feed",
link = "http://www.mysite.com/",
lastBuildDate = datetime.datetime.utcnow(),
# This should be done for each scapy-Item, without scrapy this part
loops for each thing in the rss feed.
items = [
PyRSS2Gen.RSSItem(
title = str(item['headline']),
description = str(item['article_content']),
])
# In the original the "w" optin is used to replace the feed i.e. to
have the correct lastBuildDate
rss.write_xml(open("pyrss2gen.xml", "a"))
class WriteToRSS(object):
def process_item(self, item, spider):
write_to_rss(item)
return item
It is also possible to add/append items to the variable before the full
variable gets written to the file. The code from PyRSS2Gen writes the items
and wraps them in XML and places the header before in the rss-feed and than
call the WriteToRSS class. But I don't know where and how to place this
code in scrapy.
Code hier eingeben # Add Item to the rss feed
rss.items.append(PyRSS2Gen.RSSItem(
title = str(item['headline']),
description = str(item['article_content'])))...
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.