Bug#275510: #275510: URL improperly de-htmlentitied
Aaron Swartz wrote: An xml dump of the feed was included in the bug report A full one? I see only a very partial one Apparently it's http://nu.nl/deeplink_rss2/index.jsp?r=Algemeen -- see shy jo signature.asc Description: Digital signature
Bug#275510: #275510: URL improperly de-htmlentitied
Apparently it's http://nu.nl/deeplink_rss2/index.jsp?r=Algemeen Yeah, so that actually says: linkhttp://www.nu.nl/news.jsp?n=496739amp;amp;c=11/link which once decoded becomes: linkhttp://www.nu.nl/news.jsp?n=496739amp;c=11/link so it looks like r2e is doing the right thing. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#275510: #275510: URL improperly de-htmlentitied
Aaron Swartz wrote: Apparently it's http://nu.nl/deeplink_rss2/index.jsp?r=Algemeen Yeah, so that actually says: linkhttp://www.nu.nl/news.jsp?n=496739amp;amp;c=11/link which once decoded becomes: linkhttp://www.nu.nl/news.jsp?n=496739amp;c=11/link so it looks like r2e is doing the right thing. Looks like double-escaping in the feed is at fault, think I can close this bug? -- see shy jo signature.asc Description: Digital signature
Bug#275510: #275510: URL improperly de-htmlentitied
On Mon, Mar 14, 2005 at 09:25:38PM -0500, Joey Hess wrote: Aaron Swartz wrote: Apparently it's http://nu.nl/deeplink_rss2/index.jsp?r=Algemeen Yeah, so that actually says: linkhttp://www.nu.nl/news.jsp?n=496739amp;amp;c=11/link which once decoded becomes: linkhttp://www.nu.nl/news.jsp?n=496739amp;c=11/link so it looks like r2e is doing the right thing. Looks like double-escaping in the feed is at fault, think I can close this bug? Well, shouldn't it? I'm not a RSS expert, but I thought you escape once for the XML, and the result should be valid HTML. Implying again -escaping. Same for body of the text afaik. But, if you find out that the feed is actually wrong w.r.t. the specs, yeah, then you can close it. I don't have the RSS specs handy though, and eh, I'm a lazy bastard. So, I've said that :-/. Mostly I'm just really really not an expert at all, but I guess the feed, which is of the most popular internet news source in the Netherlands, surely has maken sure to actually work for like, eh, most RSS readers. It's a really big site. --Jeroen -- Jeroen van Wolffelaar [EMAIL PROTECTED] (also for Jabber MSN; ICQ: 33944357) http://Jeroen.A-Eskwadraat.nl
Bug#275510: #275510: URL improperly de-htmlentitied
Aaron Swartz wrote: This shouldn't normally happen. Can you ask the person who filed the bug for the URL that causes this? In any event, the bug needs to be passed upstream to feedparser.sf.net. An xml dump of the feed was included in the bug report (http://bugs.debian.org/275510), but AFAICS, the URL: bit comes from rss2email and is not data that is processed by the feed parser: else: message += text/plain content = unu(content).strip() + \n\nURL: +link -- see shy jo signature.asc Description: Digital signature
Bug#275510: #275510: URL improperly de-htmlentitied
An xml dump of the feed was included in the bug report A full one? I see only a very partial one (http://bugs.debian.org/275510), but AFAICS, the URL: bit comes from rss2email and is not data that is processed by the feed parser: else: message += text/plain content = unu(content).strip() + \n\nURL: +link yes, but link comes from feedparser -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]