[backstage] BBC Breaking News RSS feeds query
Hi, I'm doing some work around breaking news and had a few queries about the available feeds. Looking on the backstage site there appears to be a few sources of breaking news feeds : http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/breaking_news/rss.xml http://newsrss.bbc.co.uk/rss/newsonline_world_edition/breaking_news/rss.xml These feeds don't contain links to a story on the BBC news site, they only link through to news.bbc.co.uk. I had a couple of questions : 1. Are these alerts in any way linked to a news story (i.e. is there a story published at the same time as the alert goes out via RSS ?) If so, could the RSS feed be altered to contain the link to the story ? 2. Does the guid in the feed (e.g. guid isPermaLink=falseurn:news_bbc_co_uk:breaking_news:33400/guid _) have any relationship to a story (so in this example does 33400 map to a story on the site ?) Looking at the RSS entry titles and then the corresponding story on the BBC news site, they seem to be very similar or identical in many cases (I suspect the different ones are where the story has been subsequently updated) so I'm guessing in the worst case scenario I could match the website story title to the alert title to identify a story as 'breaking'. Is there an easy way to identify a particular story as 'breaking news' at the moment ? Thanks, Rob - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/
Re: [backstage] Muddy Boots on Backstage
Fearghas McKay wrote: Noah On 27 Nov 2007, at 10:57, Noah Slater wrote: To which I have two suggestions: 1) Leave the /discussion/ list you're on. 2) Move to the next message, trash the message and move on. 3) Filter all email with freedom in the body into /dev/null and be done with it. My fourth suggestion would be that perhaps the discussion you want to have is not on topic for a list. As such continuing the discussion you want to have may be off topic for most list members. As to whether this list is an advocacy list for freedom I will leave as the list owners' call. Or just change the post title and start a new post : Free Software Nonsense was (Re: [backstage] Muddy Boots on Backstage) That way this thread about MuddyBoots is actually useful to anyone who wants to find out about it and anybody who wants to talk about Free Software Nonsense can do. - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/
Re: [backstage] Muddy Boots on Backstage
Hi, Rob - this is neat, though not entirely sure that it's working entirely as you might want... http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=pageid=701 http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=pageid=701 ...a page about The Sun (and the News of the World) has lots of links off to the NASA website - presumably because of the use of the word Sun... Nice, though - and something to think about. Hi James, Thanks for this, it highlights one of the challenges we face when trying to find correct contextual meaning where ambiguity exists, we haven't got it right in all cases yet :) I thought I'd work it through and highlight areas that could be improved. The initial story has been categorised as being related to the following tags (via the yahoo term extraction service) : (http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=viewid=701) * media ownership * editorial control * ownership laws * communications committee * independent board * evening newspapers * evidence http://en.wikipedia.org/wiki/Evidence * news corporation http://en.wikipedia.org/wiki/News_Corporation * chairman http://en.wikipedia.org/wiki/Chair_%28official%29 * mr http://en.wikipedia.org/wiki/MR * house of lords http://en.wikipedia.org/wiki/House_of_Lords * news of the world http://en.wikipedia.org/wiki/News_of_the_World * mr murdoch * parliamentary committee http://en.wikipedia.org/wiki/Committee * murdoch http://en.wikipedia.org/wiki/Murdoch * fox news http://en.wikipedia.org/wiki/Fox_News_Channel * sky news http://en.wikipedia.org/wiki/Sky_News * sun http://en.wikipedia.org/wiki/Sun_%28disambiguation%29 * news station http://en.wikipedia.org/wiki/News_station * rupert murdoch http://en.wikipedia.org/wiki/Rupert_Murdoch The obvious problem with this is the sun tag, it is an ambiguous term that has many meanings, as evidenced at : http://en.wikipedia.org/wiki/Sun_(disambiguation) Currently we only follow the links off these disambiguation pages to gather external links, however if we were to improve our usage of the disambiguation pages we could cut down on these false positives (in fact that's top of the list of the things we'd like to experiment with). The other problem here is that we display inks if they have any matches in del.icio.us with the story tags listed above. We should probably put some metrics around the minimum number of tags a story must match to be a recommended link, in this case that would have meant we wouldn't have recommended the 'planetary' sun links if we had a minimum match of 2 tags. Thanks for the feedback ! - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/
Re: [backstage] Muddy Boots on Backstage
Tom Loosemore wrote: Thanks for the feedback ! Muddy boots is cool... Thanks :) TheyWorkForYou.com adds links to Hansard by matching Proper Names with Wikipedia entries. http://www.theyworkforyou.com/debates/?id=2007-11-21a.1190.1 The number false positives is acceptable and the wikipedia links are miles better than the user-generated glossary with which the site was launched. But it's still limited since it only parses for Capitalised Phrases or ACRONYMS. Shifting to term extraction seemed an obvious route, but as I think Muddy Boots shows, term extraction tends to throw up unacceptably large number of 'false positive' terms- these result in crappy random links and are user experience poison. However, you can minimise false positive terms by running the copy through several different flavours of term extractor, and only using terms thrown up by x or more of them (where x depends on your appetite for false positives vs false negatives). I like this idea as obviously the context for the story (i.e. the tags we use to define it) impacts the final link recommendations, it's one of the two weak points in the system at the moment (the other being the previously mentioned disambiguation issues), however it's nice to have a platform that we can start to test these kind of ideas out ... So, why not throw the copy through several more term extractors then only use the overlapping terms? - The BBC has at least one *excellent* term extractor in house which adds extra metadata like 'this term is a person/place/topic'... would be a lovely API to offer, hint hint... - Seconded ! Anybody else have any other recommendations for term extraction services ? Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/ - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/
Re: [backstage] Muddy Boots on Backstage
Brian Butterworth wrote: How about using a two-frame page as the link with a rate this link option shown as a one-line toolbar at the top of the page? Users could then rate the appropriateness of the link from wrong to fantastic, which would allow automatic removal of incorrect links and an simple administration list of links considered poor. That was another idea we had, both from the perspective of feeding meta-data back to Wikipedia and also getting end-users to moderate links, although in our use-case we had the system helping journalists in finding relevant external link material, the one's they chose from the complete list were marked as known 'good' meta-data for the story and fed back into the system (and if they had the time they could mark 'bad' suggestions as well). So for example if you choose a MuddyBoots 'red' report [1] (i.e. requires moderation) you'll see there are far more links that *could* be relevant to the article and the journalists could choose from these and add them to a news story, thus creating a feedback mechanism into the system. [1] http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=pageid=714report_type=red - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/
[backstage] Muddy Boots on Backstage
Hi Everyone, Just thought I'd accompany the latest post to the backstage blog (http://backstage.bbc.co.uk/news/archives/2007/11/from_last_years_1.html) with some examples of muddyboots in action. For those of you who aren't aware of the project it's probably best to look at http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=more. Essentially we're attempting to use Wikipedia and other commons authored data sources to augment the meta-data around BBC news stories, this ultimately took the form of automated contextually relevant link recommendations based off data within Wikipedia and del.icio.us (although we have some other ideas about how this data could be used ...) It's still a prototype so it's not production ready by any means, there are still stories where we are unable to recommend links and there are others where ambiguity becomes a problem and identifying what context a story has can be difficult (although we have some ideas around using the disambiguation data within Wikipedia to improve this). Here are a few links to stories where I thought muddyboots added some interest and hopefully a little of that Wikipedia 'browse experience' : http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=pageid=646 http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=pageid=630 http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=pageid=622 http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=pageid=643 If you'd like to see how those recommendations were arrived at then each story has a 'View' action which can be used to get a breakdown of each stage of the muddyboots process, for example : http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=viewid=622 It's worth noting we only keep the last 50 story submissions in the system, so these links will eventually 'age' out. (Disclaimer : I worked on the project) Thanks, Rob - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/
Re: [backstage] More iPlayer protesting
Not that I'm condoning the choice, personally I'll always prefer an agnostic system, but, well, maybe the BBC were just realists when it came to the practicalities of development cost versus ROI from creating versions for (EXTREMELY) minority OSes? I mean, come on, hands up who here on the list uses Linux as their primary OS. Me - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/