Hi all,
I've been hanging around here for a while, but today
I've been thrust into the limelight. Ian has quite cleverly found my
latest mashup using BBC data and has already posted it as a prototype
to the backstage website (
http://backstage.bbc.co.uk/prototypes/archives/2006/10/how_in_touch_is.html
), and has asked me to mail the list with some background, so here goes. (I think Jem mentioned it in an earlier email too.)
BBC Touch
http://cgriley.com/bbctouch/
takes data from the BBC News RSS feed and compares it with data from
the (sort of hidden) popular new stories XML feed to get a % of how "in
touch" the BBC news editorial staff are with the public and what we are
actually reading. The about page I've already written explains it a
bit better that I have just now http://cgriley.com/bbctouch/about/
Why
did I do it? I don't have any desire to highlight any hidden agendas
the BBC's editorial staff might have (although I guess it can), but
more from an interest in how "in touch" are the BBC with what the
public actually reads and cares about compared to what they think we
do. To help identify subjects we might care about more than others all
the headlines I get I put through the Yahoo Content Analysis API to get
some useful keywords (subjects). On the web page you'll see subjects
they want us to read about vs. what we're actually reading about for
the past 24 hours, and past 2 weeks.
Hopefully over time we'll be able to identify trends in the
subjects areas the BBC push but we never read, or vice versa, and use
the % as a measure for how in touch the BBC get at particular points in
time. People with a better understanding of this will no doubt have
better ways to investigate this data, but I hope this is a useful tool
in the analysis. In particular I think its useful for highlighting
issues the public care more about. For instance a couple of says ago whilst
Pakistan was the headline, most of us were reading the climate change
story.
How have I done it? I explain this in the about page, but
essentially I poll the feeds every hour and store them, along with any
Yahoo extracted subjects, in a SQL Server database. The touch page
then pulls this data out, and works out the touch value based on this
data.
As for the visual representation, I thought about using a tag cloud for
the subjects, but found the decreasing size of text in a list a simpler
and cleaner approach (and Jem seems to like it, so that's good. ;o) )
Planned
features - I already intend to add a page to this to show the highest
and lowest touch values achieved, and the headlines / subjects that
achieved them. However if you want to suggest things I could do with
the site, other views on the data, or just have some general feedback
then please do reply to this mail or email me using the email address
on the about page.
I hope this has inspired you to take another look at the feeds we already have access to and play with them in new ways.
Thanks,
Chris Riley
- [backstage] BBC Touch Chris Riley
- Re: [backstage] BBC Touch Matthew Somerville
- RE: [backstage] BBC Touch Matthew Cashmore
- Re: [backstage] BBC Touch Chris Riley
- Re: [backstage] BBC Touch Tom Loosemore