Hi, Rob - this is neat, though not entirely sure that it's working entirely as you might want...

http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=page&id=701 <http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=page&id=701>

...a page about "The Sun" (and the "News of the World") has lots of links off to the NASA website - presumably because of the use of the word "Sun"...

Nice, though - and something to think about.

Hi James,

Thanks for this, it highlights one of the challenges we face when trying to find correct contextual meaning where ambiguity exists, we haven't got it right in all cases yet :)

I thought I'd work it through and highlight areas that could be improved. The initial story has been categorised as being related to the following tags (via the yahoo term extraction service) :

(http://muddyboots.rattleresearch.com/cgi-bin/mb.cgi?action=view&id=701)

   * media ownership
   * editorial control
   * ownership laws
   * communications committee
   * independent board
   * evening newspapers
   * evidence <http://en.wikipedia.org/wiki/Evidence>
   * news corporation <http://en.wikipedia.org/wiki/News_Corporation>
   * chairman <http://en.wikipedia.org/wiki/Chair_%28official%29>
   * mr <http://en.wikipedia.org/wiki/MR>
   * house of lords <http://en.wikipedia.org/wiki/House_of_Lords>
   * news of the world <http://en.wikipedia.org/wiki/News_of_the_World>
   * mr murdoch
   * parliamentary committee <http://en.wikipedia.org/wiki/Committee>
   * murdoch <http://en.wikipedia.org/wiki/Murdoch>
   * fox news <http://en.wikipedia.org/wiki/Fox_News_Channel>
   * sky news <http://en.wikipedia.org/wiki/Sky_News>
   * sun <http://en.wikipedia.org/wiki/Sun_%28disambiguation%29>
   * news station <http://en.wikipedia.org/wiki/News_station>
   * rupert murdoch <http://en.wikipedia.org/wiki/Rupert_Murdoch>

The obvious problem with this is the "sun" tag, it is an ambiguous term that has many meanings, as evidenced at :

http://en.wikipedia.org/wiki/Sun_(disambiguation)

Currently we only follow the links off these disambiguation pages to gather external links, however if we were to improve our usage of the disambiguation pages we could cut down on these false positives (in fact that's top of the list of the things we'd like to experiment with).

The other problem here is that we display inks if they have any matches in del.icio.us with the story tags listed above. We should probably put some metrics around the minimum number of tags a story must match to be a recommended link, in this case that would have meant we wouldn't have recommended the 'planetary' sun links if we had a minimum match of 2 tags.

Thanks for the feedback !


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/

Reply via email to