Brian Butterworth wrote:
I thought they were trying to do OCR on the captions from the DVB-T stream.
No, we have clear text. As it says in the blog post :-)
However, a clear text feed of the data would keep the data pure, surely?
Sadly not (trust me, I've spent some time on this!) - even ignoring some missing data (so we'd have to do this for then anyway), when there's a long debate sometimes the captioning simply shows a summary of what's going on rather than someone's name (especially if they're a minister so we "know" who they are); captions don't cover quick interruptions, which can really mess things up if there's a lot of going back and forth between two people; etc. etc. :)
ATB, Matthew - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/