We have got some experimental tools that allow us to search archived content by the subtitles, but it's all a bit heath robinson to the honest. Work is ongoing, and we may have a semi public hackday type event later in the year where we can invite people in to explore possible applications- more news as we are allowed!
a On Tue, Aug 25, 2009 at 8:35 PM, Brian Butterworth<briant...@freeview.tv> wrote: > Isn't it a shame that the iPlayer doesn't have this (100% of BBCTV is > subtitled, too) > > It could be very search engine friendly. > > On Aug 21, 2009 7:17 PM, "Dan Brickley" <dan...@danbri.org> wrote: > > NPR transcripts are now - I read - easier to find. I had a quick look > around and couldn't find one, but I didn't try that hard. > > Could be of interest when run through text-summarisers, > auto-classifiers etc to make new routes to their content. > > More on NPR transcripts here - > http://help.npr.org/ics/support/default.asp?deptID=5670&task=knowledge&questionID=464 > > And googling for NPR API I find http://www.npr.org/api/index which > mentions a Transcript API, > http://www.npr.org/templates/apidoc/transcript.php as well as all > kinds of other fun stuff (including topic lists eg. > http://api.npr.org/list?id=3002). Also here's a blog post on their API > - http://www.npr.org/blogs/inside/2008/07/npr_api_is_live_on_nprorg.html > > It'd be rather nice to see some work on cross-referencing stories > across eg. BBC and NPR sites, to get different(-ish) perspectives on > the same issues. Having textual transcripts should help with doing > that at an approximate level, beyond the metadata NPR provide > directly... > > Dan > > ---------- Forwarded message ---------- > From: kimo <k...@webnetic.net> > Date: Fri, Aug 21, 2009 at 7:05 PM > Subject: [sunlightlabs] Free Transcripts on NPR.org now > To: sunlightl...@googlegroups.com > > > http://www.npr.org/ombudsman/2009/08/free_transcripts_now_available.html?ft=1&f=17370252 > > > Free Transcripts now Available on NPR.org > > 3:32 pm > > August 19, 2009 > > comments (3) > > Recommend (1) > > byline goes here > > Transcripts of favorite, missed or maddening stories on NPR used to > cost $3.95 each, but now they are free on NPR.org. > > Previously, NPR charged for transcripts because an outside contractor > worked fast to prepare them to be available to the library within a > few hours of a piece airing. It was a costly expense which NPR did for > the benefit of classrooms and deaf audiences, or anyone who wrote to > Listener Services and was willing to pay. > > As of the new NPR.org site re-launch on July 27, over 20,000 visitors > had gone online to get transcripts. > > Now, all you have to do to get a story's text is visit www.NPR.org and > click on the transcript link to the right of the audio button, located > just below the story's title. > > Quotes from these transcripts are for non-commercial use only, and may > not be used in any other media without attribution to NPR. > > Why now? > > "Transcripts were once largely the province of librarians and other > specialists whose job was to find archival content, often for > professional purposes," said Kinsey Wilson, the Senior VP of NPR's > Digital Media department. "As Web content becomes easier to share and > distribute, and search and social media have become important drivers > of audience engagement, archival content -- whether in the form of > stories or transcripts -- has an entirely different value than it did > in the past." > > NPR took the new website launch as an opportunity to offer free > transcripts, according to Laura Soto-Barra, NPR's Senior Librarian. > > "We made a decision to go ahead even though NPR pays a considerable > amount of money to produce transcripts on deadline," said Soto-Barra. > "Transcripts are posted six hours after the shows air, except for > Morning Edition's transcripts which are posted four hours after the > show is broadcast. We have offered free audio for a long time and we > felt that free transcripts were long overdue." > > New software allows NPR's staff to receive daily metrics and supply > data for "most popular transcripts yesterday", most popular > transcripts for the last seven days" and "most popular transcript > ever". > > Keep in mind transcript coordinators do their best to catch and > correct errors on the text. But since there is a quick turn-around > time on transcripts, mistakes can occur. If you notice a spelling or > typographical error, please email transcri...@npr.org, where it can be > corrected. > Soto-Barra said that NPR transcripts may contain minor or significant > errors, ranging from the use of "ex-patriot" instead of "expatriate." > In another example, a transcriber mistakenly quoted filmmaker John > Waters as saying of former Manson follower Leslie Van Houten: "She's a > yuppie," when what he really said was, "She's not a yuppie." > > Transcript coordinators "Dorothy Hickson and Laura Jeffrey do their > best to find and correct errors but unfortunately, they cannot > proofread every piece," said Soto-Barra. "Librarians and transcript > coordinators appreciate when someone calls their attention to errors, > particularly when they involve name spellings and use of > (unintelligible)." > > categories: > > What is this? > > Share > > > --~--~---------~--~----~------------~-------~--~----~ > You received this message because you are subscribed to the Google > Groups "sunlightlabs" group. > To post to this group, send email to sunlightl...@googlegroups.com > To unsubscribe from this group, send email to > sunlightlabs+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/sunlightlabs?hl=en > -~----------~----~----~----~------~----~------~--~--- > - > Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please > visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. > Unofficial list archive: > http://www.mail-archive.com/backstage@lists.bbc.co.uk/ > -- Ant Miller tel: 07709 265961 email: ant.mil...@gmail.com - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/