Very interesting, thanks for posting! The TED dataset is also quite
interesting for Wikidata, because we are missing the generic concepts
behind many Wikipedia articles. Most people complain that Wikipedia tends
to dive into indepth information without giving adequate coverage in an
overview article. Many overview articles have grown beyond normal viewing
capacity on a mobile phone and probably should be split into 2nd and 3rd
tier wikipages giving explanations about branches of the subject. To see
what I mean, try viewing the English Wikipedia article for "Insurance" on
your phone.

The TED talks touch on many of such missing subject items and it would be
nice to crowdsource the creation of them. Your project could be possibly be
a way to direct contributors to quick explanations and/or uses of such
concepts. The fact that many TED talks are transcribed into so many
different languages means we may be able to harness these translations for
use in Wikidata labels. At least that is what I hope. Without labels,
nothing is findable on Wikidata and that is why we still are so slow
interlinking linkable items.

If your initiative takes off, it may be interesting to apply it to our own
set of film media on Commons, but very little of that has been linked to
Wikidata yet.

On Sun, Apr 24, 2016 at 1:15 PM, Raphaël Troncy <raphael.tro...@eurecom.fr>
wrote:

> Good news blog post:
>> https://blog.wikimedia.org/2016/04/22/ted-wikimedia-collaboration/
>>
>
> Great news! I didn't know neither that Wikidata has unique identifiers for
> so many TED talks.
>
> FYI, my group has worked 18 months ago on a prototype we called HyperTED.
> You can read about it at
> http://linkedup-project.eu/2014/10/14/vici-shortlist-hyperted/. There is
> also a presentation at
> http://www.slideshare.net/JosLuisRedondoGarca/hyperted-40494120. And you
> can play directly with the HyperTED prototype at
> http://linkedtv.eurecom.fr/HyperTED/
>
> In a nutshell, we used the TED talk metadata (subtitles divided into
> paragraphs) in order to provide chapters to TED talks. We have annotated
> them automatically using named entity recognition and disambiguation tools
> and topic detection algorithms. Hence, entities are disambiguated to
> dbpedia (but this could also be wikidata entities). Finally, we have
> developed an algorithm that detects hot spots in TED talks (read the
> scientific paper at
> http://www.eurecom.fr/~troncy/Publications/Redondo_Troncy-iswc14.pdf).
> Ultimately, as soon you watch chapters of TED talks, we are recommending
> you other chapters of other TED talks that may be related (because of
> common entities and topics). Instead of being a traditional recommender
> system that suggests you other TED talks, we perform recommendation at the
> fragment level.
>
> We are eager to receive any feedback. Be gentle with the demo, we are
> aware of some bugs and limitations.
> Best regards.
>
>   Raphaël
>
> --
> Raphaël Troncy
> EURECOM, Campus SophiaTech
> Data Science Department
> 450 route des Chappes, 06410 Biot, France.
> e-mail: raphael.tro...@eurecom.fr & raphael.tro...@gmail.com
> Tel: +33 (0)4 - 9300 8242
> Fax: +33 (0)4 - 9000 8200
> Web: http://www.eurecom.fr/~troncy/
>
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to