Re: [Dbpedia-discussion] Fetching song and movie related information

Mohamed Morsey Fri, 12 Oct 2012 07:30:26 -0700

Hi Tom,

On 10/11/2012 05:24 PM, Tom Morris wrote:

On Thu, Oct 11, 2012 at 11:03 AM, Pablo N. Mendes<pablomen...@gmail.com <mailto:pablomen...@gmail.com>> wrote:
    Good point, Tom.

        That article is in Category:Hindi_films, not
        Category:Hindi_songs and it's a Film, not a song, so it's not
        going to meet the requirements of your query.

    But maybe the class hierarchy comes to the rescue (Work is a
    supertype of Song and Film)?
The main point is that the extracted triples are semantic nonsensebecause they conflate multiple subjects under a single URI.
You've got

<http://dbpedia.org/resource/Jaane_Tu..._Ya_Jaane_Na>     <http://dbpedia.org/property/runtime> 
    "9300.0"^^<http://dbpedia.org/datatype/second> .
<http://dbpedia.org/resource/Jaane_Tu..._Ya_Jaane_Na>     <http://dbpedia.org/property/length>  
    "276.0"^^<http://dbpedia.org/datatype/second> .
<http://dbpedia.org/resource/Jaane_Tu..._Ya_Jaane_Na>     <http://dbpedia.org/property/length>  
    "1908.0"^^<http://dbpedia.org/datatype/second> .
<http://dbpedia.org/resource/Jaane_Tu..._Ya_Jaane_Na>     <http://dbpedia.org/property/length>  
    "221.0"^^<http://dbpedia.org/datatype/second> .
Which is the length of what?  They all refer to the same subject.
Similarly "Pappu Can't Dance" isn't a song. It's an (alternate?) titlefor the film according to the RDF. A human knows it's a song becauseof the
<http://dbpedia.org/resource/Jaane_Tu..._Ya_Jaane_Na>     
<http://dbpedia.org/property/title>       "Pappu Can't Dance"@en .
<http://dbpedia.org/resource/Jaane_Tu..._Ya_Jaane_Na>     
<http://dbpedia.org/ontology/Work/runtime>        
"155.0"^^<http://dbpedia.org/datatype/minute> .
To make what Venkatesh wants to work happen, you'd need to teach theextractor to figure out what the "main" subject of a page was and thenhave it mint new subject URIs for all related concepts represented onthe page which are different (and don't have their on Wikipedia page)such as sound track album, songs on a sound track album, etc. Thenyou'd also need to teach it that the physical proximity of the tracklisting and the soundtrack infobox implies that they refer to the samesubject. Finally, you'd have to make this robust in the face ofdifferent editing & structuring styles by different Wikipedians.

that's a good idea, I agree with you, that if those subtopics are alsotaken into consideration, that would be a great achievement for DBpedia

I'd love to see the extractor get this smart, but I'm not holding mybreath.
Tom



--
Kind Regards
Mohamed Morsey
Department of Computer Science
University of Leipzig

------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev

_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Re: [Dbpedia-discussion] Fetching song and movie related information

Reply via email to