Re: [Dbpedia-discussion] Test data set - DBpedia

Mohamed Morsey Wed, 20 Feb 2013 15:43:10 -0800

Hi Prashanth,

On 02/20/2013 11:37 PM, Prashanth Swaminathan wrote:

On Wed, Feb 20, 2013 at 8:34 PM, Mohamed Morsey<mor...@informatik.uni-leipzig.de<mailto:mor...@informatik.uni-leipzig.de>> wrote:


    Hi Prashanth,


    On 02/20/2013 03:42 PM, Prashanth Swaminathan wrote:

    Hi,

    I'm trying to work with NLPReduce
    
(https://files.ifi.uzh.ch/ddis/oldweb/ddis/research/talking-to-the-semantic-web/nlpreduce/index.html)

    That project comes with few data sets in owl.

    I want to try queries with respect to "people"( questions related
    to people)
    Does DBpedia has any test data set in owl about people that I can
    use in NLP Reduce?


    Do you mean a dataset containing data about people only (no other
    types).


    Thanks!

--Regards

    Prashanth Swaminathan
    www. <http://www.about.me/prashanthswami>prashanths.in
    <http://prashanths.in>

--Kind Regards

    Mohamed Morsey
    Department of Computer Science
    University of Leipzig

Ah, yes. Exactly
Thanks

Actually there is no such dataset (dataset for a specific type), but youcan still generate it yourself, as follows:


1. Get a list of all people in DBpedia, this can be done using SPARQL
   queries but with increasing the "OFFSET" in each call, as follows:

       a- SELECT * WHERE { ?s a dbpedia-owl:Person } limit 1000
       b- SELECT * WHERE { ?s a dbpedia-owl:Person } limit 1000 OFFSET 1000
       c- SELECT * WHERE { ?s a dbpedia-owl:Person } limit 1000 OFFSET 2000
       and so on.

2. Append the results returned each time to a file.
3. After finishing the last step, you should have a file with a full
   list of all people in DBpedia.
4. Iterate through that file and pick a resource each time.
5. Replace the word "resource" in the URI with the word "data" and
   append ".ntriples" to the end if you want the output format to be
   N-Triples, e.g. for resource
   "http://dbpedia.org/resource/Lionel_Messi"; you should use
   "http://dbpedia.org/data/Lionel_Messi.ntriples"; to get his full data
   in N-Triples.
6. Use the URI you have just created in the last step to get the data,
   using "curl" Linux command or anything equivalent if you are using
   another OS.
7. Append the returned triples to a file.
8. By the end of that step you should have a full dump of people only.

--
Regards
Prashanth Swaminathan
www. <http://www.about.me/prashanthswami>prashanths.in<http://prashanths.in>


Hope that helps.

--
Kind Regards
Mohamed Morsey
Department of Computer Science
University of Leipzig

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb

_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Re: [Dbpedia-discussion] Test data set - DBpedia

Reply via email to