Hi,

I'm a graduate student working on a research project using Wikipedia's
categorization system.  I stumbled on to DBpedia while trying to discover
the best way to access this data and have been plunging myself in to the
wonderful world of the Semantic Web, RDF, SPARQL, DBpedia, and the inner
workings of Wikipedia ever since.

First, thanks for such a wonderfully structured access point to this
information!

Second, in the application I'm developing I'd like to combine 2 (or more,
potentially) resources into one idea.  For example, the article "Pasta" and
category "Category:Pasta" both describe (roughly) the same idea. Right now
I'm just matching labels. This becomes a problem, though, when the resources
don't have the same name.  Pluralization is one of the largest differences
(e.g. "Sandwich" vs. "Category:Sandwiches"), but there are many cases that
can't be fixed by a simple pluralization change.  Wikipedia, though, has a
system for manually denoting these "eponymous categories" and assigning main
articles to categories.

For example, see Wikipedia's category page for Pasta [1]. At the top we are
given a link to the "main article for this category". At the beginning of
the list of pages in the category we see "List of Pasta"  and under the * we
see the articles "Pasta" and "Noodles".  The systems for denoting these
connections are described in detail at [2], [3], and [4].  This information,
however, is not present in DBpedia.

I feel like this would be a valuable addition to the category information
already available, but I don't want to pretend to know how to go about
extracting this information or how to denote it (some SKOS predicate
maybe?).  Does anyone else think this would be useful information?  Anyone
super familiar with the current extraction framework and would know how
doable this is or why it hasn't been done before?  I know I won't be able to
work it into my project (deadlines...)  but this project has exposed me to
so many new ideas and technologies that I now feel invested in them all.

Cheers,

Matt

[1] http://en.wikipedia.org/wiki/Category:Pasta
[2]
http://en.wikipedia.org/wiki/Wikipedia:Categorization#Eponymous_categories
[3]
http://en.wikipedia.org/wiki/Wikipedia:FAQ/Categorization#The_.7B.7Bcatmore.7D.7D_template
[4] http://en.wikipedia.org/wiki/Wikipedia:Categorization#Typical_sort_keys

--
Matt Mullins
Computer Science Department, Western Washington University
------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to