I think you need to start redesigns by considering scenarios with specific 
examples. What are the tasks that I want to do that I can't currently do? What 
are the tasks that I do so much that they should be easier? Can you provide an 
example of some DBpedia queries that are awkward due to the current ontology 
and would be improved by COSMO? Things like that typically change gradually on 
wiki-projects as opposed to starting over. Here are the beginnings of a visual 
query interface for Wikidata: http://toolserver.org/~magnus/ts2/wdq/ Also, the 
English Wikipedia currently has about a million categories. Some of them are 
used for infrastructure purposes, but the vast majority of them are content 
categories. The original issue was that not all categories that contain people 
have subcategories for males and females. With Wikidata we will be able to say, 
"The vast majority of these articles have a statement describing their sex, so 
we can place a filter for that property on this category page." Then we can 
find a balance between the organic nature of the category system without having 
to worry about categories like "American male Democratic politicians". We could 
just have a politicians category, and the system would be smart enough to know 
that it should put gender, nationality, and party filters on that page. Then 
eventually we might have enough structured data to eliminate the category 
system. That is how I imagine the progression at least.

From: p...@micra.com
To: wikidata-l@lists.wikimedia.org
Date: Sat, 4 May 2013 19:25:25 -0400
Subject: Re: [Wikidata-l] Question about wikipedia categories.














If one is interested in a functional “category”
system, it would be very helpful to have a good logic-based ontology as the
backbone.

I haven’t looked recently, but when I inquired about the
ontology used by DBpedia a year ago, I was referred to “dbpedia-ontology.owl”,
an ontology in the format of the “semantic web” ontology format
OWL.  The OWL format is excellent for simple purposes, but the 
dbpedia-ontology.owl
(at that time) was not well-structured (being very polite).  I did inquire as
to who was maintaining the ontology, and had a hard time figuring out how to
help bring it up to professional standards.  But it was like punching jello,
nothing to grasp onto. I gave up, having other useful things to do with my
time.

 

Perhaps it is time now, with more experience in hand, to rethink
the category system starting with basics.   This is not as hard as it sounds.  
It
may require some changes where there is ambiguity or logical inconsistency, but
mostly it only necessary to link the Wikipedia categories to an  ontology based
on a well-structured and logically sound foundation ontology (also referred to
as an “upper ontology”), that supplies the basic categories and
relations.  Such an ontology can provide the basic concepts, whose labels can
be translated into any terminology that any local user wants to use.  There are
several well-structured foundation ontologies, based on over twenty years of
research, but the one I suggest is the one I am most familiar with (which I
created over the past seven years), called COSMO.  The files at 
http://micra.com/COSMO will provide the
ontology itself (“COSMO.owl”, in OWL) and papers describing the
basic principles.    COSMO is structured to be a “primitives-based
foundation ontology”, containing all of the “semantic primitives”
needed to describe anything one wants to talk about.   All other categories are
structured as logical combinations of the  basic elements.  Its inventory of
primitives is probably incomplete, but is able to describe everything I have
been concerned with for years (7000 categories and 800 relations thus far) can 
always
be supplemented as required for new fields.  With an OWL ontology, queries can
be executed by any of several logic-based utilities.  Making the query system
easy for those who prefer not to build SPARQL queries (including myself) would
require some programming, but that is a miniscule effort compared to what has
already been put into the DBPedia database.  Tools such as “Protégé”
make it easy to work with an OWL ontology, and there is a web site where an OWL
ontology can be developed collaboratively.

 

I will be willing to put some effort into this and assist anyone
who wants to used the COSMO ontology for this project.   If those who are in
charge of maintaining the ontology (is anyone?) would like to discuss this at
greater length, send me an email or telephone me.  All those who are interested
in this topic may also feel free to contact me, or to discuss this thread on
the list.   I suggest the thread title “Foundation Ontology”.

 

Pat

 



Patrick Cassidy

MICRA Inc.

cass...@micra.com

908-561-3416



 







From:
wikidata-l-boun...@lists.wikimedia.org
[mailto:wikidata-l-boun...@lists.wikimedia.org] On Behalf Of Michael
Hale

Sent: Saturday, May 04, 2013 2:57 AM

To: Discussion list for the Wikidata project.

Subject: Re: [Wikidata-l] Question about wikipedia categories.





 



I
think it's important to consider the distinction between a category system and
semantic queries. I think it's very likely that DBpedia and Wikidata will
converge over time and develop a simple enough query interface that causes
fewer people to use the category system because we will be able to
automatically generate relevant queries related to a given article. DBpedia
currently has a lot more data, but Wikidata is important for many editing
scenarios. Also, in the future I think there will be a lot of content scenarios
where it is natural to start by putting data into Wikidata and then including
it in articles instead of just extracting information from articles. If you are
familiar with query languages you can get comfortable with the DBpedia SPARQL
examples in a few minutes, but for a typical reader that just wants to go from
an article about a person to a list of similar people it is hard to beat
scrolling down and just clicking on a category. I did a test query on DBpedia
to plot all sports cars by their engine sizes, and I think for the types of
things it enables you to do it is totally worth the learning curve. That being
said, I think the category system has a lot of potential for better browsing
scenarios as opposed to queries. I've been making a tool that mixes the article
view data with the category system. You can see a video of the basic idea here
and a screenshot of football league popularity split by language. 
http://en.wikipedia.org/wiki/User:Wakebrdkid/Popular_category_browsing I'm
currently multiplying the Chinese traffic by 30 to try and account for Baidu
Baike.



> Date:
Sat, 4 May 2013 08:14:54 +0200

> From: jane...@gmail.com

> To: wikidata-l@lists.wikimedia.org

> Subject: Re: [Wikidata-l] Question about wikipedia categories.

> 

> Wondering exactly the same thing - my frustrations with categories

> began about three years ago and it seems I am surprised monthly by

> severe limitations to this outdated apparatus. I am a heavy category

> user, but I would love to be able to kick it out the door in favour of

> a more structured method. As far as I can tell, there is very little

> synchronisation among language Wikipedias of category trees, and being

> able to apply a central structure to all Wikipedias through Wikidata

> sounds like a great idea, and one which would not disturb the current

> category trees we already have, but supplement them. As I see it, some

> category structures are OK, but when categories get big, people split

> them in non-standard ways, causing problems like this recent

> media-hype regarding female novellists. I think that it's great this

> is in the news in this way, because I am sure that most Wikipedia

> readers never knew we had categories, and this is a great introduction

> to them, as well as an invitation to edit Wikipedia.

> 

> 2013/5/4, Chris Maloney <voldr...@gmail.com>:

> > I am just curious if there has ever been discussion about the

> > potential for reimplementing / replacing the category system in

> > Wikipedia with semantic tagging in WikiData. It seem to me that the

> > recent kerfuffle with regards to "American women writers"
would not

> > have happened if the pages were tagged with simple RDF assertions

> > instead of these convoluted categories. I know, of course, that it

> > would be a huge undertaking, but I just don't see how the category

> > system can continue to scale (I'm amazed it has scaled as well as it

> > has already, of course).

> >

> > I am trying to learn more about wikidata, and have perused the
various

> > infos and FAQs for the last two hours, and can't find any discussion

> > of this particular issue.

> >

> > -- Chris

> >

> > _______________________________________________

> > Wikidata-l mailing list

> > Wikidata-l@lists.wikimedia.org

> > https://lists.wikimedia.org/mailman/listinfo/wikidata-l

> >

> 

> _______________________________________________

> Wikidata-l mailing list

> Wikidata-l@lists.wikimedia.org

> https://lists.wikimedia.org/mailman/listinfo/wikidata-l













_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l                         
                  
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to