Doug from WikiSource started a page over at meta: http://meta.wikimedia.org/wiki/Beyond_categories
I'll be trying to fill in some of my understanding of the problem and the scope of a possible solution. I recognize there's been a lot of prior art on this issue, and a lot of existing overlapping tools and infrastructure, and I'm pretty new around here, and apt to be inaccurate and naive. So I do hope others with more experience will come and help sort it out. Chris On Sun, May 5, 2013 at 11:06 AM, Michael Hale <hale.michael...@live.com> wrote: > As far as checking the import progress of Wikidata, the category American > women writers has 1479 articles. 651 of them currently have a main type > (GND), 328 have a sex, 162 have an occupation, 111 have a country of > citizenship, 49 have a sexual orientation, 39 have a place of birth, etc. > >> From: j...@sahnwaldt.de >> Date: Sun, 5 May 2013 16:28:14 +0200 > >> To: wikidata-l@lists.wikimedia.org >> Subject: Re: [Wikidata-l] Question about wikipedia categories. >> >> Hi Pat, >> >> I've been involved with DBpedia for several years, so these are >> interesting thoughts. >> >> On 5 May 2013 01:25, Patrick Cassidy <p...@micra.com> wrote: >> > If one is interested in a functional “category” system, it would be very >> > helpful to have a good logic-based ontology as the backbone. >> > >> > I haven’t looked recently, but when I inquired about the ontology used >> > by >> > DBpedia a year ago, I was referred to “dbpedia-ontology.owl”, an >> > ontology in >> > the format of the “semantic web” ontology format OWL. The OWL format is >> > excellent for simple purposes, but the dbpedia-ontology.owl (at that >> > time) >> > was not well-structured (being very polite). >> >> Do you mean just the file dbpedia-ontology.owl or the DBpedia ontology >> in general? We still use OWL as our main format for publishing the >> ontology. The file is generated automatically. Maybe the generation >> process could be improved. >> >> > I did inquire as to who was >> > maintaining the ontology, and had a hard time figuring out how to help >> > bring >> > it up to professional standards. But it was like punching jello, nothing >> > to >> > grasp onto. I gave up, having other useful things to do with my time. >> >> The ontology is maintained by a community that everyone can join at >> http://mappings.dbpedia.org/ . An overview of the current class >> hierarchy is here: >> http://mappings.dbpedia.org/server/ontology/classes/ . You're more >> than welcome to help! I think talk pages are not used enough on the >> mappings wiki, so if you have ideas, misgivings or questions about the >> DBpedia ontology, the place to go is probably the mailing list: >> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion >> >> Thanks! >> >> Christopher >> >> > >> > >> > >> > Perhaps it is time now, with more experience in hand, to rethink the >> > category system starting with basics. This is not as hard as it sounds. >> > It may require some changes where there is ambiguity or logical >> > inconsistency, but mostly it only necessary to link the Wikipedia >> > categories >> > to an ontology based on a well-structured and logically sound foundation >> > ontology (also referred to as an “upper ontology”), that supplies the >> > basic >> > categories and relations. Such an ontology can provide the basic >> > concepts, >> > whose labels can be translated into any terminology that any local user >> > wants to use. There are several well-structured foundation ontologies, >> > based on over twenty years of research, but the one I suggest is the one >> > I >> > am most familiar with (which I created over the past seven years), >> > called >> > COSMO. The files at http://micra.com/COSMO will provide the ontology >> > itself >> > (“COSMO.owl”, in OWL) and papers describing the basic principles. COSMO >> > is structured to be a “primitives-based foundation ontology”, containing >> > all >> > of the “semantic primitives” needed to describe anything one wants to >> > talk >> > about. All other categories are structured as logical combinations of >> > the >> > basic elements. Its inventory of primitives is probably incomplete, but >> > is >> > able to describe everything I have been concerned with for years (7000 >> > categories and 800 relations thus far) can always be supplemented as >> > required for new fields. With an OWL ontology, queries can be executed >> > by >> > any of several logic-based utilities. Making the query system easy for >> > those who prefer not to build SPARQL queries (including myself) would >> > require some programming, but that is a miniscule effort compared to >> > what >> > has already been put into the DBPedia database. Tools such as “Protégé” >> > make it easy to work with an OWL ontology, and there is a web site where >> > an >> > OWL ontology can be developed collaboratively. >> > >> > >> > >> > I will be willing to put some effort into this and assist anyone who >> > wants >> > to used the COSMO ontology for this project. If those who are in charge >> > of >> > maintaining the ontology (is anyone?) would like to discuss this at >> > greater >> > length, send me an email or telephone me. All those who are interested >> > in >> > this topic may also feel free to contact me, or to discuss this thread >> > on >> > the list. I suggest the thread title “Foundation Ontology”. >> > >> > >> > >> > Pat >> > >> > >> > >> > Patrick Cassidy >> > >> > MICRA Inc. >> > >> > cass...@micra.com >> > >> > 908-561-3416 >> > >> > >> > >> > From: wikidata-l-boun...@lists.wikimedia.org >> > [mailto:wikidata-l-boun...@lists.wikimedia.org] On Behalf Of Michael >> > Hale >> > Sent: Saturday, May 04, 2013 2:57 AM >> > To: Discussion list for the Wikidata project. >> > >> > >> > Subject: Re: [Wikidata-l] Question about wikipedia categories. >> > >> > >> > >> > I think it's important to consider the distinction between a category >> > system >> > and semantic queries. I think it's very likely that DBpedia and Wikidata >> > will converge over time and develop a simple enough query interface that >> > causes fewer people to use the category system because we will be able >> > to >> > automatically generate relevant queries related to a given article. >> > DBpedia >> > currently has a lot more data, but Wikidata is important for many >> > editing >> > scenarios. Also, in the future I think there will be a lot of content >> > scenarios where it is natural to start by putting data into Wikidata and >> > then including it in articles instead of just extracting information >> > from >> > articles. If you are familiar with query languages you can get >> > comfortable >> > with the DBpedia SPARQL examples in a few minutes, but for a typical >> > reader >> > that just wants to go from an article about a person to a list of >> > similar >> > people it is hard to beat scrolling down and just clicking on a >> > category. I >> > did a test query on DBpedia to plot all sports cars by their engine >> > sizes, >> > and I think for the types of things it enables you to do it is totally >> > worth >> > the learning curve. That being said, I think the category system has a >> > lot >> > of potential for better browsing scenarios as opposed to queries. I've >> > been >> > making a tool that mixes the article view data with the category system. >> > You >> > can see a video of the basic idea here and a screenshot of football >> > league >> > popularity split by language. >> > http://en.wikipedia.org/wiki/User:Wakebrdkid/Popular_category_browsing >> > I'm >> > currently multiplying the Chinese traffic by 30 to try and account for >> > Baidu >> > Baike. >> > >> >> Date: Sat, 4 May 2013 08:14:54 +0200 >> >> From: jane...@gmail.com >> >> To: wikidata-l@lists.wikimedia.org >> >> Subject: Re: [Wikidata-l] Question about wikipedia categories. >> >> >> >> Wondering exactly the same thing - my frustrations with categories >> >> began about three years ago and it seems I am surprised monthly by >> >> severe limitations to this outdated apparatus. I am a heavy category >> >> user, but I would love to be able to kick it out the door in favour of >> >> a more structured method. As far as I can tell, there is very little >> >> synchronisation among language Wikipedias of category trees, and being >> >> able to apply a central structure to all Wikipedias through Wikidata >> >> sounds like a great idea, and one which would not disturb the current >> >> category trees we already have, but supplement them. As I see it, some >> >> category structures are OK, but when categories get big, people split >> >> them in non-standard ways, causing problems like this recent >> >> media-hype regarding female novellists. I think that it's great this >> >> is in the news in this way, because I am sure that most Wikipedia >> >> readers never knew we had categories, and this is a great introduction >> >> to them, as well as an invitation to edit Wikipedia. >> >> >> >> 2013/5/4, Chris Maloney <voldr...@gmail.com>: >> >> > I am just curious if there has ever been discussion about the >> >> > potential for reimplementing / replacing the category system in >> >> > Wikipedia with semantic tagging in WikiData. It seem to me that the >> >> > recent kerfuffle with regards to "American women writers" would not >> >> > have happened if the pages were tagged with simple RDF assertions >> >> > instead of these convoluted categories. I know, of course, that it >> >> > would be a huge undertaking, but I just don't see how the category >> >> > system can continue to scale (I'm amazed it has scaled as well as it >> >> > has already, of course). >> >> > >> >> > I am trying to learn more about wikidata, and have perused the >> >> > various >> >> > infos and FAQs for the last two hours, and can't find any discussion >> >> > of this particular issue. >> >> > >> >> > -- Chris >> >> > >> >> > _______________________________________________ >> >> > Wikidata-l mailing list >> >> > Wikidata-l@lists.wikimedia.org >> >> > https://lists.wikimedia.org/mailman/listinfo/wikidata-l >> >> > >> >> >> >> _______________________________________________ >> >> Wikidata-l mailing list >> >> Wikidata-l@lists.wikimedia.org >> >> https://lists.wikimedia.org/mailman/listinfo/wikidata-l >> > >> > >> > _______________________________________________ >> > Wikidata-l mailing list >> > Wikidata-l@lists.wikimedia.org >> > https://lists.wikimedia.org/mailman/listinfo/wikidata-l >> > >> >> _______________________________________________ >> Wikidata-l mailing list >> Wikidata-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > _______________________________________________ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l