Hello, Omid Rouhani wrote: > Hi, > > I'm curious about the "Cleanded Wikipedia Category Class (CWCC) > Hierarchy" dataset. > I read the quite short description available at > "http://wiki.dbpedia.org/Downloads#cleandedwikipediacategoryclass(cwcc)hierarchy", > however is there any other documentation about what exactly this is or > what the status of the project is. Is someone currently working on it? > Do we have some estimate of when we think a new version of the dataset > might be released? > > In case no formal documentation exists as of today, perhaps some of > the people behind the project are on this list and can share with us > some informal description of what work has been done so far and what > is planned for the future .
There is a piece of information on the download page: "The aim of this class hierarchy is to be close to the Wikipedia category system, but without some of its obstacles, e.g. cycles of categories, administrative categories, categories which represent instances instead of classes etc. However, the current extraction script contains some bugs and data cleansing still insufficient to be useful in applications. For this reason, the data set is not published in the SPARQL endpoint." Currently, no one is working on improving this data set, because we are busy with other activities. So I cannot make any estimate whether and when there will be a new version. As always, the code is publicly available in the DBpedia SVN. If you are interested in improving it, I can give you advice on what needs to be done. Kind regards, Jens ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion