Re: [Dbpedia-discussion] A problem with categories datasets in DB3.8

2013-01-29 Thread Dimitris Kontokostas
The framework is written in Java / Scala so, in order to expand a template instance inside an article there are two options 1) re-implement (part of) the PHP MW engine in Java / Scala or 2) use the MW API during the extraction process Both of the approaches introduce complexities & dependencies fo

Re: [Dbpedia-discussion] A problem with categories datasets in DB3.8

2013-01-28 Thread Ning Zhang
Thanks for the explanation:) I fully understand the situation and I am really shocked by the great framework you have built. However I just want to understand what happens here. If this problem comes because the CategoryExtractor does not know how to expand some templates, how can we solve it by ch

Re: [Dbpedia-discussion] A problem with categories datasets in DB3.8

2013-01-28 Thread Dimitris Kontokostas
Hey Ning, This is not exactly a bug. The problem with templates is that are usually very complicated and we would have to re-implement the MediaWiki engine in Java in order to parse and expand them correctly (or come up with another alternative like the use the MW API somehow in the extraction pro

Re: [Dbpedia-discussion] A problem with categories datasets in DB3.8

2013-01-28 Thread Ning Zhang
Thank you all. @Dimitris&Andrea, if it comes to be a bug of the extractor, then could you give me a brief estimation of how long it will take to re-extract it? I will just keep looking on this discussion and just let me when any conclusion is gotten or any further work I can help do. Best, Ning

Re: [Dbpedia-discussion] A problem with categories datasets in DB3.8

2013-01-28 Thread Dimitris Kontokostas
don't worry :) This is an old issue. In this case Wikipedia applies categories through special templates for instance: {{nationality by occupation|Country=United Kingdom|Nationality=British}} The framework c

Re: [Dbpedia-discussion] A problem with categories datasets in DB3.8

2013-01-28 Thread Andrea Di Menna
I am sorry :) I meant there exists a link in those pages. (picked the incorrect words to express myself). There could an issue in the SkosCategoriesExtractor. If I am not wrong, the triple should be collected when analysing the http://en.wikipedia.org/wiki/Category:British_people_by_occupation art

Re: [Dbpedia-discussion] A problem with categories datasets in DB3.8

2013-01-28 Thread Dimitris Kontokostas
Thanks Andrea, @Ning, DBpedia tries to be an exact semantic mirror of Wikipedia so if you want to fix these "errors" you should try to fix them at the source (which is Wikipedia) and on the next DBpedia release they will be fixed Best, Dimitris On Mon, Jan 28, 2013 at 4:14 PM, Andrea Di Menna

Re: [Dbpedia-discussion] A problem with categories datasets in DB3.8

2013-01-28 Thread Andrea Di Menna
Hi Dimistris, does not seem so: http://en.wikipedia.org/wiki/Category:British_people_by_occupation?oldid=489570899 http://en.wikipedia.org/wiki/Category:British_people?oldid=494233120 Cheers Andrea 2013/1/28 Dimitris Kontokostas > Hi Ning, > > Can you please confirm that the same thing does n

Re: [Dbpedia-discussion] A problem with categories datasets in DB3.8

2013-01-28 Thread Dimitris Kontokostas
Hi Ning, Can you please confirm that the same thing does not happen in Wikipedia too? Best, Dimitris On Mon, Jan 28, 2013 at 7:06 AM, Ning Zhang wrote: > Hi Friends, > > I want to extract wiki articles category graph and find your datasets > fortunately to avoid parsing the huge dump by mysel

[Dbpedia-discussion] A problem with categories datasets in DB3.8

2013-01-27 Thread Ning Zhang
Hi Friends, I want to extract wiki articles category graph and find your datasets fortunately to avoid parsing the huge dump by myself. Thank you so much for the effort. However, I found something strange doing BFS on the graph based on Categories(Skos): there are lots of category nodes that canno