[Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-09-24 Thread Marco Fossati
And here is another open issue. Cheers, Marco http://it.wikipedia.org/wiki/Glenn_Danzig contains: |genere = Heavy Metal |genere2 = Alternative Metal |genere3 = Punk rock |genere4 = Hardcore punk all properties map to the same dbpedia-owl:genre property, but only 2 are extracted i.e. 'Punk_rock'

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-15 Thread Marco Fossati
On 5/15/12 11:59 AM, Marco Amadori wrote: I propose the following algorithm: Let's keep the current codebase that produces a triple if and only if a same cased wikilink is present elsewhere in the page. This time it does not trash the triple if it does not find the link, instead the code

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-15 Thread Pablo Mendes
+1 on the proposed solution +1 on the beer* Also, -ambiguous can contain triples where the property is defined as ObjectProperty and the value is a String. I've heard that there is a tool out there called DBpedia Spotlight (or something like that) that could be used to disambiguate these links

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-15 Thread Marco Amadori
2012/5/15 Pablo Mendes pablomen...@gmail.com: +1 on the proposed solution +1 on the beer* Also, -ambiguous can contain triples where the property is defined as ObjectProperty and the value is a String. Wait, the algorithm above mentioned is exactly about that: -ambiguous containst triples

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-15 Thread Marco Amadori
2012/5/15 Pablo Mendes pablomen...@gmail.com: Alright. Marco, so will you give a go on the -ambiguous creation? I was hoping that Jona could take on that, he knows an order of magnitude better than I do the DBpedia extractor codebase. It could take ages for me to do that since I would need

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-13 Thread Jona Christopher Sahnwaldt
On Thu, May 10, 2012 at 9:09 PM, Marco Amadori marco.amad...@gmail.com wrote: On Thursday 10 May 2012 21:06:30 Jona Christopher Sahnwaldt wrote: I think what Marco meant was: the mapping says it's an object property, so we should extract a URI, even if the property value is just a string.

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-13 Thread Marco Amadori
On Sunday 13 May 2012 21:04:05 Jona Christopher Sahnwaldt wrote: On Thu, May 10, 2012 at 9:09 PM, Marco Amadori marco.amad...@gmail.com wrote: On Thursday 10 May 2012 21:06:30 Jona Christopher Sahnwaldt wrote: I think what Marco meant was: the mapping says it's an object property, so we

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-13 Thread Jona Christopher Sahnwaldt
On Sun, May 13, 2012 at 9:13 PM, Marco Amadori marco.amad...@gmail.com wrote: On Sunday 13 May 2012 21:04:05 Jona Christopher Sahnwaldt wrote: On Thu, May 10, 2012 at 9:09 PM, Marco Amadori marco.amad...@gmail.com wrote: On Thursday 10 May 2012 21:06:30 Jona Christopher Sahnwaldt wrote: I

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-13 Thread Marco Amadori
On Monday 14 May 2012 00:08:02 Jona Christopher Sahnwaldt wrote: In my requirements, this will happen if and only if the same happens in mediawiki code, or in other words the DBpedia heuristic is the same as Mediawiki's one. If the mediawiki template engine would produce a wikilink we

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Marco Fossati
Hi Jona, Thanks for the exhaustive explanation. On 5/8/12 7:14 PM, Jona Christopher Sahnwaldt wrote: Even worse - there is a link to [[Heavy metal]], but in the infobox it's spelled Heavy Metal (with a capital M), so we don't find that link. This behavior could be considered a bug. Wikipedia

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Pablo Mendes
Hi Marco, This smells like another thread: http://www.mail-archive.com/dbpedia-discussion@lists.sourceforge.net/msg02767.html Would it be possible to emulate Wikipedia renderer engine behavior? It is written in PHP, so it should be a piece of cake to implement it in powerful Scala. So... are

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Marco Amadori
Would it be possible to emulate Wikipedia renderer engine behavior? It is written in PHP, so it should be a piece of cake to implement it in powerful Scala. So... are you volunteering to try it out? I think he just missed to add an appropriate emoticon at the end of the sentence :-) --

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Marco Amadori
2012/5/10 Pablo Mendes pablomen...@gmail.com: Hi Marco, This smells like another thread: http://www.mail-archive.com/dbpedia-discussion@lists.sourceforge.net/msg02767.html This is not similar, In this thread I was asking about a couple of bugs of misconfiguration of dbpedia html page renderer

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Marco Amadori
2012/5/10 Jona Christopher Sahnwaldt j...@sahnwaldt.de: I would guess that adding template expansion to DBpedia is a *major* task. May take several months. It would also be a *huge* benefit. :-) Sure it is, but here Marco Fossati was just asking about case insensitive wikilink discovery in

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Pablo Mendes
Marco, My pointer to the thread was with reference to the Sweble parser usage. Rendering pages like the PHP of wikipedia seems to do means just using their code or implementing template resolution. We've been doing the first, in that thread we suggested doing the second. Fixing this in the code

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Jona Christopher Sahnwaldt
I know. :-) On Thu, May 10, 2012 at 1:50 PM, Marco Fossati hell.j@gmail.com wrote: On 5/10/12 12:24 PM, Jona Christopher Sahnwaldt wrote: sorry for the exhausting explanation. I meant exhaustive. :-) -- Live

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Jona Christopher Sahnwaldt
I just made that little change. I had looked at the code before, so it was very simple. We now also get the triple for Heavy metal, but that's it: http://mappings.dbpedia.org/server/extraction/it/extract?title=Glenn+Danzig Let's hope that this doesn't introduce too many extraction errors. It's

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Marco Amadori
2012/5/10 Jona Christopher Sahnwaldt j...@sahnwaldt.de: I just made that little change. I had looked at the code before, so it was very simple. We now also get the triple for Heavy metal, but that's it: http://mappings.dbpedia.org/server/extraction/it/extract?title=Glenn+Danzig It seems a

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Jona Christopher Sahnwaldt
I think what Marco meant was: the mapping says it's an object property, so we should extract a URI, even if the property value is just a string. In the case of the musician infoboxes on it wiki, that would work, but in many other cases, it wouldn't. For example:

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Jona Christopher Sahnwaldt
http://en.wikipedia.org/wiki/Neocon http://en.wikipedia.org/wiki/NeoCon Nice example on why case insensitiveness is bad. :-) Oh, I don't see a problem here. There would only be a problem if we extracted a page that contained the string NeoCon in the infobox but a link to [[Neocon]] in the

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Marco Amadori
On Thursday 10 May 2012 21:06:30 Jona Christopher Sahnwaldt wrote: I think what Marco meant was: the mapping says it's an object property, so we should extract a URI, even if the property value is just a string. Right. In the case of the musician infoboxes on it wiki, that would work, but

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-10 Thread Pablo Mendes
In the context of... When there is a link, then the object property MUST be generated directly FROM THE LINK PROVIDED BY THE USER. If there isn't a link FOR AN OBJECT PROPERTY, then we should try to find one in the page. Jona said... We could use a heuristic to split the string into

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-09 Thread Marco Amadori
2012/5/8 Jona Christopher Sahnwaldt j...@sahnwaldt.de: No, I'm not insane. It's not a bug, it's a feature. Let me explain. The ontology property 'genre' is an object property, so its values must be URIs and its parser is looking for links to other Wikipedia pages. Even worse - there is a

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-09 Thread Dimitris Kontokostas
Sorry :) I got confused and thought we were talking about the */property/* namespace Cheers, Dimitris On Tue, May 8, 2012 at 7:53 PM, Jona Christopher Sahnwaldt j...@sahnwaldt.dewrote: No, this is a different problem. The wikitext already contains four different properties: |genere = Heavy

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-08 Thread Dimitris Kontokostas
Hi, I am pasting an old developers-list thread between me and Max (I could not find it on the archive search for a link) I think it is more or less about the same bug. I don't remember if it was fixed or not Cheers, Dimitris --- Thanks for pointing to

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-08 Thread Jona Christopher Sahnwaldt
No, this is a different problem. The wikitext already contains four different properties: |genere = Heavy Metal |genere2 = Alternative Metal |genere3 = Punk rock |genere4 = Hardcore punk And http://mappings.dbpedia.org/index.php/Mapping_it:Artista_musicale contains mappings for all of them. The

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-08 Thread Jona Christopher Sahnwaldt
No, I'm not insane. It's not a bug, it's a feature. Let me explain. The ontology property 'genre' is an object property, so its values must be URIs and its parser is looking for links to other Wikipedia pages. In this case, the values are rendered as links in Wikipedia, but not entered as

Re: [Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-08 Thread Jona Christopher Sahnwaldt
Here's the heuristic code: http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/extraction_framework/file/2ed85b439df2/core/src/main/scala/org/dbpedia/extraction/dataparser/ObjectParser.scala#l44 On Tue, May 8, 2012 at 7:14 PM, Jona Christopher Sahnwaldt j...@sahnwaldt.de wrote: No, I'm not insane.

[Dbpedia-discussion] Mapping extractor generates only 1 triple when a property has multiple objects

2012-05-07 Thread Marco Fossati
Hi Jona, We have just generated fresh dumps for the Italian DBpedia with the latest extractors code version and found that some data is lost in the mapping-based dataset. If you have a look at this example [1], 'dbprop-it:genere' property has 4 objects, while 'dbpedia-owl:genre' only has 1.