And here is another open issue.
Cheers,
Marco
http://it.wikipedia.org/wiki/Glenn_Danzig contains:
|genere = Heavy Metal
|genere2 = Alternative Metal
|genere3 = Punk rock
|genere4 = Hardcore punk
all properties map to the same dbpedia-owl:genre property, but only 2
are extracted i.e. 'Punk_rock'
On 5/15/12 11:59 AM, Marco Amadori wrote:
I propose the following algorithm:
Let's keep the current codebase that produces a triple if and only if
a same cased wikilink is present elsewhere in the page.
This time it does not trash the triple if it does not find the link,
instead the code
+1 on the proposed solution
+1 on the beer*
Also, -ambiguous can contain triples where the property is defined as
ObjectProperty and the value is a String. I've heard that there is a tool
out there called DBpedia Spotlight (or something like that) that could be
used to disambiguate these links
2012/5/15 Pablo Mendes pablomen...@gmail.com:
+1 on the proposed solution
+1 on the beer*
Also, -ambiguous can contain triples where the property is defined as
ObjectProperty and the value is a String.
Wait, the algorithm above mentioned is exactly about that: -ambiguous
containst triples
2012/5/15 Pablo Mendes pablomen...@gmail.com:
Alright. Marco, so will you give a go on the -ambiguous creation?
I was hoping that Jona could take on that, he knows an order of
magnitude better than I do the DBpedia extractor codebase.
It could take ages for me to do that since I would need
On Thu, May 10, 2012 at 9:09 PM, Marco Amadori marco.amad...@gmail.com wrote:
On Thursday 10 May 2012 21:06:30 Jona Christopher Sahnwaldt wrote:
I think what Marco meant was: the mapping says it's an object
property, so we should extract a URI, even if the property value is
just a string.
On Sunday 13 May 2012 21:04:05 Jona Christopher Sahnwaldt wrote:
On Thu, May 10, 2012 at 9:09 PM, Marco Amadori marco.amad...@gmail.com
wrote:
On Thursday 10 May 2012 21:06:30 Jona Christopher Sahnwaldt wrote:
I think what Marco meant was: the mapping says it's an object
property, so we
On Sun, May 13, 2012 at 9:13 PM, Marco Amadori marco.amad...@gmail.com wrote:
On Sunday 13 May 2012 21:04:05 Jona Christopher Sahnwaldt wrote:
On Thu, May 10, 2012 at 9:09 PM, Marco Amadori marco.amad...@gmail.com
wrote:
On Thursday 10 May 2012 21:06:30 Jona Christopher Sahnwaldt wrote:
I
On Monday 14 May 2012 00:08:02 Jona Christopher Sahnwaldt wrote:
In my requirements, this will happen if and only if the same happens in
mediawiki code, or in other words the DBpedia heuristic is the same as
Mediawiki's one.
If the mediawiki template engine would produce a wikilink we
Hi Jona,
Thanks for the exhaustive explanation.
On 5/8/12 7:14 PM, Jona Christopher Sahnwaldt wrote:
Even worse - there is a link to [[Heavy metal]], but in the infobox
it's spelled Heavy Metal (with a capital M), so we don't find that
link. This behavior could be considered a bug. Wikipedia
Hi Marco,
This smells like another thread:
http://www.mail-archive.com/dbpedia-discussion@lists.sourceforge.net/msg02767.html
Would it be possible to emulate Wikipedia renderer engine behavior? It
is written in PHP, so it should be a piece of cake to implement it in
powerful Scala.
So... are
Would it be possible to emulate Wikipedia renderer engine behavior? It
is written in PHP, so it should be a piece of cake to implement it in
powerful Scala.
So... are you volunteering to try it out?
I think he just missed to add an appropriate emoticon at the end of
the sentence :-)
--
2012/5/10 Pablo Mendes pablomen...@gmail.com:
Hi Marco,
This smells like another thread:
http://www.mail-archive.com/dbpedia-discussion@lists.sourceforge.net/msg02767.html
This is not similar, In this thread I was asking about a couple of
bugs of misconfiguration of dbpedia html page renderer
2012/5/10 Jona Christopher Sahnwaldt j...@sahnwaldt.de:
I would guess that adding template expansion to DBpedia is a *major*
task. May take several months. It would also be a *huge* benefit. :-)
Sure it is, but here Marco Fossati was just asking about case
insensitive wikilink discovery in
Marco,
My pointer to the thread was with reference to the Sweble parser usage.
Rendering pages like the PHP of wikipedia seems to do means just using
their code or implementing template resolution. We've been doing the first,
in that thread we suggested doing the second.
Fixing this in the code
I know. :-)
On Thu, May 10, 2012 at 1:50 PM, Marco Fossati hell.j@gmail.com wrote:
On 5/10/12 12:24 PM, Jona Christopher Sahnwaldt wrote:
sorry for the exhausting explanation.
I meant exhaustive. :-)
--
Live
I just made that little change. I had looked at the code before, so it
was very simple. We now also get the triple for Heavy metal, but
that's it:
http://mappings.dbpedia.org/server/extraction/it/extract?title=Glenn+Danzig
Let's hope that this doesn't introduce too many extraction errors.
It's
2012/5/10 Jona Christopher Sahnwaldt j...@sahnwaldt.de:
I just made that little change. I had looked at the code before, so it
was very simple. We now also get the triple for Heavy metal, but
that's it:
http://mappings.dbpedia.org/server/extraction/it/extract?title=Glenn+Danzig
It seems a
I think what Marco meant was: the mapping says it's an object
property, so we should extract a URI, even if the property value is
just a string.
In the case of the musician infoboxes on it wiki, that would work, but
in many other cases, it wouldn't. For example:
http://en.wikipedia.org/wiki/Neocon
http://en.wikipedia.org/wiki/NeoCon
Nice example on why case insensitiveness is bad. :-)
Oh, I don't see a problem here. There would only be a problem if we
extracted a page that contained the string NeoCon in the infobox but
a link to [[Neocon]] in the
On Thursday 10 May 2012 21:06:30 Jona Christopher Sahnwaldt wrote:
I think what Marco meant was: the mapping says it's an object
property, so we should extract a URI, even if the property value is
just a string.
Right.
In the case of the musician infoboxes on it wiki, that would work, but
In the context of...
When there is a link, then the object property MUST be generated directly
FROM THE LINK PROVIDED BY THE USER.
If there isn't a link FOR AN OBJECT PROPERTY, then we should try to find
one in the page.
Jona said...
We could use a heuristic to
split the string into
2012/5/8 Jona Christopher Sahnwaldt j...@sahnwaldt.de:
No, I'm not insane.
It's not a bug, it's a feature. Let me explain.
The ontology property 'genre' is an object property, so its values
must be URIs and its parser is looking for links to other Wikipedia
pages.
Even worse - there is a
Sorry :)
I got confused and thought we were talking about the */property/* namespace
Cheers,
Dimitris
On Tue, May 8, 2012 at 7:53 PM, Jona Christopher Sahnwaldt
j...@sahnwaldt.dewrote:
No, this is a different problem. The wikitext already contains four
different properties:
|genere = Heavy
Hi,
I am pasting an old developers-list thread between me and Max (I could not
find it on the archive search for a link)
I think it is more or less about the same bug. I don't remember if it was
fixed or not
Cheers,
Dimitris
---
Thanks for pointing to
No, this is a different problem. The wikitext already contains four
different properties:
|genere = Heavy Metal
|genere2 = Alternative Metal
|genere3 = Punk rock
|genere4 = Hardcore punk
And http://mappings.dbpedia.org/index.php/Mapping_it:Artista_musicale
contains mappings for all of them.
The
No, I'm not insane.
It's not a bug, it's a feature. Let me explain.
The ontology property 'genre' is an object property, so its values
must be URIs and its parser is looking for links to other Wikipedia
pages.
In this case, the values are rendered as links in Wikipedia, but not
entered as
Here's the heuristic code:
http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/extraction_framework/file/2ed85b439df2/core/src/main/scala/org/dbpedia/extraction/dataparser/ObjectParser.scala#l44
On Tue, May 8, 2012 at 7:14 PM, Jona Christopher Sahnwaldt
j...@sahnwaldt.de wrote:
No, I'm not insane.
Hi Jona,
We have just generated fresh dumps for the Italian DBpedia with the
latest extractors code version and found that some data is lost in the
mapping-based dataset.
If you have a look at this example [1], 'dbprop-it:genere' property has
4 objects, while 'dbpedia-owl:genre' only has 1.
29 matches
Mail list logo