So it confirms that the data is coming from the Infobox template and handled
directly at the template level...
I agree with the necessity of interpreting templates recursively in Infoboxes.
We actually had to add a similar feature in our Wikipedia structured data
extraction pipeline. It has great benefits but also typically introduces noise
that needs to be handled at the mapping/normalization phases. In the short
term, we (i.e. Yahoo) will be handling this exception (and similar ones) at the
mapping level but having a more generic solution that benefits everyone would
be better.
FYI, Simon found out that the Infobox_South_African_* templates have the same
"South Hemisphere" bias.=>
http://en.wikipedia.org/wiki/Category:South_Africa_subdivision_infobox_templates
Nicolas.
On Monday, September 22, 2014 11:34 PM, Dimitris Kontokostas
<jimk...@gmail.com> wrote:
Looking at the template source we find the following:{{Geobox
coor|{{#expr:abs({{{latitude|{{{latd|}}}}}})}}|{{{latm|}}}|{{{lats|}}}| S ....
which means that they don't care about the sign, they just take the absolute
value and add 'S' at the end and later they embed the coordinates template.
DBpedia cannot parse embedded templates, we can only get the outer mapping.As
Alexandru noticed, looks like we have two options:1) handle this at the mapping
level:We recently enabled prefix/suffix-ing values in property mappings [1]. We
could port part of the code in the Coord templates and add a '-' prefix to
solve this case but we'd need a abs() transformation as well to get it right
We'd also need people to change all the affected templates.
This is a known problem in Wikipedia, in this case it is "legitimate" but there
are many other wrong coordinates.One of my todo's was to find them using
RDFUnit and report them back. Now with this issue we can see what is fixed and
try to re-map the rest
2) Fuse information from different sources. It's something we consider for the
next release and maybe Live too.
Any feedback is welcome
[1]
http://mappings.dbpedia.org/index.php/Mapping_commons:Chemical_structure_verified
On Mon, Sep 22, 2014 at 11:30 PM, Alexandru Todor <to...@inf.fu-berlin.de>
wrote:
Hi Nicolas,
I don't think the geo information you see in the top-right corner of the
parsed wiki page actually comes from the article. I think it actually comes
from another wikimedia project or geonames. I see 2 possible ways for us to fix
it
1) Correct the information in wikipedia by restructuring the offending
"Infobox Australian place" to include orientation information so we can place
the longitude and latitude in the right hemisphere. And then go trough each
article that uses that infobox and correct the info either manually or with a
wikibot
2) Get that information from an external source, which we can't really do
since we would then face the problem of being out of sync with wikipedia on
that issue.
You could try bringing up this issue on the wikimedia mailing lists and asking
them where they get the coordinates from, and what piece of code is responsible
for parsing it. I don't see any way to fix it in the extractor since this is a)
infobox specific b) it would be a nasty hack . But maybe some other people here
have a better idea.
Cheers,
Alexandru
On 09/22/2014 10:12 PM, Nicolas Torzec wrote:
Hi Alexandru,
Unfortunately I am not sure it's that simple. - The coordinates in the
infobox have indeed wrong in term of "absolute" values (i.e. wrong latitude but
correct longitude). - However, the coord. displayed at the top-right of the
page are correct (37_48_49_S_144_57_47_E) and Melbourne is correctly mapped.
This particular place uses the "Infobox_Australian_place" template [1], which
seems to be customized for Australian places (i.e. southern hemisphere). It
looks like those Australian places were populated with relative coordinates
(i.e. not absolute one), and it looks like Wikipedia is aware of this
peculiarity.
Re: DBpedia and other geo-extractors, they probably have to take those
peculiarities into account to get higher precision. => Thoughts?
N.
Reference: [1]: http://en.wikipedia.org/wiki/Template:Infobox_Australian_place
On Monday, September 22, 2014 11:17 AM, Alexandru Todor
<to...@inf.fu-berlin.de> wrote:
Hi Nicolas,
From Wikipedia wiki source, the part where you press edit on top of the page
| latd =37 |latm =48 |lats =49
| longd =144 |longm =57 |longs =47
The correct entry in Wikipedia should be
| latd =-37 |latm =48 |lats =49
| longd =144 |longm =57 |longs =47
This is an error in Wikipedia not in DBpedia, you can correct it by editing
the source of the Melbourne Wiki page.
Cheers,
Alexandru
On 09/22/2014 07:45 PM, Nicolas Torzec wrote:
Hi, It looks like an old problem with the geoparsers in DBpedia is still not
resolved.
Back in 2012, the geo-coordinates extractors were not leveraging the
hemisphere information correctly [3], causing some places from the southern
hemisphere (e.g. Melbourne) to appear in the northern hemisphere. Apparently it
is still there...
Compare the coordinates for Melbourne in Wikipedia [1] and DBpedia [2].
[1]: http://sourceforge.net/p/dbpedia/mailman/message/29588905/
[2]: http://en.wikipedia.org/wiki/Melbourne
From Wikipedia: coordinates
={{Coord|37|48|49|S|144|57|47|E|type:city(4000000)_region:AU-VIC|display=inline,title}}
From DBpedia: | latd =37 |latm =48 |lats =49 | longd =144 |longm =57
|longs =47
Melbourne is correctly placed on Wikipedia but incorrectly placed in DBpedia.
Nicolas.
Reference: [3]: initial problem here http://live.dbpedia.org/page/Melbourne.
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
Kontokostas Dimitris
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion