On Tue, Sep 23, 2014 at 9:06 PM, Nicolas Torzec <torz...@yahoo-inc.com>
wrote:

> So it confirms that the data is coming from the Infobox template and
> handled directly at the template level...
>
> I agree with the necessity of interpreting templates recursively in
> Infoboxes. We actually had to add a similar feature in our Wikipedia
> structured data extraction pipeline. It has great benefits but also
> typically introduces noise that needs to be handled at the
> mapping/normalization phases. In the short term, we (i.e. Yahoo) will be
> handling this exception (and similar ones) at the mapping level but
> having a more generic solution that benefits everyone would be better.
>

I had the impression that the noise cost would more than the actual benefit
and we never decided to go this way. The reason is that there are too many
styling templates and we somehow need to exclude them from the process.
Black/white listing could be an option but they probably both require a lot
of manual effort and have different cost (precision/coverage).

It would be great if you could share your experience in this regard. Should
we move the discussion to the developer list?

Dimitris


>
> FYI, Simon found out that the Infobox_South_African_* templates have the
> same "South Hemisphere" bias.
> =>
> http://en.wikipedia.org/wiki/Category:South_Africa_subdivision_infobox_templates
>
>
> Nicolas.
>
>
>
>
>   On Monday, September 22, 2014 11:34 PM, Dimitris Kontokostas <
> jimk...@gmail.com> wrote:
>
>
> Looking at the template source we find the following:
> {{Geobox 
> coor|{{*#expr:abs*({{{latitude|{{{latd|}}}}}})}}|{{{latm|}}}|{{{lats|}}}|
> *S* ....
> which means that they don't care about the sign, they just take the
> absolute value and add 'S' at the end and later they embed the coordinates
> template.
>
> DBpedia cannot parse embedded templates, we can only get the outer mapping.
> As Alexandru noticed, looks like we have two options:
> 1) handle this at the mapping level:
> We recently enabled prefix/suffix-ing values in property mappings [1]. We
> could port part of the code in the Coord templates and add a '-' prefix to
> solve this case but we'd need a abs() transformation as well to get it right
> We'd also need people to change all the affected templates.
>
> This is a known problem in Wikipedia, in this case it is "legitimate" but
> there are many other wrong coordinates.
> One of my todo's was to find them using RDFUnit and report them back.
> Now with this issue we can see what is fixed and try to re-map the rest
>
> 2) Fuse information from different sources. It's something we consider for
> the next release and maybe Live too.
>
> Any feedback is welcome
>
> [1]
> http://mappings.dbpedia.org/index.php/Mapping_commons:Chemical_structure_verified
>
> On Mon, Sep 22, 2014 at 11:30 PM, Alexandru Todor <to...@inf.fu-berlin.de>
> wrote:
>
>  Hi Nicolas,
>
> I don't think the geo information you see in the top-right corner of the
> parsed wiki page actually comes from the article. I think it actually comes
> from another wikimedia project or geonames. I see 2 possible ways for us to
> fix it
>
> 1) Correct the information in wikipedia by restructuring the offending
> "Infobox Australian place" to include orientation information so we can
> place the longitude and latitude in the right hemisphere. And then go
> trough each article that uses that infobox and correct the info either
> manually or with a wikibot
> 2) Get that information from an external source, which we can't really do
> since we would then face the problem of being out of sync with wikipedia on
> that issue.
>
> You could try bringing up this issue on the wikimedia mailing lists and
> asking them where they get the coordinates from, and what piece of code is
> responsible for parsing it. I don't see any way to fix it in the extractor
> since this is a) infobox specific b) it would be a nasty hack . But maybe
> some other people here have a better idea.
>
> Cheers,
> Alexandru
>
>
> On 09/22/2014 10:12 PM, Nicolas Torzec wrote:
>
>  Hi Alexandru,
>
>  Unfortunately I am not sure it's that simple.
> - The coordinates in the infobox have indeed wrong in term of "absolute"
> values (i.e. wrong latitude but correct longitude).
> - However, the coord. displayed at the top-right of the page are correct
> (37_48_49_S_144_57_47_E) and Melbourne is correctly mapped.
>
>  This particular place uses the "Infobox_Australian_place" template [1],
> which seems to be customized for Australian places (i.e. southern
> hemisphere). It looks like those Australian places were populated with
> relative coordinates (i.e. not absolute one), and it looks like Wikipedia
> is aware of this peculiarity.
>
>  Re: DBpedia and other geo-extractors, they probably have to take those
> peculiarities into account to get higher precision.
> => Thoughts?
>
>  N.
>
>
>  Reference:
> [1]: http://en.wikipedia.org/wiki/Template:Infobox_Australian_place
>
>
>
>
>    On Monday, September 22, 2014 11:17 AM, Alexandru Todor
> <to...@inf.fu-berlin.de> <to...@inf.fu-berlin.de> wrote:
>
>
>   Hi Nicolas,
>
> From Wikipedia wiki source, the part where you press edit on top of the
> page
>
> | latd  =37  |latm =48 |lats  =49
> | longd =144 |longm =57 |longs =47
>
> The correct entry in Wikipedia should be
>
> | latd  =-37  |latm =48 |lats  =49
> | longd =144 |longm =57 |longs =47
>
> This is an error in Wikipedia not in DBpedia, you can correct it by
> editing the source of the Melbourne Wiki page.
>
> Cheers,
> Alexandru
>
>
> On 09/22/2014 07:45 PM, Nicolas Torzec wrote:
>
>  Hi,
> It looks like an old problem with the geoparsers in DBpedia is still not
> resolved.
>
>  Back in 2012, the geo-coordinates extractors were not leveraging the
> hemisphere information correctly [3], causing some places from the southern
> hemisphere (e.g. Melbourne) to appear in the northern hemisphere.
> Apparently it is still there...
>
>  Compare the coordinates for Melbourne in Wikipedia [1] and DBpedia [2].
>  [1]: http://sourceforge.net/p/dbpedia/mailman/message/29588905/
>  [2]: http://en.wikipedia.org/wiki/Melbourne
>
>
>  From Wikipedia:
> coordinates =
> {{Coord|37|48|49|S|144|57|47|E|type:city(4000000)_region:AU-VIC|display=inline,title}}
>
>  From DBpedia:
> | latd  =37  |latm =48 |lats  =49
> | longd =144 |longm =57 |longs =47
>
>
>  Melbourne is correctly placed on Wikipedia but incorrectly placed in
> DBpedia.
>
>
>  Nicolas.
>
>
>  Reference:
> [3]: initial problem here http://live.dbpedia.org/page/Melbourne.
>
>
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog 
> Analyzerhttp://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
>
>
>
> _______________________________________________
> Dbpedia-discussion mailing 
> listDbpedia-discussion@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
>
>
> ------------------------------------------------------------------------------
> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
>
> http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
>
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
>
>
>
> ------------------------------------------------------------------------------
> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
>
> http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
>
>
> --
> Kontokostas Dimitris
>
>
>


-- 
Kontokostas Dimitris
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to