Hi Ruben,
Do you think this could improve on the current Gender extractor that Max
and I created? We'd love to have it improved. Why don't you send a pull
request over there?

https://github.com/dbpedia/extraction-framework/blob/master/core/src/main/scala/org/dbpedia/extraction/mappings/GenderExtractor.scala

I also like your idea to use this for anomaly detection. I wonder if we
already have a way to output suggested "negative triples" in a standard
fashion for the DEF? Meaning that we could have a bunch of "negative
extractors" suggesting which triples should be deleted. I think Heiko and
Dimitris have played with ideas related to this?

Cheers
Pablo

On Mon, Dec 1, 2014, 04:07 Ruben Verborgh <ruben.verbo...@ugent.be> wrote:

> Dear all,
>
> This weekend, I quickly experimented with gender extraction from the Dutch
> Wikipedia.
> A summary of the approach and results is available here:
> http://ruben.verborgh.org/blog/2014/11/30/distinguishing-between-frank-
> and-nancy/
>
> The highlights are:
> - I extracted 52,686 gender indications with high confidence out of 80,499
> “people” articles.
>  44,614 (85%) are man; 8,072 (15%) are women.
> - A brief manual check didn't reveal any errors (yet).
> - The algorithm can also help to improve data quality.
>  For instance, the article “27th government of Israel” is incorrectly
> marked a person. It's results are:
>  27e regering van Israël { male: 0.5, female: 0.0 }, compared to, for
> instance:
>  A.H. Nijhoff { male: 3.5, female: 33.3 }.
>  This is an indication the “Person” label might be incorrect.
>
> The resulting software and datasets are on GitHub:
> - https://github.com/RubenVerborgh/DBpediaGender
> - https://github.com/RubenVerborgh/DBpediaGenderResults
> The approach has now also been tested in the English version;
> results in the above repository.
>
> Please let me know any feedback and/or questions.
>
> Best,
>
> Ruben
> ------------------------------------------------------------
> ------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&;
> iu=/4140/ostg.clktrk
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to