Thank you Aiko! This is excellent work. Thank you for helping us offer this
valuable new data service to the Wikimedia Movement.

Best,
Jonathan

On Sat, Mar 7, 2020 at 6:03 AM Ai-Jou Chou <qwanqwa...@gmail.com> wrote:

> Hi all,
>
> I’m happy to announce the outcome of an Outreachy internship
> <https://phabricator.wikimedia.org/T233707> that I’m finishing up. It is a
> new tool and public dataset named Citation Detective which tool developers
> and researchers can now use for their projects.
>
> Citation Detective <https://meta.wikimedia.org/wiki/Citation_Detective>
> contains sentences that have been identified as needing a citation using a
> machine learning-based classifier published earlier last year
> <https://arxiv.org/pdf/1902.11116.pdf> by WMF researchers and
> collaborators. As part of Outreachy, I developed a tool
> <https://github.com/AikoChou/citationdetective> (hosted on Toolforge
> <https://tools.wmflabs.org>) to run through Wikipedia and extract
> high-scoring sentences along with contextual information.
>
> As an example use case for this data, I also created a proof of concept for
> integrating Citation Detective and Citation Hunt
> <https://tools.wmflabs.org/citationhunt>. Check out my prototype Citation
> Hunt <https://tools.wmflabs.org/aiko-citationhunt>, which uses Citation
> Detective to import sentences that would not normally be featured in
> Citation Hunt. The repository for that is here
> <https://github.com/AikoChou/citationhunt>.
>
> This dataset currently includes sentences from ~120,000 randomly selected
> articles from the English Wikipedia. In future work, we hope to expand this
> to more language Wikipedia projects and a greater number of articles. It is
> also possible to expand the database to contain more fields in a future
> version according to feedback from tool developers and researchers. More
> use cases for this type of data were identified in a design research
> project
> <
> https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statements/API_design_research
> >
> conducted last year by Jonathan Morgan.
>
> You can find more information in our Wiki Workshop submission
> <
> https://commons.wikimedia.org/wiki/File:Citation_Detective_WikiWorkshop2020.pdf
> >
> and in my blog <https://rollingmist.home.blog/> which documented the whole
> journey.
>
> Thank you very much!
>
> Kind regard,
> Aiko
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>


-- 
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
(Uses He/Him)

*Please note that I do not expect a response from you on evenings or
weekends*
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to