I was mentioned as "the developer of ORES". So I comment on that. Aaron
Halfaker is the creator of ORES.  It's been his work night and day for a
few years now. I've contributed around 20% of the code base.  But let's be
clear, ORES is his brainchild.  There is an army of other developers who
have contributed.  E.g. He7d3r, Jonas.agx, Aetilley, Danilo, Yuvipanda,
Awight, Kenrick95, NealMCB, and countless translators.  The idea that a
single person can develop something like a production machine learning
service.  Yikes.

See:
https://github.com/wiki-ai/revscoring/graphs/contributors (the modeling
library)
https://github.com/wiki-ai/ores/graphs/contributors (the hosting service)
https://github.com/wiki-ai/ores-wmflabs-deploy/graphs/contributors (our
server configuration)
https://github.com/wiki-ai/wikilabels/graphs/contributors (the labeling
system)
https://github.com/wiki-ai/editquality/graphs/contributors (the set of
damage/vandalism detection models)
https://github.com/wikimedia/mediawiki-extensions-ORES/graphs/contributors
(mediawiki extension that highlights based on ORES predictions)

Also, I fail to see the relation of running a labeling script to what's
ORES is doing.

Best

On Wed, Mar 22, 2017 at 8:51 PM John Erling Blad <jeb...@gmail.com> wrote:

> Only using sitelinks as a weak indication of quality seems correct to me.
> Also the idea that some languages are more important than other, and some
> large languages are more important than other. I would really like it if
> the reasoning behind the classes and the features could be spelled out.
>
> I have serious issues with the ORES training sets, but that is another
> discussion. ;/ (There is a lot of similar bot edits in the sets, and that
> will train a bot-detector, which is not what we need! Grumpf…)
>
> On Wed, Mar 22, 2017 at 3:33 PM, Aaron Halfaker <aaron.halfa...@gmail.com>
> wrote:
>
> Hey wiki-research-l folks,
>
> Gerard didn't actually link you to the quality criteria he takes issue
> with.  See https://www.wikidata.org/wiki/Wikidata:Item_quality  I think
> Gerard's argument basically boils down to Wikidata != Wikipedia, but it's
> unclear how that is relevant to the goal of measuring the quality of
> items.  This is something I've been talking to Lydia about for a long
> time.  It's been great for the few Wikis where we have models deployed in
> ORES[1] (English, French, and Russian Wikipedia).  So we'd like to have the
> same for Wikidata.   As Lydia said, we do all sorts of fascinating things
> with a model like this.  Honestly, I think the criteria is coming together
> quite nicely and we're just starting a pilot labeling campaign to work
> through a set of issues before starting the primary labeling drive.
>
> 1. https://ores.wikimedia.org
>
> -Aaron
>
>
>
> On Wed, Mar 22, 2017 at 6:39 AM, Gerard Meijssen <
> gerard.meijs...@gmail.com> wrote:
>
> Hoi,
> What I have read is that it will be individual items that are graded. That
> is not what helps you determine what items are lacking in something. When
> you want to determine if something is lacking you need a relational
> approach. When you approach a award like this one [1], it was added to make
> the award for a person [2] more complete. No real importance is given to
> this award, just a few more people were added because they are part of a
> group that gets more attention from me [3]. For yet another award [4], I
> added all the people who received the award because I was told by someone's
> expert opinion that they were all notable (in the Wikipedia sense of the
> word). I added several of these people in Wikidata. Arguably, the Wikidata
> the quality for the item for the award is great but it has no article
> associated to it in Wikipedia but that has nothing to do with the quality
> of the information it provides. It is easy and obvious to recognise in one
> level deeper that quality issues arise; the info for several people is
> meagre at best.You cannot deny their relevance though; removing them
> destroys the quality for the award.
>
> The point is that in relations you can describe quality, in the grading
> that is proposed there is nothing really that is actionable.
>
> When you add links to the mix, these same links have no bearing on the
> quality of the Wikidata item. Why would it? Links only become interesting
> when you compare the statements in Wikidata with the links to other
> articles in the same Wikipedia. This is not what this approach brings.
>
> Really, how will the grades to items make a difference. How will it help us
> understand that "items relating to railroads are lacking"? It does not.
>
> When you want to have indicators for quality; here is one.. an author (and
> its subclasses) should have a VIAF identifier. An artist with objects in
> the Getty Museum should have an ULAN number. The lack of such information
> is actionable. The number of interwiki links is not, the number of
> statements are not and even references are not that convincing.
> Thanks,
>       GerardM
>
> [1] https://tools.wmflabs.org/reasonator/?&q=29000734
> [2] https://tools.wmflabs.org/reasonator/?&q=7315382
> [3] https://tools.wmflabs.org/reasonator/?&q=3308284
> [4] https://tools.wmflabs.org/reasonator/?&q=28934266
>
> On 22 March 2017 at 11:56, Lydia Pintscher <lydia.pintsc...@wikimedia.de>
> wrote:
>
> > On Wed, Mar 22, 2017 at 10:03 AM, Gerard Meijssen
> > <gerard.meijs...@gmail.com> wrote:
> > > In your reply I find little argument why this approach is useful. I do
> > not
> > > find a result that is actionable. There is little point to this
> approach
> > > and it does not fit with well with much of the Wikidata practice.
> >
> > Gerard, the outcome will be very actionable. We will have the
> > groundwork needed to identify individual items and sets of items that
> > need improvement. If it for example turns out that our items related
> > to railroads are particularly lacking then that is something we can
> > concentrate on if we so chose. We can do editathons, data
> > partnerships, quality drives and and and.
> >
> >
> > Cheers
> > Lydia
> >
> > --
> > Lydia Pintscher - http://about.me/lydia.pintscher
> > Product Manager for Wikidata
> >
> > Wikimedia Deutschland e.V.
> > Tempelhofer Ufer 23-24
> > 10963 Berlin
> > www.wikimedia.de
> >
> > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
> >
> > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> > unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> > Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
> >
> > _______________________________________________
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
> _______________________________________________
> Wiki-research-l mailing list
> wiki-researc...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
>
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to