FYI, the new models (BREAKING CHANGE) are now deployed.

On Sun, Apr 3, 2016 at 5:38 AM, Aaron Halfaker <aaron.halfa...@gmail.com>
wrote:

> Hey folks, we have a couple of announcements for you today. First is that
> ORES has a large set of new functionality that you might like to take
> advantage of. We'll also want to talk about a *BREAKING CHANGE on April
> 7th.*
>
> Don't know what ORES is?  See
> http://blog.wikimedia.org/2015/11/30/artificial-intelligence-x-ray-specs/
>
> *New functionality*
>
> *Scoring UI*
> Sometimes you just want to score a few revisions in ORES and remembering
> the URL structure is hard. So, we've build a simple scoring user-interface
> <https://ores.wmflabs.org/ui/> that will allow you to more easily score a
> set of edits.
>
> *New API version*
> We've been consistently getting requests to include more information in
> ORES' responses. In order to make space for this new information, we needed
> to change the structure of responses. But we wanted to do this without
> breaking the tools that are already using ORES. So, we've developed a
> versioning scheme that will allow you to take advantage of new
> functionality when you are ready. The same old API will continue to be
> available at https://ores.wmflabs.org/scores/, but we've added two
> additional paths on top of this.
>
>    - https://ores.wmflabs.org/v1/scores/ is a mirror of the old scoring
>    API which will henceforth be referred to as "v1"
>    - https://ores.wmflabs.org/v2/scores/ implements a new response format
>    that is consistent between all sub-paths and adds some new functionality
>
> *Swagger documentation*
> Curious about the new functionality available in "v2" or maybe what the
> change was from "v1"? We've implemented a structured description of both
> versions of the scoring API using swagger -- which is becoming a defacto
> stanard for this sort of thing. Visit https://ores.wmflabs.org/v1/ or
> https://ores.wmflabs.org/v2/ to see the Swagger user-interface.
> Visithttps://ores.wmflabs.org/v1/spec/ or
> https://ores.wmflabs.org/v2/spec/ to get the specification in a
> machine-readable format.
>
> *Feature values & injection*
> Have you wondered what ORES uses to make it's predictions? You can now ask
> ORES to show you the list of "feature" statistics it uses to score
> revisions. For example,
> https://ores.wmflabs.org/v2/scores/enwiki/wp10/34567892/?features will
> return the score with a mapping of feature values used by the "wp10"
> article quality model in English Wikipedia to score oldid=34567892
> <https://en.wikipedia.org/wiki/Special:Diff/34567892>. You can also
> "inject" features into the scoring process to see how that affects the
> prediction. E.g.,
> https://ores.wmflabs.org/v2/scores/enwiki/wp10/34567892?features&feature.wikitext.revision.chars=10000
>
> *Breaking change -- new models*
> We've been experimenting with new learning algorithms to make ORES work
> better and we've found that we get better results with gradient boosting
> <https://en.wikipedia.org/wiki/Gradient_boosting> and random forest
> <https://en.wikipedia.org/wiki/Random_forest> strategies than we do with
> the current linear svc
> <https://en.wikipedia.org/wiki/Support_vector_machine> models. We'd like
> to get these new, better models deployed as soon as possible, but with the
> new algorithm comes a change in the range of probabilities returned by the
> model. So, when we deploy this change, any tools that uses hard-coded
> thresholds on ORES' prediction probabilities will suddenly start behaving
> strangely. Regretfully, we haven't found a way around this problem, so
> we're announcing the change now and we plan to deploy this *BREAKING
> CHANGE on April 7th*. Please subscribe to the AI mailinglist
> <https://lists.wikimedia.org/mailman/listinfo/ai> or watch our project
> page [[:m:ORES <https://meta.wikimedia.org/wiki/ORES>]] to catch
> announcements of future changes and new functionality.
>
> In order to make sure we don't end up in the same situation the next time
> we want to change an algorithm, we've included a suite of evaluation
> statistics with each model. The filter_rate_at_recall(0.9),
> filter_rate_at_recall(0.75), and recall_at_fpr(0.1) thresholds represent
> three critical thresholds (should review, needs review, and definitely
> damaging -- respectively) that can be used to automatically configure your
> wiki tool.  You can find out these thresholds for your model of choice by
> adding the ?model_info parameter to requests.  So, come breaking change,
> we strongly recommend basing your thresholds on these statistics in the
> future. We'll be working to submit patches to tools that use ORES in the
> next week to implement this flexibility.  Hopefully, all you'll need to do
> is worth with us on those.
>
> -halfak & The Revision Scoring team
> <https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service>
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to