[Wikitech-l] Re: ORES To Lift Wing Migration

2023-09-23 Thread Strainu
Hi folks,

So glad to see the old and new ML teams have an open discussion about this
subject.

I understand that the team might prefer to have several tickets for
different issues, but the discussion about the general approach to the
different models is of interest to many people and is more easily digested
on email. I would suggest to continue discussing the merits of the current
strategy (and not necessarily of a model or another) on email.

* One model per wiki or overall
This is a tough one. :) As a user, I remember how hard it was for Romanian
speakers to complete the training data for damaging/goodfaith and would
prefer to not have to do it again.

However, I'm also worried that some specificities of larger wikis would
creep in the output, leading to reverts that would normally not happen on
my wiki. For instance, smaller settlements are not accepted on enwp, while
they are accepted on rowp. I don't know how to test it myself, and I
haven't seen anything about it in the research.

Another problem I have is I'm not sure how the revert-risk score should be
matched against custom damaging/goodfaith thresholds. Ate there some
guidelines on this except "test"?

* Multiple criteria VS a single score
I think the discussion has been very much about reverts, but as Sj said,
each of these scores are a slightly different facet. Is there data
available on the prevalence of other use-cases or is everyone just writing
revert bots?

On the long run, I believe an unique model good enough can be developed for
revert bots. However, it would be great if there were some clear quality
criteria that the community can verify and the old models are maintained
for a wiki until we are sure the new model passes that criteria on that
wiki.

A change in hosting should not be the guiding force in any team's roadmap,
but the needs of its users.

Have a good weekend,
 Strainu




Pe sâmbătă, 23 septembrie 2023, Luca Toscano  a
scris:
>
>
> On Fri, Sep 22, 2023 at 11:34 PM Aaron Halfaker 
wrote:
>>
>> All fine points.  As you can see, I've filed some phab tasks where I saw
a clear opportunity to do so.
>
> Thanks a lot! We are going to review them next week and decide the next
steps, but we'd like to proceed anyway to migrate ores to ores-legacy on
Monday (this will allow us to free some old nodes that need to be decommed
etc..). Adding features later on to the models on Lift Wing should be
doable, and our goal is to transition away from ores-legacy in a few months
(to avoid maintaining too many systems). The timeline is not yet set in
stone, we'll update this mailing list when the time comes (and we'll follow
up with the remaining users of ores-legacy as well). To summarize: we start
with Ores -> Ores Legacy on Monday, and we'll do Ores Legacy -> Lift Wing
in a second step.
>>
>> >  as mentioned before all the models that currently run on ORES are
available in both ores-legacy and Lift Wing.
>>
>> I thought I read that damaging and goodfaith models are going to be
replaced.  Should I instead read that they are likely to remain available
for the foreseeable future?   When I asked about a community discussion
about the transition from damaging/goodfaith to revertrisk, I was imagining
that many people who use those predictions might have an opinion about them
going away.  E.g. people who use the relevant filters in RecentChanges.
Maybe I missed the discussions about that.
>
> This is a good point, I'll clarify the documentation on Wikitech. Until
models are used we'll not remove them from Lift Wing, but we'll propose to
use Revert Risk where it is suited since it is a model family on which we
decided to invest time and efforts. Basic maintenance will be performed on
the goodfaith/damaging/articlequality/etc.. models on Lift Wing, but we
don't have (at the moment) any bandwidth to guarantee retraining or more
complex workflows on them. This is why we used the term "deprecated" on
Wikitech, but we need to specify what we mean to avoid confusion. Thanks
for the feedback :)
>
>>
>> I haven't seen a mention of the article quality or article topic models
in the docs.  Are those also going to remain available?  I have some user
scripts that use these models and are relatively widely used.  I didn't
notice anyone reaching out. ... So I checked and setting a User-Agent on my
user scripts doesn't actually change the User-Agent.  I've read that you
need to set "Api-User-Agent" instead, but that causes a CORS error when
querying ORES.  I'll file a bug.
>
> Will update the docs as well, as mentioned above we'll keep the current
ORES models available on Lift Wing. Eventually new models will be proposed
by Research and other teams (like Revert Risk), and at that point we (as ML
team) will decide what recommendation to give. Nothing will be removed from
Lift Wing if there are active users on it, but we'll certainly try to
reduce the amount of models to maintain (based on common functionality
etc..), so some changes will be proposed in the future.
> 

[Wikitech-l] Re: ORES To Lift Wing Migration

2023-09-23 Thread Luca Toscano
On Fri, Sep 22, 2023 at 11:34 PM Aaron Halfaker 
wrote:

> All fine points.  As you can see, I've filed some phab tasks where I saw a
> clear opportunity to do so.
>

Thanks a lot! We are going to review them next week and decide the next
steps, but we'd like to proceed anyway to migrate ores to ores-legacy on
Monday (this will allow us to free some old nodes that need to be decommed
etc..). Adding features later on to the models on Lift Wing should be
doable, and our goal is to transition away from ores-legacy in a few months
(to avoid maintaining too many systems). The timeline is not yet set in
stone, we'll update this mailing list when the time comes (and we'll follow
up with the remaining users of ores-legacy as well). To summarize: we start
with Ores -> Ores Legacy on Monday, and we'll do Ores Legacy -> Lift Wing
in a second step.

>  as mentioned before all the models that currently run on ORES are
> available in both ores-legacy and Lift Wing.
>
> I thought I read that damaging and goodfaith models are going to be
> replaced.  Should I instead read that they are likely to remain available
> for the foreseeable future?   When I asked about a community discussion
> about the transition from damaging/goodfaith to revertrisk, I was imagining
> that many people who use those predictions might have an opinion about them
> going away.  E.g. people who use the relevant filters in RecentChanges.
> Maybe I missed the discussions about that.
>

This is a good point, I'll clarify the documentation on Wikitech. Until
models are used we'll not remove them from Lift Wing, but we'll propose to
use Revert Risk where it is suited since it is a model family on which we
decided to invest time and efforts. Basic maintenance will be performed on
the goodfaith/damaging/articlequality/etc.. models on Lift Wing, but we
don't have (at the moment) any bandwidth to guarantee retraining or more
complex workflows on them. This is why we used the term "deprecated" on
Wikitech, but we need to specify what we mean to avoid confusion. Thanks
for the feedback :)


>
> I haven't seen a mention of the article quality or article topic models in
> the docs.  Are those also going to remain available?  I have some user
> scripts that use these models and are relatively widely used.  I didn't
> notice anyone reaching out. ... So I checked and setting a User-Agent on my
> user scripts doesn't actually change the User-Agent.  I've read that you
> need to set "Api-User-Agent" instead, but that causes a CORS error when
> querying ORES.  I'll file a bug.
>

Will update the docs as well, as mentioned above we'll keep the current
ORES models available on Lift Wing. Eventually new models will be proposed
by Research and other teams (like Revert Risk), and at that point we (as ML
team) will decide what recommendation to give. Nothing will be removed from
Lift Wing if there are active users on it, but we'll certainly try to
reduce the amount of models to maintain (based on common functionality
etc..), so some changes will be proposed in the future.

Luca
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/