[Wikimedia-l] Re: Mopping with the tap open

Steven Walling Tue, 02 Jul 2024 11:36:40 -0700

On Thu, Jun 20, 2024 at 9:00 AM Megan Neisler <mneis...@wikimedia.org>
wrote:


> Steven, SJ, and Petr: I’ve provided responses to the questions about the
> quantitative findings below. Please let me know if any additional
> clarification would be helpful.
>
> > “*The report says "*On mobile, edit completion rate decreased by -24.3%
> (-13.5pp)*" -- what's the difference between the first and (second)
> percentage figures?"
>
> Great question, SJ. Can you please let me know if the below helps clarify
> the uncertainty you were asking about above?
>
> The first percentage figure (-24.3%) indicates the relative change in
> percentage between the control and test groups. In other words, by what
> percentage (larger or smaller) did the edit completion rate observed in the
> test group change from the edit completion rate observed in the control
> group?  We observed an edit completion rate of 55.6% in the control group
> and 42.1% in the test group. This equates to a 24.3% decrease, calculated
> by finding the ratio of the absolute change between the two groups (42.1%
> minus 55.6%) to the reference value (55.6%).
>
> The second percentage figure (-13.5pp)  represents the absolute change
> between the control and test groups. In this case, the difference is the
> test edit completion rate  (42.1%) minus the control edit completion rate
> (55.6%), which equals -13.5 percentage points.
>
> Both values are provided in the report to help clarify the degree of
> difference between the two numbers. But by either measure, these numbers
> indicate how much change we observed in edit completion rate between the
> test and control group.
>
> >  “In other words we lose 24% of saved edits in order to decrease the
> revert rate by 8.6%. This tradeoff does not seem good.”
>
> The interaction between these two metrics is worthy of clarifying – thank
> you for drawing our collective attention to the need for us to do so,
> Steven.
>
> Below is an attempt to offer some additional clarity. We'd value knowing
> if this brings any new questions to mind…
>
> The 24% decrease observed on mobile represents the relative change in the
> edit completion rates observed for the control and test groups, as
> indicated in the clarification provided above. It does *not* reflect a
> decrease in the total number of saved edits.
>
> If we look at the impact on saved edits, the total number of saved new
> content edits on mobile decreased from 3,924 edits in the control group to
> 3,468 in the test group (a total decrease of 456 saved new content edits or
> 12% relative decrease in saved new content edits). However, Reference Check
> increased the number of saved new content edits on mobile with a reference
> from 60 edits in the control group to 1012 edits in the test group (an
> increase of 952 saved new content edits or 16 times more saved new content
> edits with a reference). See Figure 18 of the analysis report for more
> details [1].
>
> The edit completion rates for this analysis were based on a specific
> subset of all the edits that were attempted during the A/B test.
> Specifically, we reviewed the proportion of all edits where a person
> indicated intent to save and were successfully published. We focused only
> on edits where a person indicated intent to save as this is the point of
> the workflow when Reference Check would be shown and we wanted to exclude
> edits abandoned for other reasons before this point.
>
> If we look at all edits that were started and then successfully published,
> there was no significant change in edit completion rate on mobile or
> desktop as Reference Check was presented to a limited number of all edits
> that were started.
>
> Zooming out, we seem to be aligned in thinking that it will be important
> for us to actively monitor changes in edit completion rate to ensure future
> Edit Checks do not cause significant disruption to the editor experience.
> In fact, we'd value knowing if there are other metrics you think we should
> consider monitoring. Reason being: the Editing Team is actively defining
> the requirements for a dashboard (
> https://phabricator.wikimedia.org/T367130) that will help us track how
> edit session health evolves over time as more Checks are introduced.
>

Thanks for the followup explanation on this Megan and other folks from the
team. Your explanation makes a lot of sense and it's much less concerning
now.



>
> [1]
> https://mneisler.quarto.pub/reference-check-ab-test-report-2024/#number-of-new-content-edits-successfully-saved
> _______________________________________________
> Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines
> at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/2AU5QA2HJVTHKRTIICSFRYV3EGDPEPP7/
> To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

_______________________________________________
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/PHGB46VZAMGLKD5WERMSKUPZDM6GAPRG/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Re: Mopping with the tap open

Reply via email to