On Thu, Jun 20, 2024 at 9:00 AM Megan Neisler <mneis...@wikimedia.org> wrote:
> Steven, SJ, and Petr: I’ve provided responses to the questions about the > quantitative findings below. Please let me know if any additional > clarification would be helpful. > > > “*The report says "*On mobile, edit completion rate decreased by -24.3% > (-13.5pp)*" -- what's the difference between the first and (second) > percentage figures?" > > Great question, SJ. Can you please let me know if the below helps clarify > the uncertainty you were asking about above? > > The first percentage figure (-24.3%) indicates the relative change in > percentage between the control and test groups. In other words, by what > percentage (larger or smaller) did the edit completion rate observed in the > test group change from the edit completion rate observed in the control > group? We observed an edit completion rate of 55.6% in the control group > and 42.1% in the test group. This equates to a 24.3% decrease, calculated > by finding the ratio of the absolute change between the two groups (42.1% > minus 55.6%) to the reference value (55.6%). > > The second percentage figure (-13.5pp) represents the absolute change > between the control and test groups. In this case, the difference is the > test edit completion rate (42.1%) minus the control edit completion rate > (55.6%), which equals -13.5 percentage points. > > Both values are provided in the report to help clarify the degree of > difference between the two numbers. But by either measure, these numbers > indicate how much change we observed in edit completion rate between the > test and control group. > > > “In other words we lose 24% of saved edits in order to decrease the > revert rate by 8.6%. This tradeoff does not seem good.” > > The interaction between these two metrics is worthy of clarifying – thank > you for drawing our collective attention to the need for us to do so, > Steven. > > Below is an attempt to offer some additional clarity. We'd value knowing > if this brings any new questions to mind… > > The 24% decrease observed on mobile represents the relative change in the > edit completion rates observed for the control and test groups, as > indicated in the clarification provided above. It does *not* reflect a > decrease in the total number of saved edits. > > If we look at the impact on saved edits, the total number of saved new > content edits on mobile decreased from 3,924 edits in the control group to > 3,468 in the test group (a total decrease of 456 saved new content edits or > 12% relative decrease in saved new content edits). However, Reference Check > increased the number of saved new content edits on mobile with a reference > from 60 edits in the control group to 1012 edits in the test group (an > increase of 952 saved new content edits or 16 times more saved new content > edits with a reference). See Figure 18 of the analysis report for more > details [1]. > > The edit completion rates for this analysis were based on a specific > subset of all the edits that were attempted during the A/B test. > Specifically, we reviewed the proportion of all edits where a person > indicated intent to save and were successfully published. We focused only > on edits where a person indicated intent to save as this is the point of > the workflow when Reference Check would be shown and we wanted to exclude > edits abandoned for other reasons before this point. > > If we look at all edits that were started and then successfully published, > there was no significant change in edit completion rate on mobile or > desktop as Reference Check was presented to a limited number of all edits > that were started. > > Zooming out, we seem to be aligned in thinking that it will be important > for us to actively monitor changes in edit completion rate to ensure future > Edit Checks do not cause significant disruption to the editor experience. > In fact, we'd value knowing if there are other metrics you think we should > consider monitoring. Reason being: the Editing Team is actively defining > the requirements for a dashboard ( > https://phabricator.wikimedia.org/T367130) that will help us track how > edit session health evolves over time as more Checks are introduced. > Thanks for the followup explanation on this Megan and other folks from the team. Your explanation makes a lot of sense and it's much less concerning now. > > [1] > https://mneisler.quarto.pub/reference-check-ab-test-report-2024/#number-of-new-content-edits-successfully-saved > _______________________________________________ > Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines > at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and > https://meta.wikimedia.org/wiki/Wikimedia-l > Public archives at > https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/2AU5QA2HJVTHKRTIICSFRYV3EGDPEPP7/ > To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org
_______________________________________________ Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/PHGB46VZAMGLKD5WERMSKUPZDM6GAPRG/ To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org