Hi all, tl;dr: we'd like to remove the rev_is_revert field from the mediawiki.revision-create stream to solve a missing event problem.
For years now, we've known that the mediawiki.revision-create stream <https://stream.wikimedia.org/?doc#/streams/get_v2_stream_mediawiki_revision_create> has been missing many real revision create events <https://phabricator.wikimedia.org/T215001> when compared with MediaWiki's MySQL databases. This makes the stream almost useless for those who want to use it as a notification mechanism about all MediaWiki page changes. The reason for the large number of missing events is because the code that emits the event is subscribing to the wrong MediaWiki hook. This patch <https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventBus/+/679353/> will fix this, however the correct hook does not give us the information we need to set the rev_is_revert and rev_revert_details fields. This field is relatively new (only added last August 2020 <https://github.com/wikimedia/schemas-event-primary/commit/53b6480cb1045316ce7bf16987e6169fa386450f#diff-70a054c62940bbabcef7a38e58eb4bf4d9001ed46dd6277473509e5775ec5d34R53-R94>). We think that including the missing revisions is more important than capturing the revert information, which really only captures whether or not a user used the MediaWiki UI to issue a revert. We plan on moving forward with this, but would like feedback before we do. If you have objections, or other ideas on how we can provide this data (like maybe including it in mediawiki/revision-tags-change <https://schema.wikimedia.org/repositories//primary/jsonschema/mediawiki/revision/tags-change/current.yaml> and making that public?), let us know by replying to this email or in this ticket: https://phabricator.wikimedia.org/T215001 Thanks! -Andrew Otto SRE, Data Engineering, WMF
_______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics