Isn't there anything we (or TS admins) can do about this? Like asking WMF to populate the SHA1s at a slower rate?
Petr Onderka [[en:User:Svick]] On Tue, Jul 10, 2012 at 4:19 PM, Russell Blau <russb...@imapmail.org> wrote: > As most of us already know, replag on enwiki has been going up and up > since about 30 June. As it says on status.toolserver.org, "Hight replag > because of inserting of many SHA1-hashes." (Note to DaB.: the first > word should be spelled "High".) > > I asked DaB. on IRC how long this might go on, and he replied one to two > weeks. However, I've since done some independent investigation that > suggests that his estimate might be a little low. > > It turns out that there are three large blocks of consecutive entries in > the revision database that need to be populated with SHA1 hashes. > Apparently there are three processes running in parallel on the WMF > servers that are filling in each of these blocks from the bottom, by > numerical order of rev_id. Knowing this, we can estimate how many > revisions still need to be populated at any given point; and, taking > such estimates at various points in time, can estimate how long the > process will take. (Needless to say, this is only an estimate since the > rate at which database changes are processed on the toolserver side is > variable; also, the blocks of rev_ids are not actually consecutive due > to deletions, but we can assume for our purposes that the deleted > revisions are distributed uniformly throughout the database.) > > It further turns out that it is only possible to compute this estimate > for sql-s1-user (thyme), because the enwiki_p view on sql-s1-rr > (rosemary) does not have the rev_sha1 field at all (!). It appears that > the server on rosemary is receiving millions of database updates each > day from WMF and throwing them in the bit bucket. > > Anyway, based on four observations spaced at 6 hour intervals, it > appears that thyme is populating about 353,000 revisions per hour, or > 8.5 million per day. A simple trendline analysis shows that, at this > rate, completing the 230,000,000 remaining unpopulated revisions will > take about 27 more days (estimated completion Aug 6 at 17:48 UTC). > > Anyone who relies on use of the enwiki_p database should expect a > prolonged continuation of degraded service and steadily increasing > replag. > -- > Russell Blau > russb...@imapmail.org > > > _______________________________________________ > Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) > https://lists.wikimedia.org/mailman/listinfo/toolserver-l > Posting guidelines for this list: > https://wiki.toolserver.org/view/Mailing_list_etiquette _______________________________________________ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette