Hi Amanda, Here's a ticket you can upvote: https://phabricator.wikimedia.org/T87596
I added a link to this thread to the task. I also added an "Evil Spooky Haunted Tree" token to the task. Because... well it just felt like the right thing to do. - J On Tue, Feb 17, 2015 at 7:58 AM, Dan Andreescu <dandree...@wikimedia.org> wrote: > I can't find a specific ticket, Nuria may know of one. In general, this > is the project that LabsDB tickets are tagged with: > https://phabricator.wikimedia.org/tag/wikimedia-labs-infrastructure/ > > On Tue, Feb 17, 2015 at 10:53 AM, Amanda Bittaker <abitta...@wikimedia.org > > wrote: > >> Good morning Dan, >> >> Thanks very much for the explanation. Is there a Phabricator task we can >> upvote (award a token?) to make this issue more visible? >> >> As always, we really appreciate your help with this. >> >> Best, >> Amanda >> >> >> >> On Tue, Feb 17, 2015 at 7:20 AM, Dan Andreescu <dandree...@wikimedia.org> >> wrote: >> >>> Sorry for the trouble, Amanda. The problem is solely with the >>> underlying database, which we don't maintain. It's a sanitized replica of >>> all the changes being made to all the wikis so it's a fairly complicated >>> piece of infrastructure that sometimes has problems. The folks who >>> maintain it are aware of the issues, but we'll continue representing them >>> until they're solved. >>> >>> On Mon, Feb 16, 2015 at 3:49 PM, Amanda Bittaker < >>> abitta...@wikimedia.org> wrote: >>> >>>> Oop, thanks for the ping, Nuria. Wikimetrics seems to be working >>>> better now. I still get failures, especially when running three or four >>>> reports in one batch, but the reports work if you rerun them (sometimes a >>>> couple times.) >>>> >>>> I'm still getting "PENDING"s that turn into "FAILURE"s sometimes, which >>>> I just noticed for the first time last Thursday. Also, sometimes the >>>> "FAILURE"s change position in the Current Report Inbox list, moving up or >>>> down a spot. Not sure if that helps diagnose what might be happening... >>>> >>>> In any case, Wikimetrics is mostly functioning but seems to be having >>>> recurring troubles that sometimes blow up to freeze the whole tool. It >>>> would be great to resolve the troubles before the next explosion--is there >>>> anything I can do to help? Dan H and I still have plenty of reports to >>>> run, we can keep you updated on the reports ran and failure rate while you >>>> are fixing, if that would be useful. >>>> >>>> Many thanks, >>>> Amanda >>>> >>>> >>>> On Mon, Feb 16, 2015 at 10:15 AM, Nuria Ruiz <nu...@wikimedia.org> >>>> wrote: >>>> >>>>> Ping .... >>>>> >>>>> On Fri, Feb 13, 2015 at 2:19 PM, Nuria Ruiz <nu...@wikimedia.org> >>>>> wrote: >>>>> >>>>>> Amanda, >>>>>> >>>>>> Looks like wikimetrics was able to run automatic reports last night >>>>>> w/o big issues, are your reports still failing? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Nuria >>>>>> >>>>>> On Thu, Feb 12, 2015 at 1:42 PM, Amanda Bittaker < >>>>>> abitta...@wikimedia.org> wrote: >>>>>> >>>>>>> Alright, thanks so much for your help once again, Nuria. >>>>>>> >>>>>>> If there's anything I can do or any information I can contribute, >>>>>>> please don't hesitate to ping me. >>>>>>> >>>>>>> Best, >>>>>>> Amanda >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Feb 12, 2015 at 1:36 PM, Nuria Ruiz <nu...@wikimedia.org> >>>>>>> wrote: >>>>>>> >>>>>>>> DB connections in labs look to be failing, unfortunately I think >>>>>>>> besides asking for help on the labs list there is not much we can do >>>>>>>> there. >>>>>>>> I will start a thread on this regard. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Nuria >>>>>>>> >>>>>>>> On Thu, Feb 12, 2015 at 1:32 PM, Amanda Bittaker < >>>>>>>> abitta...@wikimedia.org> wrote: >>>>>>>> >>>>>>>>> Thanks so much for the quick response, Nuria. >>>>>>>>> >>>>>>>>> I ran the exact same reports on the same cohort as one of the last >>>>>>>>> batches that were failing. Last time 2/4 of the reports failed, when >>>>>>>>> I >>>>>>>>> reran the individually they succeeded. (But they don't always, I >>>>>>>>> reran one >>>>>>>>> report 3 times this morning before it worked.) This time, my >>>>>>>>> failure rate >>>>>>>>> got worse: 4/4 failed, although they said "PENDING" for a few seconds >>>>>>>>> first, which is new. >>>>>>>>> >>>>>>>>> Is that useful information? Please do let me know what else I can >>>>>>>>> do to help solve this. >>>>>>>>> >>>>>>>>> Thanks again, >>>>>>>>> Amanda >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Feb 12, 2015 at 1:09 PM, Jonathan Morgan < >>>>>>>>> jmor...@wikimedia.org> wrote: >>>>>>>>> >>>>>>>>>> Thanks Nuria! >>>>>>>>>> >>>>>>>>>> On Thu, Feb 12, 2015 at 12:57 PM, Nuria Ruiz <nu...@wikimedia.org >>>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> If so a cohort + report to repro will be most useful. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Translation:* try to run the exact same reports on the same >>>>>>>>>> cohort again, to see if the same metrics fail. Let us know what you >>>>>>>>>> find. ;) >>>>>>>>>> >>>>>>>>>> Same goes for anyone else who experiences these issues: the more >>>>>>>>>> details we (users) can provide the engineers, the more effective >>>>>>>>>> they can >>>>>>>>>> be at diagnosing and addressing the problems. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> - J >>>>>>>>>> >>>>>>>>>> *for anyone who is not 100% familiar with that hip, new software >>>>>>>>>> engineering lingo >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Nuria >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Feb 12, 2015 at 12:35 PM, Dan Andreescu < >>>>>>>>>>> dandree...@wikimedia.org> wrote: >>>>>>>>>>> >>>>>>>>>>>> Recently there was a restart of the labsdb cluster. I'm sorry >>>>>>>>>>>> but I don't have time to check on it, but I bet that's the >>>>>>>>>>>> problem. I'm >>>>>>>>>>>> off tomorrow unfortunately but I'll try to check tomorrow night :( >>>>>>>>>>>> I hope >>>>>>>>>>>> someone else beats me to it. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Feb 12, 2015 at 3:20 PM, Jonathan Morgan < >>>>>>>>>>>> jmor...@wikimedia.org> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> (ping Kevin and Dan A.) >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Amanda, I've had some problems with report failures >>>>>>>>>>>>> recently when I ran a few test cohorts. On the same cohort, when >>>>>>>>>>>>> I ran >>>>>>>>>>>>> multiple concurrent reports (say, bytes added, edits, and pages >>>>>>>>>>>>> created), >>>>>>>>>>>>> some would fail and others succeed. It wasn't clear what the >>>>>>>>>>>>> issue was. >>>>>>>>>>>>> >>>>>>>>>>>>> - J >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Feb 12, 2015 at 12:16 PM, Amanda Bittaker < >>>>>>>>>>>>> abitta...@wikimedia.org> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am getting failures again, both when uploading cohorts and >>>>>>>>>>>>>> running reports. Strangely, it seems the more reports you try >>>>>>>>>>>>>> to run in >>>>>>>>>>>>>> one batch the less likely it is any report will succeed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Is anyone else having these problems again? Wonderful >>>>>>>>>>>>>> Analytics people, could you please work your magic again? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Many thanks, >>>>>>>>>>>>>> Amanda >>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> Wikimetrics mailing list >>>>>>>>>>>>>> Wikimetrics@lists.wikimedia.org >>>>>>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Jonathan T. Morgan >>>>>>>>>>>>> Community Research Lead >>>>>>>>>>>>> Wikimedia Foundation >>>>>>>>>>>>> User:Jmorgan (WMF) >>>>>>>>>>>>> <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)> >>>>>>>>>>>>> jmor...@wikimedia.org >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Wikimetrics mailing list >>>>>>>>>>>> Wikimetrics@lists.wikimedia.org >>>>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Jonathan T. Morgan >>>>>>>>>> Community Research Lead >>>>>>>>>> Wikimedia Foundation >>>>>>>>>> User:Jmorgan (WMF) >>>>>>>>>> <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)> >>>>>>>>>> jmor...@wikimedia.org >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Wikimetrics mailing list >>>>>>>>>> Wikimetrics@lists.wikimedia.org >>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Wikimetrics mailing list >>>> Wikimetrics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics >>>> >>>> >>> >>> _______________________________________________ >>> Wikimetrics mailing list >>> Wikimetrics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/wikimetrics >>> >>> >> >> _______________________________________________ >> Wikimetrics mailing list >> Wikimetrics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikimetrics >> >> > > _______________________________________________ > Wikimetrics mailing list > Wikimetrics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikimetrics > > -- Jonathan T. Morgan Community Research Lead Wikimedia Foundation User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)> jmor...@wikimedia.org
_______________________________________________ Wikimetrics mailing list Wikimetrics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikimetrics