Pablo-WMDE added a comment.
We (@Michael + @toan + @Tarrow + myself) looked at the last 12 hours of those events. We could not immediately see a connection between the items of interest, but what stood out was that there seem to be bursts of those (`"Request failed with status 0. Usually this means network failure or timeout"`) happening. This leads me to the assumption that we are not dealing with problems which are caused by the termbox service in particular responding to individual requests in a poor fashion but instead that the infrastructure at those moments in time is facing challenges which are outside of our team's control. I thing we need someone with a better understanding of how the production hosting is currently set up (buzz words: tls-proxy, kubernetes) and insights about the health of similar services at the same time to not look at this problem at the wrong layer of abstraction, losing the majority of relevant data points in the process (e.g. no point in searching for "termbox" in kibana if we are in fact dealing with a network problem). //Possibly// related (proper write-up following): We are seeing our termbox service, when reaching out to the mediawiki API, occasionally receiving a 503 (e.g. <https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-syslog-2020.09.24/syslog?id=AXTAe5h5LNRtRo5Xnvl8&_g=h@44136fa>). We are also seeing a very similar error from a TLS-proxy (e.g. <https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-syslog-2020.09.24/syslog?id=AXTAe5h5LNRtRo5Xnvl8&_g=h@44136fa>) What we did not manage to track down is a corresponding error from an app server (MW; acting as API for the termbox service) which we would have expected to see. We wonder if the errors may have something to do with the "recent" move of termbox to TLS (T254581 <https://phabricator.wikimedia.org/T254581>), which we don't know how it works, and how we would we track down those errors? (The few errors `Bad request. Language not existing` we saw while looking at this seem to be application-level problems and deserve a ticket) TASK DETAIL https://phabricator.wikimedia.org/T255410 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: toan, Pablo-WMDE Cc: Sakretsu, akosiaris, JMeybohm, WMDE-leszek, Pablo-WMDE, Tarrow, Jakob_WMDE, Addshore, Aklapper, Michael, Akuckartz, Iflorez, darthmon_wmde, alaa_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs