Pablo-WMDE added a comment.

  We (@Michael + @toan + @Tarrow + myself) looked at the last 12 hours of those 
events. We could not immediately see a connection between the items of 
interest, but what stood out was that there seem to be bursts of those 
(`"Request failed with status 0. Usually this means network failure or 
timeout"`) happening.
  
  This leads me to the assumption that we are not dealing with problems which 
are caused by the termbox service in particular responding to individual 
requests in a poor fashion but instead that the infrastructure at those moments 
in time is facing challenges which are outside of our team's control. I thing 
we need someone with a better understanding of how the production hosting is 
currently set up (buzz words: tls-proxy, kubernetes) and insights about the 
health of similar services at the same time to not look at this problem at the 
wrong layer of abstraction, losing the majority of relevant data points in the 
process (e.g. no point in searching for "termbox" in kibana if we are in fact 
dealing with a network problem).
  
  //Possibly// related (proper write-up following):
  We are seeing our termbox service, when reaching out to the mediawiki API, 
occasionally receiving a 503 (e.g. 
<https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-syslog-2020.09.24/syslog?id=AXTAe5h5LNRtRo5Xnvl8&_g=h@44136fa>).
 We are also seeing a very similar error from a TLS-proxy (e.g. 
<https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-syslog-2020.09.24/syslog?id=AXTAe5h5LNRtRo5Xnvl8&_g=h@44136fa>)
  What we did not manage to track down is a corresponding error from an app 
server (MW; acting as API for the termbox service) which we would have expected 
to see.
  We wonder if the errors may have something to do with the "recent" move of 
termbox to TLS (T254581 <https://phabricator.wikimedia.org/T254581>), which we 
don't know how it works, and how we would we track down those errors?
  
  (The few errors `Bad request. Language not existing` we saw while looking at 
this seem to be application-level problems and deserve a ticket)

TASK DETAIL
  https://phabricator.wikimedia.org/T255410

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: toan, Pablo-WMDE
Cc: Sakretsu, akosiaris, JMeybohm, WMDE-leszek, Pablo-WMDE, Tarrow, Jakob_WMDE, 
Addshore, Aklapper, Michael, Akuckartz, Iflorez, darthmon_wmde, alaa_wmde, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to