[Wikidata-bugs] [Maniphest] [Updated] T212189: New Service Request: Wikidata Termbox SSR
akosiaris added a comment. @WMDE-leszek Hi, sorry for not answering any sooner, last few weeks have been crazy indeed. Q4/Q2 started We can start work on this finally. The tracking task for this is in T220402 <https://phabricator.wikimedia.org/T220402>. Barring various issues that might creep up and stall this, we could get this deployed by April's end, hopefully even before that. I 'll start posting updates on T220402 <https://phabricator.wikimedia.org/T220402> TASK DETAIL https://phabricator.wikimedia.org/T212189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: RazShuty, sbassett, thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, Krinkle, Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, alaa_wmde, holger.knust, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, rosalieper, Wong128hk, Eevans, Hardikj, Wikidata-bugs, aude, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi, Dzahn ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T217641: Stop analytics-wmde-scripts/blob/master/src/wikidata/social/googleplus.php script
akosiaris updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T217641 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, Lucas_Werkmeister_WMDE, Ladsgroup, Aklapper, Addshore, alaa_wmde, joker88john, CucyNoiD, Nandana, NebulousIris, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Jonas, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Closed] T217641: Stop analytics-wmde-scripts/blob/master/src/wikidata/social/googleplus.php script
akosiaris closed this task as "Resolved". akosiaris claimed this task. akosiaris added a comment. dummy private repo updated, so is the actual private repo. Resolving, thanks! TASK DETAIL https://phabricator.wikimedia.org/T217641 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, Lucas_Werkmeister_WMDE, Ladsgroup, Aklapper, Addshore, alaa_wmde, joker88john, CucyNoiD, Nandana, NebulousIris, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Jonas, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T220402: Introduce wikidata termbox SSR to kubernetes
akosiaris added a comment. @Tarrow, @WMDE-leszek. I 've been working on the termbox helm chart and while the service seems to be up and running, I see no `/_info` endpoint nor a swagger/openapi[1] spec published under `/?spec`. Both are crucial for deploying, as the former is used as a kubernetes readiness probe (aka if an instance of the app can't serve that for any reason it will temporarily not see new traffic) and the latter is used by our monitoring, so we can't proceed without those. Could you please have a look at it and add them? Thanks! [1] https://swagger.io/docs/specification/about/ TASK DETAIL https://phabricator.wikimedia.org/T220402 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Dzahn, Addshore, Tarrow, Lea_WMDE, WMDE-leszek, Jakob_WMDE, Aklapper, akosiaris, alaa_wmde, joker88john, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Liudvikas, Wong128hk, Eevans, thcipriani, mobrovac, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, dduvall, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T220402: Introduce wikidata termbox SSR to kubernetes
akosiaris added a comment. @WMDE-leszek. Yes I did. Using https://locust.io/, wrote P8511 <https://phabricator.wikimedia.org/P8511> and benchmarked the service locally on my minikube instance. A rough howto (minus the locust part) is at https://wikitech.wikimedia.org/wiki/User:Alexandros_Kosiaris/Benchmarking_kubernetes_apps. Below are graphs with the results from the benchmarks CPU F29003763: termbox_cpu_usage.png <https://phabricator.wikimedia.org/F29003763> Memory F29003768: termbox_mem_usage.png <https://phabricator.wikimedia.org/F29003768> Requests: F29003766: termbox_locust_stats.png <https://phabricator.wikimedia.org/F29003766> So a single worker termbox installation uses some 1cpu max, ~235MB memory and is able to serve ~16 req/s. When idling we are at ~140MB and 0 cpu usage. Those numbers should be used as a driver to gauge how many instances of the app we will need. I 've updated the chart with those numbers and will proceed with merging it. TASK DETAIL https://phabricator.wikimedia.org/T220402 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: fsero, mobrovac, Matthias_Geisler_WMDE, Pablo-WMDE, Dzahn, Addshore, Tarrow, Lea_WMDE, WMDE-leszek, Jakob_WMDE, Aklapper, akosiaris, darthmon_wmde, alaa_wmde, joker88john, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Liudvikas, Wong128hk, Eevans, thcipriani, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, dduvall, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T220402: Introduce wikidata termbox SSR to kubernetes
akosiaris added a comment. @Tarrow , @WMDE-leszek I 've noticed 3 things while working on the above - The service seems to be configurable to reach directly out to the wikidata endpoint. e.g. `WIKIBASE_REPO: '{env(WIKIBASE_REPO,https://www.wikidata.org/w)}'`. Since, in the general case, this goes via the edge caches (possibly polluting them with artificial requests, and limits operational flexibility, e.g. in the case of a datacenter switchover we can't really switch the app without switching the edge layer, we 've been defaulting to talking to internal endpoints and sending a `Host: ` HTTP for the identification of the exact project. Would it be possible to add that functionality? (or guide me on how this is achieved if it is already implemented). - There was no `x-amples` stanza to `/termbox` endpoint. We need this for monitoring checks so I went ahead and added one in https://gerrit.wikimedia.org/r/#/c/wikibase/termbox/+/509391. Lemme know what you think. - The app does not emit statsd metrics for requests (it does however emit nodejs heap and GC stats). This should be rather easy to add as service-runner wraps it very nicely. See https://github.com/wikimedia/service-runner#metric-reporting What exactly you want to do for that is up to you, but up to now services have been using https://github.com/wikimedia/service-template-node/blob/master/lib/util.js#L107 to wrap the route handlers which provides them out of the box with requests rates per endpoints, latencies per endpoint, errors and so on. I highly suggest that. For service specific stats, e.g. how many time language=de has been passed as a request, `options.metrics.increment` is what has been used up to now as they have invariably been of a counter nature. Overall I think we are ready to deploy to staging, production will have to wait a bit until we solve at least the first 2 items. The 3rd is not a hard blocker but I am willing to bet you want stats :-) I also think we should schedule a training session for how deployers deploy code in the pipeline/kubernetes environment. TASK DETAIL https://phabricator.wikimedia.org/T220402 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: fsero, mobrovac, Matthias_Geisler_WMDE, Pablo-WMDE, Dzahn, Addshore, Tarrow, Lea_WMDE, WMDE-leszek, Jakob_WMDE, Aklapper, akosiaris, darthmon_wmde, alaa_wmde, joker88john, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Liudvikas, Wong128hk, Eevans, thcipriani, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, dduvall, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T220402: Introduce wikidata termbox SSR to kubernetes
akosiaris added a comment. > With respect to the end point checks it would be great to hear what we are trying to achieve with them. Our service depends on the availability of another service. If the examples are to act as smoke tests then their reliability depends on the upstream service; a dependency which would need to be configured (are we going to point it against prod for this?) & modeled (how to express service inter-dependency in the config?) in order to be able to make sense of the information down the line (i.e. "no need to be alarmed that this service reported 500 while the mw api was down"). As per https://gerrit.wikimedia.org/r/#/c/wikibase/termbox/+/509391/, The `x-amples` spec defines checks that will be exercised run using the service-checker[1] software in wikimedia infrastructure. Those run on a cron, contacting the service every 1 minute. The dependency part on mediawiki is something we can work on the alerting system level, but I have to say we don't currently do that for any service (and multiple ones rely on mediawiki) for mostly technical reasons related to our current monitoring infrastructure. That will not always be the case though.Also note that we know and accept that those monitoring checks create inorganic traffic and increase system load. It's a price that we have accepted to pay. TASK DETAIL https://phabricator.wikimedia.org/T220402 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: fsero, mobrovac, Matthias_Geisler_WMDE, Pablo-WMDE, Dzahn, Addshore, Tarrow, Lea_WMDE, WMDE-leszek, Jakob_WMDE, Aklapper, akosiaris, darthmon_wmde, alaa_wmde, joker88john, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Liudvikas, Wong128hk, Eevans, thcipriani, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, dduvall, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T220402: Introduce wikidata termbox SSR to kubernetes
akosiaris added a comment. In T220402#5177862 <https://phabricator.wikimedia.org/T220402#5177862>, @Pablo-WMDE wrote: > Hi @akosiaris - thanks for getting back to us. > > > sending a Host: HTTP for the identification of the exact project. Would it be possible to add that functionality? > > It certainly //is// possible and depending on operational needs we certainly can make this happen. We quickly discussed this in the team and would like to first truly understand the goal to make sure we don't mix up the different layers of our proverbial sausage pizza without a valid reason. The service would be run in a container inside a k8s pod, controlling its DNS - why not use this option to make sure requests reach the intended endpoint? The correct host would then come "for free" per the host part configured in `WIKIBASE_REPO`. The intended hostname is going to be (as is for all services) `appservers.discovery.wmnet`. That is an internal hostname that point to all the wiki projects and allows us to do a set of operations easily like switching traffic from one datacenter to another. That being said, as you can see, nothing specific about a project is coded in that hostname. That's on purpose, as we don't have a need - and don't really want - to treat operationally e.g. enwiki differently from dewiki. Technically it's just a set of apache webservers with many many virtualhosts for all of the hundreds of project we have. Hence the need for the `Host: wikidata.org` HTTP header. k8s can't help with that. > FTR: One internal talking point here was that intentionally circumventing edge side caching might put additional strain on the infrastructure, increase response times - but we assume you have this covered. > >> `x-amples` stanza to `/termbox` endpoint > > I looked at the change in gerrit <https://gerrit.wikimedia.org/r/#/c/wikibase/termbox/+/509391>, please find my comment there. Answered already, thanks for looking into it. > > >> app does not emit statsd metrics for requests > > I will create a story for that => T223202 <https://phabricator.wikimedia.org/T223202> > >> service specific stats > > Will be added as we go per the requirements from the technical product owner(s) > > Thanks Thanks! TASK DETAIL https://phabricator.wikimedia.org/T220402 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: fsero, mobrovac, Matthias_Geisler_WMDE, Pablo-WMDE, Dzahn, Addshore, Tarrow, Lea_WMDE, WMDE-leszek, Jakob_WMDE, Aklapper, akosiaris, darthmon_wmde, alaa_wmde, joker88john, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Liudvikas, Wong128hk, Eevans, thcipriani, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, dduvall, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T220402: Introduce wikidata termbox SSR to kubernetes
akosiaris added a comment. In T220402#5180031 <https://phabricator.wikimedia.org/T220402#5180031>, @Pablo-WMDE wrote: > Hi @akosiaris, > > thanks for taking the time to explain the way the `Host` header is intended to be used. > If I understand correctly the goal is to ensure that requests originating from our service bear a header `Host: (www.)wikidata.org` and reach which ever IP(s) `appservers.discovery.wmnet` resolves to on the system running it. This sounds like a name resolution challenge and a case for HostAliases <https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/#adding-additional-entries-with-hostaliases> or, more traditionally, a CNAME record <https://en.wikipedia.org/wiki/CNAME_record>. Unfortunately no. It is not a name resolution issue and hence HostAliases and/or CNAME records will not help. Note that we want to do here IS NOT direct the traffic to the correct endpoint, which is indeed a name resolution issue, but rather inform the endpoint which of the hundreds of virtualhosts the request is for, something that needs to be done at the HTTP level. > Again, if we are missing something, implementing this can be arranged, but we really rather not as it sounds like there are more appropriate tools for the job. If there were, we would not be asking it :-) TASK DETAIL https://phabricator.wikimedia.org/T220402 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: fsero, mobrovac, Matthias_Geisler_WMDE, Pablo-WMDE, Dzahn, Addshore, Tarrow, Lea_WMDE, WMDE-leszek, Jakob_WMDE, Aklapper, akosiaris, darthmon_wmde, alaa_wmde, joker88john, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Liudvikas, Wong128hk, Eevans, thcipriani, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, dduvall, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T199219: WDQS should use internal endpoint to communicate to Wikidata
akosiaris added a comment. In T199219#4452041 <https://phabricator.wikimedia.org/T199219#4452041>, @Smalyshev wrote: > @BBlack I am getting rather strange result with `appservers-ro.discovery.wmnet` - if I call the URL you provided, the call takes a lot of time: > > real0m4.270s > > while if I call to `www.wikidata.org`, I get: > > real0m0.127s > > Same with `api-ro`. `appservers-rw` is a bit faster: > > real0m0.320s > > But still 3x from going through frontend (and it's not caching - I changed the URL, result is the same, and varnish settings all say "miss"). Is this still true? I see deploy1001:~$ for i in appservers-ro appservers-rw api-ro api-rw ; do echo -n $i; time curl -s -o /dev/null -X GET -H 'Host: www.wikidata.org' "https://${i}.discovery.wmnet/wiki/Special:EntityData/Q2408871.ttl?nocache=1530836328152&flavor=dump"; ; done appservers-ro real0m0.097s user0m0.020s sys 0m0.012s appservers-rw real0m0.113s user0m0.028s sys 0m0.000s api-ro real0m0.113s user0m0.024s sys 0m0.012s api-rw real0m0.128s user0m0.028s sys 0m0.004s Note that it's quite important where the tests are run from. That is the active DC is going to be faster anyway. Running those from codfw yields entirely different results as there are back and forths between the DCs that need to be served. deploy2001:~$ for i in appservers-ro appservers-rw api-ro api-rw ; do echo -n $i; time curl -s -o /dev/null -H 'Host: www.wikidata.org' "https://${i}.discovery.wmnet/wiki/Special:EntityData/Q2408871.ttl?nocache=1530836328152&flavor=dump"; ; done appservers-ro real0m6.435s user0m0.024s sys 0m0.012s appservers-rw real0m0.323s user0m0.012s sys 0m0.020s api-ro real0m5.061s user0m0.028s sys 0m0.004s api-rw real0m0.276s user0m0.032s sys 0m0.000s Which are comparable to the numbers posted above. TASK DETAIL https://phabricator.wikimedia.org/T199219 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, BBlack, Aklapper, Smalyshev, Gehel, darthmon_wmde, Premeditated, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T199219: WDQS should use internal endpoint to communicate to Wikidata
akosiaris added a comment. In T199219#5234041 <https://phabricator.wikimedia.org/T199219#5234041>, @Smalyshev wrote: > There's a change though that WDQS no longer uses `nocache` for cache-busting in most common cases (see T217897 <https://phabricator.wikimedia.org/T217897> for more details). So I am not sure using internal endpoint now makes sense. It's not just about caches though. It's also about easier service level operations, e.g. switchover between DCs becomes easier and less error prone if the internal endpoint is used instead of the external one. TASK DETAIL https://phabricator.wikimedia.org/T199219 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, BBlack, Aklapper, Smalyshev, Gehel, darthmon_wmde, Premeditated, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T220402: Introduce wikidata termbox SSR to kubernetes
akosiaris added a comment. @tarrow, @WMDE-leszek Hi, sorry for taking so long to answer to this, it's been really busy. In T220402#5214471 <https://phabricator.wikimedia.org/T220402#5214471>, @Tarrow wrote: > @mobrovac Thanks! I think we've now taken most of this onboard and merged it. > > @akosiaris could you take a look at out new x-amples block and Host header? Hopefully this meets our needs :). I did have a look. Thanks for catering to the Host header support! I 've updated the helm chart to honor the above and went through a testing phase. I've also added you both as maintainers of the helm chart. https://gerrit.wikimedia.org/r/515028 Almost everything is fine, except some issue in the generator for the x-amples in the HEALTHCHECK_URL. The generated /openapi.json stanza in question is "x-amples": [ { "title": "get rendered termbox", "request": { "query": { "language": "de", "entity": "Q1", "revision": 103, "editLink": "/edit/Q1347", "preferredLanguages": [ "de", "en" ] }, "response": { "status": 200, "headers": { "content-type": "text/html" } } Note how preferred languages is an array indeed but is not pipeDelimited as the openapi spec requires, but is rather given in an array format and not as a pipedelimited string. That in turn causes service-checker to error out with the following DEBUG:urllib3.connectionpool:http://192.168.99.100:18788 "GET /termbox?revision=103&editLink=%2Fedit%2FQ1347&entity=Q1&preferredLanguages=%5B%27de%27%2C+%27en%27%5D&language=de HTTP/1.1" 400 169 /termbox (get rendered termbox) is CRITICAL: Test get rendered termbox returned the unexpected status 400 (expecting: 200) as it tries to convert `[`] and `]` into url encoded strings, thus creating a parameter than violates the spec and making the health check fail. I had a quick look at the code but found no quick way to fix it, could you please have a look? TASK DETAIL https://phabricator.wikimedia.org/T220402 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Krenair, fsero, mobrovac, Matthias_Geisler_WMDE, Pablo-WMDE, Dzahn, Addshore, Tarrow, Lea_WMDE, WMDE-leszek, Jakob_WMDE, Aklapper, akosiaris, E.S.A-Sheild, darthmon_wmde, Premeditated, joker88john, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Liudvikas, Wong128hk, Eevans, thcipriani, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, dduvall, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T220402: Introduce wikidata termbox SSR to kubernetes
akosiaris added a comment. Indeed this was fixed. However another regression has crept up it's head Doing a `curl 'http://192.168.99.100:18788/termbox?editLink=%2Fedit%2FQ1347&preferredLanguages=de&language=de&entity=Q1&revision=103'` returns a 500 with an error in the logs { "name": "termbox", "hostname": "termbox-termbox-64d46c6dcb-8jl86", "pid": 16, "level": 50, "err": { "message": "", "name": "wikibase-termbox", "stack": "Error: wikibase-shortcopyrightwarning-heading is not a valid message-key.\nat wikibase.termbox.main.js:18168:17\nat Array.forEach ()\nat AxiosWikibaseMessagesRepo.transformMessages (src/server/data-access/AxiosWikibaseMessagesRepo.ts:77:11)\nat AxiosWikibaseMessagesRepo.getMessageTranslationCollection (src/server/data-access/AxiosWikibaseMessagesRepo.ts:71:24)\nat wikibase.termbox.main.js:18147:27\nat process._tickCallback (internal/process/next_tick.js:68:7)", "levelPath": "error/service" }, "msg": "wikibase-shortcopyrightwarning-heading is not a valid message-key.", "time": "2019-06-07T12:04:33.596Z", "v": 0 } Note that the previous image, `2019-06-03-162801-production` (aka git sha1 7c5347f48b4225a48e94bb846c7e4d8820cde347 <https://phabricator.wikimedia.org/rWBTB7c5347f48b4225a48e94bb846c7e4d8820cde347>) did not exhibit this. So this is a regression introduced since then. TASK DETAIL https://phabricator.wikimedia.org/T220402 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Krenair, fsero, mobrovac, Matthias_Geisler_WMDE, Pablo-WMDE, Dzahn, Addshore, Tarrow, Lea_WMDE, WMDE-leszek, Jakob_WMDE, Aklapper, akosiaris, E.S.A-Sheild, darthmon_wmde, Premeditated, joker88john, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Liudvikas, Wong128hk, Eevans, thcipriani, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, dduvall, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T199219: WDQS should use internal endpoint to communicate to Wikidata
akosiaris added a comment. In T199219#5234979 <https://phabricator.wikimedia.org/T199219#5234979>, @Smalyshev wrote: > But won't we lose use of the varnish cache if we use the internal endpoint? Yes that's true. That being said, is that particularly important? Will WDQS fail in spectacular ways if it requests objects over the uncached endpoints? TASK DETAIL https://phabricator.wikimedia.org/T199219 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, BBlack, Aklapper, Smalyshev, Gehel, darthmon_wmde, Premeditated, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T220402: Introduce wikidata termbox SSR to kubernetes
akosiaris added a comment. In T220402#5243209 <https://phabricator.wikimedia.org/T220402#5243209>, @Tarrow wrote: > This should now be fixed. Sadly this was due to a mismatch between the code in wikibase master and that deployed on Wikidata.org And yes, we are finally in greener pastures. helm test termbox RUNNING: termbox-termbox-service-checker PASSED: termbox-termbox-service-checker Thanks for taking care of it so quickly! TASK DETAIL https://phabricator.wikimedia.org/T220402 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Krenair, fsero, mobrovac, Matthias_Geisler_WMDE, Pablo-WMDE, Dzahn, Addshore, Tarrow, Lea_WMDE, WMDE-leszek, Jakob_WMDE, Aklapper, akosiaris, E.S.A-Sheild, darthmon_wmde, Premeditated, joker88john, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Liudvikas, Wong128hk, Eevans, thcipriani, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, dduvall, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T220402: Introduce wikidata termbox SSR to kubernetes
akosiaris added a comment. Just as an FYI, everything looks ok on this end, but there's a train freeze this week, so we have to wait before deploying this. Patches are up and waiting to be merged on Monday the 17th TASK DETAIL https://phabricator.wikimedia.org/T220402 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Krenair, fsero, mobrovac, Matthias_Geisler_WMDE, Pablo-WMDE, Dzahn, Addshore, Tarrow, Lea_WMDE, WMDE-leszek, Jakob_WMDE, Aklapper, akosiaris, Daryl-TTMG, RomaAmorRoma, E.S.A-Sheild, darthmon_wmde, joker88john, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Liudvikas, Wong128hk, Eevans, thcipriani, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, dduvall, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Closed] T220402: Introduce wikidata termbox SSR to kubernetes
akosiaris closed this task as "Resolved". akosiaris claimed this task. akosiaris added a comment. curl -s -I -X GET 'http://termbox.discovery.wmnet:3030/termbox?editLink=%2Fedit%2FQ1347&preferredLanguages=de&language=de&entity=Q1&revision=103' HTTP/1.1 200 OK X-Powered-By: Express Content-Type: text/html; charset=utf-8 Content-Length: 1568 ETag: W/"620-MeNQXY3hfRVxLzBPruUZ418lGUc" Vary: Accept-Encoding Date: Tue, 18 Jun 2019 14:30:54 GMT Connection: keep-alive This is deployed finally. The canonical endpoint to talk to the service in production is `termbox.discovery.wmnet` as per the command above. @Tarrow , @WMDE-leszek, I 'll resolve this, feel free to reopen TASK DETAIL https://phabricator.wikimedia.org/T220402 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Krenair, fsero, mobrovac, Matthias_Geisler_WMDE, Pablo-WMDE, Dzahn, Addshore, Tarrow, Lea_WMDE, WMDE-leszek, Jakob_WMDE, Aklapper, akosiaris, Daryl-TTMG, RomaAmorRoma, 0010318400, E.S.A-Sheild, darthmon_wmde, joker88john, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Liudvikas, Wong128hk, Eevans, thcipriani, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, dduvall, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T212189: New Service Request: Wikidata Termbox SSR
akosiaris added a comment. @WMDE-leszek, @Tarrow. I 've noticed we are missing one thing. We have a dashboard for the service's metrics in https://grafana.wikimedia.org/d/AJf0z_7Wz/termbox but it looks like the service isn't sending request metrics to the local statsd instance. It is sending however memory and nodejs GC metrics which already appear in the graphs. `service-runner` already has code for it, see https://github.com/wikimedia/service-template-node/blob/a92cccea9df8af7bda315b4eb41495c95bbfbdad/lib/util.js#L98 for how to wrap the `/termbox` endpoint (or any other endpoint, wrapping /_info is also helpful) in order to have traffic, error and latency graphs (and consequently SLIs) for it. TASK DETAIL https://phabricator.wikimedia.org/T212189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Mholloway, RazShuty, sbassett, thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, Krinkle, Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, darthmon_wmde, holger.knust, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, rosalieper, Wong128hk, Eevans, Hardikj, Wikidata-bugs, aude, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi, Dzahn ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T212189: New Service Request: Wikidata Termbox SSR
akosiaris added a comment. @WMDE-leszek, @Tarrow. Any feedback on the comment above? TASK DETAIL https://phabricator.wikimedia.org/T212189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Mholloway, RazShuty, sbassett, thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, Krinkle, Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, darthmon_wmde, holger.knust, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Techguru.pc, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, rosalieper, Pchelolo, Wong128hk, Eevans, Hardikj, Wikidata-bugs, aude, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi, Dzahn ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T212189: New Service Request: Wikidata Termbox SSR
akosiaris added a comment. In T212189#5289724 <https://phabricator.wikimedia.org/T212189#5289724>, @Tarrow wrote: > @akosiaris Yep; we've interpreted it as something we really need before exposing it to real traffic. We've got a ticket open about it that we'll be picking up real soon: T226625 <https://phabricator.wikimedia.org/T226625> Cool, that is great. Thanks for the input. > We're assuming that it's still ok to continue in parallel with some integration with test.wikidata.org though As far as I am concerned, that's totally fine. TASK DETAIL https://phabricator.wikimedia.org/T212189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Mholloway, RazShuty, sbassett, thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, Krinkle, Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, darthmon_wmde, holger.knust, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Techguru.pc, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, rosalieper, Pchelolo, Wong128hk, Eevans, Hardikj, Wikidata-bugs, aude, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi, Dzahn ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T176875: Allow access to wdqs.svc.eqiad.wmnet on port 8888
akosiaris added a comment. In T176875#5317155 <https://phabricator.wikimedia.org/T176875#5317155>, @Ottomata wrote: > @Addshore, just saw T218710 <https://phabricator.wikimedia.org/T218710> and clicked through to here. If you use https://wikitech.wikimedia.org/wiki/HTTP_proxy, you can access wdqs.svc.eqiad.wmnet over HTTP from the analytics VLAN. Please don't do that. As the page very clearly says it's `To allow HTTP requests reach the outside world`, not to bypass internal restrictions TASK DETAIL https://phabricator.wikimedia.org/T176875 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, Ottomata, elukey, Smalyshev, Gehel, Addshore, Aklapper, darthmon_wmde, ET4Eva, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Techguru.pc, Lahi, Gq86, Darkminds3113, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Avner, Zppix, _jensen, rosalieper, Jonas, FloNight, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T229236: Investigate if the code of Graphoid uses a proper user agent header
akosiaris added a comment. > It's unmaintained as you pointed out (see T211881 <https://phabricator.wikimedia.org/T211881> for the gory details). Even if we fixed the blubber file (I am guessing that's where the issue is), deploying anything merged would probably not happen. Plus, there is no one to even review the patches and get them merged right now. Per https://phabricator.wikimedia.org/T211881#5509001, graphoid is to be undeployed TASK DETAIL https://phabricator.wikimedia.org/T229236 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ladsgroup, akosiaris Cc: akosiaris, alaa_wmde, Ladsgroup, Aklapper, Yurik, Smalyshev, Lydia_Pintscher, Lea_Lacroix_WMDE, Iflorez, darthmon_wmde, Nandana, Lahi, Gq86, Capankajsmilyo, GoranSMilovanovic, QZanden, LawExplorer, SongTake, _jensen, rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, Ricordisamoa, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Triaged] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater
akosiaris triaged this task as "Medium" priority. TASK DETAIL https://phabricator.wikimedia.org/T247058 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Aklapper, dcausse, Zbyszko, Gehel, darthmon_wmde, Legado_Shulgin, Nandana, Davinaclare77, Qtn1293, Techguru.pc, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Zppix, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Rxy, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T210260: Stretch in docker registry forces ascii encoding
akosiaris added a comment. In T210260#4784708, @LarsWirzenius wrote: C.UTF8 does not exist. In every other locale I try, a UTF8 suffix is an alias to the UTF-8 suffix (with the dash). This works: docker run unitest env LC_ALL=C.UTF-8 python3 -c "print('étoile')" I'd suggest that we use the C.UTF-8 locale, but I see no strong reason to prefer it over the en_US.UTF-8 locale. Actually there is. Avoid all the other non C locale related things (LC_NUMERIC, LC_COLLATE, etc, etc) and just obtain the UTF-8 functionality. Not only that but en_US and C locales are not fully interchangeable: The C locale is the standard locale, it implements the ISO C standard and basically is a en_US locale with a metric system and 24 hours time format. There has even for a while a package in debian to just get that (https://tracker.debian.org/pkg/open-infrastructure-locales-c.utf-8). It was not accepted in anything but sid and was removed a 1,5 years ago. That being said as far as I am aware, this is a Debian (and derivatives) specific locale (not that it matters much, our images are based on Debian) and glibc is still ironing out the details (per https://sourceware.org/bugzilla/show_bug.cgi?id=17318) I would advise using C.UTF-8 unless there is a clear reason not to. And since this is generic enough, it should not be blubber's job to make sure it exists (but rather that it is used - potentially override-able although I dont see a use case yet) so ping me if there is a problem with that and I 'll make sure it exists in the base images.TASK DETAILhttps://phabricator.wikimedia.org/T210260EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: akosiarisCc: LarsWirzenius, dduvall, hashar, akosiaris, Aklapper, Ladsgroup, Nandana, NebulousIris, Lahi, Gq86, Vacio, GoranSMilovanovic, QZanden, Gstupp, LawExplorer, _jensen, D3r1ck01, Liudvikas, notconfusing, Luke081515, thcipriani, mobrovac, Wikidata-bugs, aude, Alchimista, Addshore, Mbch331, Jay8g, greg___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T210260: Stretch in docker registry forces ascii encoding
akosiaris added a comment. I did just do a quick check on wikimedia-stretch image for this $ docker run --rm -it docker-registry.wikimedia.org/wikimedia-stretch:latest root@92dc0302edca:/# ls bin boot dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var root@92dc0302edca:/# locale LANG= LANGUAGE= LC_CTYPE="POSIX" LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL= root@92dc0302edca:/# locale -a C C.UTF-8 POSIX root@92dc0302edca:/# export LC_ALL=C.UTF-8 root@92dc0302edca:/# echo "\303toile" étoile so support for C.UTF-8 already exists in our images.TASK DETAILhttps://phabricator.wikimedia.org/T210260EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: akosiarisCc: LarsWirzenius, dduvall, hashar, akosiaris, Aklapper, Ladsgroup, Nandana, NebulousIris, Lahi, Gq86, Vacio, GoranSMilovanovic, QZanden, Gstupp, LawExplorer, _jensen, D3r1ck01, Liudvikas, notconfusing, Luke081515, thcipriani, mobrovac, Wikidata-bugs, aude, Alchimista, Addshore, Mbch331, Jay8g, greg___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Closed] T210260: Stretch in docker registry forces ascii encoding
akosiaris closed this task as "Resolved".akosiaris claimed this task.akosiaris added a comment. In T210260#4820556, @hashar wrote: Following the merge of https://gerrit.wikimedia.org/r/478200 , can you possibly rebuild the two images please? :) RepositoryTagImage idCreatedSize docker-registry.wikimedia.org/wikimedia-stretchlatestac576ceda67113 months ago56.1MB docker-registry.wikimedia.org/wikimedia-jessielatesta81cc7ec799813 months ago80.4MB Done. Note, whatever images use a different LC_ALL without installing/generating locales first will probably fail so per T210260#4788262 the CI images will also have to be updated.TASK DETAILhttps://phabricator.wikimedia.org/T210260EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: akosiarisCc: gerritbot, LarsWirzenius, dduvall, hashar, akosiaris, Aklapper, Ladsgroup, CucyNoiD, Nandana, NebulousIris, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, Gq86, Baloch007, Vacio, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, Gstupp, LawExplorer, Lewizho99, Maathavan, _jensen, D3r1ck01, Liudvikas, notconfusing, thcipriani, mobrovac, Wikidata-bugs, aude, Alchimista, Addshore, Mbch331, Jay8g, greg___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Reopened] T210260: Stretch in docker registry forces ascii encoding
akosiaris reopened this task as "Open".akosiaris added a comment. Oops, closed this by mistake. Re-opened, feel free to close when the issue is indeed resolved.TASK DETAILhttps://phabricator.wikimedia.org/T210260EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: akosiarisCc: gerritbot, LarsWirzenius, dduvall, hashar, akosiaris, Aklapper, Ladsgroup, CucyNoiD, Nandana, NebulousIris, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, Gq86, Baloch007, Vacio, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, Gstupp, LawExplorer, Lewizho99, Maathavan, _jensen, D3r1ck01, Liudvikas, notconfusing, thcipriani, mobrovac, Wikidata-bugs, aude, Alchimista, Addshore, Mbch331, Jay8g, greg___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T212189: New Service Request: Wikidata Termbox SSR
akosiaris added a comment. In T212189#4833536, @daniel wrote: "We should not introduce a service that is called by MediaWiki, and itself calls MediaWiki." Slightly OT, but a +1000 YES to this. Been there, seen that antipattern, it's a mess to reason about. The coupling of the 2 components make it near impossible to test/benchmark/debug the interactions. It also a mess to untangle and fix when it is identified.TASK DETAILhttps://phabricator.wikimedia.org/T212189EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: akosiarisCc: akosiaris, Krinkle, Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, Legado_Shulgin, Nandana, thifranc, AndyTan, kostajh, Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, D3r1ck01, SBisson, Wong128hk, Eevans, Hardikj, Wikidata-bugs, aude, GWicke, jayvdb, fbstj, faidon, santhosh, Jdforrester-WMF, Mbch331, Rxy, Jay8g, Ltrlg, bd808, fgiunchedi, Legoktm___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Subscribers] T212189: New Service Request: Wikidata Termbox SSR
akosiaris added subscribers: Tarrow, thcipriani. akosiaris added a comment. Per some IRC discussions we had in #wikimedia-serviceops, the code should be updated to be service-runner compatible as this will greatly increase homogeneity and allow for easy handling of things like logging, metrics as well as potentially rate limiting and DNS cache management. As far as I have understood, @Tarrow is already working on that (many thanks!). Following that, we should enable the pipeline for the project so that it builds docker images for this services. The first part is easy, we will need just a `.pipeline/blubber.yaml` and enabling the pipeline. Adding @thcipriani for that. Docs are currently under https://wikitech.wikimedia.org/wiki/Blubber. I can help with the next step which is the creation of a helm chart for the service. After that, (and assuming all other prereqs are done) it's time for deployment. There are a number of questions to answer as well, regardless of all the technical questions above: - We will need contact details in case the service suffers an outage - We will need a person/team to be the service-owner (that can be the same as above) - The service owner will have to state what will the required availability of this service (no, it can't be 100%) be. - In order to answer that question in a structured way another question needs to be answered and it's "What will be the SLO(s) of this service" (SLO stands for Service Level Objective). Which in turns implies another question (I promise it's the last in this stack) which is "What are the SLIs for this service" (SLI stands for Service Level Indicator aka a metric). Assuming a service-runner integration we will be able to have easily metrics (and graphs) for requests/sec, latency, errors. Any of these (or all + whatever else is deemed important to measure) can be chosen as SLIs and a target (aka an SLO) can be chosen on those. For better explanation of the terms SLI, SLO for now please have a look at https://landing.google.com/sre/sre-book/chapters/service-level-objectives/, as we are still building the documentation for all of this. - An estimation of the traffic the service is expected to receive: Already given, it's ~1 req/s - A schedule for when we would like to have this deployed to production as SRE will have to reserve some cycles for this. TASK DETAIL https://phabricator.wikimedia.org/T212189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, Krinkle, Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, alaa_wmde, holger.knust, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, rosalieper, Wong128hk, Eevans, Hardikj, Wikidata-bugs, aude, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T212189: New Service Request: Wikidata Termbox SSR
akosiaris added a comment. In T212189#4998182 <https://phabricator.wikimedia.org/T212189#4998182>, @Tarrow wrote: > I am indeed already working on it. > > Just so you know the current state: we are already using blubber for the CI i.e. we have 'service-pipeline-test' run in zuul/layout.yaml. I suppose this needs to soon include '-and-publish'? Yes, that's correct. > Our blubber will obviously need tweaking once we have the service-runner integration working though. TASK DETAIL https://phabricator.wikimedia.org/T212189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, Krinkle, Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, alaa_wmde, holger.knust, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, rosalieper, Wong128hk, Eevans, Hardikj, Wikidata-bugs, aude, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T187960: Rack/cable/configure asw2-a-eqiad switch stack
akosiaris updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T187960 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Cmjohnson, akosiaris Cc: Addshore, MMiller_WMF, Catrope, elukey, Marostegui, Stashbot, Paladox, gerritbot, Aklapper, BBlack, Cmjohnson, ayounsi, alaa_wmde, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, kostajh, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Soum213, Taiwania_Justo, Thibaut120094, Wong128hk, Wikidata-bugs, aude, Southparkfan, mark, Lydia_Pintscher, Darkdadaah, faidon, Nikerabbit, Arrbee, santhosh, KartikMistry, Jdforrester-WMF, Mbch331, Jay8g, Ltrlg, akosiaris, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T187960: Rack/cable/configure asw2-a-eqiad switch stack
akosiaris updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T187960 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Cmjohnson, akosiaris Cc: Addshore, MMiller_WMF, Catrope, elukey, Marostegui, Stashbot, Paladox, gerritbot, Aklapper, BBlack, Cmjohnson, ayounsi, alaa_wmde, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, kostajh, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Soum213, Taiwania_Justo, Thibaut120094, Wong128hk, Wikidata-bugs, aude, Southparkfan, mark, Lydia_Pintscher, Darkdadaah, faidon, Nikerabbit, Arrbee, santhosh, KartikMistry, Jdforrester-WMF, Mbch331, Jay8g, Ltrlg, akosiaris, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Subscribers] T187960: Rack/cable/configure asw2-a-eqiad switch stack
akosiaris added subscribers: jijiki, ArielGlenn, Krinkle, hashar, fgiunchedi, Joe, akosiaris. akosiaris added a comment. I was looking at Special needs or unsorted. @ayounsi I 've updated a few, feel free to move them to other sections. Pinging: - ge-2/0/13 - tungsten - xhgui:app @Krinkle @Joe - ge-3/0/5 - prometheus1003 - @fgiunchedi - ge-4/0/3 - logstash1004 - @fgiunchedi - ge-4/0/5 - snapshot1005 - @ArielGlenn - ge-4/0/26 - contint1001 - Releng - @hashar - ge-4/0/31 - stat1004 - Analytics, @elukey - ge-3/0/19 - rdb1005 - @jijiki , @Joe - ge-4/0/43 - rdb1003 - @jijiki , @Joe for the rest TASK DETAIL https://phabricator.wikimedia.org/T187960 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Cmjohnson, akosiaris Cc: akosiaris, Joe, fgiunchedi, hashar, Krinkle, ArielGlenn, jijiki, Addshore, MMiller_WMF, Catrope, elukey, Marostegui, Stashbot, Paladox, gerritbot, Aklapper, BBlack, Cmjohnson, ayounsi, alaa_wmde, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, kostajh, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Soum213, Taiwania_Justo, Thibaut120094, Wong128hk, Wikidata-bugs, aude, Southparkfan, mark, Lydia_Pintscher, Darkdadaah, faidon, Nikerabbit, Arrbee, santhosh, KartikMistry, Jdforrester-WMF, Mbch331, Jay8g, Ltrlg ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T212189: New Service Request: Wikidata Termbox SSR
akosiaris added a comment. That for this, it's appreciated. Note that we haven't still decided over which time window the availability will be calculated, but it's probably gonna be quarertly (3months that is). I have to say I am wondering a bit about the latency as the low end seems to be quite high (500ms?). It's your service of course and I am fine with it. As far as "wikipedia" availability goes, that's a very difficult number to come up with (it's being asked of in the past) as there are many components that constitute "wikipedia". Anyway 99.9% looks fine to me at this point, and this is not anyway set in stone, we can reevaluate in a few quarters if it ends up being unfeasible. > - ASAP as feasible, preferably 2019-03-31 at the latest. Unfortunately that is not feasible. This is the last month of the quarter and there's goals already running, plus we are short handed. But I guess we can target early next quarter. TASK DETAIL https://phabricator.wikimedia.org/T212189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: sbassett, thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, Krinkle, Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, alaa_wmde, holger.knust, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, rosalieper, Wong128hk, Eevans, Hardikj, Wikidata-bugs, aude, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T212189: New Service Request: Wikidata Termbox SSR
akosiaris added a comment. In T212189#5017053 <https://phabricator.wikimedia.org/T212189#5017053>, @Tarrow wrote: > In T212189#5011311 <https://phabricator.wikimedia.org/T212189#5011311>, @akosiaris wrote: > > > I have to say I am wondering a bit about the latency as the low end seems to be quite high (500ms?). It's your service of course and I am fine with it. > > > There is a chance we might want to revise this down in the future but right now it seems that being this high would not be unreasonable for us. It's difficult to gauge what is realistic right now. Sure. As I said, fine by me. > > >>> - ASAP as feasible, preferably 2019-03-31 at the latest. >> >> Unfortunately that is not feasible. This is the last month of the quarter and there's goals already running, plus we are short handed. But I guess we can target early next quarter. > > What more work/steps are needed? Is it: > > - Helm Chart > - Security Review There is also the LVS and DNS work, but that's in SRE realm. It does take time, but I don't think there anything you can do to expedite that. > Is there a "similar to production" test environment we could use check that everything is correctly setup on our end? There is a staging environment that might fit part of the "similar to production" definition. Things that are deployed in kubernetes are required to go through that environment anyway, so it's part of the process. > If the main work load on your end would be post deploy troubleshooting/monitoring etc.. could we consider a sooner deploy of the service without any expectations of service level (and not send any wikidata.org traffic there)? Just to increase our confidence that we haven't overlooked something. Unfortunately the main work load is pre-deploy. There is of course work post deploy, but I did not take that into consideration when I answered. In T212189#5017076 <https://phabricator.wikimedia.org/T212189#5017076>, @WMDE-leszek wrote: > In T212189#5011311 <https://phabricator.wikimedia.org/T212189#5011311>, @akosiaris wrote: > > > I have to say I am wondering a bit about the latency as the low end seems to be quite high (500ms?). It's your service of course and I am fine with it. > > > This indeed a relatively high number. We have come up with this estimate taking into account that our service is depending on others (MW API for time being) to fullfil its job, so we have taken this uncertainty into account, hence higher figure. We of course don't mind if the service works faster. > And FWIW, we understand "request latency" here as the time from client making a request and getting the response from the service, not a time between client making a request and the service becoming aware of it (just to be clear). Thanks for the clarification. Just noting that, assuming correct service-runner integration, we will anyway have metrics for the former, the latter would have been more difficult. > > >> As far as "wikipedia" availability goes, that's a very difficult number to come up with (it's being asked of in the past) as there are many components that constitute "wikipedia". Anyway 99.9% looks fine to me at this point, and this is not anyway set in stone, we can reevaluate in a few quarters if it ends up being unfeasible. > > Understood, thanks! > >> >> >>> - ASAP as feasible, preferably 2019-03-31 at the latest. >> >> Unfortunately that is not feasible. This is the last month of the quarter and there's goals already running, plus we are short handed. But I guess we can target early next quarter. > > This is understandable. We have had to try, though :) We're looking forward for this hopefully happening next quarter. Thanks for the understanding. We are drafting next quarter goals this week, I 'll make sure to add this. Final question and just for verification, this ain't going to be exposed directly to the internet after all, right? Rather this will be called by mediawiki, per the architectural diagrams attached to this task. TASK DETAIL https://phabricator.wikimedia.org/T212189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: sbassett, thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, Krinkle, Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, alaa_wmde, holger.knust, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hf
[Wikidata-bugs] [Maniphest] [Commented On] T212189: New Service Request: Wikidata Termbox SSR
akosiaris added a comment. In T212189#5044451 <https://phabricator.wikimedia.org/T212189#5044451>, @Addshore wrote: > In T212189#5020187 <https://phabricator.wikimedia.org/T212189#5020187>, @akosiaris wrote: > > > Thanks for the understanding. We are drafting next quarter goals this week, I 'll make sure to add this. > > > Just poking to double check that this was added (I would hate to see it missed). Yup. It's in already https://www.mediawiki.org/w/index.php?title=Wikimedia_Technology/Annual_Plans/FY2019/TEC3:_Deployment_Pipeline/Goals#Q4_Goals Draft, so not final wording, but it wasn't forgotten. Thanks for doublechecking! TASK DETAIL https://phabricator.wikimedia.org/T212189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: sbassett, thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, Krinkle, Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, alaa_wmde, holger.knust, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, rosalieper, Wong128hk, Eevans, Hardikj, Wikidata-bugs, aude, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi, Dzahn ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T229236: Investigate if the code of Graphoid uses a proper user agent header
akosiaris added a comment. In T229236#5376652 <https://phabricator.wikimedia.org/T229236#5376652>, @Ladsgroup wrote: > @Lydia_Pintscher @alaa_wmde So the current user agent of graphoid service is `graphoid (yurik at wikimedia)`. Yurik has left Wikimedia for a couple years I think. I made a patch <https://gerrit.wikimedia.org/r/526442> to fix it but it fails because blubber version 3 is not supported anymore (does it mean we can't merge anything in graphoid right now? @akosiaris knows better) It's unmaintained as you pointed out (see T211881 <https://phabricator.wikimedia.org/T211881> for the glory details). Even if we fixed the blubber file (I am guessing that's where the issue is), deploying anything merged would probably not happen. Plus, there is no one to even review the patches and get them merged right now. TASK DETAIL https://phabricator.wikimedia.org/T229236 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ladsgroup, akosiaris Cc: akosiaris, alaa_wmde, Ladsgroup, Aklapper, Yurik, Smalyshev, Lydia_Pintscher, Lea_Lacroix_WMDE, Hook696, Daryl-TTMG, RomaAmorRoma, 0010318400, E.S.A-Sheild, darthmon_wmde, joker88john, DannyS712, CucyNoiD, Nandana, NebulousIris, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, Gq86, Af420, Darkminds3113, Bsandipan, Lordiis, Capankajsmilyo, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, LawExplorer, SongTake, WSH1906, Lewizho99, Maathavan, _jensen, rosalieper, Jonas, Wikidata-bugs, aude, Ricordisamoa, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T199219: WDQS should use internal endpoint to communicate to Wikidata
akosiaris added a comment. In T199219#5396104 <https://phabricator.wikimedia.org/T199219#5396104>, @Ladsgroup wrote: > The other thing I want to mention and was missing here is overhead of encryption and TLS handshakes. In the @BBlack's example, we still use TLS but if you use plain http request, it's considerably faster (in both overhead of encryption and decryption): > > ladsgroup@mwmaint1002:~$ time curl -H 'Host: www.wikidata.org' 'http://appservers-ro.discovery.wmnet/wiki/Special:EntityData/Q7251.ttl?revision=992109551&flavor=dump' > /dev/null > % Total% Received % Xferd Average Speed TimeTime Time Current >Dload Upload Total SpentLeft Speed > 100 123k0 123k0 0 508k 0 --:--:-- --:--:-- --:--:-- 510k > real 0m0.256s > user 0m0.008s > sys 0m0.004s > > Unless there's any reason to encrypt requests internally, I think this would help us greatly. In the same DC the numbers are comparable. e.g. akosiaris@deploy1001:$ ab -n 100 -c 5 -H "Host: www.wikidata.org" 'http://appservers-ro.discovery.wmnet/wiki/Special:EntityData/Q7251.ttl?revision=992109551&flavor=dump' This is ApacheBench, Version 2.3 <$Revision: 1757674 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking appservers-ro.discovery.wmnet (be patient).done Server Software:mw1250.eqiad.wmnet Server Hostname:appservers-ro.discovery.wmnet Server Port:80 Document Path: /wiki/Special:EntityData/Q7251.ttl?revision=992109551&flavor=dump Document Length:125989 bytes Concurrency Level: 5 Time taken for tests: 5.984 seconds Complete requests: 100 Failed requests:0 Total transferred: 12655928 bytes HTML transferred: 12598900 bytes Requests per second:16.71 [#/sec] (mean) Time per request: 299.186 [ms] (mean) Time per request: 59.837 [ms] (mean, across all concurrent requests) Transfer rate: 2065.49 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect:00 0.1 0 1 Processing: 209 281 156.72441324 Waiting: 208 280 156.72431323 Total:209 281 156.72441324 Percentage of the requests served within a certain time (ms) 50%244 66%253 75%264 80%271 90% 317 95%375 98% 1044 99% 1324 100% 1324 (longest request) vs akosiaris@deploy1001:$ ab -n 100 -c 5 -H "Host: www.wikidata.org" 'https://appservers-ro.discovery.wmnet/wiki/Special:EntityData/Q7251.ttl?revision=992109551&flavor=dump' This is ApacheBench, Version 2.3 <$Revision: 1757674 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking appservers-ro.discovery.wmnet (be patient).done Server Software:mw1269.eqiad.wmnet Server Hostname:appservers-ro.discovery.wmnet Server Port:443 SSL/TLS Protocol: TLSv1.2,ECDHE-ECDSA-AES256-GCM-SHA384,256,256 TLS Server Name:www.wikidata.org Document Path: /wiki/Special:EntityData/Q7251.ttl?revision=992109551&flavor=dump Document Length:125989 bytes Concurrency Level: 5 Time taken for tests: 5.385 seconds Complete requests: 100 Failed requests:0 Total transferred: 12653400 bytes HTML transferred: 12598900 bytes Requests per second:18.57 [#/sec] (mean) Time per request: 269.239 [ms] (mean) Time per request: 53.848 [ms] (mean, across all concurrent requests) Transfer rate: 2294.77 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect:23 1.5 3 14 Processing: 220 262 39.2255 485 Waiting: 218 260 39.2254 483 Total:223 265 39.2258 488 Percentage of the requests served within a certain time (ms) 50%258 66%264 75%269 80%275 90%294 95%331 98%475 99%488 100%488 (longest request) In fact, if anything HTTPS requests were typically faster in this benchmarks proving your point about mediawiki + caching. It seems like the slow thing in this is mediawiki, not TLS termination. Across DCs
[Wikidata-bugs] [Maniphest] [Closed] T212189: New Service Request: Wikidata Termbox SSR
akosiaris closed this task as "Resolved". akosiaris claimed this task. akosiaris added a comment. The service has for long been deployed and even has nice dashboards in grafana, resolving. TASK DETAIL https://phabricator.wikimedia.org/T212189 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Niedzielski, Mholloway, RazShuty, sbassett, thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, Krinkle, Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, darthmon_wmde, holger.knust, Legado_Shulgin, DannyS712, Nandana, thifranc, Davinaclare77, Qtn1293, Techguru.pc, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, rosalieper, Pchelolo, Wong128hk, Eevans, Hardikj, Wikidata-bugs, aude, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi, Dzahn ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Retitled] T117398: number of database updates multiplied x3 since 29 October
akosiaris changed the title from "number of database updates multiplied x3 since 29 November" to "number of database updates multiplied x3 since 29 October". TASK DETAIL https://phabricator.wikimedia.org/T117398 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Performance-Team, Aklapper, jcrespo, Wikidata-bugs, aude, GWicke, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Unblock] T119777: Track number of referenced and unreferenced statements in wikidata
akosiaris closed blocking task T120010: Add firewall exception to get to wdqs*: from analytics cluster as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T119777 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Addshore, akosiaris Cc: gerritbot, Aklapper, StudiesWorld, Addshore, Wikidata-bugs, aude, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Closed] T120010: Add firewall exception to get to wdqs*:8888 from analytics cluster
akosiaris closed this task as "Resolved". akosiaris claimed this task. akosiaris added a comment. ACLs updated. Just tested it from `stat1003` and it works fine akosiaris@stat1003:~$ telnet wdqs1002.eqiad.wmnet Trying 10.64.32.183... Connected to wdqs1002.eqiad.wmnet. Escape character is '^]'. Resolving TASK DETAIL https://phabricator.wikimedia.org/T120010 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: BBlack, chasemp, Ottomata, akosiaris, Smalyshev, Addshore, Aklapper, Cmjohnson, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, mark, faidon, Mbch331, fgiunchedi ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T249598: Wikibase schema updaters must not modify database directly
akosiaris updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T249598 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Aklapper, Krinkle, kchapman, Pablo-WMDE, Ladsgroup, alaa_wmde, Anomie, Addshore, WMDE-leszek, kostajh, daniel, Oblanco79, Alter-paule, Beast1978, Un1tY, Hook696, Daryl-TTMG, RomaAmorRoma, E.S.A-Sheild, darthmon_wmde, Kent7301, Meekrab2012, joker88john, CucyNoiD, Nandana, NebulousIris, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, Gq86, Af420, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T255410: Termbox SSR connection terminated very often
akosiaris added a comment. In T255410#6224239 <https://phabricator.wikimedia.org/T255410#6224239>, @Michael wrote: > > Termbox/k8s: https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-syslog-2020.06.15/syslog?id=AXK3_osLZmYAikdJbyT-&_g=h@e3739c2 > Mediawiki: https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-mediawiki-2020.06.15/mediawiki?id=AXK3_omyK6TSR36GrxZd&_g=h@e1c60c6 Logstash links are unfortunately not copy-pastable. Anyone clicking on those links will get a `Unable to completely restore the URL, be sure to use the share functionality.` TASK DETAIL https://phabricator.wikimedia.org/T255410 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, JMeybohm, WMDE-leszek, Pablo-WMDE, Tarrow, Jakob_WMDE, Addshore, Aklapper, Michael, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Created] T204083: wikibase_shared/-wikidatawiki-hhvm:CacheAwarePropertyInfoStore memcached key not well distributed, causing excessive traffic
akosiaris created this task.akosiaris added projects: Wikidata, wikiba.se, Operations.Restricted Application added a subscriber: Aklapper. TASK DESCRIPTIONSRE team noticed that a specific host (mc1023) is close to saturating the uplink network connection [1]. More investigation into the grafana graphs for the entire cluster [2] showed that this is a recurring pattern that seems to follow hosts around. Doing a memkeys on mc1023 we found out that the key wikibase_shared/1_32_0-wmf_20-wikidatawiki-hhvm:CacheAwarePropertyInfoStore is doing >600Mbps of traffic. The fact the train version is coded in the key name supports the theory of the key name following the train and being hashed to a different server, explaining the fact the traffic seems to follow hosts around. This will cause an outage soon, needs to be fixed [1] https://grafana.wikimedia.org/dashboard/db/prometheus-machine-stats?panelId=8&fullscreen&orgId=1&var-server=mc1023&var-datasource=eqiad%20prometheus%2Fops&from=now-7d&to=now-1m [2] https://grafana.wikimedia.org/dashboard/db/prometheus-cluster-breakdown?orgId=1&from=now-30m&to=now&var-datasource=eqiad%20prometheus%2Fops&var-cluster=memcached&var-instance=AllTASK DETAILhttps://phabricator.wikimedia.org/T204083EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: akosiarisCc: Aklapper, mark, Krinkle, Joe, akosiaris, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Zppix, Wong128hk, Wikidata-bugs, aude, faidon, Mbch331, Jay8g, fgiunchedi___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Triaged] T204083: wikibase_shared/-wikidatawiki-hhvm:CacheAwarePropertyInfoStore memcached key not well distributed, causing excessive traffic
akosiaris triaged this task as "High" priority.akosiaris added a project: Performance-Team. TASK DETAILhttps://phabricator.wikimedia.org/T204083EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: akosiarisCc: Aklapper, mark, Krinkle, Joe, akosiaris, AndyTan, Davinaclare77, Qtn1293, Imarlier, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Vali.matei, Zppix, Wong128hk, Wikidata-bugs, aude, faidon, Mbch331, Jay8g, fgiunchedi___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T204083: wikibase_shared/-wikidatawiki-hhvm:CacheAwarePropertyInfoStore memcached key not well distributed, causing excessive traffic
akosiaris added a comment. https://grafana.wikimedia.org/dashboard/db/t204083?orgId=1 shows the excessive traffic moving around the various memcached hosts for the last 1 year.TASK DETAILhttps://phabricator.wikimedia.org/T204083EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: akosiarisCc: Aklapper, mark, Krinkle, Joe, akosiaris, AndyTan, Davinaclare77, Qtn1293, Imarlier, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, LawExplorer, Vali.matei, Zppix, Wong128hk, Wikidata-bugs, aude, faidon, Mbch331, Jay8g, fgiunchedi___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T255410: Termbox SSR connection terminated very often
akosiaris added a comment. In T255410#6491711 <https://phabricator.wikimedia.org/T255410#6491711>, @Pablo-WMDE wrote: > We are seeing our termbox service, when reaching out to the mediawiki API, occasionally receiving a 503 (e.g. <https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-syslog-2020.09.24/syslog?id=AXTAe5h5LNRtRo5Xnvl8&_g=h@44136fa>). We are also seeing a very similar error from a TLS-proxy (e.g. <https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-syslog-2020.09.24/syslog?id=AXTAe5h5LNRtRo5Xnvl8&_g=h@44136fa>) Logstash urls aren't copy pasteable I am afraid. I get "Unable to completely restore the URL, be sure to use the share functionality." Could you please update with a logstash shared url so we can be sure we are looking into the same thing before we delve deeper into this? TASK DETAIL https://phabricator.wikimedia.org/T255410 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: toan, akosiaris Cc: Lucas_Werkmeister_WMDE, Sakretsu, akosiaris, JMeybohm, WMDE-leszek, Pablo-WMDE, Tarrow, Jakob_WMDE, Addshore, Aklapper, Michael, Akuckartz, Iflorez, darthmon_wmde, alaa_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T255410: Termbox SSR connection terminated very often
akosiaris added a project: serviceops-radar. akosiaris added a comment. Sorry for not answering earlier. In T255410#6494077 <https://phabricator.wikimedia.org/T255410#6494077>, @Pablo-WMDE wrote: > > I unfortunately don't know how to do this for single documents. The links show the warning for me as well but reproduce fine. Ok, then that should suffice. Thanks for confirming that. I can confirm there is minor increase in errors during the timeperiod of that document. That being said, I 've been looking a bit more into this. Couple of notes: - Grafana for the last 7 days (https://grafana.wikimedia.org/d/JcMStTFGz/termbox-slo-panel?orgId=1&from=160133760&to=1602028799000) reports 517 500s. - Logstash (https://logstash.wikimedia.org/goto/2d69a0f714d7fb66c1959da2f0e8b69a) says 533 errors. Note also that logstash includes eqiad, which isn't pooled in those 7d but is still receiving health check requests and has to reach over to codfw with added latencies. So, I am going to assume they are roughly equal, cause that 17 entries discrepancy doesn't matter for the rest of this comment - We are starting to work on some preliminary/draft SLOs for mediawiki. There is some work to be done on getting numbers, but when we come up with those, it would be prudent to adjust the SLO of termbox to those as termbox is dependent on mediawiki and for it to provide more strict SLOs than mediawiki doesn't make sense. So, we got an error rate of 0.01889% (or 0.0001889) with the SLO of the service being 0.1% (or 0.001) per T212189#5007579 <https://phabricator.wikimedia.org/T212189#5007579>. The flip side of that is an availability of 99.98111% which is something to be rather proud of (see https://en.wikipedia.org/wiki/High_availability#Percentage_calculation). If we increase the timespan a bit to 30d (https://grafana.wikimedia.org/d/JcMStTFGz/termbox-slo-panel?orgId=1&from=159935040&to=1602028799000) we get 0.07301%, 7 times higher, but still below the SLO. Note that if we bump this before August 26th, the picture changes heavily, e.g. https://grafana.wikimedia.org/d/JcMStTFGz/termbox-slo-panel?orgId=1&from=159744960&to=1602028799000. However, as https://grafana.wikimedia.org/d/wJRbI7FGk/termbox?viewPanel=15&orgId=1&from=159822720&to=1598659199000 shows, a deployment(https://sal.toolforge.org/production?p=0&q=&d=2020-08-26 points out e03ee593f57adc7556f7a4 <https://phabricator.wikimedia.org/rDEPLOYCHARTSe03ee593f57adc7556f7a4af063caabea33c395c> - enabling the service proxy in fact) fixed that already, so corrective action has been taken since this task was created. Let's stick to the 7d timespan for now. Notes again: - Total of 533 errors in logstash - 273 are `timeout of 3000ms exceeded` - tracked in T255450 <https://phabricator.wikimedia.org/T255450>. This seems to me the most interesting to visit. - 170 are `Request failed with status code 500` - All of those are constrained in the timespan 2020-09-30T20:28:57 to 2020-09-30T22:01:05 and it's mediawiki that is returning those errors. https://logstash.wikimedia.org/goto/02a4bbcab3b7864b4b9a91fd7a26fb4a. - 77 are `Request failed with status code 503`. Those are from the sidecar envoy instance that termbox uses to connect to mediawiki. The reasons for adopting envoy are explained in https://wikitech.wikimedia.org/wiki/Envoy#Envoy_at_WMF (note that the same component also offers TLS termination so that termbox doesn't need to know or care for our internal TLS configuration). I guess this is also tracked in T263764 <https://phabricator.wikimedia.org/T263764>, so I 'll add a bit more information into that. - The rest 13 events don't seem worthy of looking into more. Overall, I am inclined to say that while the SLO isn't being violated over the course of the quarter, this should be a low priority. In T255410#6494416 <https://phabricator.wikimedia.org/T255410#6494416>, @toan wrote: > @akosiaris I did some tinkering in the kibana ui and came up with this (hopefully) shareable link <https://logstash.wikimedia.org/app/kibana#/discover?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-12h,mode:quick,to:now))&_a=(columns:!(_source),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'logstash-*',key:host,negate:!t,type:phrase,value:gerrit1001),query:(match:(host:(query:gerrit1001,type:phrase,('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'logstash-*',key:host,negate:!t,type:phrase,value:grafana1002),query:(match:(host:(query:grafana1002,type:phrase,('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'logstash-*',key:host,negate:!t,type:phrase,value:gerrit2001),que
[Wikidata-bugs] [Maniphest] T263764: Termbox service: unusual errors that could be from envoy
akosiaris edited projects, added serviceops-radar; removed serviceops. akosiaris added a comment. Envoy is being documented at https://wikitech.wikimedia.org/wiki/Envoy#Envoy_at_WMF. It is being used by termbox to talk to mediawiki (it's a component of a service mesh). The idea is to have low cost persistent TLS connections, with retries and telemetry. More more insights aside from the doc link above the following grafana dashboard is useful https://grafana.wikimedia.org/d/b1jttnFMz/envoy-telemetry-k8s?orgId=1&var-datasource=thanos&var-site=codfw&var-prometheus=k8s&var-app=termbox&var-destination=mwapi-async&from=now-7d&to=now It is expected and absolutely normal that occasionally connections will be terminated and reestablished by envoy as the network is not infallible. Some will be "masked" by envoy's retry logic, at the cost of extra latency of course. Using the dashboard above can help tracking down some of the errors. Logs from envoy for termbox are also in logstash, just remove the severity filter and they 'll appear. Parsing them can be done using https://www.envoyproxy.io/docs/envoy/latest/configuration/observability/access_log/usage A couple of notes though. - Those log entries aren't parsed into a json object unfortunately - envoy uses HTTP2 terminology for some stuff internally, even if HTTP1.1 is used. E.g. you will see `%REQ(:AUTHORITY)%`. That is the authority HTTP2 header (https://tools.ietf.org/html/rfc7540#section-8.1.2). That's equivalent to the Host HTTP/1.1 header - The response flags are usually telling. e.g. `UF: Upstream connection failure in addition to 503 response code.` or `URX: The request was rejected because the upstream retry limit (HTTP) or maximum connect attempts (TCP) was reached.` and so on Hopefully the above helps shed a bit of light. Finally as far as the `Should we be taking any action about these?`, question goes, my answer would be to use the service's SLO as a guide. As pointed out in T255410 <https://phabricator.wikimedia.org/T255410>, it doesn't seem worthy to investigate those more right now. TASK DETAIL https://phabricator.wikimedia.org/T263764 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: toan, Michael, JMeybohm, Tarrow, akosiaris, Aklapper, wkandek, Akuckartz, darthmon_wmde, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331, Dzahn ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T264710: Host static sites on kubernetes
akosiaris added a comment. A couple of requirements from my side, regardless of where those sites are deployed and the technology used: - Support for structured logging to stdout to allow debugging issues via our ELK stack should be a requirement. - Support for exporting metrics via prometheus or statsd should be a requirement. This should allow debugging issues, establishing SLIs and SLOs and allowing to come up with a level of support and ownership of powered services. Failing that, it will be impossible to come up with a level of support and will make those static sites a best effort. TASK DETAIL https://phabricator.wikimedia.org/T264710 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, Tarrow, Aklapper, WMDE-leszek, Addshore, Akuckartz, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T255410: Termbox SSR connection terminated very often
akosiaris added a comment. In T255410#6543118 <https://phabricator.wikimedia.org/T255410#6543118>, @Michael wrote: > @akosiaris Thank you a lot for your detailed response. I did look into those errors a tiny bit more to properly document them as can be now seen on wikitech <https://wikitech.wikimedia.org/wiki/WMDE/Wikidata/SSR_Service#Availability_objectives_and_accepted_operational_errors>. > > In the course of that I looked at the last days and noticed some discrepancies to the numbers you provided above. All the following data is for the 7 days between 2020-10-07 00:00:00 and 2020-10-13 23:59:59. I think you just exposed some weird behavior/bug in prometheus's `increase()` function regarding counter resets. I 've added a panel to the graph showcasing it. If you manually substract the peaks from the valleys for the 3 distinct timeframes depicted there you get almost the same errors as logstash. It's `62-0 + 99 - 0 + 484 - 440= 170`. It's probably that last (first timewise) timeframe that throughs prometheus off. Given that per the docs [1] It is syntactic sugar for rate(v) multiplied by the number of seconds under the specified time range window, and should be used primarily for human readability. there is probably something funny going on over the large timeframe. The rate() is also depicted in the panel and itis gradually dropping as well but it's quite higher in the first timeframe. Couple of notes though to clarify a few things. > - the Grafana SLO panel <https://grafana.wikimedia.org/d/JcMStTFGz/termbox-slo-panel?orgId=1&from=160202880&to=1602633599000> shows **277** errors. This is from the PoV of termbox itself. It count the HTTP 500s termbox knows it emitted. > - the kubernetes logstash for Termbox SSR <https://logstash-next.wikimedia.org/goto/dbf13405ed13e217d271c2ce1f694ae7> has **171** errors in that time frame > - 120 timeout errors, 51 envoy 503 errors > - I excluded some 19 errors about "startup finished", that are probably the ones you mentioned with "not worth looking into" Same PoV but on a log level. > I was surprised by that, but noticed that there were also a similar amount of network errors between MediaWiki and the Termbox SSR app in that timeframe: > > - the MediaWiki (PHP) logstash <https://logstash.wikimedia.org/goto/995becc306bb3da55de9e321631c40d0> has **104** errors of Termbox being unreachable That's actually from the PoV of mediawiki. If you put this logstash dashboard and the termbox one side-by-side there's considerable overlap as events are depicted in both. > It would make sense to me if the SLO covered those network problems as well, as they defacto impact the availability of the service to MediaWiki. Also, taking those errors together, we can account for 275 of the 277 errors shown in the Grafana SLO panel. > > Is the understanding layed out above correct? I think it's wrong to sum the 2 logstash dashboards (in fact, it's just coincidence that the numbers added up to something close to 277 as that was a made up number from prometheus). They are of a different nature and thus wrong to sum as you will be double counting events. [1] https://prometheus.io/docs/prometheus/latest/querying/functions/#increase TASK DETAIL https://phabricator.wikimedia.org/T255410 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Michael, akosiaris Cc: toan, Lucas_Werkmeister_WMDE, Sakretsu, akosiaris, JMeybohm, WMDE-leszek, Pablo-WMDE, Tarrow, Jakob_WMDE, Addshore, Aklapper, Michael, wkandek, Akuckartz, Iflorez, darthmon_wmde, alaa_wmde, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331, Dzahn ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T255410: Termbox SSR connection terminated very often
akosiaris added a comment. In T255410#6550492 <https://phabricator.wikimedia.org/T255410#6550492>, @Michael wrote: > > That seems very strange. I would have expected the //error rate// to be calculated by `(number of errors / number of total requests)` for the given timeframe. How does it actually work? Something like `(number of milliseconds with error/number of total milliseconds in timeframe)`? You can say that again :-). The main formula is what you described. In prometheus terms, it's sum(increase(service_runner_request_duration_seconds_count{service="termbox", prometheus="k8s", uri="termbox", status="500"}[$__range])) / sum(increase(service_runner_request_duration_seconds_count{service="termbox", prometheus="k8s", uri="termbox", status=~"200|500"}[$__range])) and that's what the left panel in that dashboard has. The issue isn't with the division, it's rather with the increase() function (the right hand side panel is just the nominator of the above equation), so it's sum( increase( service_runner_request_duration_seconds_count{service="termbox", prometheus="k8s", uri="termbox", status="500"}[$__range] ) ) The `sum()` is to sum across all the instances of termbox in that timeframe, the `increase()` is to calculate the change in that quantity from start to end of the timeframe. Normally it works, but in this case, it has failed. My guess as to what has happened is that due to 2 deployments (you can use the main termbox dashboard to spot them) termbox pods were destroyed and new ones started. So metrics changed and the internal counter resetting detection of rate() could not function. If you target a week without deploys, you aren't gonna witness that. If you are more interested about prometheus counter, there's more info about counters and how they work in prometheus at https://www.robustperception.io/how-does-a-prometheus-counter-work It also means we 'll have to figure out how to calculate better the SLO across large timeframes. TASK DETAIL https://phabricator.wikimedia.org/T255410 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Michael, akosiaris Cc: toan, Lucas_Werkmeister_WMDE, Sakretsu, akosiaris, JMeybohm, WMDE-leszek, Pablo-WMDE, Tarrow, Jakob_WMDE, Addshore, Aklapper, Michael, wkandek, Akuckartz, Iflorez, darthmon_wmde, alaa_wmde, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331, Dzahn ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T264710: Host static sites on kubernetes
akosiaris added a comment. For what is worth, the idea that Daniel explains above, would solve the issue for now without the need to move to kubernetes, satisfying multiple of the requirements without requiring significant effort. The following from the task description are satisfied: [X] Teams that manage / own the sites should be able to update the content of the site [X] The hosting location can be pointed to from sub paths of query.wikidata.org (and similar flexible locations). For WDQS this could be done in the WDQS nginx server config [X] Does CDN cache purging need to be considered at all (setting the correct `Cache-control` HTTP header in the apache config would solve this). The following aren't, but were marked by yours truly as SHOULD, not MUST to begin with. In the interest of moving forward and providing a solution I think it's ok. [ ] Support for structured logging to stdout to allow debugging issues via our ELK stack should be a requirement. [ ] Support for exporting metrics via prometheus or statsd should be a requirement. TASK DETAIL https://phabricator.wikimedia.org/T264710 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: JMeybohm, CDanis, Dzahn, Gehel, dcausse, Joe, akosiaris, Tarrow, Aklapper, WMDE-leszek, Addshore, wkandek, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, jijiki, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Mahir256, QZanden, EBjune, merbst, LawExplorer, Salgo60, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T265504: Create Blubberfile in WDQS repo
akosiaris added a comment. In T265504#6578280 <https://phabricator.wikimedia.org/T265504#6578280>, @Zbyszko wrote: > @akosiaris Can we base a blubber enabled project on a 3rd party docker image, provided on docker hub? I was wondering if we have to replicate original dockerfile here (I'd rather base of their image to reduce future maintenance). No, we don't want to base anything that's running in production on 3rd party images due to a variety of issues with them, ranging from security issues to supply chain attacks and integration with our auditing toolset. That's a decision taken long ago, but you can refer to https://www.mediawiki.org/wiki/Wikimedia_Technical_Talks#Episode_6:_A_Deployment_Pipeline_Overview for an overview on the why and hows. TASK DETAIL https://phabricator.wikimedia.org/T265504 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Mstyles, akosiaris Cc: Zbyszko, akosiaris, Ottomata, dcausse, Aklapper, Gehel, Mstyles, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T265504: Create Blubberfile in WDQS repo
akosiaris added a comment. In T265504#6580645 <https://phabricator.wikimedia.org/T265504#6580645>, @Zbyszko wrote: > @akosiaris I see, makes sense. I still would like to solve the issue with replicating the original dockerfile - can we deploy Flink images to our registry - even if we'd need to fork Flink docker repo? Could you elaborate on that a bit? I am not sure I have understood the question. Specifically, what do you mean by "deploy Flink images to our registry"? Whose Flink images? Ours? Sure. 3rd party ones? No, for the aforementioned reasons. By the way, I 'd strongly suggest to NOT try and replicate the original Dockerfile. We 've consciously and on purpose built Blubber for our infrastructure, to spare everyone from having to deal with Dockerfiles as they are very very easy to get wrong in a multitude of ways and end up creating insecure, misbehaving or non optimized images. TASK DETAIL https://phabricator.wikimedia.org/T265504 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Mstyles, akosiaris Cc: Zbyszko, akosiaris, Ottomata, dcausse, Aklapper, Gehel, Mstyles, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T265504: Create Blubberfile in WDQS repo
akosiaris added a comment. In T265504#6580948 <https://phabricator.wikimedia.org/T265504#6580948>, @Zbyszko wrote: >> Could you elaborate on that a bit? > > Sure, here goes: We are using Apache Flink[1] as a platform for our event processing we do to feed Wikidata Query Service. We've want to move to Flink deployment to Kubernetes, hence this ticket. Apache Flink provides it's own docker image[2] which, in other circumstances, we would build upon. What @Mstyles is doing now is basically replaying work original Flink contributors did for their docker image - which, according to our current knowledge is what we must do. > The actual docker file (with additional entry script) is here [3] - it would be great if we wouldn't need to make sure that we covered everything that is handled here with each Flink update. Yeah, that's not needed. What I proposed, in the meeting back then (and I may have failed to communicate it clearly), is to use it as an inspiration to solve issues, but not to try and replicate it. That would be a waste of time and resources and just wouldn't work, as it is written with a different mindset, e.g. the use of gosu doesn't make sense in our environment, we don't use EXPOSE, user/group creation is handled by blubber, the docker-entrypoint.sh tries to write to what should be an immutable image etc. > I hope that clears it up, I'm terrible at explaining things via text. If you need more context, we could connect over Meet. Actually that helped up a lot. Thanks for taking the time to explain it, I hope my answer helps as well. > [1] https://flink.apache.org/ > [2] https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/docker.html > [3] https://github.com/apache/flink-docker/tree/master/1.11/scala_2.11-java8-debian TASK DETAIL https://phabricator.wikimedia.org/T265504 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Mstyles, akosiaris Cc: Zbyszko, akosiaris, Ottomata, dcausse, Aklapper, Gehel, Mstyles, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T264710: Host static sites on kubernetes
akosiaris closed this task as "Invalid". akosiaris added a comment. In T264710#6586070 <https://phabricator.wikimedia.org/T264710#6586070>, @Addshore wrote: > Sounds like a fine solution from our side for now. > I'll let #serviceops <https://phabricator.wikimedia.org/tag/serviceops/> do with this ticket as they wish (keep it or close it) and I'll get on to creating some tickets for: Cool, thanks. I 'll close it as invalid for now, but we can always reopen in the future to resume this conversation. TASK DETAIL https://phabricator.wikimedia.org/T264710 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: JMeybohm, CDanis, Dzahn, Gehel, dcausse, Joe, akosiaris, Tarrow, Aklapper, WMDE-leszek, Addshore, wkandek, CBogen, Akuckartz, Nandana, Namenlos314, jijiki, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Mahir256, QZanden, EBjune, merbst, LawExplorer, Salgo60, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T265504: Create Blubberfile in WDQS repo
akosiaris added a comment. In T265504#6598430 <https://phabricator.wikimedia.org/T265504#6598430>, @Mstyles wrote: > @akosiaris when you get some time, can you please take another look at https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/635074 Yes, I will. This hasn't fallen through the cracks, it's just RL catching up with me. TASK DETAIL https://phabricator.wikimedia.org/T265504 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Mstyles, akosiaris Cc: Zbyszko, akosiaris, Ottomata, dcausse, Aklapper, Gehel, Mstyles, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T265504: Create Blubberfile in WDQS repo
akosiaris added a comment. In T265504#6615197 <https://phabricator.wikimedia.org/T265504#6615197>, @Mstyles wrote: > @akosiaris I started using the new Java images that you uploaded. I wasn't able to install gpg in the build process. There are some conflicts. We can skip gpg verification of the Flink tar, but I don't think that's a good idea. I will continue to do some debugging. > > Error message: > > The following packages have unmet dependencies: >gpg : Depends: gpgconf (= 2.2.12-1+deb10u1~bpo9+1) but it is not going to be installed > Depends: libassuan0 (>= 2.5.0) but 2.4.3-2 is to be installed > Depends: libgpg-error0 (>= 1.35) but 1.26-2 is to be installed > E: Unable to correct problems, you have held broken packages. Yup, I 've left comments on the change about this already. TL;DR, package name is `gnupg`. I did manage to get a container build correctly with the proposed changes, albeit due to the downloading of flink be prepared for a slow process TASK DETAIL https://phabricator.wikimedia.org/T265504 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Mstyles, akosiaris Cc: Zbyszko, akosiaris, Ottomata, dcausse, Aklapper, Gehel, Mstyles, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T265512: Set up Pipeline Configuration in WDQS repo
akosiaris added subscribers: jeena, dduvall. akosiaris added a comment. In T265512#6637980 <https://phabricator.wikimedia.org/T265512#6637980>, @Mstyles wrote: > @akosiaris it was unclear to me whether we need the promote section in the pipeline config. I'm referring to this: https://wikitech.wikimedia.org/wiki/PipelineLib/Reference#Promote and I saw it in a couple of configs here: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/services/mathoid/+/refs/heads/master/.pipeline/config.yaml#34. Promote just creates a gerrit change for review after the image has been built. It automates away the step of pushing to gerrit the version bump, essentially. Up to you, whether you want it or not. You can always start without it of course. > Additionally, just so I'm clear, we don't need to do the Jenkins configuration unless we want this to run on every commit (we do not want that). I'm referring to what I saw in the docs: https://wikitech.wikimedia.org/wiki/PipelineLib/Guides/How_to_configure_CI_for_your_project. We just want to be able to rebuild the image whenever we have a new release of the service (on average, once a week) Hm, that's a first, I am not sure we fully support it. We definitely support triggered building by pushing a new tag, not sure we support skipping CI on every commit. @Jeena, @dduvall, input please. TASK DETAIL https://phabricator.wikimedia.org/T265512 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: dduvall, jeena, akosiaris, Aklapper, Gehel, Mstyles, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T265512: Set up Pipeline Configuration in WDQS repo
akosiaris added a comment. In T265512#6648623 <https://phabricator.wikimedia.org/T265512#6648623>, @Gehel wrote: > > My (limited) understanding of the usual pipelines is that we expect a 1-to-1 mapping between project source code, packaged artifact and deployment. That is true for usual pipelines, but having met this already for the deployment pipeline, there has been work already to not have that 1-to-1 mapping. Examples of the fact we don't have a 1-to-1 mapping between packaged artifact and deployment is eventgate, which has 4 different installations[1]. Also, depending on the support from the framework used, there isn't necessarily a 1-1 mapping between repo and packaged artifact. It is true that 1 repo needs to be the umbrella one for which configuration is setup, but otherwise that repo can fetch in dependencies. Finally there has been work on the possibility to create more than 1 docker container (the artifact part) from a single repo. However, our testbed's (ORES) team does not have the capacity to test that out and the ORES migration to kubernetes has been declined (for reasons mostly unrelated to the pipeline). > This assumption is broken in our case, where the project source code generates multiple packaged artifacts, and each artifact generates potentially multiple independent deployments. I think not, at least for the latter as the eventgate example shows. Depending on what exactly the multiple packaged artifacts means, that might work fine as well. [1] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/services/ TASK DETAIL https://phabricator.wikimedia.org/T265512 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Mstyles, akosiaris Cc: dduvall, jeena, akosiaris, Aklapper, Gehel, Mstyles, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T264006: Deploy Flink to kubernetes (k8s)
akosiaris added a comment. After the helm chart is merged and published (both should happen automatically on a +2, I 've +1ed already), the final 2 items for deployment are: [ ] Create k8s tokens, namespaces. That's for SRE ServiceOps, example is at 17f18754661a9091ac7f621dda0d4dadb7b3a9f5 <https://phabricator.wikimedia.org/rOPUP17f18754661a9091ac7f621dda0d4dadb7b3a9f5>, e7036a5756eebfc104a0ca4a413d7fb0eb28932f <https://phabricator.wikimedia.org/rLPRIe7036a5756eebfc104a0ca4a413d7fb0eb28932f> and e8e5385765cf92a918ffa7021d09f49011efdc52 <https://phabricator.wikimedia.org/rDEPLOYCHARTSe8e5385765cf92a918ffa7021d09f49011efdc52> [ ] Create a helmfile.d structure. Example for that at b5b4c450dbb0bc4276ad9a3fc4926b74897824a2 <https://phabricator.wikimedia.org/rDEPLOYCHARTSb5b4c450dbb0bc4276ad9a3fc4926b74897824a2>. That part is probably best not done by ServiceOps but rather have them as reviewers. TASK DETAIL https://phabricator.wikimedia.org/T264006 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, Mstyles, Gehel, Aklapper, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Unblock] T98030: Do whatever is necessary to hook up production Wikidata Query Service to HDFS and other log collection systems
akosiaris closed blocking task T109357: Grant SMalyshev access to stat1002 to query hive as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T98030 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Lydia_Pintscher, Manybubbles, Smalyshev, Ironholds, Deskana, Aklapper, Wikidata-bugs, aude, Malyacko ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Unblock] T98030: Do whatever is necessary to hook up production Wikidata Query Service to HDFS and other log collection systems
akosiaris closed blocking task T110217: Need access for smalyshev to hive queries on stat1002 as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T98030 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Deskana, akosiaris Cc: Lydia_Pintscher, Manybubbles, Smalyshev, Ironholds, Deskana, Aklapper, Wikidata-bugs, aude, Malyacko ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T138627: Enable WDQS admins to enable/disable updater service
akosiaris added a comment. I support that as well. The code in wdqs::updater should probably anyway be amended to use base::service_unit at some point, at which point the question of whether puppet should also manage a service resource will probably be posed. We would at least be ready from that point of view.TASK DETAILhttps://phabricator.wikimedia.org/T138627EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: akosiarisCc: akosiaris, gerritbot, Zppix, JanZerebecki, Aklapper, Gehel, Smalyshev, Avner, Lewizho99, Maathavan, debt, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, mark, Mbch331, Jay8g, Krenair___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T163922: Create a URL rewrite to handle the /data/ path for canonical URLs for machine readable page content
akosiaris added a comment. There are other comments from yours truly in the last review, namely maintaining the status quo of configuring the redirect, aside from the 303 vs 301 part, on which I can be convinced with a good enough argument, but I haven't yet seen a reply.TASK DETAILhttps://phabricator.wikimedia.org/T163922EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: akosiarisCc: Lucas_Werkmeister_WMDE, Dereckson, akosiaris, fgiunchedi, Stashbot, gerritbot, Ladsgroup, Aklapper, daniel, Lordiis, Adik2382, Jayprakash12345, Th3d3v1ls, Ramalepe, Liugev6, Zoranzoki21, Lewizho99, Maathavan, DatGuy, Devwaker, Urbanecm, JEumerus, Tulsi_Bhagat, suriyaa, Luke081515, biplabanand, Wikidata-bugs, Snowolf, Dcljr, Southparkfan, Jdforrester-WMF, Matanya, Rxy, Jay8g, Glaisher, Krenair___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T163922: Create a URL rewrite to handle the /data/ path for canonical URLs for machine readable page content
akosiaris added a comment. In T163922#3639831, @Lucas_Werkmeister_WMDE wrote: If the TTL isn’t too long (I saw a cap of 1 day in the puppet config, is that correct?), then normal expiry is probably enough. It doesn't work like that. The time that request can be cached is determined by the headers the response sets. http://book.varnish-software.com/3.0/HTTP.html is a pretty interesting read if you 've never read it before. It's also a rabbithole (ableit not a very big one ;-). In absence of these (like in this case where only the Age header was set) it's not easy to deduce when the page is going to be removed from all existing caches (some of which we don't really control, like the browser cache) Anyway, I 've purged the caches in order to resolve this faster instead of waiting it out. For the interested the commands were (in that sequence) varnishadm ban "obj.status == 301 && req.http.host ~ commons.wikimedia.org" varnishadm -n frontend ban "obj.status == 301 && req.http.host ~ commons.wikimedia.org" Do remember to force refresh to test it as your browser probably has the result cached as well.TASK DETAILhttps://phabricator.wikimedia.org/T163922EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: akosiarisCc: ema, elukey, TerraCodes, Jay8g, Liuxinyu970226, Lucas_Werkmeister_WMDE, Dereckson, akosiaris, fgiunchedi, Stashbot, gerritbot, Ladsgroup, Aklapper, daniel, Lordiis, Adik2382, Jayprakash12345, Th3d3v1ls, Ramalepe, Liugev6, Zoranzoki21, Lewizho99, Maathavan, DatGuy, Devwaker, Urbanecm, JEumerus, Tulsi_Bhagat, suriyaa, Luke081515, biplabanand, Wikidata-bugs, Snowolf, Dcljr, Southparkfan, Jdforrester-WMF, Matanya, Rxy, Glaisher, Krenair___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Unblock] T146207: publish lag and response time for wdqs codfw to graphite
akosiaris closed subtask T146474: Add firewall exception to get to wdqs*.codfw.wmnet: from analytics cluster as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T146207EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Addshore, akosiarisCc: Addshore, Smalyshev, Gehel, Ramalepe, Liugev6, EBjune, mschwarzer, Avner, Lewizho99, Zppix, Maathavan, debt, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Closed] T146474: Add firewall exception to get to wdqs*.codfw.wmnet:8888 from analytics cluster
akosiaris closed this task as "Resolved".akosiaris claimed this task.akosiaris added a comment. This seems to have fallen between the cracks. I 've had a quick look and talk with @Gehel and opened the access. Resolving, feel free to reopen.TASK DETAILhttps://phabricator.wikimedia.org/T146474EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: akosiarisCc: akosiaris, Ottomata, Gehel, Addshore, Aklapper, EBjune, mschwarzer, Avner, Zppix, debt, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, Cmjohnson, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, mark, faidon, Mbch331, Jay8g, fgiunchedi___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T280485: Additional capacity on the k8s Flink cluster for WCQS updater
akosiaris added a comment. Hi. Given the 13 core/15GB RAM requirement, I can verify that we have that capacity free lying around[1], so we expect no problems there. Is T280485#7275149 <https://phabricator.wikimedia.org/T280485#7275149> related to blazegraph and not flink ? I am not sure what 13B triplets vs 2.8T triples means storage wise and in which context. [1] https://w.wiki/4Q7J TASK DETAIL https://phabricator.wikimedia.org/T280485 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, akosiaris Cc: akosiaris, Zbyszko, Aklapper, RKemper, Gehel, MPhamWMF, wkandek, JMeybohm, CBogen, Namenlos314, jijiki, Gq86, Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Dzahn ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T280485: Additional capacity on the k8s Flink cluster for WCQS updater
akosiaris added a comment. In T280485#7509094 <https://phabricator.wikimedia.org/T280485#7509094>, @Gehel wrote: > In T280485#7506072 <https://phabricator.wikimedia.org/T280485#7506072>, @akosiaris wrote: > >> Is T280485#7275149 <https://phabricator.wikimedia.org/T280485#7275149> related to blazegraph and not flink ? I am not sure what 13B triplets vs 2.8T triples means storage wise and in which context. > > The number of triples in Blazegraph is roughly linearly correlated with the local storage requirement on the Flink side. OK, so we are talking about an increase that's on the order of 200 times. However, I am still not clear though where exactly that change would be depicted. Swift? The fs (I guess /tmp mostly) of the container of the jobmanager or the taskmanager? And do we expect the increase to actually be on the order mentioned above? Or are the internal flink databases more or less efficient than that crude calculation? TASK DETAIL https://phabricator.wikimedia.org/T280485 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, akosiaris Cc: akosiaris, Zbyszko, Aklapper, RKemper, Gehel, MPhamWMF, wkandek, JMeybohm, CBogen, Namenlos314, jijiki, Gq86, Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Dzahn ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T280485: Additional capacity on the k8s Flink cluster for WCQS updater
akosiaris added a comment. In T280485#7510193 <https://phabricator.wikimedia.org/T280485#7510193>, @Gehel wrote: >>> In T280485#7506072 <https://phabricator.wikimedia.org/T280485#7506072>, @akosiaris wrote: >>> >>>> Is T280485#7275149 <https://phabricator.wikimedia.org/T280485#7275149> related to blazegraph and not flink ? I am not sure what 13B triplets vs 2.8T triples means storage wise and in which context. > > Oh, I now see the confusion! Wrong units (typo) in the initial message. The current Flink updater takes data from Wikidata, which has ~13B triples. 😆😆. OK, thanks for clearing that up. That 200 times increase had me worried. > The new Flink updater will add support for getting data from Commons, which has ~2.8B triples. So the new updater will add ~20% more resource consumption (assuming a linear cost). OK, that's nothing then. > This will mean: > > - additional storage on Swift (I assume this is trivial given the size of Swift and can be ignored) > - additional CPU / RAM usage on k8s > - additional local storage (/tmp) on the containers We got enough on all of those 3, no worries there. > It isn't super clear to me if our strategy is to increase the size of the current Flink cluster, or have a new cluster dedicated to the Commons updater (to be decided later today). Cool. Let us know what you decide. On our side, it probably isn't much more than 1 more deployment on the k8s cluster. That being said, and assuming my memory is up to date with how flink works in session cluster mode, I 'd expect it's able to handle this internally without needing another k8s deployment. It's also fine to increase the number of worker pods if that's something that would make things easier for you. > Duplicate the existing cluster would provide additional isolation between the 2 workflows. This is also the worst case scenario in terms of resource needed. The additional estimated resources are: > > - manager: 1 more pod at 1.6G, cpu: 500m > - workers: 3 pods at 2.1G ram, cpu: 1000m Even in this case, we got that capacity. TASK DETAIL https://phabricator.wikimedia.org/T280485 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, akosiaris Cc: akosiaris, Zbyszko, Aklapper, RKemper, Gehel, MPhamWMF, wkandek, JMeybohm, CBogen, Namenlos314, jijiki, Gq86, Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Dzahn ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T280485: Additional capacity on the k8s Flink cluster for WCQS updater
akosiaris added a comment. In T280485#7510247 <https://phabricator.wikimedia.org/T280485#7510247>, @dcausse wrote: > small precision: > If we reuse the same cluster (same k8s namespace): > > - it's 3 more pods at 2.1G ram, cpu: 1000m each > > If we use a separate cluster (new k8s namespace): > > - add a pod at 1.6G, cpu: 500m to the 3 pods mentioned above The difference is small enough to make resource allocation irrelevant for this. I 'd suggest you don't try to optimize for CPU/RAM when deciding. TASK DETAIL https://phabricator.wikimedia.org/T280485 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, akosiaris Cc: dcausse, akosiaris, Zbyszko, Aklapper, RKemper, Gehel, MPhamWMF, wkandek, JMeybohm, CBogen, Namenlos314, jijiki, Gq86, Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Dzahn ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T280485: Additional capacity on the k8s Flink cluster for WCQS updater
akosiaris added a comment. In T280485#7510249 <https://phabricator.wikimedia.org/T280485#7510249>, @Zbyszko wrote: > Sorry all for the confusion my typo caused, different name for that magnitude in my native language is confusing me my whole life :). No worries, that's fine. You actually just triggered my encyclopedic inner beast which sent me down a small translation rabbithole and finally into https://pl.wikipedia.org/wiki/Przedrostek_SI and I must say `Thanks. Today I learned :)` Btw, the equivalent in Greek is Διεθνές_σύστημα_μονάδων#Προθέματα_μονάδων <https://el.wikipedia.org/wiki/%CE%94%CE%B9%CE%B5%CE%B8%CE%BD%CE%AD%CF%82_%CF%83%CF%8D%CF%83%CF%84%CE%B7%CE%BC%CE%B1_%CE%BC%CE%BF%CE%BD%CE%AC%CE%B4%CF%89%CE%BD#%CE%A0%CF%81%CE%BF%CE%B8%CE%AD%CE%BC%CE%B1%CF%84%CE%B1_%CE%BC%CE%BF%CE%BD%CE%AC%CE%B4%CF%89%CE%BD> and yes a similar pattern exists in `Tera`, `Peta` and `Exa` which all are -1 of what the sounds would implicitly suggest to a Greek person. So, I feel you. TASK DETAIL https://phabricator.wikimedia.org/T280485 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, akosiaris Cc: JMeybohm, dcausse, akosiaris, Zbyszko, Aklapper, RKemper, Gehel, MPhamWMF, wkandek, CBogen, Namenlos314, jijiki, Gq86, Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Dzahn ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T301471: New Service Request SchemaTree
akosiaris added a comment. In T301471#7837692 <https://phabricator.wikimedia.org/T301471#7837692>, @Addshore wrote: > So a summary comment as promised! > > In T301471#7799393 <https://phabricator.wikimedia.org/T301471#7799393>, @Michaelcochez wrote: > >> @QChris I noticed the addition of the .gitreview file on gerrit. Is this file needed? If so, we would merge it into our github repository, so we can keep the active development there and synchronize with gerrit. > > Sounds like this would work to me, and would be fine from the WMDE side of things. > @akosiaris would syncing the code from Github to Gerrit via a Gerrit change whenever it would need to be updated be acceptable on your side? Yes, that would be ok, some other services use that approach too. > To summarize current state: > > - Repo was created, code is there https://gerrit.wikimedia.org/g/wikidata/propertysuggester/RecommenderServer/+/refs/heads/main but right now development still happens on github > - Blubber configuration exists in the repo, but who needs to setup the CI for this @akosiaris ? Is this on your team? or something we can do (we would need docs / guidence) Release Engineering would be the team for that. You can always send a gerrit change for integration/config (it should be pretty similar to 4bd7dc4ad5fb1 <https://phabricator.wikimedia.org/rCICF4bd7dc4ad5fb1d1c99a4bed049050e4bd26398ca>) and they can review/merge/deploy. > - bullseye distribution update was done as suggested above, does @Joe or @akosiaris want to review this PR? https://github.com/martaannaj/RecommenderServer/pull/22/files Done > - 'deployment-charts repository' was also mentioned, Again is there anything we can do on our side, or more something for service ops to handle? This is almost entirely self-serve. Docs are at: https://wikitech.wikimedia.org/wiki/Deployment_pipeline/Migration/Tutorial#Creating_a_Helm_Chart Once you have the chart from the above step, upload it to gerrit for review and we 'll work together there. > The other big topic is still how to include the index for the service. > Context, this index file is currently ~1.5GB and would need to be regenerated every now and again: > I see a couple of options, and I'm sure @akosiaris and team would have options. > > - Build 1 image with just the code and have another repo and another blubber config etc that then adds an index file in to a new image > - Where can this image file be stored and retrieved? is this a bit big to just shove in a gerrit repo? That's a large file. It's a bad idea to stick it in a docker registry, it will make deployments slower, more resource consuming and brittle. We are already meeting various limits (at multiple places) with large files and I can tell you that you don't want to be debugging them. Plus, conceptually, the place for a dataset that is used by a codebase, isn't right next to the codebase. To attack the problem from another angle. What is in that dataset? What is the format that is has and why ? Is it something we can alter? Would it make sense if it was stored in a datastore (e.g. SQL, Cassandra, etc) and queried instead ? It's also a bad idea to stick it in a git repo (any git repo). In fact, gerrit won't even allow you to push this. It has (and for good reasons) a `maxObjectSizeLimit` of 100MB. TASK DETAIL https://phabricator.wikimedia.org/T301471 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, QChris, ItamarWMDE, Joe, Aklapper, Addshore, karapayneWMDE, Martaannaj, Michaelcochez, Astuthiodit_1, Arnoldokoth, Invadibot, maantietaja, wkandek, JMeybohm, Akuckartz, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Eevans, Hardikj, Wikidata-bugs, aude, Sjoerddebruin, Jdforrester-WMF, Mbch331, Jay8g, Dzahn ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T238751: Only generate maxlag from pooled query service servers.
akosiaris added a comment. In T238751#7851690 <https://phabricator.wikimedia.org/T238751#7851690>, @Addshore wrote: > @Joe (Also pinging @akosiaris as I know joe is out right now). > It seems like the ideal solution of T239392: Applications and scripts need to be able to understand the pooled status of servers in our load balancers. <https://phabricator.wikimedia.org/T239392> might not happen for some time. > Would it be possible to resolve this for now with https://gerrit.wikimedia.org/r/c/operations/puppet/+/553097 which I believe would have been "fine" TM for the last 2.5 years and decreased humans touching things and also decreased the number of issues users end up seeing around delayed / broken but depooled wdqs hosts? That patch is already out of date. lvs2003 is no longer around pointing out exactly why the approach of hardcoding an LVS server in the patch would be problematic. TASK DETAIL https://phabricator.wikimedia.org/T238751 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Joe, akosiaris Cc: akosiaris, Lydia_Pintscher, bking, karapayneWMDE, Lucas_Werkmeister_WMDE, Ladsgroup, Gehel, Jheald, Joe, Addshore, Aklapper, Fernandobacasegua34, Astuthiodit_1, 786, Suran38, Biggs657, Invadibot, Lalamarie69, MPhamWMF, LSobanski, maantietaja, Juan90264, Alter-paule, Beast1978, CBogen, ItamarWMDE, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, jijiki, Klaas_Z4us_V, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, QZanden, EBjune, LawExplorer, Lewizho99, Maathavan, elukey, _jensen, rosalieper, Neuronton, Scott_WUaS, Wikidata-bugs, aude, Mbch331, Jay8g ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T301471: New Service Request SchemaTree
akosiaris added a comment. In T301471#7840496 <https://phabricator.wikimedia.org/T301471#7840496>, @Michaelcochez wrote: > I merged the pull request on github now. > > I do not have rights to push to the gerrit repository, it might just be my limited knowledge of how gerrit works. I 've added to you to the gerrit `wikidata-propertysuggester-RecommenderServer` group, you should have access now. > I will look into the helm chart/CI setup soon. > >> questions around the index file: > > This file is the serialization of the in-memory tree structure used for recommendation. The file is a compressed (gzipped) binary file. For serialization we use https://pkg.go.dev/encoding/gob . Given this, and the fact that changes in the tree structure can have a 'rippling effect', it is not possible (or at least extremely hard) to alter the file. This tree is a specifically crafted type of index, serving its data from an external database would be impossible/detrimental for performance as it would require //a lot// of roundtrips. Despite the allure, shipping around serialized memory objects has many drawbacks as an approach. Most obvious are the security ones and most languages indeed put wording in their respective frameworks to point that out. https://github.com/golang/go/issues/20221 has some hints. Python's pickle more or less points out the same. Really big hacks that have ex-filtrated tons of private data have happened because of "serialization" vulnerabilities (e.g. the equifax hack was reliant on an apache struts serialization vulnerability - https://nakedsecurity.sophos.com/2017/09/06/apache-struts-serialisation-vulnerability-what-you-need-to-know/) There's more drawbacks of course. For example, how do you do versioning of the dataset? It needs to always match the definitions of the Golang Struct that it contains. Even simple changes in field names, can cause unintended behavior. e.g. changing a field name means that data for it will silently be dropped when deserializing an older dataset and loading it into memory. Thus the dataset needs to be strongly coupled with the application (that is they need to be deployed in tandem), which is a bad pattern due to the size constraints I 've explained about above, not to mention the fact that gerrit currently won't allow you to even upload the file. > The index file is loaded into memory once when the process starts. It could be loaded from 'anywhere' and does not even have to reside on disk necessarily. That's the thing. It can't be loaded from 'anywhere' cause of the security issues and because of the strong coupling it has with the application itself. A final question, regarding the external database roundtrips note. Almost all datastores (either RDBMS ones or NoSQL ones) have the ability to batch results, obviating the need to multiple roundtrips. As a result e.g. many ORMs (Hibernate, Django, SQLAlchemy, Gorm) also support this (naming the functionality with various terms, but it's there). In fact, we 've seen this before and in most cases rewriting the queries to fetch hundreds or thousands of entities in 1 go instead of hundreds or thousands of queries was trivial. There are also other well known, proven and safer ways of shipping around serialized structured data, e.g. protocol buffers (aka protobufs[1]). Have any of the above been evaluated? [1] https://developers.google.com/protocol-buffers TASK DETAIL https://phabricator.wikimedia.org/T301471 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, QChris, ItamarWMDE, Joe, Aklapper, Addshore, karapayneWMDE, Martaannaj, Michaelcochez, Astuthiodit_1, Arnoldokoth, Invadibot, maantietaja, wkandek, JMeybohm, Akuckartz, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Eevans, Hardikj, Wikidata-bugs, aude, Sjoerddebruin, Jdforrester-WMF, Mbch331, Jay8g, Dzahn ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T330906: HTTP URIs do not resolve from NL and DE?
akosiaris added a comment. curl also implements HSTS. See https://curl.se/docs/hsts.html, but it is indeed primarily a mechanism to protect users of browsers. @Ennomeijers you are right about this being a fundamental question. I think nobody expected HTTP to become disfavored anytime soon when that URI scheme was proposed, but then, things happened and here we are today, trying to only serve the absolutely necessary requests that @Vgutierrez mentioned above for maintaining compatibility while suggesting to everyone to migrate to HTTPS. I expect that this will cause more issues down the road. Probably many of them inadvertently, like this one. But, it's quite conceivable that it will be less and less supported gradually in the future. TASK DETAIL https://phabricator.wikimedia.org/T330906 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, Nikki, Vgutierrez, BBlack, Ennomeijers, Aklapper, Astuthiodit_1, KOfori, karapayneWMDE, joanna_borun, Invadibot, Devnull, maantietaja, Muchiri124, ItamarWMDE, Akuckartz, Legado_Shulgin, ReaperDawn, Nandana, Davinaclare77, Techguru.pc, Lahi, Gq86, GoranSMilovanovic, Hfbn0, QZanden, LawExplorer, Zppix, _jensen, rosalieper, Scott_WUaS, Wong128hk, Wikidata-bugs, aude, faidon, Mbch331, Jay8g, fgiunchedi ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T301471: New Service Request SchemaTree
akosiaris added a comment. Any updates on this one? Per previous comment we were waiting on a merge, has this been done? TASK DETAIL https://phabricator.wikimedia.org/T301471 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, QChris, ItamarWMDE, Joe, Aklapper, Addshore, karapayneWMDE, Martaannaj, Michaelcochez, Astuthiodit_1, Arnoldokoth, Invadibot, maantietaja, wkandek, JMeybohm, Akuckartz, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Hardikj, Wikidata-bugs, aude, Sjoerddebruin, Jdforrester-WMF, Mbch331, Jay8g, Dzahn ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T301471: New Service Request SchemaTree
akosiaris added a comment. In T301471#8806115 <https://phabricator.wikimedia.org/T301471#8806115>, @Michaelcochez wrote: > The testing code is now implemented, and we found two small issues with it. These have now been resolved and the code is simplified further. > > Give this ticket: https://phabricator.wikimedia.org/T332953 I am uncertain whether it makes sense to merge things in now, or whether to wait for that to happen. @akosiaris what is your suggestion? I agree with @Dzahn. Don't couple the 2 things, let T332953 <https://phabricator.wikimedia.org/T332953> happen on it's own timeline. TASK DETAIL https://phabricator.wikimedia.org/T301471 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Dzahn, akosiaris, QChris, ItamarWMDE, Joe, Aklapper, Addshore, karapayneWMDE, Martaannaj, Michaelcochez, Kappakayala, Astuthiodit_1, Arnoldokoth, Invadibot, maantietaja, wkandek, JMeybohm, Akuckartz, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Hardikj, Wikidata-bugs, aude, Sjoerddebruin, Jdforrester-WMF, Mbch331, Jay8g ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T341054: Wikibase DispatchChanges job potentially broken
akosiaris added a comment. Sorry about that. For what is worth, we are approaching this piecemeal and this is the first instance. There are more changeprop related metrics that are wrongly summaries and not histograms, we will ping you before changing the next few ones. TASK DETAIL https://phabricator.wikimedia.org/T341054 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, Michael, Clement_Goubert, Aklapper, Lucas_Werkmeister_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, Rishacha, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Pchelolo, Wikidata-bugs, aude, Mbch331 ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T341054: Wikibase DispatchChanges job potentially broken
akosiaris added a comment. I think I have fixed the graphs now to be correct. They will definitely be more correct than previously where they were doing statistically wrong things (aggregating aggregates) TASK DETAIL https://phabricator.wikimedia.org/T341054 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, Michael, Clement_Goubert, Aklapper, Lucas_Werkmeister_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, Rishacha, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Pchelolo, Wikidata-bugs, aude, Mbch331 ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T341054: Wikibase DispatchChanges job potentially broken
akosiaris added a comment. Fixed the alert too. Took me a bit to figure out how to find it, thanks for posting the link in the task. TASK DETAIL https://phabricator.wikimedia.org/T341054 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, Michael, Clement_Goubert, Aklapper, Lucas_Werkmeister_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, Rishacha, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Pchelolo, Wikidata-bugs, aude, Mbch331 ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T301471: New Service Request SchemaTree
akosiaris added a comment. In T301471#7964314 <https://phabricator.wikimedia.org/T301471#7964314>, @Michaelcochez wrote: > An update on the current status, mainly regarding the index file: > > First, I made a mistake in my response above. The size of the file is a lot smaller than what I wrote above. The binary version is currently around 75mb (and not 1.5gb). > > Progress: > > - We implemented serialization using protocol buffers. Initial experiments seem promising. Store and load times appear to be only slightly slower compared to the native format. The on-disk size grew from 75mb to 250mb. However, using compression (bzip2) the protocol buffer version can be queezed into 51mb, so that seems to go well. > - While rewriting the serialization code, I noticed that it was hard to maintain that in a separate project. Hence, I integrated a minimal version of the index creation into the codebase. > - The previous index creation was using the rdf dump as its datasource. The new version uses the json dump. That has several benefits, mainly with regards to needed preprocessing steps (or rather avoiding them). > - The go version has been bumped from 1.17 to 1.18 > > These changes are not merged into main yet. Development is ongoing in https://github.com/martaannaj/RecommenderServer/tree/protobuffer_serialization > The following still needs to be done before merging. > > - test coverage for the new serialization format > - checking whether gokart can be used with the latest go somehow. It does not support the new generics capabilities of go. Most functionality is covered by the other checking tools used. Thanks for this update. Couple of comments: - Nice to see that the protocol buffer version can be squeezed down to 51MB. That size isn't gonna cut it probably for a large scale deployment of this, but it is going to be fine for an A/B test, which is what this is for. If it ends up staying around and needing to be scaled up, we can revisit this one. - A 51MB object can be pushed to gerrit, so that blocker is lifted. Do take care to not commit that file too often as that would make the repo huge and would cause stalls to image builds. So, when this is merged and deemed ready, I think we can proceed with this. TASK DETAIL https://phabricator.wikimedia.org/T301471 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, QChris, ItamarWMDE, Joe, Aklapper, Addshore, karapayneWMDE, Martaannaj, Michaelcochez, Astuthiodit_1, Arnoldokoth, Invadibot, maantietaja, wkandek, JMeybohm, Akuckartz, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Eevans, Hardikj, Wikidata-bugs, aude, Sjoerddebruin, Jdforrester-WMF, Mbch331, Jay8g, Dzahn ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T301471: New Service Request SchemaTree
akosiaris added a comment. In T301471#7964353 <https://phabricator.wikimedia.org/T301471#7964353>, @Michaelcochez wrote: > Regarding the option of using a batch of queries to an external database; the issue is that what we are creating is a specialized index specifically for what we need. What we perform is a tree traversal were at each node a new decision is made. To do something like this with a database, one would have to fire a series of queries where each of the queries is dependent on all the results of the previous ones. Yes, the point is that, depending on a couple of things, like what type of tree you are using (binary search tree, AVL tree, B-tree, red-black tree etc) and the amount of nodes in the tree, and given that these things tend to have a complexity of `O(log n)`, it quite possibly could be fine to bulk fetch the entirety of a branch, from the starting point of the branch down to the leaf nodes in 1 query. Depending on the value of `n` that might mean that after issuing something like 2-5 requests, the tree would have been trimmed down enough to fetch the entirety of it in a single query and just finish up the work left locally. This can be especially useful e.g. for recommendations, where one probably wants multiple leaf nodes and not just 1. TASK DETAIL https://phabricator.wikimedia.org/T301471 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, QChris, ItamarWMDE, Joe, Aklapper, Addshore, karapayneWMDE, Martaannaj, Michaelcochez, Astuthiodit_1, Arnoldokoth, Invadibot, maantietaja, wkandek, JMeybohm, Akuckartz, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Eevans, Hardikj, Wikidata-bugs, aude, Sjoerddebruin, Jdforrester-WMF, Mbch331, Jay8g, Dzahn ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T355685: Migrate Termbox SSR from Node 16 to 18
akosiaris added a comment. The deeper reason behind most of this mess is the probably the uniqueness of the `test` release. There is no other environment where we have a `test` release currently and thus some of the assumptions made elsewhere to provide functionality don't apply to it. Service mesh support as well as the DNS records are such exceptions and the difference in configuration to reflect the above is a consequence. My gut feeling, probably supported by some stuff this task, says that the high level end-result is wasted effort every time some actions need to be taken that (even tangentially) affect termbox. Either in T334064 <https://phabricator.wikimedia.org/T334064> or in this task, special consideration needed/needs to happen to accommodate for the `test` release. Some of these thoughts were also captured (or at least alluded to) in T226814 <https://phabricator.wikimedia.org/T226814> when the test release was introduced, albeit not so clearly stated (and the situation has changed considerably since 2019) My high level suggestion would be to re-evaluate if the `test` helm release actually serves a useful purpose (I know it serves `test.wikidata.org` but it apparently gets updated very infrequently. All termbox releases have been at the same version for 10 months now, so can't we just have `test.wikidata.org` use the main one?). If not, let's just stop having it. If yes, we might need to kick the can down the road a bit more until we decide we need to support somehow this type of helm releases, cause we currently have no other use cases and thus no current plans to support such uses. In T355685#9481732 <https://phabricator.wikimedia.org/T355685#9481732>, @Lucas_Werkmeister_WMDE wrote: > So maybe there’s a reason why (IIUC) `values-test.yaml` directly connects to `mw-api-int-ro.discovery.wmnet:4446`, while `values.yaml` talks to `localhost:6500` which according to `.fixtures.yaml` is a proxy to `mw-api-int.discovery.wmnet:4446`. fixtures are test/CI data, they aren't used somehow outside of that scope. For the same reason, they are often dummy data and might or might not reflect some actual situation (in this case they do reflect reality, but that's more happenstance than anything else). TASK DETAIL https://phabricator.wikimedia.org/T355685 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Aklapper, akosiaris, Clement_Goubert, Jdforrester-WMF, Michael, WMDE-leszek, Lucas_Werkmeister_WMDE, Danny_Benjafield_WMDE, Kappakayala, Mohamed-Awnallah, Astuthiodit_1, lbowmaker, Arnoldokoth, BTullis, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, wkandek, JMeybohm, ItamarWMDE, Akuckartz, darthmon_wmde, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T355685: Migrate Termbox SSR from Node 16 to 18
akosiaris added a comment. In T355685#9484621 <https://phabricator.wikimedia.org/T355685#9484621>, @Lucas_Werkmeister_WMDE wrote: > In T355685#9484091 <https://phabricator.wikimedia.org/T355685#9484091>, @akosiaris wrote: > >> My high level suggestion would be to re-evaluate if the `test` helm release actually serves a useful purpose (I know it serves `test.wikidata.org` but it apparently gets updated very infrequently. All termbox releases have been at the same version for 10 months now, so can't we just have `test.wikidata.org` use the main one?). > > IMHO it’s useful to be able to test a new Termbox version on Test Wikidata before deploying it to Wikidata – but as we’ve seen in this task, the current setup doesn’t support that perfectly, because there are too many differences between in the `test` release. Agreed on the last part. On the first part, it depends on what a failure of Termbox would mean for your end users and whether it indeed makes sense to have 1 more safety net (in addition to staging). It's a product decision as you say. If it would help them make that decision, the dashboard for the `/termbox` API endpoint, is at https://grafana-rw.wikimedia.org/d/wJRbI7FGk/termbox?orgId=1&var-dc=thanos&var-site=All&var-service=termbox&var-prometheus=k8s&var-container_name=All&from=now-6M&to=now&viewPanel=12&editPanel=12 A quick reading show over the last 12 months, shows a multimodal distribution. It's split in 2 main sections, 1 that's before Jun 2023 and after mid-November 2023 and the in-between (Northern Summer+Autumn let's call it). The traffic in the latter pattern apparently tripled and then subsided again. I have no idea if this is a seasonal effect or a result of some code changes. In any case, the amount of rps implies a small amount of concurrent users globally, so there's an argument to be made that it might be OK to not have a `testing` ground. > Would it be possible to have just one helm release, but have Test Wikidata use the `staging` cluster while Wikidata uses the `eqiad` and `codfw` clusters? Meaning merging the functionality of `test` in the functionality of the `staging` release ? It certainly is possible, although that would mean overloading the functionality of the `staging` release. We do have an open big question of what the `staging` releases mean to deployers after all and whether they indeed find them useful. I am a bit ambivalent about that approach, but it certainly is possible and if product people deemed it is useful to have a test release, we can go down that path. > Otherwise, I think it wouldn’t be the end of the world if we just lost this ability, and always had to deploy to Test Wikidata and Wikidata together; the impact is that mobile users without JavaScript lose access to terms until the deployer notices the problem and rolls back to the old version, which should be accceptable IMHO. (Though I’d want to check that with Product if we decide to go this way.) Thanks for this input, I appreciate it. TASK DETAIL https://phabricator.wikimedia.org/T355685 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Aklapper, akosiaris, Clement_Goubert, Jdforrester-WMF, Michael, WMDE-leszek, Lucas_Werkmeister_WMDE, Danny_Benjafield_WMDE, Kappakayala, Mohamed-Awnallah, Astuthiodit_1, lbowmaker, Arnoldokoth, BTullis, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, wkandek, JMeybohm, ItamarWMDE, Akuckartz, darthmon_wmde, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T355685: Migrate Termbox SSR from Node 16 to 18
akosiaris added a comment. In T355685#9490230 <https://phabricator.wikimedia.org/T355685#9490230>, @Lucas_Werkmeister_WMDE wrote: > In T355685#9490204 <https://phabricator.wikimedia.org/T355685#9490204>, @akosiaris wrote: > >>> Would it be possible to have just one helm release, but have Test Wikidata use the `staging` cluster while Wikidata uses the `eqiad` and `codfw` clusters? >> >> Meaning merging the functionality of `test` in the functionality of the `staging` release ? It certainly is possible, although that would mean overloading the functionality of the `staging` release. We do have an open big question of what the `staging` releases mean to deployers after all and whether they indeed find them useful. I am a bit ambivalent about that approach, but it certainly is possible and if product people deemed it is useful to have a test release, we can go down that path. > > Yeah, that’s certainly an open question for me – I don’t know what the current functionality of the `staging` release is. It was always envisioned as safety net. One could deploy there before deploying to production in order to catch errors that standard CI/CD failed to catch. That being said, I think that it's natural for all these kinds of environments to obtain more roles than the intended ones and I don't think we have made a very consistent effort to communicate what the vision was. > FWIW, when I’ve deployed new Termbox versions in the past, I never tested the deployment “directly” (though I assume it would be possible – `curl` some internal URL?) – I only ever tested it through MediaWiki, by looking at new items on Wikidata or Test Wikidata and checking whether they had a server-side rendered termbox or not. But IIUC, this only allows testing two of the possible release+cluster combinations: Wikidata targets the production release on the eqiad/codfw clusters, I assume, while Test Wikidata targets some other combination (I don’t know which one). There is a loop here. `wikidata.org` uses the `production` release of each helmfile environment (eqiad/codfw to match our DCs). The `production` releases use `wikidata.org` in their own turn (that's the loop I was pointing out). The `staging` release isn't being used by anything. It is using wikidata.org production as well (you can tell by the fact that the only thing it overrides from the main values.yaml file is the number of replicas). `test.wikidata.org` uses the `test` release. The `test` release uses `test.wikidata.org` in turn (same loop as above). As for a curl request example, here it is deploy2002:~$ curl https://staging.svc.eqiad.wmnet:4004/_info {"name":"wikibase-termbox","version":"0.1.0"} Your test release is accessible as deploy1002:~$ curl http://staging.svc.eqiad.wmnet:3031/_info {"name":"wikibase-termbox","version":"0.1.0"} Note the difference in ports and HTTPS vs HTTP. The `test` release, being the unique thing that it is, doesn't have TLS support as it doesn't use the service mesh. > (I also just noticed that `helmfile.yaml` lists //three// releases: production, staging, and test. ~~I don’t think I was aware of the test one before, to be honest.~~ Edit: Nonsense, that’s the one I updated in this Gerrit change <https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/992387>. But there’s definitely //something// about the releases and clusters/environments that I didn’t realize, because I previously thought of it as 2×3 combinations, when it seems to be more 3×3. Maybe the thing I missed is that the name “staging” is used both for a release and for an environment/cluster?) It's 4 releases in total. 1 `production` release per helmfile environment (or cluster/DC/data center[1]) and 2 releases, named `staging` and `test` that both reside in a kubernetes cluster named `staging`. The corresponding environment in helmfile is also named `staging`. Some of the above can possibly be treated as implementation details. There is nothing forcing us to have the staging and test releases in a different cluster/environment, they could also reside in the eqiad/codfw ones (addressing them would a little bit different but not by much). We just chose to go that way for some historical reasons. [1] You can use those terms kinda interchangeably for this specific discussion we have right now. That might not always be the case, but it sure is right now TASK DETAIL https://phabricator.wikimedia.org/T355685 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Aklapper, akosiaris, Clement_Goubert, Jdforrester-WMF, Michael, WMDE-leszek, Lucas_Werkmeister_WMDE, Danny_Benjafield_WMDE, Kappakayala, Mohamed-Awnallah,
[Wikidata-bugs] [Maniphest] T355685: Migrate Termbox SSR from Node 16 to 18
akosiaris added a comment. In T355685#9490871 <https://phabricator.wikimedia.org/T355685#9490871>, @Lucas_Werkmeister_WMDE wrote: > Thanks a lot – I’ve added some of that information at https://wikitech.wikimedia.org/wiki/WMDE/Wikidata/SSR_Service#Deployment where it will hopefully be helpful in future. > > With your explanation, I think the `test` and `staging` releases are each somewhat useful (though I wouldn’t mind if you want to remove one of them either). Additionally, it sounds like it would be useful to make the `test` release in particular less special; I guess ideally, `values-test.yaml` would override the `config.public.WIKIBASE_REPO` (`test.wikidata.org` instead of `www.wikidata.org`) and the `main_app.version` (so we can bump this version before the `production` one), but almost nothing else? But to me that seems like a separate task. What do you think? Definitely different task. I am also not at all sure right now that the test release can easily be folded in like that, we 'll have to see if the service mesh is able to support >1 release being exposed like that. > Meanwhile, we should still fix the `localhost` issue of the `production` release. My understanding is that changing `localhost` to `127.0.0.1` might work, but T355686 <https://phabricator.wikimedia.org/T355686> has been proposed as an alternative solution that might be more sustainable; do you have any preference which one we should go for? T355686 <https://phabricator.wikimedia.org/T355686> is the preferable approach here, solving the problem more generically by having envoy dual stack binding and avoiding having every single application hardcoding localhost to 127.0.0.1. > (I was also wondering why the `HEALTHCHECK_QUERY` in `values.yaml`, which looks correct to me, didn’t prevent the broken deployment – but as far as I can tell, it’s not actually connected to any Kubernetes liveness/readiness/startup probes like I had assumed. It ends up in some OpenAPI spec `x-amples` (`curl 'https://staging.svc.eqiad.wmnet:4004/?spec'`) and that’s apparently all.) That thing is used by https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/service-checker/+/refs/heads/master which runs in our monitoring infrastructure. It utilizes the `x-amples` stanza to construct and issue queries to the service as part of monitoring, effectively mimicking a simple "user", at least as far as the x-amples stanzas for every endpoint instruct it to. TASK DETAIL https://phabricator.wikimedia.org/T355685 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Aklapper, akosiaris, Clement_Goubert, Jdforrester-WMF, Michael, WMDE-leszek, Lucas_Werkmeister_WMDE, Danny_Benjafield_WMDE, Kappakayala, Mohamed-Awnallah, Astuthiodit_1, lbowmaker, Arnoldokoth, BTullis, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, wkandek, JMeybohm, ItamarWMDE, Akuckartz, darthmon_wmde, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T355685: Migrate Termbox SSR from Node 16 to 18
akosiaris added a comment. In T355685#9491033 <https://phabricator.wikimedia.org/T355685#9491033>, @Lucas_Werkmeister_WMDE wrote: > In T355685#9490969 <https://phabricator.wikimedia.org/T355685#9490969>, @akosiaris wrote: > >> Definitely different task. I am also not at all sure right now that the test release can easily be folded in like that, we 'll have to see if the service mesh is able to support >1 release being exposed like that. > > Created T355955: Simplify Termbox SSR test release <https://phabricator.wikimedia.org/T355955>. > >>> Meanwhile, we should still fix the `localhost` issue of the `production` release. My understanding is that changing `localhost` to `127.0.0.1` might work, but T355686 <https://phabricator.wikimedia.org/T355686> has been proposed as an alternative solution that might be more sustainable; do you have any preference which one we should go for? >> >> T355686 <https://phabricator.wikimedia.org/T355686> is the preferable approach here, solving the problem more generically by having envoy dual stack binding and avoiding having every single application hardcoding localhost to 127.0.0.1. > > Alright, then let’s see how that task develops. I’ve set myself a calendar reminder to come back to this task in ~two weeks, because I don’t think we should have a known broken version tagged as `latest` indefinitely – if the general solution doesn’t happen soon, we should either hard-code `127.0.0.1` after all (we can always revert it later) or revert the Node 18 upgrade for now. (But that’s not meant to hurry or pressure T355686 <https://phabricator.wikimedia.org/T355686> at all, I just want to make sure we don’t forget about the Wikidata part :)) Cool, thanks for the patience. >>> (I was also wondering why the `HEALTHCHECK_QUERY` in `values.yaml`, which looks correct to me, didn’t prevent the broken deployment – but as far as I can tell, it’s not actually connected to any Kubernetes liveness/readiness/startup probes like I had assumed. It ends up in some OpenAPI spec `x-amples` (`curl 'https://staging.svc.eqiad.wmnet:4004/?spec'`) and that’s apparently all.) >> >> That thing is used by https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/service-checker/+/refs/heads/master which runs in our monitoring infrastructure. It utilizes the `x-amples` stanza to construct and issue queries to the service as part of monitoring, effectively mimicking a simple "user", at least as far as the x-amples stanzas for every endpoint instruct it to. > > I see, thanks. So it the broken termbox would’ve shown up in monitoring sooner or later even without me testing it, but it didn’t automatically hold back the new version. Yes. If you do want to call the tool manually, you can via something like deploy1002:~$ service-checker-swagger -t 60 termbox.svc.eqiad.wmnet https://termbox.discovery.wmnet:4004 All endpoints are healthy Mess with the arguments a bit and you can test out all 4 releases with this. Note that in our infra we only test against the 2 `production` releases TASK DETAIL https://phabricator.wikimedia.org/T355685 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Aklapper, akosiaris, Clement_Goubert, Jdforrester-WMF, Michael, WMDE-leszek, Lucas_Werkmeister_WMDE, Danny_Benjafield_WMDE, Kappakayala, Mohamed-Awnallah, Astuthiodit_1, lbowmaker, Arnoldokoth, BTullis, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, wkandek, JMeybohm, ItamarWMDE, Akuckartz, darthmon_wmde, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T355685: Migrate Termbox SSR from Node 16 to 18
akosiaris added a comment. In T355685#9491036 <https://phabricator.wikimedia.org/T355685#9491036>, @akosiaris wrote: > In T355685#9491033 <https://phabricator.wikimedia.org/T355685#9491033>, @Lucas_Werkmeister_WMDE wrote: > >> In T355685#9490969 <https://phabricator.wikimedia.org/T355685#9490969>, @akosiaris wrote: >> >>> Definitely different task. I am also not at all sure right now that the test release can easily be folded in like that, we 'll have to see if the service mesh is able to support >1 release being exposed like that. >> >> Created T355955: Simplify Termbox SSR test release <https://phabricator.wikimedia.org/T355955>. >> >>>> Meanwhile, we should still fix the `localhost` issue of the `production` release. My understanding is that changing `localhost` to `127.0.0.1` might work, but T355686 <https://phabricator.wikimedia.org/T355686> has been proposed as an alternative solution that might be more sustainable; do you have any preference which one we should go for? >>> >>> T355686 <https://phabricator.wikimedia.org/T355686> is the preferable approach here, solving the problem more generically by having envoy dual stack binding and avoiding having every single application hardcoding localhost to 127.0.0.1. >> >> Alright, then let’s see how that task develops. I’ve set myself a calendar reminder to come back to this task in ~two weeks, because I don’t think we should have a known broken version tagged as `latest` indefinitely – if the general solution doesn’t happen soon, we should either hard-code `127.0.0.1` after all (we can always revert it later) or revert the Node 18 upgrade for now. (But that’s not meant to hurry or pressure T355686 <https://phabricator.wikimedia.org/T355686> at all, I just want to make sure we don’t forget about the Wikidata part :)) > > Cool, thanks for the patience. Patches are up for review! TASK DETAIL https://phabricator.wikimedia.org/T355685 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Aklapper, akosiaris, Clement_Goubert, Jdforrester-WMF, Michael, WMDE-leszek, Lucas_Werkmeister_WMDE, Danny_Benjafield_WMDE, Isabelladantes1983, Themindcoder, Kappakayala, Mohamed-Awnallah, Adamm71, Jersione, Hellket777, LisafBia6531, Astuthiodit_1, 786, lbowmaker, Arnoldokoth, BTullis, Biggs657, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, Juan90264, wkandek, JMeybohm, Alter-paule, Beast1978, ItamarWMDE, Un1tY, Akuckartz, Hook696, darthmon_wmde, Kent7301, joker88john, CucyNoiD, Nandana, jijiki, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Neuronton, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T355685: Migrate Termbox SSR from Node 16 to 18
akosiaris added a comment. Patches have been deployed, simple curl tests as well as `service-checker-swagger` checks have passed. I double checked the diff, envoy is listening now on both IPv6 and IPv4. I think you are unblocked on this and can proceed with the migration. TASK DETAIL https://phabricator.wikimedia.org/T355685 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Aklapper, akosiaris, Clement_Goubert, Jdforrester-WMF, Michael, WMDE-leszek, Lucas_Werkmeister_WMDE, Danny_Benjafield_WMDE, Isabelladantes1983, Themindcoder, Kappakayala, Mohamed-Awnallah, Adamm71, Jersione, Hellket777, LisafBia6531, Astuthiodit_1, 786, lbowmaker, Arnoldokoth, BTullis, Biggs657, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, Juan90264, wkandek, JMeybohm, Alter-paule, Beast1978, ItamarWMDE, Un1tY, Akuckartz, Hook696, darthmon_wmde, Kent7301, joker88john, CucyNoiD, Nandana, jijiki, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Neuronton, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331 ___ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org
[Wikidata-bugs] [Maniphest] T273097: Create Flink Base Image
akosiaris added a comment. Given the very well put `The downside of having a base image owned by SRE means that we are reliant on them to merge any Flink version updates.` state in the task description, I 'd suggest that we don't create a base image but rather fetch the jars and put them in a repository Search controls, e.g. archiva.wikimedia.org. That would avoid unnecessary friction between the teams. That all is assuming T273097 <https://phabricator.wikimedia.org/T273097> is rejected as an approach. TASK DETAIL https://phabricator.wikimedia.org/T273097 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Aklapper, Gehel, akosiaris, Mstyles, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T273097: Create Flink Base Image
akosiaris added a comment. Hmm, I see your point. In that case, if there isn't a good place that Search controls to upload the version of flink that they want, the most prudent way out is the debian package described in T266495 <https://phabricator.wikimedia.org/T266495> TASK DETAIL https://phabricator.wikimedia.org/T273097 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: dcausse, Aklapper, Gehel, akosiaris, Mstyles, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T273097: Create Flink Base Image
akosiaris added a comment. In T273097#6801429 <https://phabricator.wikimedia.org/T273097#6801429>, @RKemper wrote: > @akosiaris Is your concern with the idea of using a`flink` base image solution mainly just centered around the inefficiency/inconvenience of needing SRE to merge any flink version upgrades? I wouldn't call it merely an inconvenience. I would call it a source of potential friction between teams, see below for an elaboration. > Since we have an embedded SRE on search (me) and to a lesser extent Guillaume, I think it wouldn't be too much of a problem. So, say a bus factor of 1.5 or so? Not ideal, but workable. That being said, it all depends on the urgency, see below. > In general having our dependencies managed by a docker image will make it easier for us to be explicit about what version we're using, and it seems like the default docker-y way of doing things. Is there a technical reason why a base image might not be a good idea? Unfortunately, even being explicit on the level of docker images won't make it easer. And the reason for that is the source of the content (the flink tar.gz file) will still not be under Search's control (correct me if I am wrong, I might have misunderstood something) but instead on the flink project's servers. What this will mean is that the creation of the base image will succeed only for a short period of time after bumping the version of flink, since the flink project, per my undestanding, removes old versions from their servers. However that is not the only time we build images. We semi-regularly have to rebuild the entire tree of docker images for some reason, the most usual ones being security updates in the base images, some misconfiguration or some newer features being added to our image building toolkit. What that means is that on the next shellshock/heartbleed/younameit the rebuild breaks. Then, SRE comes knocking on Search's door, asking for a bump the version of flink in that container cause we need to rebuild it. Now, you will obviously point out that Search will probably be relying on some version of the container they can still rely on the old one and upgrade on their own timeline. However, during that timeframe before the upgrade happens is: - SRE has an unbuildable image which it does not control and knows little about. So it does what comes naturally, which is complain. - Search is running an image that does not have the security upgrades. - SRE is pushing for having images without known security vulnerabilities. - Search is forced to alter plans and reprioritize upgrading the flink version cause SRE is complaining. So, a source of friction between teams. All of this can be solved by just moving the source of the problem (the flink .tar.gz) somewhere that Search controls so they can guarantee that a) it's the version Search wants b) fetching it during the building phase will not fail. But if that happens, most of this discussion becomes moot and the layer that docker image would be is just a potential (depending on whether the layer is present or not on the build host) speedup during building (which we can do, but for the right reasons). Interestingly, the debian package approach discussed in T266495 <https://phabricator.wikimedia.org/T266495> does have the 2 attributes outlined above which is why it's preferable to me. It still relies on an SRE of course to upgrade it which is a minus, but it shares this minus with this proposal. Ideally we wouldn't even need that and any member of Search would be able to update it. TASK DETAIL https://phabricator.wikimedia.org/T273097 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: RKemper, dcausse, Aklapper, Gehel, akosiaris, Mstyles, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T273097: Create Flink Base Image
akosiaris added a comment. In T273097#6805355 <https://phabricator.wikimedia.org/T273097#6805355>, @Mstyles wrote: > @akosiaris I just learned that there are archive links that have all of the Flink packages. I'm proposing that we close both this ticket and https://phabricator.wikimedia.org/T266495 and just use the Flink archive links where we won't have to worry about the packages no longer being available. Oh, that's nice. +1 on my side. TASK DETAIL https://phabricator.wikimedia.org/T273097 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: RKemper, dcausse, Aklapper, Gehel, akosiaris, Mstyles, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T199219: WDQS should use internal endpoint to communicate to Wikidata
akosiaris added a comment. For what is worth, we now have the services proxy (envoy based) with persistent connections and doing TLS on its own so any costs from switching to TLS connections to the internal LVS services will be largely mitigated. In fact, if anything I expect the latencies from that part of the equation to decrease since it won't have to go through a proxy and the edge caches. The question of whether bypassing the edge caches will hugely increase the load on mediawiki still stands, but there have been many changes on the mediawiki caching infrastructure too (e.g. we now have onhost memcached) so that might very well be largely mitigated as well. I think we ought to revisit this indeed. Having the updater go through an extra 4 (outgoing proxy + 3 layers of edge caches) layers of the infrastructure, one of which is in NO WAY deemed critical to have High Availability (the outgoing proxy) doesn't help with either easy debugging nor ease of operations during maintenance/emergencies. TASK DETAIL https://phabricator.wikimedia.org/T199219 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: Ladsgroup, akosiaris, BBlack, Aklapper, Smalyshev, Gehel, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Jony, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Vali.matei, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T199219: WDQS should use internal endpoint to communicate to Wikidata
akosiaris added a comment. In T199219#6855641 <https://phabricator.wikimedia.org/T199219#6855641>, @Gehel wrote: > In terms of implementation in our new updater, the comment from @BBlack is the starting point: > > In T199219#4416896 <https://phabricator.wikimedia.org/T199219#4416896>, @BBlack wrote: > >> For this very particular case, the simplest way would be to do your language/platform/library's equivalent of: >> >> curl -H 'Host: www.wikidata.org' 'https://appservers-ro.discovery.wmnet/wiki/Special:EntityData/Q2408871.ttl?nocache=1530836328152&flavor=dump' >> >> That is, use the internal service endpoint hostname in the URI for TLS connection purposes, but then explicitly set the request `Host` header to `www.wikidata.org` for use at the HTTP level. Indeed. But with a minor correction, instead of `appservers-ro` please use instead `api-ro` in order to hit the API cluster as the appserver cluster is meant to be the //end-user browser serving cluster//. TASK DETAIL https://phabricator.wikimedia.org/T199219 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: dcausse, Ladsgroup, akosiaris, BBlack, Aklapper, Smalyshev, Gehel, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Jony, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Vali.matei, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T276550: Missing alerts for Termbox staging and test services
akosiaris closed this task as "Invalid". akosiaris added a comment. I am inclined to close this as `Invalid`. Regardless of the amount of time staging was broken for, staging is not meant to have alerts on purpose. That being said, feel free to reopen. TASK DETAIL https://phabricator.wikimedia.org/T276550 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: akosiaris Cc: akosiaris, WMDE-leszek, JMeybohm, Tarrow, Addshore, Aklapper, Jakob_WMDE, maantietaja, wkandek, Akuckartz, darthmon_wmde, Nandana, jijiki, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, MSGJ, abian, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331, Dzahn ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs