[Wikidata-bugs] [Maniphest] [Unassigned] T105845: Page components / content widgets
GWicke removed GWicke as the assignee of this task. TASK DETAILhttps://phabricator.wikimedia.org/T105845EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: leila, Reasno, SBisson, MZMcBride, Mholloway, RandomDSdevel, jmadler, Bianjiang, LikeLifer, MGChecker, -jem-, Daniel_Mietchen, StudiesWorld, Kelson, Jonas, daniel, Jhernandez, MrStradivarius, JanZerebecki, Quiddity, mobrovac, ssastry, Tgr, Ltrlg, Inez, cscott, TrevorParscal, Jdlrobson, GWicke, GoranSMilovanovic, QZanden, Luke081515, Jrf, Wikidata-bugs, aude, Gryllida, jayvdb, fbstj, RobLa-WMF, santhosh, Arlolra, Jdforrester-WMF, Mbch331, Rxy, Jay8g, bd808, Krenair, Legoktm___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Unassigned] T99088: [RFC] Evolving our content platform: Content adaptability, structured data and caching
GWicke removed GWicke as the assignee of this task. TASK DETAILhttps://phabricator.wikimedia.org/T99088EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: jmadler, Bianjiang, gpaumier, LikeLifer, Mholloway, MZMcBride, RobLa-WMF, StudiesWorld, Qgil, Tgr, JanZerebecki, RobLa, cscott, Dbrant, Smalyshev, greg, bearND, JKatzWMF, Gilles, Ltrlg, Fhocutt, Jhernandez, Joe, BBlack, Tnegrin, mark, faidon, Tfinc, TrevorParscal, damons, Anomie, bd808, dr0ptp4kt, daniel, BGerstle-WMF, tstarling, ssastry, mobrovac, Catrope, ori, brion, GWicke, GoranSMilovanovic, QZanden, merbst, Luke081515, Wikidata-bugs, aude, fbstj, Mbch331, Jay8g, Krenair, Legoktm___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata
GWicke added a comment. Looks like adding the JSON_UNESCAPED_UNICODE flag should do it: http://php.net/manual/en/function.json-encode.phpTASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: mobrovac, Stashbot, Legoktm, hoo, Addshore, aude, gerritbot, Ladsgroup, daniel, GWicke, Aklapper, Pchelolo, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, Lewizho99, Maathavan, Izno, Eevans, JAllemandou, Hardikj, Wikidata-bugs, Mbch331, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T175316: Very large jobs posted by Wikidata
GWicke added a comment. Raised priority, as this is a) blocking the migration to the Kafka job queue backend (T157088), and b) is likely already causing performance and possibly reliability issues in the current job queue.TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: Ladsgroup, daniel, GWicke, Aklapper, Pchelolo, GoranSMilovanovic, QZanden, Izno, Eevans, JAllemandou, mobrovac, Hardikj, Wikidata-bugs, aude, Mbch331, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Raised Priority] T175316: Very large jobs posted by Wikidata
GWicke raised the priority of this task from "Normal" to "High". TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: Ladsgroup, daniel, GWicke, Aklapper, Pchelolo, GoranSMilovanovic, QZanden, Izno, Eevans, JAllemandou, mobrovac, Hardikj, Wikidata-bugs, aude, Mbch331, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T175316: Very large jobs posted by Wikidata
GWicke updated the task description. (Show Details) CHANGES TO TASK DESCRIPTION..."2653965": [6, "Meyers_b9_s0043.jpg"], (..a few million further page entries...) },...TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: daniel, GWicke, Aklapper, Pchelolo, GoranSMilovanovic, QZanden, Izno, Eevans, JAllemandou, mobrovac, Hardikj, Wikidata-bugs, aude, Mbch331, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata
GWicke added a comment. @Pchelolo, based on our previous conversation about this I am assuming that the bulk of the task is a very large list of pages. Is this correct?TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: daniel, GWicke, Aklapper, Pchelolo, GoranSMilovanovic, QZanden, Izno, Eevans, JAllemandou, mobrovac, Hardikj, Wikidata-bugs, aude, Mbch331, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T173710: Job queue is increasing non-stop
GWicke added a comment. I updated https://gerrit.wikimedia.org/r/#/c/295027/ to apply on current master. This removes CDN purges from HTMLCacheUpdate, and only performs them after RefreshLinks, and only if nothing else caused a re-render since. With this patch applied, we should be able to reduce the throttling for HTMLCacheUpdate jobs without endangering the CDN infrastructure with bursts of purges.TASK DETAILhttps://phabricator.wikimedia.org/T173710EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Legoktm, ema, Joe, GWicke, Nemo_bis, Andreasmperu, BBlack, Peachey88, Liuxinyu970226, daniel, Stashbot, Agabi10, Daniel_Mietchen, Harej, XXN, Pasleim, Bugreporter, Sjoerddebruin, Magnus, Mr.Ibrahem, Emijrp, gerritbot, EBernhardson, Esc3300, jcrespo, WMDE-leszek, Jdforrester-WMF, Krinkle, aaron, fgiunchedi, Aklapper, Ladsgroup, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, Vali.matei, Avner, Zppix, debt, Gehel, FloNight, Izno, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, Jay8g, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T173710: Job queue is increasing non-stop
GWicke added a comment. HTMLCacheUpdate root job timestamp distribution, jobs executed within the last 15 hours: 1233 20170407 8237 20170408 18 20170423 18 20170426 20 20170429 50 20170430 18 20170502 18 20170504 20 20170509 10 20170512 18 20170513 16 20170523 22 20170528 10 20170529 40 20170606 20 20170617 18 20170622 21 20170625 16 20170627 10 20170628 10 20170630 36 20170701 20 20170705 28 20170708 18 20170712 10 20170715 16 20170717 18 20170724 42 20170725 20 20170726 20 20170728 17 20170729 34 20170803 46 20170804 30 20170805 50 20170807 54 20170808 260 20170809 137 20170810 16 20170811 17 20170812 84 20170813 36 20170814 10 20170815 72 20170816 445 20170817 82 20170818 67 20170819 21452 20170820 1825 20170821 81 20170822 176 20170823 4810 20170824 9773 20170825 21842 20170826 218770 20170827 8087 20170828 183142 20170829 3805398 20170830TASK DETAILhttps://phabricator.wikimedia.org/T173710EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: GWicke, Nemo_bis, Andreasmperu, BBlack, Peachey88, Liuxinyu970226, daniel, Stashbot, Agabi10, Daniel_Mietchen, Harej, XXN, Pasleim, Bugreporter, Sjoerddebruin, Magnus, Mr.Ibrahem, Emijrp, gerritbot, EBernhardson, Esc3300, jcrespo, WMDE-leszek, Jdforrester-WMF, Krinkle, aaron, fgiunchedi, Aklapper, Ladsgroup, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, Vali.matei, Avner, Zppix, debt, Gehel, FloNight, Izno, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, Jay8g, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T173710: Job queue is increasing non-stop
GWicke added a comment. A possible contribution to the backlog building could be the infinite retry / immortal job problem described in T73853. Looking for old htmlCacheUpdate root jobs from April still executing over four months later (!) via grep htmlCacheUpdate runJobs.log | grep -c 'rootJobTimestamp=201704' in mwlog1001:/srv/mw-log yields 9208 executions, just today. Interestingly, jobs from May, June, and July are much less common (hundreds). Considering that HTMLCacheUpdateJob basically only updates touched timestamps in the DB, and then quickly fires off CDN purges, seeing anything but zero ancient jobs might mean that T73853 is not actually resolved yet. To actually establish whether this significantly contributes to the current backlog, we would need to look at the distribution of rootJobTimestamp values for htmlCacheUpdates from July, especially for the period since the backlog growth really started around the 8th. The general HTMLCacheUpdate / purge volume problematic was previously discussed in T124418. At the time, I posted https://gerrit.wikimedia.org/r/#/c/295027/, which would move the vast majority of CDN purges to RefreshLinksJob, which would make purges less bursty. I think we could dust this off quite easily. Finally, since EventBus and Kafka was brought up, let me clarify: No jobs are executed via EventBus / Kafka so far. We did start writing copies of job specs to EventBus on August 2nd (phase0), and then enabled this for phase1 on the 16th. The timing does not align with the backlog rise, so it seems unlikely that the double production significantly contributes to this issue.TASK DETAILhttps://phabricator.wikimedia.org/T173710EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: GWicke, Nemo_bis, Andreasmperu, BBlack, Peachey88, Liuxinyu970226, daniel, Stashbot, Agabi10, Daniel_Mietchen, Harej, XXN, Pasleim, Bugreporter, Sjoerddebruin, Magnus, Mr.Ibrahem, Emijrp, gerritbot, EBernhardson, Esc3300, jcrespo, WMDE-leszek, Jdforrester-WMF, Krinkle, aaron, fgiunchedi, Aklapper, Ladsgroup, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, EBjune, Vali.matei, Avner, Lewizho99, Zppix, Maathavan, debt, Gehel, FloNight, Izno, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, Jay8g, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T173710: Job queue is increasing non-stop
GWicke added a project: Services (watching). TASK DETAILhttps://phabricator.wikimedia.org/T173710EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Nemo_bis, Andreasmperu, BBlack, Peachey88, Liuxinyu970226, daniel, Stashbot, Agabi10, Daniel_Mietchen, Harej, XXN, Pasleim, Bugreporter, Sjoerddebruin, Magnus, Mr.Ibrahem, Emijrp, gerritbot, EBernhardson, Esc3300, jcrespo, WMDE-leszek, Jdforrester-WMF, Krinkle, aaron, fgiunchedi, Aklapper, Ladsgroup, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, EBjune, Vali.matei, Avner, Lewizho99, Zppix, Maathavan, debt, Gehel, FloNight, Izno, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, Jay8g, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T172832: Investigate use-cases for delayed job executions
GWicke added a comment. In T172832#3540031, @Mattflaschen-WMF wrote: There are three considerations relevant to Echo: Delayed notifications (T156808: Back-end infrastructure for timed notifications in Echo) 1a. Article reminder notifications (T2582: Remind me of this article in X days) 1b. User group expiry notifications (T153817: Notify users when their user group membership is about to expire, or has expired) "Batching & rate limiting for Echo notifications". So that leaves #2. I think this is a reference to ProcessEchoEmailBatch. We currently use a OS cron job for that. If there's another use case you're thinking of, please let us know. (I also checked the existing Echo jobs, and they do not use getReleaseTimestamp/jobReleaseTimestamp) @Mattflaschen-WMF, thank you for the background on echo! Regarding use case #2, can this be fairly summarized as "wait for a limited time before sending out notifications, in order to reduce the volume with batching"?TASK DETAILhttps://phabricator.wikimedia.org/T172832EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: MaxSem, MarkTraceur, ggellerman, dr0ptp4kt, TrevorParscal, kaldari, phuedx, Fjalapeno, Mattflaschen-WMF, daniel, PokestarFan, hoo, Aklapper, Joe, elukey, Ottomata, Nuria, mobrovac, GWicke, Pchelolo, GoranSMilovanovic, QZanden, Izno, Eevans, JAllemandou, Hardikj, Wikidata-bugs, aude, Mbch331, Jay8g, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T172832: Investigate use-cases for delayed job executions
GWicke added a comment. Some use cases from today's eventbus sync discussion: Batching & rate limiting for Echo notifications CirrusSearch for rate limiting / batching Wikidata for rate limiting, partly work-around for lack of dependency tracking Delayed HTCP purging (by ~10s) to account for replication lag TASK DETAILhttps://phabricator.wikimedia.org/T172832EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: Mattflaschen-WMF, daniel, PokestarFan, hoo, Aklapper, Joe, elukey, Ottomata, Nuria, mobrovac, GWicke, Pchelolo, GoranSMilovanovic, QZanden, Izno, Eevans, JAllemandou, Hardikj, Wikidata-bugs, aude, Mbch331, Jay8g, jeremyb___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T170860: Need ability to block specific query sources for WDQS
GWicke added a comment. T167906: Make API usage limits easier to understand, implement, and more adaptive to varying request costs / concurrency limiting might help with this problem, especially if you can reduce the allowed concurrency to a fairly small value.TASK DETAILhttps://phabricator.wikimedia.org/T170860EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: GWicke, gerritbot, Esc3300, Gymel, EBernhardson, Lydia_Pintscher, Sjoerddebruin, debt, BBlack, Gehel, Aklapper, Smalyshev, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, EBjune, merbst, Avner, Lewizho99, Zppix, Maathavan, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T167787: [Spike 2hr] Investigate ability for page previews in wikidata to appear in user's preferred language
GWicke added a comment. In T167787#3364625, @phuedx wrote: In T167787#3364314, @GWicke wrote: We discussed this in the Reading / Services sync meeting. One question that came up in the discussion is whether including all languages in the response would be feasible from a performance perspective. The advantage of this direction would be no cache fragmentation, the downside a larger response. Which is the more expensive of the two? Would you like us to spend time figuring out the p50, p75, p95 number of descriptions and labels per item to get a better understanding of the expected size of the response? Ultimately, I think the median / p99 compressed sizes of responses with a single vs. all languages would be helpful. It's really more about getting an idea of the ballpark we are talking about -- is this < 16k, or are we talking about >100 kb? If we determine that returning all languages does not make sense, then we could consider using the accept-language header as the general language selection mechanism, in line with T122942: RFC: Support language variants in the REST API. Reading through that RFC, it feels like this is the accepted solution. Would it really make sense to special-case this endpoint? (I guess this depends on your answer to the first question). The caveat is that we discussed Accept-Language in the context of supporting language variants in the REST API. There is also {T114662: RFC: Per-language URLs for multilingual wiki pages}, which is focused on Wikidata, but does not consider APIs so far.TASK DETAILhttps://phabricator.wikimedia.org/T167787EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Jdlrobson, GWicke, mobrovac, Pchelolo, bmansurov, Jhernandez, Nirzar, Tbayer, Aklapper, Lydia_Pintscher, phuedx, ovasileva, GoranSMilovanovic, QZanden, Winter, Izno, Wikidata-bugs, aude, Ricordisamoa, Se4598, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T167787: [Spike 2hr] Investigate ability for page previews in wikidata to appear in user's preferred language
GWicke added a comment. We discussed this in the Reading / Services sync meeting. One question that came up in the discussion is whether including all languages in the response would be feasible from a performance perspective. The advantage of this direction would be no cache fragmentation, the downside a larger response. If we determine that returning all languages does not make sense, then we could consider using the accept-language header as the general language selection mechanism, in line with T122942: RFC: Support language variants in the REST API.TASK DETAILhttps://phabricator.wikimedia.org/T167787EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Jdlrobson, GWicke, mobrovac, Pchelolo, bmansurov, Jhernandez, Nirzar, Tbayer, Aklapper, Lydia_Pintscher, phuedx, ovasileva, GoranSMilovanovic, QZanden, Winter, Izno, Wikidata-bugs, aude, Ricordisamoa, Se4598, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T157014: CONSULTATION/PLAN: Managing Complex State and GUI on MediaWiki (e.g. for Wikidata/Wikibase UI)
GWicke added a comment. In T157014#3168034, @daniel wrote: AFAIK we do have infrastructure for running node servers with proper caching, as part of the rest services layer. If you create endpoints that return structured data you could consume those from the PHP backend or the JS client. We have node based rendering infrastructure, but as far as I know, we cannot use them to directly serve page views. Because a) all the skin/chrome stuff is done in PHP and b) we currently only use node when editing, and don't have the server capacity to handle several orders of magnitude more hits that would be needed for serving page views. We aren't using node for UI rendering at scale, but at the API level the REST API is seeing 15 minute averages of more than 6k requests per second most days. This is comparable to the action API: https://grafana-admin.wikimedia.org/dashboard/db/api-summary?orgId=1&from=now-30d&to=now The vast majority of this traffic is driven by read views and data retrieval. Cache hit rate is already ~95%, and rising with traffic levels. So, we can use node.js to render stuff, but we still need to loop it through PHP to add chrome. This depends on the roll-out strategy. The standard chrome elements are actually not *that* hard to replace. Long tail special-case interfaces rendered in the content area like special pages etc will take longer, or might not make sense to re-do at all. See https://phabricator.wikimedia.org/T114596. We also need caching for this on the same level as the parser cache. That could be done on the HTTP level between PHP and node. Does this exist? We touched on this in the mail thread. Most API responses and server-side UI renders for anonymous users would be cached in the CDN / Varnish.TASK DETAILhttps://phabricator.wikimedia.org/T157014EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: GWicke, Aleksey_WMDE, daniel, Niedzielski, Magnus, Milimetric, Prtksxna, Fjalapeno, phuedx, Jdlrobson, siddharth11, Capt_Swing, TheDJ, Jdforrester-WMF, SBisson, WMDE-leszek, Volker_E, Krinkle, gabriel-wmde, Jonas, thiemowmde, Lydia_Pintscher, Jan_Dittrich, Jhernandez, Jdrewniak, Aklapper, QZanden, Salgo60, SamanthaNguyen, JGirault, Izno, Wikidata-bugs, aude, Mbch331, Jay8g___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T161527: Canonical data URIs and URLs for machine readable page content
GWicke added a comment. I think the URL most users would consider canonical is /wiki/{title}. Wouldn't this already provide a reasonable URL for the concept of the page?TASK DETAILhttps://phabricator.wikimedia.org/T161527EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Rybesh, Dzahn, GWicke, tstarling, Aklapper, Jonas, Smalyshev, mkroetzsch, Lydia_Pintscher, daniel, QZanden, D3r1ck01, Izno, suriyaa, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, Southparkfan, fbstj, RobLa-WMF, santhosh, Mbch331, Jay8g, Ltrlg, Glaisher, bd808, Krenair, Legoktm___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T161527: Canonical data URIs and URLs for machine readable page content
GWicke added a comment. In T161527#3135095, @Smalyshev wrote: I think we have several concepts there that needs to be refined. Canonical object URI - this is the URI that uniquely identifies an object in Wikimedia world, and, by extension, in the whole world of linked data. I think there is a spectrum here between reserving the object domain only for abstract concepts & treating most things as representations, and making some of those representations first-class objects, related to the underlying concept via an is-representation-of edge. For addressable resources in the REST sense, I think that usability concerns should play a prominent role in finding the right balance between URLs and headers. For example, I don't think that using a single URL for all resources related to the concept of "Barack Obama" would make a lot of sense to API users.TASK DETAILhttps://phabricator.wikimedia.org/T161527EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Rybesh, Dzahn, GWicke, tstarling, Aklapper, Jonas, Smalyshev, mkroetzsch, Lydia_Pintscher, daniel, QZanden, D3r1ck01, Izno, suriyaa, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, Southparkfan, fbstj, RobLa-WMF, santhosh, Mbch331, Jay8g, Ltrlg, Glaisher, bd808, Krenair, Legoktm___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T161527: Canonical data URIs and URLs for machine readable page content
GWicke added a project: Services (watching). TASK DETAILhttps://phabricator.wikimedia.org/T161527EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Dzahn, GWicke, tstarling, Aklapper, Jonas, Smalyshev, mkroetzsch, Lydia_Pintscher, daniel, QZanden, D3r1ck01, Izno, suriyaa, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, Southparkfan, fbstj, RobLa-WMF, santhosh, Mbch331, Jay8g, Ltrlg, Glaisher, bd808, Krenair, Legoktm___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T161527: Canonical data URIs and URLs for machine readable page content
GWicke added a comment. However, URIs by nature should not include interface version information, because they identify the resource independently of representation. The REST API versioning policy explicitly describes how representation concerns are hadled through content negotiation, and not by incrementing major API versions. Changes in major API versions are expected to be extremely rare. The major API version is basically an insurance policy for the case that we'd want to introduce a fundamentally different URL layout, without breaking existing users and resource references. What URI structure would you propose? Within the REST API, data associated with pages is typically exposed using the /api/rest_v1/page/{type}/{title}{/revision} pattern. Examples: HTML, summary, data-parsoid, PDF.TASK DETAILhttps://phabricator.wikimedia.org/T161527EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Dzahn, GWicke, tstarling, Aklapper, Jonas, Smalyshev, mkroetzsch, Lydia_Pintscher, daniel, QZanden, D3r1ck01, Izno, suriyaa, Wikidata-bugs, aude, jayvdb, Southparkfan, fbstj, RobLa-WMF, santhosh, Mbch331, Jay8g, Ltrlg, Glaisher, bd808, Krenair, Legoktm___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T161527: Canonical data URIs and URLs for machine readable page content
GWicke added a comment. The description of the requirements seems to fit the REST API: API versioning & content negotiation. REST URL structure. Integration with CDN layer. Machine and user readable API specs & documentation. The REST URL hierarchy makes it quite easy to route specific end points directly to specialized backends, while still presenting a consistent & well-documented API to end users. In other words, exposing functionality through the REST API does not imply the use of RESTBase where that does not make sense.TASK DETAILhttps://phabricator.wikimedia.org/T161527EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Dzahn, GWicke, tstarling, Aklapper, Jonas, Smalyshev, mkroetzsch, Lydia_Pintscher, daniel, QZanden, D3r1ck01, Izno, suriyaa, Wikidata-bugs, aude, jayvdb, Southparkfan, fbstj, RobLa-WMF, santhosh, Mbch331, Jay8g, Ltrlg, Glaisher, bd808, Krenair, Legoktm___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Created] T149114: Reconsider wikidata support in the REST API
GWicke created this task.GWicke added projects: Services (next), Wikidata, RESTBase-API, Mobile-Content-Service.Herald added a subscriber: Aklapper. TASK DESCRIPTIONBasically none of the current REST API end points in https://www.wikidata.org/api/rest_v1/ are actually useful currently. HTML contains a wrapping a large JSON blob. Nobody is using VisualEditor to edit this JSON blob. None of the derived content types like mobile sections or the page summary appear obviously useful. For the most part, wikidata information like item descriptions is consumed through individual project's REST APIs (such as the summary end point), rather than the wikidata REST API itself. We also currently process all updates for these end points the same way as for any other wiki. This means a lot of unnecessary work and storage use. So, lets revisit how we support WikiData in the REST API: If we find no actual use cases for the current WikiData REST API, then I am proposing to remove it altogether, and drop the associated storage. This saves resources, and avoids exposing an API that isn't useful or meaningfully maintained. We know that there are currently no uses for mobile sections or page summaries. Consider which use cases the REST API could actually help with for WikiData, and set up a custom project config, similar to what we already do for wiktionaries. TASK DETAILhttps://phabricator.wikimedia.org/T149114EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: mobrovac, bearND, Mholloway, Pchelolo, Eevans, daniel, Aklapper, GWicke, D3r1ck01, Izno, Hardikj, Wikidata-bugs, aude, AuFCL, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Triaged] T149114: Reconsider wikidata support in the REST API
GWicke triaged this task as "Normal" priority. TASK DETAILhttps://phabricator.wikimedia.org/T149114EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: mobrovac, bearND, Mholloway, Pchelolo, Eevans, daniel, Aklapper, GWicke, D3r1ck01, Izno, Hardikj, Wikidata-bugs, aude, AuFCL, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Lowered Priority] T102476: RFC: Requirements for change propagation
GWicke lowered the priority of this task from "High" to "Low".GWicke edited projects, added Services (watching); removed Services. TASK DETAILhttps://phabricator.wikimedia.org/T102476EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Mholloway, Liuxinyu970226, Scott_WUaS, Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T88633: [EPIC] Image-positioning service for storing and retrieving image focal points
GWicke edited projects, added Services (watching); removed Services (later). TASK DETAILhttps://phabricator.wikimedia.org/T88633EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: MZMcBride, Jdlrobson, Ricordisamoa, Spage, bearND, kaldari, KHammerstein, Mhurd, GWicke, Jdouglas, mobrovac, Jdforrester-WMF, phuedx, Deskana, Fjalapeno, BGerstle-WMF, brion, Dbrant, Maryana, Aklapper, NHarateh_WMF, Winter, D3r1ck01, Izno, Eevans, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T88633: [EPIC] Image-positioning service for storing and retrieving image focal points
GWicke edited projects, added Services (backlog); removed Services. TASK DETAILhttps://phabricator.wikimedia.org/T88633EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: MZMcBride, Jdlrobson, Ricordisamoa, Spage, bearND, kaldari, KHammerstein, Mhurd, GWicke, Jdouglas, mobrovac, Jdforrester-WMF, phuedx, Deskana, Fjalapeno, BGerstle-WMF, brion, Dbrant, Maryana, Aklapper, NHarateh_WMF, Winter, D3r1ck01, Izno, Eevans, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T102476: RFC: Requirements for change propagation
GWicke edited the task description. (Show Details) EDIT DETAILS...- **Reliable RCStream**: @Ottomata has been looking into leveraging Kafka events in RCStream. This can potentially let clients catch up after being disconnected. This can potentially let clients catch up after being disconnectedSee {T130651}TASK DETAILhttps://phabricator.wikimedia.org/T102476EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Mholloway, Liuxinyu970226, Scott_WUaS, Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T38881: Wiktionary needs usable API
GWicke added a comment. There is now an experimental API end point for wiktionary definitions at https://en.wiktionary.org/api/rest_v1/?doc#!/Page_content/get_page_definition_term This API is used by the Android app to provide definitions for words using wiktionary data, but it is currently only available for the English Wiktionary. T138709 discusses ways to expand coverage to other languages by adding standard markup to consistently identify specific components of the definitions. Please chime in there.TASK DETAILhttps://phabricator.wikimedia.org/T38881EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: GWicke, Alkamid, TheDaveRoss, dg711, intracer, Aklapper, Hippietrail, Ricordisamoa, jberkel, Liuxinyu970226, Wikidata-bugs, GPHemsley, Amire80, siebrand, mxn, Glaisher, Qgil, MZMcBride, Yurik, Bawolff, Lydia_Pintscher, Sethakill, Luke081515, jayvdb, Darkdadaah, Anomie, Krenair, Legoktm___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T38881: Wiktionary needs usable API
GWicke added a subtask: T138709: Use microformats on Wiktionary to improve term parsing. TASK DETAILhttps://phabricator.wikimedia.org/T38881EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Alkamid, TheDaveRoss, dg711, intracer, Aklapper, Hippietrail, Ricordisamoa, jberkel, Liuxinyu970226, Wikidata-bugs, GPHemsley, Amire80, siebrand, mxn, Glaisher, Qgil, MZMcBride, Yurik, Bawolff, Lydia_Pintscher, Sethakill, Luke081515, jayvdb, Darkdadaah, Anomie, Krenair, Legoktm___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T102476: RFC: Requirements for change propagation
GWicke added a comment. I moved the current status / next steps section back here, so that only the general background section is now on MediaWiki.org. Status & next steps is more volatile & links directly to ongoing work, so benefits from being in this task. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: RobLa-WMF, GWicke Cc: Scott_WUaS, Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T102476: RFC: Requirements for change propagation
GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: RobLa-WMF, GWicke Cc: Scott_WUaS, Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T102476: RFC: Requirements for change propagation
GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: RobLa-WMF, GWicke Cc: Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T102476: RFC: Requirements for change propagation
GWicke added a comment. @RobLa-WMF, I updated the task summary with a status summary & a sketch of next steps & open questions. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: RobLa-WMF, GWicke Cc: Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T102476: RFC: Requirements for change propagation
GWicke added subscribers: Smalyshev, Ottomata. GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: RobLa-WMF, GWicke Cc: Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions
GWicke added a comment. > As I understand it, restbase is a front-end caching proxy store, exposed to the public internet. For most use cases (including HTML), it is actually *storing*, and not just caching. It is the equivalent of ExternalStore and most of the text table, including revision deletions. Longer term, we are looking into replacing ExternalStore for wikitext. TASK DETAIL https://phabricator.wikimedia.org/T107595 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: daniel, GWicke Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions
GWicke added a comment. The use case for providing metadata is so that we can use stores like RESTBase, which already provide an API keyed on title, revision & render ID. It also already deals with the complexities you mention. Basically, if we don't have a way to provide this key information to the backend store, then we can't access all the multi-content revision data that's already out there through this interface. TASK DETAIL https://phabricator.wikimedia.org/T107595 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: daniel, GWicke Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions
GWicke added a comment. > In any case, the PageUpdater / WikiPage code needs to trigger notifications (produce events). I don't care what mechanism it used for that. Or rather: I'm very happy if we get a generalized mechanism. We'll have to agree on some kind of schema for revisions, slots, and blobs, but that should be easy enough. Makes sense. Thanks for the clarification! >> In addition to title and revision (which I assume remains an integer), we'd need an optional v1 UUID parameter to retrieve specific renders, in both the request & response interfaces. > I have thought about this, too. My solution is to encode this in the slot name. So you could have an html.canonical (sub)slot, and a html.29e68f78-8765-49f8-86d5-dfc438d459fe, or html.en, or whatever. Hmmm, this sounds like a rather ugly hack. I thought the 'slot' is identifying the kind of content, and is not some general-purpose string that is used to append otherwise missing parameters, and differs with each render. >> How would a dumb blob store figure out which content belongs to the same page (and is thus similar), if all it has is the content & some metadata, but not the page id, title, revision & render UUID? This is the same design issue that plagues ExternalStore, and something we addressed in RESTBase. With large-window compression algorithms like brotli, we are getting down to 2-3% of the input HTML size (see https://phabricator.wikimedia.org/T122028). Without this locality information, you are likely to use an order of magnitude more storage as you are foregoing efficient delta compression. > > This is a good point. Once again, we want our abstraction to be a bit leaky, to allow for optimizations. I would argue that it is a case of finding an abstraction at the right level. A simple blob store is a very low-level abstraction, and severely limits the backend's abilities to optimize storage, distribution & consistency. It also limits the backend's usefulness as an API in its own right. Instead, I think we should clearly define the API for each slot to provide / consume - page id, - page title, - revision id, and - a UUID / hash / etag. This makes sure that backends can continue to implement higher-level functionality & important optimizations. This should be part of the API, and not a case of a "leak". That said, backends *can* choose to ignore all of this (but the UUID / hash). > I havn't thought this through yet, but my inclanation is that we could associate a metadata array (k/v set) with the blob, which could include things like a hash and the page title. A BlobStore would be free to use this or not, to store it or not, and to make it retrievable or not. A minimum set of metadata (like the versioned content-type) should always be provided. It would be nice to model this in a way that's compatible with normal HTTP headers, as stored & returned by services like RESTBase. TASK DETAIL https://phabricator.wikimedia.org/T107595 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: daniel, GWicke Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions
GWicke added a comment. > Blobs would typically be shared by different revisions of the same page. This happens every time one primary slot is edited, but another is not changed. E.g. the free wikitext description of a file is edited, but the structured data isn't (or vice versa). Or the quality assessment data of an article is updated, but the article text isn't edited. In both cases, one of the blobs would be re-used by the new revision. I think this will actually be more common than editing all primary streams at once. Makes sense, some of these fields won't change between revisions. Depending on the constraints, it might still make sense to store unchanged content & rely on compression to encode it efficiently. This is likely what we'll continue to do in RESTBase, as this makes sure that access by revision continues to perform predictably. In any case, as long as you ask the backend for content for a specific title / page id, revision & UUID, backends are free to use whatever performs best. TASK DETAIL https://phabricator.wikimedia.org/T107595 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: daniel, GWicke Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T107595: [RFC] Multi-Content Revisions
GWicke added a comment. > Where do I propose another mechanism for change propagation? The PageUpdater would do exactly what Revision does now: schedule DataUpdates. EventBus & the change propagation service are moving away from scheduling "jobs", and towards an event processing approach based on Kafka. In this model, subscribers react to change events associated with resources. Event production & processing / consumption is decoupled and decentralized. PageUpdater (and RevisionUpdater) as proposed seem to be moving in the opposite direction, towards more jobs & away from event processing. > The bob-store is (potentially) content-adressable, so the same blob may be used for different revisions of different pages. Blob sharing would complicate your storage significantly, as you'd either have to forgo deleting content forever (very expensive for something like HTML renders), or incur significant complexity of implementing an atomic reference counting scheme. For textual content, I am pretty certain that sharing is rare, and the complexity would overall be a loss in performance and reliability. > Even for blobs that have an incremental ID (e.g. using the current text table storage mechanism), the same blob would frequently be used for multiple blobs of the same page. How would a dumb blob store figure out which content belongs to the same page (and is thus similar), if all it has is the content & some metadata, but not the page id, title, revision & render UUID? This is the same design issue that plagues ExternalStore, and something we addressed in RESTBase. With large-window compression algorithms like brotli, we are getting down to 2-3% of the input HTML size (see https://phabricator.wikimedia.org/T122028). Without this locality information, you are likely to use an order of magnitude more storage as you are foregoing efficient delta compression. I am generally trying to work out how RevisionContentLookup would work for use cases like fetching HTML from RESTBase. Some notes / questions: - In addition to title and revision (which I assume remains an integer), we'll need an optional v1 UUID parameter to retrieve specific renders, in both the request & response interfaces. - Will getTouched() return the UUID timestamp of a specific render (last-modified, essentially), or is this about page_touched? Also, should we expose UUIDs to make sure that we have a unique ID with a high-resolution timestamp? - For content from RESTBase, read restrictions are always enforced as part of the API request. No information about the applied restrictions is returned. In this context, getReadRestrictions() would basically always return the empty set. TASK DETAIL https://phabricator.wikimedia.org/T107595 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: daniel, GWicke Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions
GWicke added a comment. Some notes: - PageUpdater aims to provide similar functionality as the change propagation service (using EventBus) & the job queue. Could you clarify why we need another mechanism for change propagation? - The blob store does not provide any locality information (title or page id, revision, render id / time-uuid), which means that it is incompatible with existing storage systems like RESTBase. Since locality information is critical for consistency and decent compression, I would suggest always providing at least these keys. TASK DETAIL https://phabricator.wikimedia.org/T107595 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: daniel, GWicke Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Up For Grabs] T102476: RFC: Requirements for change propagation
GWicke placed this task up for grabs. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T126730: [RFC] Caching for results of wikidata Sparql queries
GWicke added a comment. > some of my concerns about the affect of including a graph (with a slowish > query) on page save timing, e.g. when just fixing a typo on a wikipedia page. To get sensible hit rates for relatively rare events like edits, we would need to cache results for a long time, on the order of weeks. Would this be acceptable without automatic purging? TASK DETAIL https://phabricator.wikimedia.org/T126730 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: Milimetric, Gehel, BBlack, GWicke, Bene, Ricordisamoa, daniel, Lydia_Pintscher, Smalyshev, Jonas, Christopher, Yurik, hoo, Aklapper, aude, debt, Izno, Luke081515, jkroll, Wikidata-bugs, Jdouglas, Deskana, Manybubbles, Mbch331, Jay8g, Ltrlg ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T126730: [RFC] Caching for results of wikidata Sparql queries
GWicke added a comment. > Since running the query each time graph is displayed is too expensive, we > want some intermediate caching store that would store the results, possibly > for the time defined in the query. Is the graph extension actually re-requesting the data on each view, or would this only happen on parser cache miss / edit? I'm still not sure how effective query service caching can be in this context: > In particular, I wonder if there are a small number of queries that get a lot > of hits, and if those queries can be cached for long enough to result in > worthwhile hit rates. TASK DETAIL https://phabricator.wikimedia.org/T126730 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: Milimetric, Gehel, BBlack, GWicke, Bene, Ricordisamoa, daniel, Lydia_Pintscher, Smalyshev, Jonas, Christopher, Yurik, hoo, Aklapper, aude, debt, Izno, Luke081515, jkroll, Wikidata-bugs, Jdouglas, Deskana, Manybubbles, Mbch331, Jay8g, Ltrlg ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T126730: [RFC] Caching for results of wikidatasparql queries for Graphs
GWicke added a comment. @smalyshev, I would need more context to usefully comment on this. In particular, I wonder if there are a small number of queries that get a lot of hits, and if those queries can be cached for long enough to result in worthwhile hit rates. When discussing a use case like graphs, there are also a lot more caching layers and change propagation systems to consider. TASK DETAIL https://phabricator.wikimedia.org/T126730 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: GWicke, Bene, Ricordisamoa, daniel, Lydia_Pintscher, Smalyshev, Jonas, Christopher, Yurik, hoo, Aklapper, aude, debt, Gehel, Izno, Luke081515, jkroll, Wikidata-bugs, Jdouglas, Deskana, Manybubbles, Mbch331, Jay8g, Ltrlg ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T102476: RFC: Requirements for change propagation
GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, Qgil, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, Izno, Hardikj, Wikidata-bugs, aude, jayvdb, Mbch331, Jay8g, bd808, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T102476: RFC: Requirements for change propagation
GWicke removed a project: Wikimedia-Developer-Summit-2016. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, Qgil, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, Izno, Hardikj, Wikidata-bugs, aude, jayvdb, Mbch331, Jay8g, bd808, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Unblock] T102476: RFC: Requirements for change propagation
GWicke closed blocking task T84923: Reliable publish / subscribe event bus as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: Addshore, RobLa-WMF, StudiesWorld, intracer, Qgil, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, Hardikj, Wikidata-bugs, aude, jayvdb, Mbch331, Jay8g, bd808, Krenair, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Closed] T84923: Reliable publish / subscribe event bus
GWicke closed this task as "Resolved". GWicke claimed this task. GWicke added a comment. A basic event bus is now available in production, and is being populated with edit events from MediaWiki. Consumption is directly from Kafka at this point. This means that the core proposal of this task is implemented. I'm closing this task to reflect this. TASK DETAIL https://phabricator.wikimedia.org/T84923 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: Aklapper, Matanya, Ottomata, mmodell, Eevans, chasemp, brion, Krenair, Halfak, JanZerebecki, bd808, MZMcBride, mobrovac, GWicke, aaron, daniel, Hardikj, yuvipanda, debt, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, Mbch331, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114474: More flexible and modernized Recent Changes code
GWicke added a comment. @aude: Which questions would you like to resolve at the summit? Do you think this topic could also be reasonably discussed in a regular RFC meeting? TASK DETAIL https://phabricator.wikimedia.org/T114474 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: aude, GWicke Cc: RobLa-WMF, Izno, GWicke, Jdforrester-WMF, Krenair, Qgil, hoo, Addshore, daniel, aude, Aklapper, Wikidata-bugs, Mbch331, Jay8g ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T102476: RFC: Requirements for change propagation
GWicke added a comment. @janzerebecki: The current intention is to keep change propagation relatively simple and efficient. Many services can be implemented with very small relative per-request overheads, and services with high per-request overheads can consider applying opportunistic batching transparently to all requests, for example using a batching proxy. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: Addshore, RobLa-WMF, StudiesWorld, intracer, Qgil, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, Hardikj, Wikidata-bugs, aude, jayvdb, Mbch331, Jay8g, bd808, Krenair, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T102476: RFC: Requirements for change propagation
GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: Addshore, RobLa-WMF, StudiesWorld, intracer, Qgil, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, Hardikj, Wikidata-bugs, aude, jayvdb, Mbch331, Jay8g, bd808, Krenair, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114019: Dumps 2.0 for realz (planning/architecture session)
GWicke added a comment. @ArielGlenn: To me it seems that the discussion so far lacks a shared agreement on what the most pressing problems with dumps are. This makes it difficult to evaluate candidate solutions and their trade-offs relative to the top priorities. With the right preparation, a discussion at the dev summit could perhaps help to establish a shared agreement on the top problems to solve. It would be helpful if a candidate list could be worked out before the summit, so that it can inform the discussion. TASK DETAIL https://phabricator.wikimedia.org/T114019 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: ArielGlenn, GWicke Cc: RobLa-WMF, GWicke, TTO, zhuyifei1999, StudiesWorld, gnosygnu, LA2, Ladsgroup, intracer, Lokal_Profil, Halfak, Legoktm, Qgil, JanZerebecki, brion, daniel, Hydriz, MZMcBride, hoo, ezachte, wpmirrordev, Nemo_bis, Aklapper, ArielGlenn, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Subscribers] T105638: RFC: Streamlining Composer usage
GWicke added a subscriber: mobrovac. TASK DETAIL https://phabricator.wikimedia.org/T105638 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: JanZerebecki, GWicke Cc: mobrovac, GWicke, Addshore, Qgil, Spage, greg, tstarling, aude, hoo, daniel, zeljkofilipin, thcipriani, mmodell, bd808, csteipp, Legoktm, Krinkle, hashar, JanZerebecki, Aklapper, Lynhg, Wikidata-bugs, Malyacko, Mbch331, Jay8g, Ltrlg, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T105638: RFC: Streamlining Composer usage
GWicke added a subscriber: GWicke. GWicke added a comment. Here is an idea for a workflow-based solution that would work for nodejs as well: 1. Each code project has a corresponding deploy repository. For nodejs, current practice is to have the code as a submodule of the deploy repository (in src/). For MediaWiki, current practice is to have a deploy / dependency repository inside the code repository (vendor/). It might be worth investigating if inverting the relationship could be an option for MediaWiki as well, as this avoids deploy updates polluting the code repository history. 2. CI automatically updates the deploy repository for each test run by running composer / npm, and commits the result to git on successful test completion. The deploy repository hash is recorded in the test results. 3. For a deploy, one of the CI-prepared deploy repository commits are reviewed and merged. The diff clearly shows changes in dependencies. A potential issue here is making sure that the submodule patch is actually merged, but it seems that this could be solved with a hook. This workflow is very close to what we are currently doing for node services, except that step 2) is currently performed manually, using a docker script. TASK DETAIL https://phabricator.wikimedia.org/T105638 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: JanZerebecki, GWicke Cc: GWicke, Addshore, Qgil, Spage, greg, tstarling, aude, hoo, daniel, zeljkofilipin, thcipriani, mmodell, bd808, csteipp, Legoktm, Krinkle, hashar, JanZerebecki, Aklapper, Lynhg, Wikidata-bugs, Malyacko, Mbch331, Jay8g, Ltrlg, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T114474: More flexible and modernized Recent Changes code
GWicke added a subscriber: GWicke. GWicke added a comment. There is some related high-level discussion about recent changes and page history as event streams in https://phabricator.wikimedia.org/T107595. One idea is to layer event streams, which would potentially let us integrate related events like edits to corresponding Wikidata items. TASK DETAIL https://phabricator.wikimedia.org/T114474 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: GWicke, Jdforrester-WMF, Krenair, Qgil, hoo, Addshore, daniel, aude, Aklapper, Wikidata-bugs, Mbch331, Jay8g ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. @faidon: Until very recently (last days), there wasn't actually any REST proxy with schema validation in the EventLogging repository. @ottomata now has a patch implementing such a service <https://gerrit.wikimedia.org/r/#/c/235671/24/server/bin/eventlogging-service>, and @mobrovac has left comments on it today. So, it looks like we'll have the option of choosing between two new services implementing the same API. I don't see having two implementations of a simple service as a bad thing. As mentioned, we might want to use a single node process exposing parsoid, restbase & eventbus for small (third party) installs, but might as well use the new EventLogging service in production. There are still loose ends to be tied in the API and event schema definitions, and I think that should be our focus. The implementation deserves attention too, but it's easy to swap & a few hundred lines each. Replacing all of EventLogging is pretty much out of scope for EventBus. The focus is on queuing and event validation, and not on other EventLogging features like Varnish log decoding, analytics databases etc. If desired, we could fairly easily add HTTP event production in EL, which would write to EventBus instead of directly to Kafka. However, I personally think it's fine to let trusted producers write directly to Kafka, especially for internal applications. The current EL instance is producing to a separate (analytics) Kafka cluster in any case, so there is no potential for conflicts with non-analytics use cases. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata, GWicke Cc: Milimetric, RobLa-WMF, brion, intracer, Smalyshev, mark, MZMcBride, Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Hardikj, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, daniel, Mbch331, Jay8g, Ltrlg, jeremyb, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. @ottomata: In my recollection of the discussion & the log you linked to, the question of which REST producer proxy to use was left open. Our priority is to get basic events into Kafka before the end of this month, so that we can start building on top of this for change propagation. We have a simple node service <https://github.com/wikimedia/restevent> that does what we need & integrates with our node infrastructure, but if you have something based on EventLogging soon then we can consider using that too. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata, GWicke Cc: RobLa-WMF, brion, intracer, Smalyshev, mark, MZMcBride, Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Hardikj, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, daniel, Jay8g, Ltrlg, jeremyb, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. @ottomata: If you fill in the defaults at consumption time, then you have a choice of how you want to treat old events. You can either fill in the defaults from the latest schema (probably what you want in most cases), or choose to explicitly distinguish fields that were not yet defined at the time the event was produced. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: mobrovac, GWicke Cc: intracer, EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. @ottomata, you are basically making the case for filling in the defaults at consumption time. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: mobrovac, GWicke Cc: intracer, EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. @ottomata, they will be filled in somewhere, but I think we haven't necessarily decided on filling them in at production time. To me it seems that filling in either at production or consumption time will work, as long as defaults don't change. It sounds like you have a concern in that area, though. Could you elaborate? TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: mobrovac, GWicke Cc: intracer, EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. @ottomata: Based on our backwards-compatibility rules, the latest schema will be a superset of previous schemas. This means that you will be able to understand both old and new data in a given topic using the //latest// schema. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: mobrovac, GWicke Cc: intracer, EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. @ottomata, I think understanding the semantics of an event primarily requires knowledge of the topic. The topic in turn provides access to the schema, which describes the structure of the events. It is likely that we'll have multiple topics record similarly-structured events, which means that they might share the same schema, but describe different semantic events in each topic. For example, a basic timing event can be emitted for clicks of button A or button B, each tracked in a separate topic. I could be convinced to include the topic name / URL in each event. One use case this could potentially help is streaming events from multiple topics. We could also handle this with a framing format, but this might force us to parse JSON on the consumer side, which wouldn't be great for performance. Either way, given the topic name you should have no trouble accessing the schema. We can expose schemas for each topic URL in the REST API (ex: `/{topic}?schema.json`), which you could then store along with the event data in hadoop. Embedding an explicit schema url of the form described above might be a bit redundant, considering the simplicity of the construction. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: mobrovac, GWicke Cc: intracer, EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. > I've been thinking about it too. Ideally, we could leave these fields out of > schema defs, simply reference them. But, that seems not to be in correlation > with storing them in a git repo. What I see as a possible solution is to put > these common fields into a separate file and let the producer proxy in front > on kafka stick it into each schema def. The validator lets us register schemas corresponding to urls <https://github.com/epoberezkin/ajv#addschemaarrayobjectobject-schema--string-key>, which will then be used when those are referenced via $ref. We could also use a nested object to remove some redundancy in naming: { event: { id: '..v1 uuid..', ts: '2015-...', subject: '/some/uri' }, // Event specific data } TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: mobrovac, GWicke Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. In https://phabricator.wikimedia.org/T116247#1754698, @Ottomata wrote: > > If we have a use case for emitting two secondary events *to the same topic* > > that were both triggered by the same primary event (user click / request > > id), then we can generate a new ID for at least one of those events, and > > record the parent event id in a separate field (ex: par_id). This way, we > > can get the right deduplication semantics for each of those events. > > > ? What's the point of the request_id then? I thought we wanted X-Request-Id > so that we can easily tie together events generated by the same http request. > > Why not just have `request_id` and `uuid` as separate fields that always > exist? Sure, optionally having a separate request ID (in addition to the event ID) sounds good to me as well. We should always require / auto-gereate the event ID (and use it for event deduplication, derived event timestamp etc), while the reqid can be added to events that are indeed request-triggered. > > IMHO, the timestamps of the event ID and explicit timestamp (ts or dt) > > should always match. This makes it a lot simpler to automatically derive dt > > from id in the producer REST proxy. Other event-specific times (like the > > save time as recorded by MediaWiki) should imho go into the event body. > > > Why? I agree, that specific schemas can define additional timestamps, but > what is the harm in having a standard one that is set and used semantically > by the producer? What if I wanted to explicitly feed a topic with events > dated in the past, perhaps for backfilling or recovery reasons? That's exactly what the event ID and dt should support well. MW edit timetamps are low resolution, and in a custom format, which imho makes them less than ideal for general event ids / timestamps. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: mobrovac, GWicke Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. > I'm not so sure actually that these will always be redundant. I think the > request ID should be persisted to track the same event throughout the system. > Imagine a user clicks on something which produces an event in the queue and > that event triggers another one to be enqueued. Then, both of them should > have the same request id, but different time stamps, shouldn't they? IMHO, the timestamps of the event ID and explicit timestamp (`ts` or `dt`) should always match. This makes it a lot simpler to automatically derive `dt` from `id` in the producer REST proxy. If we have a use case for emitting two secondary events *to the same topic* that were both triggered by the same primary event (user click / request id), then we can generate a new ID for at least one of those events, and record the parent event id in a separate field (ex: `par_id`). This way, we can get the right deduplication semantics for each of those events. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. > If we adopt a convention of always storing schema name and/or revision in the > schemas themselves, then we can do like EventLogging does and infer and > validate the schema based on this value. This would especially be helpful in > associating a message with an Avro Schema when serializing into binary. The topic configuration will take precedence, so we wouldn't use client-supplied values for these fields, and would basically just write a part of the topic configuration into each event. We also decided that we will only evolve schemas in backwards-compatible ways. In practice, this means that we'll only add fields, and the latest schema will be able to validate both new and old data in each topic. @ottomata, which value do you see in recording the schema configured for a topic at enqueue time in each event? TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. I went ahead and updated the task description with the current framing / per-event schema. I renamed the `reqid` to just `id`, and added a `ts` field containing the same timestamp in ISO 8601 format. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T116247: Define edit related events for change propagation
GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. > Right, but how would you do this in say, Hive? Or in bash? Timestamp logic > should be easy and immediate. Yeah, Hive really seems to be lacking built-in support for UUIDs. There seems to be UDF code to deal with them, but it's definitely not as convenient as it could be. I'm fine with including the timestamp corresponding to the timeuuid to help Hive. The overhead is fairly small, and we can automate adding the timestamp even if only the UUID was supplied. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. @JanZerebecki: Suppression information would indeed be needed for public access to older events. One option would be to key this on the event's UUID. We could also consider superseding the message using Kafka's deduplication (compaction) based on the same UUID. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. @ottomata, UUIDs are described in https://en.wikipedia.org/wiki/Universally_unique_identifier. An example for a v1 UUID is `b54adc00-67f9-11d9-9669-0800200c9a66`. There are libraries to extract the high-resolution timestamp for most environments. Regarding a separate timestamp in the framing information: Which time would this correspond to? The next version of Cassandra is likely going to track enqueue time itself & support efficient retrieval by timestamp <http://www.confluent.io/blog/log-compaction-highlights-in-the-kafka-and-stream-processing-community-october-2015>, and enqueue time is something that should be handled in Kafka in any case. Other timestamps have event-specific semantics, like for example the MediaWiki save time, which is why I think it makes most sense to not include them in the framing information. All events should however have a unique identifier and timestamp that ties together all events triggered by the same original trigger, and can be used for per-topic de-duplication / idempotency. This is what the UUID in reqid would provide. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Subscribers] T116247: Define edit related events for change propagation
GWicke added a subscriber: EBernhardson. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation
GWicke added a comment. Some notes from the meeting: 1. Framing, for all events - **uri**: string; path or url. Example: /en.wikipedia.org/v1/page/title/San_Francisco - **reqid**: v1 UUID <https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_1_.28MAC_address_.26_date-time.29>; corresponding to the `x-request-id` header, or another primary event identifier. V1 UUIDs contain a high-resolution timestamp. - domain: en.wikipedia.org, fr.wiktionary.org,...; No mobile variants. Edit events --- - title: string - pageid - revision: integer - savetime: iso 8601 - Other metadata, like the user etc. - Generally, no overly sensitive information (like client IPs for authenticated edits) in primary events. - Can be included in expanded message in separate topic, or stored separately based on reqid. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, 01tonythomas, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T84923: Reliable publish / subscribe event bus
GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T84923 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: Aklapper, Matanya, Mattflaschen, Ottomata, mmodell, Eevans, chasemp, brion, Krenair, Halfak, JanZerebecki, bd808, MZMcBride, mobrovac, GWicke, aaron, daniel, Hardikj, yuvipanda, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T84923: Reliable publish / subscribe event bus
GWicke added a blocked task: T102476: RFC: Requirements for change propagation. TASK DETAIL https://phabricator.wikimedia.org/T84923 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: Aklapper, Matanya, Mattflaschen, Ottomata, mmodell, Eevans, chasemp, brion, Krenair, Halfak, JanZerebecki, bd808, MZMcBride, mobrovac, GWicke, aaron, daniel, Hardikj, yuvipanda, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T116247: Define edit related events for change propagation
GWicke added a blocked task: T102476: RFC: Requirements for change propagation. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T116247: Define edit related events for change propagation
GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T116247: Define edit related events for change propagation
GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Created] T116247: Define edit related events for change propagation
GWicke created this task. GWicke added subscribers: Aklapper, Matanya, Mattflaschen, Ottomata, mmodell, Eevans, chasemp, brion, Krenair, Halfak, JanZerebecki, bd808, MZMcBride, mobrovac, GWicke, aaron, daniel, Hardikj, yuvipanda. GWicke added projects: operations, EventBus, Discovery, Epic, Analytics, Wikidata, MediaWiki-General-or-Unknown, Services, Service-Architecture, Wikidata-Query-Service. TASK DESCRIPTION Our (#services) primary focus this quarter is on enabling change propagation for edit-related events. We already track such events in [a custom extension](https://github.com/wikimedia/mediawiki-extensions-RestBaseUpdateJobs/blob/master/RestbaseUpdate.hooks.php), which then creates custom jobs, which in turn performs HTTP requests to RESTBase. Instead, we would like to cover this functionality with more general-purpose events using the event bus: - article creation - article deletion - article undeletion - article edit - article rename - revision deletion / suppression - file upload ## Other use cases - Change propagation between content types - edit triggers Parsoid re-parse, which triggers mobile app service & metadata updates - Wikidata changes - use cases: invalidate pages using specific wikidata items; keeping the #wikidata-query-service up to date - Analytics: https://meta.wikimedia.org/wiki/Research:MediaWiki_events:_a_generalized_public_event_datasource ## Considerations - naming of articles / resources vs. topics vs. subscriptions: Generally use URLs / paths as discussed in T102476 (section "Addressing of components")? TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Raised Priority] T116247: Define edit related events for change propagation
GWicke raised the priority of this task from "Normal" to "High". GWicke set Security to None. TASK DETAIL https://phabricator.wikimedia.org/T116247 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. We are having a hangout meeting tomorrow (Thursday, 22nd) between 11&12am SF time. Please let us know if you'd like to join. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata, GWicke Cc: mark, MZMcBride, Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, daniel, RobLa-WMF, Jay8g, Ltrlg, jeremyb, Legoktm ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. A PR adding remote schema support to the nodejs frontend is now available at https://github.com/wikimedia/restevent/pull/1. This means that we can now choose to use local or remote schemas per-topic in the configuration. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata, GWicke Cc: mark, MZMcBride, Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, daniel, JanZerebecki, RobLa-WMF, Jay8g, Ltrlg, fgiunchedi, Dzahn, jeremyb, Legoktm, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. > For starters, it means that we have alternatives for environments where Kafka > is overkill (small third-party installations, dev environments, mw-vagrant, > etc). Using, for example, sqlite instead of Kafka is already something > supported. As far as I can see, there is no support for using any database as a queue / log in a way that would give us a light-weight alternative to Kafka. There is generally no support for streaming from a database in EventLogging, and separate tables are created whenever a schema is changed. We'll have to implement this either way. We do have fairly nice async table abstractions for sqlite and cassandra that we could reuse for this in node. Both already implement retention policies. Python has sqlalchemy, which is a pretty nice way to interface with dbs. Retention policies would have to be implemented manually. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata, GWicke Cc: mark, MZMcBride, Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, daniel, JanZerebecki, RobLa-WMF, Jay8g, fgiunchedi, Dzahn, jeremyb, Legoktm, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. In https://phabricator.wikimedia.org/T114443#1731399, @ori wrote: > In https://phabricator.wikimedia.org/T114443#1731284, @GWicke wrote: > > > See https://phabricator.wikimedia.org/T88459#1604768. tl;dr: It's not > > necessarily clear that saving very little code (see above) for EL schema > > fetching outweights the cost of additional hardware. > > > Could you explain how you arrived at the figure of 50k requests per second, > which you project for this service? This @ottomata's projection for analytics use cases. For core events, throughput should be of a lesser concern as rates will likely be in the low hundreds of messages per second. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata, GWicke Cc: mark, MZMcBride, Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, daniel, JanZerebecki, RobLa-WMF, Jay8g, fgiunchedi, Dzahn, jeremyb, Legoktm, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. In https://phabricator.wikimedia.org/T114443#1730753, @Eevans wrote: > 1. Already leverages a (really slick) JSON schema registry > <https://meta.wikimedia.org/wiki/Category:Schemas_%28active%29?status=active> Optionally fetching schemas from a URL isn't that hard really. Example code: if (/^https?:\/\//.test(schema)) { return preq.get(schema); } else { return readFromFile(schema); } This lets us support files for core events, and fetching schemas from meta for EL. Schema validation is a call to a library. > 1. Provides a pluggable, composable, architecture with support for a wide > range of readers/writers How would this be an advantage for the EventBus portion? Many third-party users will actually only want a minimal event bus, and EL doesn't seem to help with this from what I have seen. > - schema registry availability There are more concerns here than just availability (although that's important, too). Third party users won't necessarily want to give their service access to the internet in order to fetch schemas. We need to provide a way to retrieve a full set of core schemas, and a git repository is an easy way to achieve this. We also need proper code review and versioning for core schemas, and wikis don't really support code review. We could consider storing pointers to schemas (URLs) instead of the actual schemas in git, but this adds complexity without much apparent benefit: Workflow with schemas in git: 1. create a patch with a schema change 2. code review Workflow with pointers to schemas (URLs) in git: 1. save a new schema on meta; note revision id 2. create a patch with a schema URL change 3. code review > For performance, it needs to be Good Enough(tm), where Good Enough should be > something we can quantify based on factors like latency, throughput, and > capacity costs that aren't prohibitively expensive when weighed against other > factors (e.g. engineering effort). See https://phabricator.wikimedia.org/T88459#1604768. tl;dr: It's not necessarily clear that saving very little code (see above) for EL schema fetching outweights the cost of additional hardware. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata, GWicke Cc: mark, MZMcBride, Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, daniel, JanZerebecki, RobLa-WMF, Jay8g, fgiunchedi, Dzahn, jeremyb, Legoktm, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Changed Project Column] T114443: EventBus MVP
GWicke moved this task to Ready for RFC meeting on the MediaWiki-RfCs workboard. TASK DETAIL https://phabricator.wikimedia.org/T114443 WORKBOARD https://phabricator.wikimedia.org/project/board/52/ EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, daniel, mark, JanZerebecki, RobLa-WMF, Jay8g, fgiunchedi, Dzahn, jeremyb, Legoktm, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Updated] T114443: EventBus MVP
GWicke added a project: MediaWiki-RfCs. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, daniel, mark, JanZerebecki, RobLa-WMF, Jay8g, fgiunchedi, Dzahn, jeremyb, Legoktm, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. I guess we have slightly different ideas about what a message bus should be: 1. a way to get blobs from a to b, and 2. a way to expose a stream of events in a defined format that can be consumed easily by a range of clients. The use cases I care about require 2). Applying my interpretation of the Robustness Principle <https://en.wikipedia.org/wiki/Robustness_principle> to that use case means thoroughly checking / coercing things on the way in, and keeping promises on the way out. I also agree that it is possible to implement 2) by writing directly to Kafka, provided that *each* producer - emits only events satisfying the expected (current) schema, - never writes to queues it shouldn't write to (access control), and - is fully aware of internal optimizations such as binary encodings and compression specific to the event queue implementation and topic. Add to this requirements like emitting per-topic metrics, and I think it becomes clear why limiting the number of implementations is desirable. I also think that we should look at actual data before making assumptions about latency. For example, simple Kafka clients establish a new TCP connection per write, and might even fetch metadata for each connection. The simple REST service (120 lines) processes 1100+ req/s with a mean enqueue latency of around 10ms, with both Kafka and the service running on a single-core labs instance. At production load, low single-digit ms should be typical. There will be use cases where sub-ms latency or extremely high volume is needed & REST is not a good fit, but lets base decisions around that on actual data. Regarding the monolog backend, my understanding based on https://phabricator.wikimedia.org/T108618 and conversations is that this is primarily aiming to ship events to hadoop for later analysis. As such, it's message format is geared towards that use case, and no effort has been made to generalize events and their representation for general use. That said, we *could* consider using the monolog integration for emitting more general events from MediaWiki, but would then also need to implement support for alternative backends, and ensure that schemas agree. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. @ori, I changed the text to clarify which of those are potential, and which are concrete plans for this quarter. Please follow the provided links if things are still unclear. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T114443: EventBus MVP
GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T114443: EventBus MVP
GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. @Nuria, see the task description, heading "Initial use cases". TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. @ottomata, yes. One of the motivations for having a REST interface is having,... an interface. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. @ottomata, main reason would be the ability to work with $simple_queue, $binary_kafka, $amazon_queue and so on without changes in MW code. This isn't so theoretical. We'll want a lighter-weight queue for testing, developers and third party users rather soon. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T114443: EventBus MVP
GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, RobLa-WMF, bd808, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. I have now integrated some of those changes into the description. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, RobLa-WMF, bd808, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP
GWicke added a comment. @ottomata, I would perhaps leave out the section "The MVP might also include". Much of it isn't so minimally viable, IMHO. Re use cases, the following two have been driving the original idea and discussion: 1. Provide edit related events (ex: edit, creation, deletion, revision deletion, rename). Consumers: RESTBase / change propagation service, potential purge service, potentially RCStream, analytics. 2. EventLogging: Decode, validate and enqueue JSON events for EL. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, RobLa-WMF, bd808, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T107595: RFC: Multi-Content Revisions
GWicke added a comment. @daniel, your revised version seems to focus even more on implementing storage systems, change propagation etc, rather than defining a data access interface for MediaWiki, which can be backed by services. Could you clarify how you see this relate to ongoing efforts with similar goals and use cases like a) RESTBase offering a lot of the storage & API functionality (beyond blob storage), and b) the event bus, dependency tracking and change propagation work in https://phabricator.wikimedia.org/T102476 and friends? TASK DETAIL https://phabricator.wikimedia.org/T107595 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: daniel, GWicke Cc: Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, Wikidata-bugs, aude, Jay8g, bd808 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Edited] T84923: Reliable publish / subscribe event bus
GWicke edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T84923 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: Aklapper, Matanya, Mattflaschen, Ottomata, mmodell, Eevans, chasemp, brion, Krenair, Halfak, JanZerebecki, bd808, MZMcBride, mobrovac, GWicke, aaron, daniel, Hardikj, yuvipanda, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, RobLa-WMF, faidon, fgiunchedi, Dzahn, jeremyb, Malyacko ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs