[Wikidata-bugs] [Maniphest] [Unassigned] T105845: Page components / content widgets

2017-10-11 Thread GWicke
GWicke removed GWicke as the assignee of this task.
TASK DETAILhttps://phabricator.wikimedia.org/T105845EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: leila, Reasno, SBisson, MZMcBride, Mholloway, RandomDSdevel, jmadler, Bianjiang, LikeLifer, MGChecker, -jem-, Daniel_Mietchen, StudiesWorld, Kelson, Jonas, daniel, Jhernandez, MrStradivarius, JanZerebecki, Quiddity, mobrovac, ssastry, Tgr, Ltrlg, Inez, cscott, TrevorParscal, Jdlrobson, GWicke, GoranSMilovanovic, QZanden, Luke081515, Jrf, Wikidata-bugs, aude, Gryllida, jayvdb, fbstj, RobLa-WMF, santhosh, Arlolra, Jdforrester-WMF, Mbch331, Rxy, Jay8g, bd808, Krenair, Legoktm___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Unassigned] T99088: [RFC] Evolving our content platform: Content adaptability, structured data and caching

2017-10-11 Thread GWicke
GWicke removed GWicke as the assignee of this task.
TASK DETAILhttps://phabricator.wikimedia.org/T99088EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: jmadler, Bianjiang, gpaumier, LikeLifer, Mholloway, MZMcBride, RobLa-WMF, StudiesWorld, Qgil, Tgr, JanZerebecki, RobLa, cscott, Dbrant, Smalyshev, greg, bearND, JKatzWMF, Gilles, Ltrlg, Fhocutt, Jhernandez, Joe, BBlack, Tnegrin, mark, faidon, Tfinc, TrevorParscal, damons, Anomie, bd808, dr0ptp4kt, daniel, BGerstle-WMF, tstarling, ssastry, mobrovac, Catrope, ori, brion, GWicke, GoranSMilovanovic, QZanden, merbst, Luke081515, Wikidata-bugs, aude, fbstj, Mbch331, Jay8g, Krenair, Legoktm___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread GWicke
GWicke added a comment.
Looks like adding the JSON_UNESCAPED_UNICODE flag should do it: http://php.net/manual/en/function.json-encode.phpTASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: mobrovac, Stashbot, Legoktm, hoo, Addshore, aude, gerritbot, Ladsgroup, daniel, GWicke, Aklapper, Pchelolo, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, Lewizho99, Maathavan, Izno, Eevans, JAllemandou, Hardikj, Wikidata-bugs, Mbch331, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T175316: Very large jobs posted by Wikidata

2017-09-13 Thread GWicke
GWicke added a comment.
Raised priority, as this is a) blocking the migration to the Kafka job queue backend (T157088), and b) is likely already causing performance and possibly reliability issues in the current job queue.TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: Ladsgroup, daniel, GWicke, Aklapper, Pchelolo, GoranSMilovanovic, QZanden, Izno, Eevans, JAllemandou, mobrovac, Hardikj, Wikidata-bugs, aude, Mbch331, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Raised Priority] T175316: Very large jobs posted by Wikidata

2017-09-13 Thread GWicke
GWicke raised the priority of this task from "Normal" to "High".
TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: Ladsgroup, daniel, GWicke, Aklapper, Pchelolo, GoranSMilovanovic, QZanden, Izno, Eevans, JAllemandou, mobrovac, Hardikj, Wikidata-bugs, aude, Mbch331, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T175316: Very large jobs posted by Wikidata

2017-09-08 Thread GWicke
GWicke updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION..."2653965": [6, "Meyers_b9_s0043.jpg"],

(..a few million further page entries...)
},...TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: daniel, GWicke, Aklapper, Pchelolo, GoranSMilovanovic, QZanden, Izno, Eevans, JAllemandou, mobrovac, Hardikj, Wikidata-bugs, aude, Mbch331, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-08 Thread GWicke
GWicke added a comment.
@Pchelolo, based on our previous conversation about this I am assuming that the bulk of the task is a very large list of pages. Is this correct?TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: daniel, GWicke, Aklapper, Pchelolo, GoranSMilovanovic, QZanden, Izno, Eevans, JAllemandou, mobrovac, Hardikj, Wikidata-bugs, aude, Mbch331, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T173710: Job queue is increasing non-stop

2017-08-31 Thread GWicke
GWicke added a comment.
I updated https://gerrit.wikimedia.org/r/#/c/295027/ to apply on current master. This removes CDN purges from HTMLCacheUpdate, and only performs them after RefreshLinks, and only if nothing else caused a re-render since.

With this patch applied, we should be able to reduce the throttling for HTMLCacheUpdate jobs without endangering the CDN infrastructure with bursts of purges.TASK DETAILhttps://phabricator.wikimedia.org/T173710EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Legoktm, ema, Joe, GWicke, Nemo_bis, Andreasmperu, BBlack, Peachey88, Liuxinyu970226, daniel, Stashbot, Agabi10, Daniel_Mietchen, Harej, XXN, Pasleim, Bugreporter, Sjoerddebruin, Magnus, Mr.Ibrahem, Emijrp, gerritbot, EBernhardson, Esc3300, jcrespo, WMDE-leszek, Jdforrester-WMF, Krinkle, aaron, fgiunchedi, Aklapper, Ladsgroup, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, Vali.matei, Avner, Zppix, debt, Gehel, FloNight, Izno, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, Jay8g, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T173710: Job queue is increasing non-stop

2017-08-30 Thread GWicke
GWicke added a comment.
HTMLCacheUpdate root job timestamp distribution, jobs executed within the last 15 hours:

   1233 20170407
   8237 20170408
 18 20170423
 18 20170426
 20 20170429
 50 20170430
 18 20170502
 18 20170504
 20 20170509
 10 20170512
 18 20170513
 16 20170523
 22 20170528
 10 20170529
 40 20170606
 20 20170617
 18 20170622
 21 20170625
 16 20170627
 10 20170628
 10 20170630
 36 20170701
 20 20170705
 28 20170708
 18 20170712
 10 20170715
 16 20170717
 18 20170724
 42 20170725
 20 20170726
 20 20170728
 17 20170729
 34 20170803
 46 20170804
 30 20170805
 50 20170807
 54 20170808
260 20170809
137 20170810
 16 20170811
 17 20170812
 84 20170813
 36 20170814
 10 20170815
 72 20170816
445 20170817
 82 20170818
 67 20170819
  21452 20170820
   1825 20170821
 81 20170822
176 20170823
   4810 20170824
   9773 20170825
  21842 20170826
 218770 20170827
   8087 20170828
 183142 20170829
3805398 20170830TASK DETAILhttps://phabricator.wikimedia.org/T173710EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: GWicke, Nemo_bis, Andreasmperu, BBlack, Peachey88, Liuxinyu970226, daniel, Stashbot, Agabi10, Daniel_Mietchen, Harej, XXN, Pasleim, Bugreporter, Sjoerddebruin, Magnus, Mr.Ibrahem, Emijrp, gerritbot, EBernhardson, Esc3300, jcrespo, WMDE-leszek, Jdforrester-WMF, Krinkle, aaron, fgiunchedi, Aklapper, Ladsgroup, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, Vali.matei, Avner, Zppix, debt, Gehel, FloNight, Izno, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, Jay8g, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T173710: Job queue is increasing non-stop

2017-08-30 Thread GWicke
GWicke added a comment.
A possible contribution to the backlog building could be the infinite retry / immortal job problem described in T73853. Looking for old htmlCacheUpdate root jobs from April still executing over four months later (!) via grep htmlCacheUpdate runJobs.log | grep -c 'rootJobTimestamp=201704' in mwlog1001:/srv/mw-log yields 9208 executions, just today. Interestingly, jobs from May, June, and July are much less common (hundreds). Considering that HTMLCacheUpdateJob basically only updates touched timestamps in the DB, and then quickly fires off CDN purges, seeing anything but zero ancient jobs might mean that T73853 is not actually resolved yet. To actually establish whether this significantly contributes to the current backlog, we would need to look at the distribution of rootJobTimestamp values for htmlCacheUpdates from July, especially for the period since the backlog growth really started around the 8th.

The general HTMLCacheUpdate / purge volume problematic was previously discussed in T124418. At the time, I posted https://gerrit.wikimedia.org/r/#/c/295027/, which would move the vast majority of CDN purges to RefreshLinksJob, which would make purges less bursty. I think we could dust this off quite easily.

Finally, since EventBus and Kafka was brought up, let me clarify: No jobs are executed via EventBus / Kafka so far. We did start writing copies of job specs to EventBus on August 2nd (phase0), and then enabled this for phase1 on the 16th. The timing does not align with the backlog rise, so it seems unlikely that the double production significantly contributes to this issue.TASK DETAILhttps://phabricator.wikimedia.org/T173710EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: GWicke, Nemo_bis, Andreasmperu, BBlack, Peachey88, Liuxinyu970226, daniel, Stashbot, Agabi10, Daniel_Mietchen, Harej, XXN, Pasleim, Bugreporter, Sjoerddebruin, Magnus, Mr.Ibrahem, Emijrp, gerritbot, EBernhardson, Esc3300, jcrespo, WMDE-leszek, Jdforrester-WMF, Krinkle, aaron, fgiunchedi, Aklapper, Ladsgroup, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, EBjune, Vali.matei, Avner, Lewizho99, Zppix, Maathavan, debt, Gehel, FloNight, Izno, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, Jay8g, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T173710: Job queue is increasing non-stop

2017-08-30 Thread GWicke
GWicke added a project: Services (watching).
TASK DETAILhttps://phabricator.wikimedia.org/T173710EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Nemo_bis, Andreasmperu, BBlack, Peachey88, Liuxinyu970226, daniel, Stashbot, Agabi10, Daniel_Mietchen, Harej, XXN, Pasleim, Bugreporter, Sjoerddebruin, Magnus, Mr.Ibrahem, Emijrp, gerritbot, EBernhardson, Esc3300, jcrespo, WMDE-leszek, Jdforrester-WMF, Krinkle, aaron, fgiunchedi, Aklapper, Ladsgroup, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, EBjune, Vali.matei, Avner, Lewizho99, Zppix, Maathavan, debt, Gehel, FloNight, Izno, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, Jay8g, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T172832: Investigate use-cases for delayed job executions

2017-08-28 Thread GWicke
GWicke added a comment.

In T172832#3540031, @Mattflaschen-WMF wrote:
There are three considerations relevant to Echo:


Delayed notifications (T156808: Back-end infrastructure for timed notifications in Echo) 1a. Article reminder notifications (T2582: Remind me of this article in X days) 1b. User group expiry notifications (T153817: Notify users when their user group membership is about to expire, or has expired)
"Batching & rate limiting for Echo notifications".





So that leaves #2.  I think this is a reference to ProcessEchoEmailBatch.  We currently use a OS cron job for that.  If there's another use case you're thinking of, please let us know.

(I also checked the existing Echo jobs, and they do not use getReleaseTimestamp/jobReleaseTimestamp)

@Mattflaschen-WMF, thank you for the background on echo! Regarding use case #2, can this be fairly summarized as "wait for a limited time before sending out notifications, in order to reduce the volume with batching"?TASK DETAILhttps://phabricator.wikimedia.org/T172832EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: MaxSem, MarkTraceur, ggellerman, dr0ptp4kt, TrevorParscal, kaldari, phuedx, Fjalapeno, Mattflaschen-WMF, daniel, PokestarFan, hoo, Aklapper, Joe, elukey, Ottomata, Nuria, mobrovac, GWicke, Pchelolo, GoranSMilovanovic, QZanden, Izno, Eevans, JAllemandou, Hardikj, Wikidata-bugs, aude, Mbch331, Jay8g, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T172832: Investigate use-cases for delayed job executions

2017-08-16 Thread GWicke
GWicke added a comment.
Some use cases from today's eventbus sync discussion:


Batching & rate limiting for Echo notifications
CirrusSearch for rate limiting / batching
Wikidata for rate limiting, partly work-around for lack of dependency tracking
Delayed HTCP purging (by ~10s) to account for replication lag
TASK DETAILhttps://phabricator.wikimedia.org/T172832EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc: Mattflaschen-WMF, daniel, PokestarFan, hoo, Aklapper, Joe, elukey, Ottomata, Nuria, mobrovac, GWicke, Pchelolo, GoranSMilovanovic, QZanden, Izno, Eevans, JAllemandou, Hardikj, Wikidata-bugs, aude, Mbch331, Jay8g, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T170860: Need ability to block specific query sources for WDQS

2017-07-17 Thread GWicke
GWicke added a comment.
T167906: Make API usage limits easier to understand, implement, and more adaptive to varying request costs / concurrency limiting might help with this problem, especially if you can reduce the allowed concurrency to a fairly small value.TASK DETAILhttps://phabricator.wikimedia.org/T170860EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: GWicke, gerritbot, Esc3300, Gymel, EBernhardson, Lydia_Pintscher, Sjoerddebruin, debt, BBlack, Gehel, Aklapper, Smalyshev, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, EBjune, merbst, Avner, Lewizho99, Zppix, Maathavan, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T167787: [Spike 2hr] Investigate ability for page previews in wikidata to appear in user's preferred language

2017-06-20 Thread GWicke
GWicke added a comment.

In T167787#3364625, @phuedx wrote:

In T167787#3364314, @GWicke wrote:
We discussed this in the Reading / Services sync meeting. One question that came up in the discussion is whether including all languages in the response would be feasible from a performance perspective. The advantage of this direction would be no cache fragmentation, the downside a larger response.


Which is the more expensive of the two? Would you like us to spend time figuring out the p50, p75, p95 number of descriptions and labels per item to get a better understanding of the expected size of the response?


Ultimately, I think the median / p99 compressed sizes of responses with a single vs. all languages would be helpful. It's really more about getting an idea of the ballpark we are talking about -- is this < 16k, or are we talking about >100  kb?



If we determine that returning all languages does not make sense, then we could consider using the accept-language header as the general language selection mechanism, in line with T122942: RFC: Support language variants in the REST API.

Reading through that RFC, it feels like this is the accepted solution. Would it really make sense to special-case this endpoint? (I guess this depends on your answer to the first question).

The caveat is that we discussed Accept-Language in the context of supporting language variants in the REST API. There is also {T114662: RFC: Per-language URLs for multilingual wiki pages}, which is focused on Wikidata, but does not consider APIs so far.TASK DETAILhttps://phabricator.wikimedia.org/T167787EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Jdlrobson, GWicke, mobrovac, Pchelolo, bmansurov, Jhernandez, Nirzar, Tbayer, Aklapper, Lydia_Pintscher, phuedx, ovasileva, GoranSMilovanovic, QZanden, Winter, Izno, Wikidata-bugs, aude, Ricordisamoa, Se4598, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T167787: [Spike 2hr] Investigate ability for page previews in wikidata to appear in user's preferred language

2017-06-20 Thread GWicke
GWicke added a comment.
We discussed this in the Reading / Services sync meeting. One question that came up in the discussion is whether including all languages in the response would be feasible from a performance perspective. The advantage of this direction would be no cache fragmentation, the downside a larger response.

If we determine that returning all languages does not make sense, then we could consider using the accept-language header as the general language selection mechanism, in line with T122942: RFC: Support language variants in the REST API.TASK DETAILhttps://phabricator.wikimedia.org/T167787EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Jdlrobson, GWicke, mobrovac, Pchelolo, bmansurov, Jhernandez, Nirzar, Tbayer, Aklapper, Lydia_Pintscher, phuedx, ovasileva, GoranSMilovanovic, QZanden, Winter, Izno, Wikidata-bugs, aude, Ricordisamoa, Se4598, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T157014: CONSULTATION/PLAN: Managing Complex State and GUI on MediaWiki (e.g. for Wikidata/Wikibase UI)

2017-04-10 Thread GWicke
GWicke added a comment.

In T157014#3168034, @daniel wrote:
AFAIK we do have infrastructure for running node servers with proper caching, as part of the rest services layer.
 If you create endpoints that return structured data you could consume those from the PHP backend or the JS client.

We have node based rendering infrastructure, but as far as I know, we cannot use them to directly serve page views. Because a) all the skin/chrome stuff is done in PHP and b) we currently only use node when editing, and don't have the server capacity to handle several orders of magnitude more hits that would be needed for serving page views.


We aren't using node for UI rendering at scale, but at the API level the REST API is seeing 15 minute averages of more than 6k requests per second most days. This is comparable to the action API: https://grafana-admin.wikimedia.org/dashboard/db/api-summary?orgId=1&from=now-30d&to=now

The vast majority of this traffic is driven by read views and data retrieval. Cache hit rate is already ~95%, and rising with traffic levels.

So, we can use node.js to render stuff, but we still need to loop it through PHP to add chrome.

This depends on the roll-out strategy. The standard chrome elements are actually not *that* hard to replace. Long tail special-case interfaces rendered in the content area like special pages etc will take longer, or might not make sense to re-do at all. See https://phabricator.wikimedia.org/T114596.

We also need caching for this on the same level as the parser cache. That could be done on the HTTP level between PHP and node. Does this exist?

We touched on this in the mail thread. Most API responses and server-side UI renders for anonymous users would be cached in the CDN / Varnish.TASK DETAILhttps://phabricator.wikimedia.org/T157014EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: GWicke, Aleksey_WMDE, daniel, Niedzielski, Magnus, Milimetric, Prtksxna, Fjalapeno, phuedx, Jdlrobson, siddharth11, Capt_Swing, TheDJ, Jdforrester-WMF, SBisson, WMDE-leszek, Volker_E, Krinkle, gabriel-wmde, Jonas, thiemowmde, Lydia_Pintscher, Jan_Dittrich, Jhernandez, Jdrewniak, Aklapper, QZanden, Salgo60, SamanthaNguyen, JGirault, Izno, Wikidata-bugs, aude, Mbch331, Jay8g___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T161527: Canonical data URIs and URLs for machine readable page content

2017-03-29 Thread GWicke
GWicke added a comment.
I think the URL most users would consider canonical is /wiki/{title}. Wouldn't this already provide a reasonable URL for the concept of the page?TASK DETAILhttps://phabricator.wikimedia.org/T161527EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Rybesh, Dzahn, GWicke, tstarling, Aklapper, Jonas, Smalyshev, mkroetzsch, Lydia_Pintscher, daniel, QZanden, D3r1ck01, Izno, suriyaa, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, Southparkfan, fbstj, RobLa-WMF, santhosh, Mbch331, Jay8g, Ltrlg, Glaisher, bd808, Krenair, Legoktm___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T161527: Canonical data URIs and URLs for machine readable page content

2017-03-29 Thread GWicke
GWicke added a comment.

In T161527#3135095, @Smalyshev wrote:
I think we have several concepts there that needs to be refined.


Canonical object URI - this is the URI that uniquely identifies an object in Wikimedia world, and, by extension, in the whole world of linked data.



I think there is a spectrum here between reserving the object domain only for abstract concepts & treating most things as representations, and making some of those representations first-class objects, related to the underlying concept via an is-representation-of edge. For addressable resources in the REST sense, I think that usability concerns should play a prominent role in finding the right balance between URLs and headers. For example, I don't think that using a single URL for all resources related to the concept of "Barack Obama" would make a lot of sense to API users.TASK DETAILhttps://phabricator.wikimedia.org/T161527EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Rybesh, Dzahn, GWicke, tstarling, Aklapper, Jonas, Smalyshev, mkroetzsch, Lydia_Pintscher, daniel, QZanden, D3r1ck01, Izno, suriyaa, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, Southparkfan, fbstj, RobLa-WMF, santhosh, Mbch331, Jay8g, Ltrlg, Glaisher, bd808, Krenair, Legoktm___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T161527: Canonical data URIs and URLs for machine readable page content

2017-03-27 Thread GWicke
GWicke added a project: Services (watching).
TASK DETAILhttps://phabricator.wikimedia.org/T161527EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Dzahn, GWicke, tstarling, Aklapper, Jonas, Smalyshev, mkroetzsch, Lydia_Pintscher, daniel, QZanden, D3r1ck01, Izno, suriyaa, Eevans, mobrovac, Hardikj, Wikidata-bugs, aude, jayvdb, Southparkfan, fbstj, RobLa-WMF, santhosh, Mbch331, Jay8g, Ltrlg, Glaisher, bd808, Krenair, Legoktm___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T161527: Canonical data URIs and URLs for machine readable page content

2017-03-27 Thread GWicke
GWicke added a comment.
However, URIs by nature should not include interface version information, because they identify the resource independently of representation.

The REST API versioning policy explicitly describes how representation concerns are hadled through content negotiation, and not by incrementing major API versions. Changes in major API versions are expected to be extremely rare. The major API version is basically an insurance policy for the case that we'd want to introduce a fundamentally different URL layout, without breaking existing users and resource references.

What URI structure would you propose?

Within the REST API, data associated with pages is typically exposed using the /api/rest_v1/page/{type}/{title}{/revision} pattern. Examples: HTML, summary, data-parsoid, PDF.TASK DETAILhttps://phabricator.wikimedia.org/T161527EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Dzahn, GWicke, tstarling, Aklapper, Jonas, Smalyshev, mkroetzsch, Lydia_Pintscher, daniel, QZanden, D3r1ck01, Izno, suriyaa, Wikidata-bugs, aude, jayvdb, Southparkfan, fbstj, RobLa-WMF, santhosh, Mbch331, Jay8g, Ltrlg, Glaisher, bd808, Krenair, Legoktm___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T161527: Canonical data URIs and URLs for machine readable page content

2017-03-27 Thread GWicke
GWicke added a comment.
The description of the requirements seems to fit the REST API:


API versioning & content negotiation.
REST URL structure.
Integration with CDN layer.
Machine and user readable API specs & documentation.


The REST URL hierarchy makes it quite easy to route specific end points directly to specialized backends, while still presenting a consistent & well-documented API to end users. In other words, exposing functionality through the REST API does not imply the use of RESTBase where that does not make sense.TASK DETAILhttps://phabricator.wikimedia.org/T161527EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Dzahn, GWicke, tstarling, Aklapper, Jonas, Smalyshev, mkroetzsch, Lydia_Pintscher, daniel, QZanden, D3r1ck01, Izno, suriyaa, Wikidata-bugs, aude, jayvdb, Southparkfan, fbstj, RobLa-WMF, santhosh, Mbch331, Jay8g, Ltrlg, Glaisher, bd808, Krenair, Legoktm___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T149114: Reconsider wikidata support in the REST API

2016-10-25 Thread GWicke
GWicke created this task.GWicke added projects: Services (next), Wikidata, RESTBase-API, Mobile-Content-Service.Herald added a subscriber: Aklapper.
TASK DESCRIPTIONBasically none of the current REST API end points in https://www.wikidata.org/api/rest_v1/ are actually useful currently. HTML contains a  wrapping a large JSON blob. Nobody is using VisualEditor to edit this JSON blob. None of the derived content types like mobile sections or the page summary appear obviously useful. For the most part, wikidata information like item descriptions is consumed through individual project's REST APIs (such as the summary end point), rather than the wikidata REST API itself.

We also currently process all updates for these end points the same way as for any other wiki. This means a lot of unnecessary work and storage use.

So, lets revisit how we support WikiData in the REST API:


If we find no actual use cases for the current WikiData REST API, then I am proposing to remove it altogether, and drop the associated storage. This saves resources, and avoids exposing an API that isn't useful or meaningfully maintained.
We know that there are currently no uses for mobile sections or page summaries.

Consider which use cases the REST API could actually help with for WikiData, and set up a custom project config, similar to what we already do for wiktionaries.
TASK DETAILhttps://phabricator.wikimedia.org/T149114EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: mobrovac, bearND, Mholloway, Pchelolo, Eevans, daniel, Aklapper, GWicke, D3r1ck01, Izno, Hardikj, Wikidata-bugs, aude, AuFCL, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Triaged] T149114: Reconsider wikidata support in the REST API

2016-10-25 Thread GWicke
GWicke triaged this task as "Normal" priority.
TASK DETAILhttps://phabricator.wikimedia.org/T149114EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: mobrovac, bearND, Mholloway, Pchelolo, Eevans, daniel, Aklapper, GWicke, D3r1ck01, Izno, Hardikj, Wikidata-bugs, aude, AuFCL, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Lowered Priority] T102476: RFC: Requirements for change propagation

2016-10-12 Thread GWicke
GWicke lowered the priority of this task from "High" to "Low".GWicke edited projects, added Services (watching); removed Services.
TASK DETAILhttps://phabricator.wikimedia.org/T102476EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Mholloway, Liuxinyu970226, Scott_WUaS, Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T88633: [EPIC] Image-positioning service for storing and retrieving image focal points

2016-10-12 Thread GWicke
GWicke edited projects, added Services (watching); removed Services (later).
TASK DETAILhttps://phabricator.wikimedia.org/T88633EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: MZMcBride, Jdlrobson, Ricordisamoa, Spage, bearND, kaldari, KHammerstein, Mhurd, GWicke, Jdouglas, mobrovac, Jdforrester-WMF, phuedx, Deskana, Fjalapeno, BGerstle-WMF, brion, Dbrant, Maryana, Aklapper, NHarateh_WMF, Winter, D3r1ck01, Izno, Eevans, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T88633: [EPIC] Image-positioning service for storing and retrieving image focal points

2016-10-12 Thread GWicke
GWicke edited projects, added Services (backlog); removed Services.
TASK DETAILhttps://phabricator.wikimedia.org/T88633EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: MZMcBride, Jdlrobson, Ricordisamoa, Spage, bearND, kaldari, KHammerstein, Mhurd, GWicke, Jdouglas, mobrovac, Jdforrester-WMF, phuedx, Deskana, Fjalapeno, BGerstle-WMF, brion, Dbrant, Maryana, Aklapper, NHarateh_WMF, Winter, D3r1ck01, Izno, Eevans, Hardikj, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T102476: RFC: Requirements for change propagation

2016-10-05 Thread GWicke
GWicke edited the task description. (Show Details)
EDIT DETAILS...- **Reliable RCStream**: @Ottomata has been looking into leveraging Kafka events in RCStream. This can potentially let clients catch up after being disconnected. This can potentially let clients catch up after being disconnectedSee {T130651}TASK DETAILhttps://phabricator.wikimedia.org/T102476EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Mholloway, Liuxinyu970226, Scott_WUaS, Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T38881: Wiktionary needs usable API

2016-07-22 Thread GWicke
GWicke added a comment.
There is now an experimental API end point for wiktionary definitions at https://en.wiktionary.org/api/rest_v1/?doc#!/Page_content/get_page_definition_term

This API is used by the Android app to provide definitions for words using wiktionary data, but it is currently only available for the English Wiktionary. T138709 discusses ways to expand coverage to other languages by adding standard markup to consistently identify specific components of the definitions. Please chime in there.TASK DETAILhttps://phabricator.wikimedia.org/T38881EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: GWicke, Alkamid, TheDaveRoss, dg711, intracer, Aklapper, Hippietrail, Ricordisamoa, jberkel, Liuxinyu970226, Wikidata-bugs, GPHemsley, Amire80, siebrand, mxn, Glaisher, Qgil, MZMcBride, Yurik, Bawolff, Lydia_Pintscher, Sethakill, Luke081515, jayvdb, Darkdadaah, Anomie, Krenair, Legoktm___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T38881: Wiktionary needs usable API

2016-07-22 Thread GWicke
GWicke added a subtask: T138709: Use microformats on Wiktionary to improve term parsing.
TASK DETAILhttps://phabricator.wikimedia.org/T38881EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: GWickeCc: Alkamid, TheDaveRoss, dg711, intracer, Aklapper, Hippietrail, Ricordisamoa, jberkel, Liuxinyu970226, Wikidata-bugs, GPHemsley, Amire80, siebrand, mxn, Glaisher, Qgil, MZMcBride, Yurik, Bawolff, Lydia_Pintscher, Sethakill, Luke081515, jayvdb, Darkdadaah, Anomie, Krenair, Legoktm___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102476: RFC: Requirements for change propagation

2016-05-18 Thread GWicke
GWicke added a comment.


  I moved the current status / next steps section back here, so that only the 
general background section is now on MediaWiki.org. Status & next steps is more 
volatile & links directly to ongoing work, so benefits from being in this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T102476

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RobLa-WMF, GWicke
Cc: Scott_WUaS, Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, 
StudiesWorld, intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, 
BBlack, aaron, daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, 
Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T102476: RFC: Requirements for change propagation

2016-05-18 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T102476

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RobLa-WMF, GWicke
Cc: Scott_WUaS, Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, 
StudiesWorld, intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, 
BBlack, aaron, daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, 
Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T102476: RFC: Requirements for change propagation

2016-05-13 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T102476

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RobLa-WMF, GWicke
Cc: Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, 
intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, 
daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, Hardikj, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102476: RFC: Requirements for change propagation

2016-05-13 Thread GWicke
GWicke added a comment.


  @RobLa-WMF, I updated the task summary with a status summary & a sketch of 
next steps & open questions.

TASK DETAIL
  https://phabricator.wikimedia.org/T102476

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RobLa-WMF, GWicke
Cc: Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, 
intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, 
daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, Hardikj, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T102476: RFC: Requirements for change propagation

2016-05-13 Thread GWicke
GWicke added subscribers: Smalyshev, Ottomata.
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T102476

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RobLa-WMF, GWicke
Cc: Ottomata, Smalyshev, ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, 
intracer, JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, 
daniel, Eevans, mobrovac, GWicke, D3r1ck01, Izno, Luke081515, Hardikj, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808, Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-13 Thread GWicke
GWicke added a comment.


  > As I understand it, restbase is a front-end caching proxy store, exposed to 
the public internet.
  
  For most use cases (including HTML), it is actually *storing*, and not just 
caching. It is the equivalent of ExternalStore and most of the text table, 
including revision deletions. Longer term, we are looking into replacing 
ExternalStore for wikitext.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread GWicke
GWicke added a comment.


  The use case for providing metadata is so that we can use stores like 
RESTBase, which already provide an API keyed on title, revision & render ID. It 
also already deals with the complexities you mention.
  
  Basically, if we don't have a way to provide this key information to the 
backend store, then we can't access all the multi-content revision data that's 
already out there through this interface.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread GWicke
GWicke added a comment.


  > In any case, the PageUpdater / WikiPage code needs to trigger notifications 
(produce events). I don't care what mechanism it used for that. Or rather: I'm 
very happy if we get a generalized mechanism. We'll have to agree on some kind 
of schema for revisions, slots, and blobs, but that should be easy enough.
  
  Makes sense. Thanks for the clarification!
  
  >> In addition to title and revision (which I assume remains an integer), 
we'd need an optional v1 UUID parameter to retrieve specific renders, in both 
the request & response interfaces.
  
  
  
  > I have thought about this, too. My solution is to encode this in the slot 
name. So you could have an html.canonical (sub)slot, and a 
html.29e68f78-8765-49f8-86d5-dfc438d459fe, or html.en, or whatever.
  
  Hmmm, this sounds like a rather ugly hack. I thought the 'slot' is 
identifying the kind of content, and is not some general-purpose string that is 
used to append otherwise missing parameters, and differs with each render.
  
  >> How would a dumb blob store figure out which content belongs to the same 
page (and is thus similar), if all it has is the content & some metadata, but 
not the page id, title, revision & render UUID? This is the same design issue 
that plagues ExternalStore, and something we addressed in RESTBase. With 
large-window compression algorithms like brotli, we are getting down to 2-3% of 
the input HTML size (see https://phabricator.wikimedia.org/T122028). Without 
this locality information, you are likely to use an order of magnitude more 
storage as you are foregoing efficient delta compression.
  > 
  > This is a good point. Once again, we want our abstraction to be a bit 
leaky, to allow for optimizations.
  
  I would argue that it is a case of finding an abstraction at the right level. 
A simple blob store is a very low-level abstraction, and severely limits the 
backend's abilities to optimize storage, distribution & consistency. It also 
limits the backend's usefulness as an API in its own right.
  
  Instead, I think we should clearly define the API for each slot to provide / 
consume
  
  - page id,
  - page title,
  - revision id, and
  - a UUID / hash / etag.
  
  This makes sure that backends can continue to implement higher-level 
functionality & important optimizations. This should be part of the API, and 
not a case of a "leak". That said, backends *can* choose to ignore all of this 
(but the UUID / hash).
  
  > I havn't thought this through yet, but my inclanation is that we could 
associate a metadata array (k/v set) with the blob, which could include things 
like a hash and the page title. A BlobStore would be free to use this or not, 
to store it or not, and to make it retrievable or not.
  
  A minimum set of metadata (like the versioned content-type) should always be 
provided. It would be nice to model this in a way that's compatible with normal 
HTTP headers, as stored & returned by services like RESTBase.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread GWicke
GWicke added a comment.


  > Blobs would typically be shared by different revisions of the same page. 
This happens every time one primary slot is edited, but another is not changed. 
E.g. the free wikitext description of a file is edited, but the structured data 
isn't (or vice versa). Or the quality assessment data of an article is updated, 
but the article text isn't edited. In both cases, one of the blobs would be 
re-used by the new revision. I think this will actually be more common than 
editing all primary streams at once.
  
  Makes sense, some of these fields won't change between revisions. Depending 
on the constraints, it might still make sense to store unchanged content & rely 
on compression to encode it efficiently. This is likely what we'll continue to 
do in RESTBase, as this makes sure that access by revision continues to perform 
predictably.
  
  In any case, as long as you ask the backend for content for a specific title 
/ page id, revision & UUID, backends are free to use whatever performs best.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T107595: [RFC] Multi-Content Revisions

2016-05-04 Thread GWicke
GWicke added a comment.


  > Where do I propose another mechanism for change propagation? The 
PageUpdater would do exactly what Revision does now: schedule DataUpdates.
  
  EventBus & the change propagation service are moving away from scheduling 
"jobs", and towards an event processing approach based on Kafka. In this model, 
subscribers react to change events associated with resources. Event production 
& processing / consumption is decoupled and decentralized.
  
  PageUpdater (and RevisionUpdater) as proposed seem to be moving in the 
opposite direction, towards more jobs & away from event processing.
  
  > The bob-store is (potentially) content-adressable, so the same blob may be 
used for different revisions of different pages.
  
  Blob sharing would complicate your storage significantly, as you'd either 
have to forgo deleting content forever (very expensive for something like HTML 
renders), or incur significant complexity of implementing an atomic reference 
counting scheme. For textual content, I am pretty certain that sharing is rare, 
and the complexity would overall be a loss in performance and reliability.
  
  > Even for blobs that have an incremental ID (e.g. using the current text 
table storage mechanism), the same blob would frequently be used for multiple 
blobs of the same page.
  
  How would a dumb blob store figure out which content belongs to the same page 
(and is thus similar), if all it has is the content & some metadata, but not 
the page id, title, revision & render UUID? This is the same design issue that 
plagues ExternalStore, and something we addressed in RESTBase. With 
large-window compression algorithms like brotli, we are getting down to 2-3% of 
the input HTML size (see https://phabricator.wikimedia.org/T122028). Without 
this locality information, you are likely to use an order of magnitude more 
storage as you are foregoing efficient delta compression.
  
  I am generally trying to work out how RevisionContentLookup would work for 
use cases like fetching HTML from RESTBase. Some notes / questions:
  
  - In addition to title and revision (which I assume remains an integer), 
we'll need an optional v1 UUID parameter to retrieve specific renders, in both 
the request & response interfaces.
  - Will getTouched() return the UUID timestamp of a specific render 
(last-modified, essentially), or is this about page_touched? Also, should we 
expose UUIDs to make sure that we have a unique ID with a high-resolution 
timestamp?
  - For content from RESTBase, read restrictions are always enforced as part of 
the API request. No information about the applied restrictions is returned. In 
this context, getReadRestrictions() would basically always return the empty set.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: Glaisher, JJMC89, RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, 
intracer, Tgr, Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, 
awight, Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Luke081515, Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: [RFC] Multi-Content Revisions

2016-04-28 Thread GWicke
GWicke added a comment.


  Some notes:
  
  - PageUpdater aims to provide similar functionality as the change propagation 
service (using EventBus) & the job queue. Could you clarify why we need another 
mechanism for change propagation?
  - The blob store does not provide any locality information (title or page id, 
revision, render id / time-uuid), which means that it is incompatible with 
existing storage systems like RESTBase. Since locality information is critical 
for consistency and decent compression, I would suggest always providing at 
least these keys.

TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: RobLa-WMF, Yurik, ArielGlenn, APerson, TomT0m, Krenair, intracer, Tgr, 
Tobi_WMDE_SW, Addshore, Lydia_Pintscher, cscott, PleaseStand, awight, 
Ricordisamoa, GWicke, MarkTraceur, waldyrious, Legoktm, Aklapper, 
Jdforrester-WMF, Ltrlg, brion, Spage, MZMcBride, daniel, D3r1ck01, Izno, 
Wikidata-bugs, aude, jayvdb, fbstj, Mbch331, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Up For Grabs] T102476: RFC: Requirements for change propagation

2016-03-23 Thread GWicke
GWicke placed this task up for grabs.

TASK DETAIL
  https://phabricator.wikimedia.org/T102476

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, JanZerebecki, 
brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, 
mobrovac, GWicke, D3r1ck01, Izno, Hardikj, Wikidata-bugs, aude, jayvdb, fbstj, 
Mbch331, Jay8g, bd808, Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T126730: [RFC] Caching for results of wikidata Sparql queries

2016-02-17 Thread GWicke
GWicke added a comment.

> some of my concerns about the affect of including a graph (with a slowish 
> query) on page save timing, e.g. when just fixing a typo on a wikipedia page.

To get sensible hit rates for relatively rare events like edits, we would need 
to cache results for a long time, on the order of weeks. Would this be 
acceptable without automatic purging?


TASK DETAIL
  https://phabricator.wikimedia.org/T126730

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: Milimetric, Gehel, BBlack, GWicke, Bene, Ricordisamoa, daniel, 
Lydia_Pintscher, Smalyshev, Jonas, Christopher, Yurik, hoo, Aklapper, aude, 
debt, Izno, Luke081515, jkroll, Wikidata-bugs, Jdouglas, Deskana, Manybubbles, 
Mbch331, Jay8g, Ltrlg



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T126730: [RFC] Caching for results of wikidata Sparql queries

2016-02-17 Thread GWicke
GWicke added a comment.

> Since running the query each time graph is displayed is too expensive, we 
> want some intermediate caching store that would store the results, possibly 
> for the time defined in the query.

Is the graph extension actually re-requesting the data on each view, or would 
this only happen on parser cache miss / edit?

I'm still not sure how effective query service caching can be in this context:

> In particular, I wonder if there are a small number of queries that get a lot 
> of hits, and if those queries can be cached for long enough to result in 
> worthwhile hit rates.


TASK DETAIL
  https://phabricator.wikimedia.org/T126730

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: Milimetric, Gehel, BBlack, GWicke, Bene, Ricordisamoa, daniel, 
Lydia_Pintscher, Smalyshev, Jonas, Christopher, Yurik, hoo, Aklapper, aude, 
debt, Izno, Luke081515, jkroll, Wikidata-bugs, Jdouglas, Deskana, Manybubbles, 
Mbch331, Jay8g, Ltrlg



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T126730: [RFC] Caching for results of wikidatasparql queries for Graphs

2016-02-15 Thread GWicke
GWicke added a comment.

@smalyshev, I would need more context to usefully comment on this.

In particular, I wonder if there are a small number of queries that get a lot 
of hits, and if those queries can be cached for long enough to result in 
worthwhile hit rates.

When discussing a use case like graphs, there are also a lot more caching 
layers and change propagation systems to consider.


TASK DETAIL
  https://phabricator.wikimedia.org/T126730

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: GWicke, Bene, Ricordisamoa, daniel, Lydia_Pintscher, Smalyshev, Jonas, 
Christopher, Yurik, hoo, Aklapper, aude, debt, Gehel, Izno, Luke081515, jkroll, 
Wikidata-bugs, Jdouglas, Deskana, Manybubbles, Mbch331, Jay8g, Ltrlg



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T102476: RFC: Requirements for change propagation

2016-02-11 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T102476

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, Qgil, 
JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, 
Eevans, mobrovac, GWicke, Izno, Hardikj, Wikidata-bugs, aude, jayvdb, Mbch331, 
Jay8g, bd808, Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T102476: RFC: Requirements for change propagation

2016-02-11 Thread GWicke
GWicke removed a project: Wikimedia-Developer-Summit-2016.

TASK DETAIL
  https://phabricator.wikimedia.org/T102476

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer, Qgil, 
JanZerebecki, brion, Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, 
Eevans, mobrovac, GWicke, Izno, Hardikj, Wikidata-bugs, aude, jayvdb, Mbch331, 
Jay8g, bd808, Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Unblock] T102476: RFC: Requirements for change propagation

2016-01-22 Thread GWicke
GWicke closed blocking task T84923: Reliable publish / subscribe event bus as 
"Resolved".

TASK DETAIL
  https://phabricator.wikimedia.org/T102476

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: Addshore, RobLa-WMF, StudiesWorld, intracer, Qgil, JanZerebecki, brion, 
Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, 
GWicke, Hardikj, Wikidata-bugs, aude, jayvdb, Mbch331, Jay8g, bd808, Krenair, 
Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Closed] T84923: Reliable publish / subscribe event bus

2016-01-22 Thread GWicke
GWicke closed this task as "Resolved".
GWicke claimed this task.
GWicke added a comment.

A basic event bus is now available in production, and is being populated with 
edit events from MediaWiki. Consumption is directly from Kafka at this point.

This means that the core proposal of this task is implemented. I'm closing this 
task to reflect this.


TASK DETAIL
  https://phabricator.wikimedia.org/T84923

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: Aklapper, Matanya, Ottomata, mmodell, Eevans, chasemp, brion, Krenair, 
Halfak, JanZerebecki, bd808, MZMcBride, mobrovac, GWicke, aaron, daniel, 
Hardikj, yuvipanda, debt, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, Mbch331, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114474: More flexible and modernized Recent Changes code

2015-12-28 Thread GWicke
GWicke added a comment.

@aude: Which questions would you like to resolve at the summit? Do you think 
this topic could also be reasonably discussed in a regular RFC meeting?


TASK DETAIL
  https://phabricator.wikimedia.org/T114474

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude, GWicke
Cc: RobLa-WMF, Izno, GWicke, Jdforrester-WMF, Krenair, Qgil, hoo, Addshore, 
daniel, aude, Aklapper, Wikidata-bugs, Mbch331, Jay8g



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102476: RFC: Requirements for change propagation

2015-12-28 Thread GWicke
GWicke added a comment.

@janzerebecki: The current intention is to keep change propagation relatively 
simple and efficient. Many services can be implemented with very small relative 
per-request overheads, and services with high per-request overheads can 
consider applying opportunistic batching transparently to all requests, for 
example using a batching proxy.


TASK DETAIL
  https://phabricator.wikimedia.org/T102476

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: Addshore, RobLa-WMF, StudiesWorld, intracer, Qgil, JanZerebecki, brion, 
Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, 
GWicke, Hardikj, Wikidata-bugs, aude, jayvdb, Mbch331, Jay8g, bd808, Krenair, 
Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T102476: RFC: Requirements for change propagation

2015-12-28 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T102476

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: Addshore, RobLa-WMF, StudiesWorld, intracer, Qgil, JanZerebecki, brion, 
Ltrlg, Anomie, Milimetric, mark, BBlack, aaron, daniel, Eevans, mobrovac, 
GWicke, Hardikj, Wikidata-bugs, aude, jayvdb, Mbch331, Jay8g, bd808, Krenair, 
Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114019: Dumps 2.0 for realz (planning/architecture session)

2015-12-22 Thread GWicke
GWicke added a comment.

@ArielGlenn: To me it seems that the discussion so far lacks a shared agreement 
on what the most pressing problems with dumps are. This makes it difficult to 
evaluate candidate solutions and their trade-offs relative to the top 
priorities.

With the right preparation, a discussion at the dev summit could perhaps help 
to establish a shared agreement on the top problems to solve. It would be 
helpful if a candidate list could be worked out before the summit, so that it 
can inform the discussion.


TASK DETAIL
  https://phabricator.wikimedia.org/T114019

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn, GWicke
Cc: RobLa-WMF, GWicke, TTO, zhuyifei1999, StudiesWorld, gnosygnu, LA2, 
Ladsgroup, intracer, Lokal_Profil, Halfak, Legoktm, Qgil, JanZerebecki, brion, 
daniel, Hydriz, MZMcBride, hoo, ezachte, wpmirrordev, Nemo_bis, Aklapper, 
ArielGlenn, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Subscribers] T105638: RFC: Streamlining Composer usage

2015-11-06 Thread GWicke
GWicke added a subscriber: mobrovac.

TASK DETAIL
  https://phabricator.wikimedia.org/T105638

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JanZerebecki, GWicke
Cc: mobrovac, GWicke, Addshore, Qgil, Spage, greg, tstarling, aude, hoo, 
daniel, zeljkofilipin, thcipriani, mmodell, bd808, csteipp, Legoktm, Krinkle, 
hashar, JanZerebecki, Aklapper, Lynhg, Wikidata-bugs, Malyacko, Mbch331, Jay8g, 
Ltrlg, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T105638: RFC: Streamlining Composer usage

2015-11-06 Thread GWicke
GWicke added a subscriber: GWicke.
GWicke added a comment.

Here is an idea for a workflow-based solution that would work for nodejs as 
well:

1. Each code project has a corresponding deploy repository. For nodejs, current 
practice is to have the code as a submodule of the deploy repository (in src/). 
For MediaWiki, current practice is to have a deploy / dependency repository 
inside the code repository (vendor/). It might be worth investigating if 
inverting the relationship could be an option for MediaWiki as well, as this 
avoids deploy updates polluting the code repository history.
2. CI automatically updates the deploy repository for each test run by running 
composer / npm, and commits the result to git on successful test completion. 
The deploy repository hash is recorded in the test results.
3. For a deploy, one of the CI-prepared deploy repository commits are reviewed 
and merged. The diff clearly shows changes in dependencies. A potential issue 
here is making sure that the submodule patch is actually merged, but it seems 
that this could be solved with a hook.

This workflow is very close to what we are currently doing for node services, 
except that step 2) is currently performed manually, using a docker script.


TASK DETAIL
  https://phabricator.wikimedia.org/T105638

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JanZerebecki, GWicke
Cc: GWicke, Addshore, Qgil, Spage, greg, tstarling, aude, hoo, daniel, 
zeljkofilipin, thcipriani, mmodell, bd808, csteipp, Legoktm, Krinkle, hashar, 
JanZerebecki, Aklapper, Lynhg, Wikidata-bugs, Malyacko, Mbch331, Jay8g, Ltrlg, 
Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T114474: More flexible and modernized Recent Changes code

2015-11-05 Thread GWicke
GWicke added a subscriber: GWicke.
GWicke added a comment.

There is some related high-level discussion about recent changes and page 
history as event streams in https://phabricator.wikimedia.org/T107595. One idea 
is to layer event streams, which would potentially let us integrate related 
events like edits to corresponding Wikidata items.


TASK DETAIL
  https://phabricator.wikimedia.org/T114474

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: GWicke, Jdforrester-WMF, Krenair, Qgil, hoo, Addshore, daniel, aude, 
Aklapper, Wikidata-bugs, Mbch331, Jay8g



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-11-04 Thread GWicke
GWicke added a comment.

@faidon: Until very recently (last days), there wasn't actually any REST proxy 
with schema validation  in the EventLogging repository. @ottomata now has a 
patch implementing such a service 
<https://gerrit.wikimedia.org/r/#/c/235671/24/server/bin/eventlogging-service>, 
and @mobrovac has left comments on it today. So, it looks like we'll have the 
option of choosing between two new services implementing the same API. I don't 
see having two implementations of a simple service as a bad thing. As 
mentioned, we might want to use a single node process exposing parsoid, 
restbase & eventbus for small (third party) installs, but might as well use the 
new EventLogging service in production.

There are still loose ends to be tied in the API and event schema definitions, 
and I think that should be our focus. The implementation deserves attention 
too, but it's easy to swap & a few hundred lines each.

Replacing all of EventLogging is pretty much out of scope for EventBus. The 
focus is on queuing and event validation, and not on other EventLogging 
features like Varnish log decoding, analytics databases etc. If desired, we 
could fairly easily add HTTP event production in EL, which would write to 
EventBus instead of directly to Kafka. However, I personally think it's fine to 
let trusted producers write directly to Kafka, especially for internal 
applications. The current EL instance is producing to a separate (analytics) 
Kafka cluster in any case, so there is no potential for conflicts with 
non-analytics use cases.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ottomata, GWicke
Cc: Milimetric, RobLa-WMF, brion, intracer, Smalyshev, mark, MZMcBride, 
Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, 
aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, 
jkroll, Hardikj, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, daniel, 
Mbch331, Jay8g, Ltrlg, jeremyb, Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-11-02 Thread GWicke
GWicke added a comment.

@ottomata: In my recollection of the discussion & the log you linked to, the 
question of which REST producer proxy to use was left open. Our priority is to 
get basic events into Kafka before the end of this month, so that we can start 
building on top of this for change propagation. We have a simple node service 
<https://github.com/wikimedia/restevent> that does what we need & integrates 
with our node infrastructure, but if you have something based on EventLogging 
soon then we can consider using that too.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ottomata, GWicke
Cc: RobLa-WMF, brion, intracer, Smalyshev, mark, MZMcBride, Krinkle, 
EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, 
GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, 
Hardikj, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, daniel, Jay8g, 
Ltrlg, jeremyb, Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-29 Thread GWicke
GWicke added a comment.

@ottomata: If you fill in the defaults at consumption time, then you have a 
choice of how you want to treat old events. You can either fill in the defaults 
from the latest schema (probably what you want in most cases), or choose to 
explicitly distinguish fields that were not yet defined at the time the event 
was produced.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mobrovac, GWicke
Cc: intracer, EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, 
GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, 
chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, 
JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, 
RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-29 Thread GWicke
GWicke added a comment.

@ottomata, you are basically making the case for filling in the defaults at 
consumption time.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mobrovac, GWicke
Cc: intracer, EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, 
GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, 
chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, 
JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, 
RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-29 Thread GWicke
GWicke added a comment.

@ottomata, they will be filled in somewhere, but I think we haven't necessarily 
decided on filling them in at production time. To me it seems that filling in 
either at production or consumption time will work, as long as defaults don't 
change. It sounds like you have a concern in that area, though. Could you 
elaborate?


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mobrovac, GWicke
Cc: intracer, EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, 
GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, 
chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, 
JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, 
RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-29 Thread GWicke
GWicke added a comment.

@ottomata: Based on our backwards-compatibility rules, the latest schema will 
be a superset of previous schemas. This means that you will be able to 
understand both old and new data in a given topic using the //latest// schema.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mobrovac, GWicke
Cc: intracer, EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, 
GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, 
chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, 
JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, 
RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-28 Thread GWicke
GWicke added a comment.

@ottomata, I think understanding the semantics of an event primarily requires 
knowledge of the topic. The topic in turn provides access to the schema, which 
describes the structure of the events. It is likely that we'll have multiple 
topics record similarly-structured events, which means that they might share 
the same schema, but describe different semantic events in each topic. For 
example, a basic timing event can be emitted for clicks of button A or button 
B, each tracked in a separate topic.

I could be convinced to include the topic name  / URL in each event. One use 
case this could potentially help is streaming events from multiple topics. We 
could also handle this with a framing format, but this might force us to parse 
JSON on the consumer side, which wouldn't be great for performance.

Either way, given the topic name you should have no trouble accessing the 
schema. We can expose schemas for each topic URL in the REST API (ex: 
`/{topic}?schema.json`), which you could then store along with the event data 
in hadoop. Embedding an explicit schema url of the form described above might 
be a bit redundant, considering the simplicity of the construction.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mobrovac, GWicke
Cc: intracer, EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, 
GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, 
chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, 
JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, 
RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-27 Thread GWicke
GWicke added a comment.

> I've been thinking about it too. Ideally, we could leave these fields out of 
> schema defs, simply reference them. But, that seems not to be in correlation 
> with storing them in a git repo. What I see as a possible solution is to put 
> these common fields into a separate file and let the producer proxy in front 
> on kafka stick it into each schema def.


The validator lets us register schemas corresponding to urls 
<https://github.com/epoberezkin/ajv#addschemaarrayobjectobject-schema--string-key>,
 which will then be used when those are referenced via $ref.

We could also use a nested object to remove some redundancy in naming:

  {
event: {
  id: '..v1 uuid..',
  ts: '2015-...',
  subject: '/some/uri'
},
// Event specific data
  }


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mobrovac, GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-26 Thread GWicke
GWicke added a comment.

In https://phabricator.wikimedia.org/T116247#1754698, @Ottomata wrote:

> > If we have a use case for emitting two secondary events *to the same topic* 
> > that were both triggered by the same primary event (user click / request 
> > id), then we can generate a new ID for at least one of those events, and 
> > record the parent event id in a separate field (ex: par_id). This way, we 
> > can get the right deduplication semantics for each of those events.
>
>
> ?  What's the point of the request_id then?  I thought we wanted X-Request-Id 
> so that we can easily tie together events generated by the same http request.
>
> Why not just have `request_id` and `uuid` as separate fields that always 
> exist?


Sure, optionally having a separate request ID (in addition to the event ID) 
sounds good to me as well. We should always require / auto-gereate the event ID 
(and use it for event deduplication, derived event timestamp etc), while the 
reqid can be added to events that are indeed request-triggered.

> > IMHO, the timestamps of the event ID and explicit timestamp (ts or dt) 
> > should always match. This makes it a lot simpler to automatically derive dt 
> > from id in the producer REST proxy. Other event-specific times (like the 
> > save time as recorded by MediaWiki) should imho go into the event body.

> 

> 

> Why?  I agree, that specific schemas can define additional timestamps, but 
> what is the harm in having a standard one that is set and used semantically 
> by the producer?  What if I wanted to explicitly feed a topic with events 
> dated in the past, perhaps for backfilling or recovery reasons?


That's exactly what the event ID and dt should support well. MW edit timetamps 
are low resolution, and in a custom format, which imho makes them less than 
ideal for general event ids / timestamps.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mobrovac, GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-26 Thread GWicke
GWicke added a comment.

> I'm not so sure actually that these will always be redundant. I think the 
> request ID should be persisted to track the same event throughout the system. 
> Imagine a user clicks on something which produces an event in the queue and 
> that event triggers another one to be enqueued. Then, both of them should 
> have the same request id, but different time stamps, shouldn't they?


IMHO, the timestamps of the event ID and explicit timestamp (`ts` or `dt`) 
should always match. This makes it a lot simpler to automatically derive `dt` 
from `id` in the producer REST proxy.

If we have a use case for emitting two secondary events *to the same topic* 
that were both triggered by the same primary event (user click / request id), 
then we can generate a new ID for at least one of those events, and record the 
parent event id in a separate field (ex: `par_id`). This way, we can get the 
right deduplication semantics for each of those events.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-26 Thread GWicke
GWicke added a comment.

> If we adopt a convention of always storing schema name and/or revision in the 
> schemas themselves, then we can do like EventLogging does and infer and 
> validate the schema based on this value. This would especially be helpful in 
> associating a message with an Avro Schema when serializing into binary.


The topic configuration will take precedence, so we wouldn't use 
client-supplied values for these fields, and would basically just write a part 
of the topic configuration into each event. We also decided that we will only 
evolve schemas in backwards-compatible ways. In practice, this means that we'll 
only add fields, and the latest schema will be able to validate both new and 
old data in each topic.

@ottomata, which value do you see in recording the schema configured for a 
topic at enqueue time in each event?


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread GWicke
GWicke added a comment.

I went ahead and updated the task description with the current framing  / 
per-event schema. I renamed the `reqid` to just `id`, and added a `ts` field 
containing the same timestamp in ISO 8601 format.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T116247: Define edit related events for change propagation

2015-10-23 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread GWicke
GWicke added a comment.

> Right, but how would you do this in say, Hive? Or in bash? Timestamp logic 
> should be easy and immediate.


Yeah, Hive really seems to be lacking built-in support for UUIDs. There seems 
to be UDF code to deal with them, but it's definitely not as convenient as it 
could be. I'm fine with including the timestamp corresponding to the timeuuid 
to help Hive. The overhead is fairly small, and we can automate adding the 
timestamp even if only the UUID was supplied.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread GWicke
GWicke added a comment.

@JanZerebecki: Suppression information would indeed be needed for public access 
to older events. One option would be to key this on the event's UUID. We could 
also consider superseding the message using Kafka's deduplication (compaction) 
based on the same UUID.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread GWicke
GWicke added a comment.

@ottomata, UUIDs are described in 
https://en.wikipedia.org/wiki/Universally_unique_identifier. An example for a 
v1 UUID is `b54adc00-67f9-11d9-9669-0800200c9a66`. There are libraries to 
extract the high-resolution timestamp for most environments.

Regarding a separate timestamp in the framing information: Which time would 
this correspond to? The next version of Cassandra is likely going to track 
enqueue time itself & support efficient retrieval by timestamp 
<http://www.confluent.io/blog/log-compaction-highlights-in-the-kafka-and-stream-processing-community-october-2015>,
 and enqueue time is something that should be handled in Kafka in any case. 
Other timestamps have event-specific semantics, like for example the MediaWiki 
save time, which is why I think it makes most sense to not include them in the 
framing information. All events should however have a unique identifier and 
timestamp that ties together all events triggered by the same original trigger, 
and can be used for per-topic de-duplication / idempotency. This is what the 
UUID in reqid would provide.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Subscribers] T116247: Define edit related events for change propagation

2015-10-23 Thread GWicke
GWicke added a subscriber: EBernhardson.

TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-22 Thread GWicke
GWicke added a comment.

Some notes from the meeting:

  1. Framing, for all events
- **uri**: string; path or url. Example: 
/en.wikipedia.org/v1/page/title/San_Francisco
- **reqid**: v1 UUID 
<https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_1_.28MAC_address_.26_date-time.29>;
 corresponding to the `x-request-id` header, or another primary event 
identifier. V1 UUIDs contain a high-resolution timestamp.
- domain: en.wikipedia.org, fr.wiktionary.org,...; No mobile variants.

Edit events
---

- title: string
- pageid
- revision: integer
- savetime: iso 8601
- Other metadata, like the user etc.
- Generally, no overly sensitive information (like client IPs for authenticated 
edits) in primary events.
  - Can be included in expanded message in separate topic, or stored separately 
based on reqid.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, 
bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, 
Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, 01tonythomas, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T84923: Reliable publish / subscribe event bus

2015-10-21 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T84923

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: Aklapper, Matanya, Mattflaschen, Ottomata, mmodell, Eevans, chasemp, brion, 
Krenair, Halfak, JanZerebecki, bd808, MZMcBride, mobrovac, GWicke, aaron, 
daniel, Hardikj, yuvipanda, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T84923: Reliable publish / subscribe event bus

2015-10-21 Thread GWicke
GWicke added a blocked task: T102476: RFC: Requirements for change propagation.

TASK DETAIL
  https://phabricator.wikimedia.org/T84923

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: Aklapper, Matanya, Mattflaschen, Ottomata, mmodell, Eevans, chasemp, brion, 
Krenair, Halfak, JanZerebecki, bd808, MZMcBride, mobrovac, GWicke, aaron, 
daniel, Hardikj, yuvipanda, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T116247: Define edit related events for change propagation

2015-10-21 Thread GWicke
GWicke added a blocked task: T102476: RFC: Requirements for change propagation.

TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, 
JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, 
Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T116247: Define edit related events for change propagation

2015-10-21 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, 
JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, 
Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T116247: Define edit related events for change propagation

2015-10-21 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, 
JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, 
Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T116247: Define edit related events for change propagation

2015-10-21 Thread GWicke
GWicke created this task.
GWicke added subscribers: Aklapper, Matanya, Mattflaschen, Ottomata, mmodell, 
Eevans, chasemp, brion, Krenair, Halfak, JanZerebecki, bd808, MZMcBride, 
mobrovac, GWicke, aaron, daniel, Hardikj, yuvipanda.
GWicke added projects: operations, EventBus, Discovery, Epic, Analytics, 
Wikidata, MediaWiki-General-or-Unknown, Services, Service-Architecture, 
Wikidata-Query-Service.

TASK DESCRIPTION
  Our (#services) primary focus this quarter is on enabling change propagation 
for edit-related events. We already track such events in [a custom 
extension](https://github.com/wikimedia/mediawiki-extensions-RestBaseUpdateJobs/blob/master/RestbaseUpdate.hooks.php),
 which then creates custom jobs, which in turn performs HTTP requests to 
RESTBase. Instead, we would like to cover this functionality with more 
general-purpose events using the event bus:
  
  - article creation
  - article deletion
  - article undeletion
  - article edit
  - article rename
  - revision deletion / suppression
  - file upload
  
  ## Other use cases
  
  - Change propagation between content types
- edit triggers Parsoid re-parse, which triggers mobile app service & 
metadata updates
  - Wikidata changes
- use cases: invalidate pages using specific wikidata items; keeping the 
#wikidata-query-service up to date
  - Analytics: 
https://meta.wikimedia.org/wiki/Research:MediaWiki_events:_a_generalized_public_event_datasource
  
  ## Considerations
  
  - naming of articles / resources vs. topics vs. subscriptions: Generally use 
URLs / paths as discussed in T102476 (section "Addressing of components")?

TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, 
JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, 
Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Raised Priority] T116247: Define edit related events for change propagation

2015-10-21 Thread GWicke
GWicke raised the priority of this task from "Normal" to "High".
GWicke set Security to None.

TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, bd808, 
JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, Ottomata, 
Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-10-21 Thread GWicke
GWicke added a comment.

We are having a hangout meeting tomorrow (Thursday, 22nd) between 11&12am SF 
time. Please let us know if you'd like to join.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ottomata, GWicke
Cc: mark, MZMcBride, Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, 
Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, 
Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, 
aude, Deskana, Manybubbles, daniel, RobLa-WMF, Jay8g, Ltrlg, jeremyb, Legoktm



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-10-19 Thread GWicke
GWicke added a comment.

A PR adding remote schema support to the nodejs frontend is now available at 
https://github.com/wikimedia/restevent/pull/1. This means that we can now 
choose to use local or remote schemas per-topic in the configuration.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ottomata, GWicke
Cc: mark, MZMcBride, Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, 
Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, 
Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, 
RobH, aude, Deskana, Manybubbles, daniel, JanZerebecki, RobLa-WMF, Jay8g, 
Ltrlg, fgiunchedi, Dzahn, jeremyb, Legoktm, chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-10-16 Thread GWicke
GWicke added a comment.

> For starters, it means that we have alternatives for environments where Kafka 
> is overkill (small third-party installations, dev environments, mw-vagrant, 
> etc). Using, for example, sqlite instead of Kafka is already something 
> supported.


As far as I can see, there is no support for using any database as a queue / 
log in a way that would give us a light-weight alternative to Kafka. There is 
generally no support for streaming from a database in EventLogging, and 
separate tables are created whenever a schema is changed.

We'll have to implement this either way. We do have fairly nice async table 
abstractions for sqlite and cassandra that we could reuse for this in node. 
Both already implement retention policies. Python has sqlalchemy, which is a 
pretty nice way to interface with dbs. Retention policies would have to be 
implemented manually.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ottomata, GWicke
Cc: mark, MZMcBride, Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, 
Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, 
Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, 
RobH, aude, Deskana, Manybubbles, daniel, JanZerebecki, RobLa-WMF, Jay8g, 
fgiunchedi, Dzahn, jeremyb, Legoktm, chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-10-16 Thread GWicke
GWicke added a comment.

In https://phabricator.wikimedia.org/T114443#1731399, @ori wrote:

> In https://phabricator.wikimedia.org/T114443#1731284, @GWicke wrote:
>
> > See https://phabricator.wikimedia.org/T88459#1604768. tl;dr: It's not 
> > necessarily clear that saving very little code (see above) for EL schema 
> > fetching outweights the cost of additional hardware.
>
>
> Could you explain how you arrived at the figure of 50k requests per second, 
> which you project for this service?


This @ottomata's projection for analytics use cases. For core events, 
throughput should be of a lesser concern as rates will likely be in the low 
hundreds of messages per second.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ottomata, GWicke
Cc: mark, MZMcBride, Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, 
Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, 
Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, 
RobH, aude, Deskana, Manybubbles, daniel, JanZerebecki, RobLa-WMF, Jay8g, 
fgiunchedi, Dzahn, jeremyb, Legoktm, chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-10-16 Thread GWicke
GWicke added a comment.

In https://phabricator.wikimedia.org/T114443#1730753, @Eevans wrote:

> 1. Already leverages a (really slick) JSON schema registry 
> <https://meta.wikimedia.org/wiki/Category:Schemas_%28active%29?status=active>


Optionally fetching schemas from a URL isn't that hard really. Example code:

  if (/^https?:\/\//.test(schema)) {
return preq.get(schema);
  } else {
return readFromFile(schema);
  }

This lets us support files for core events, and fetching schemas from meta for 
EL. Schema validation is a call to a library.

> 1. Provides a pluggable, composable, architecture with support for a wide 
> range of readers/writers


How would this be an advantage for the EventBus portion? Many third-party users 
will actually only want a minimal event bus, and EL doesn't seem to help with 
this from what I have seen.

> - schema registry availability


There are more concerns here than just availability (although that's important, 
too).

Third party users won't necessarily want to give their service access to the 
internet in order to fetch schemas. We need to provide a way to retrieve a full 
set of core schemas, and a git repository is an easy way to achieve this.

We also need proper code review and versioning for core schemas, and wikis 
don't really support code review. We could consider storing pointers to schemas 
(URLs) instead of the actual schemas in git, but this adds complexity without 
much apparent benefit:

Workflow with schemas in git:

1. create a patch with a schema change
2. code review

Workflow with pointers to schemas (URLs) in git:

1. save a new schema on meta; note revision id
2. create a patch with a schema URL change
3. code review



> For performance, it needs to be Good Enough(tm), where Good Enough should be 
> something we can quantify based on factors like latency, throughput, and 
> capacity costs that aren't prohibitively expensive when weighed against other 
> factors (e.g. engineering effort).


See https://phabricator.wikimedia.org/T88459#1604768. tl;dr: It's not 
necessarily clear that saving very little code (see above) for EL schema 
fetching outweights the cost of additional hardware.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ottomata, GWicke
Cc: mark, MZMcBride, Krinkle, EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, 
Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, 
Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, 
RobH, aude, Deskana, Manybubbles, daniel, JanZerebecki, RobLa-WMF, Jay8g, 
fgiunchedi, Dzahn, jeremyb, Legoktm, chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T114443: EventBus MVP

2015-10-07 Thread GWicke
GWicke moved this task to Ready for RFC meeting on the MediaWiki-RfCs workboard.

TASK DETAIL
  https://phabricator.wikimedia.org/T114443

WORKBOARD
  https://phabricator.wikimedia.org/project/board/52/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, 
GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, 
Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, 
daniel, mark, JanZerebecki, RobLa-WMF, Jay8g, fgiunchedi, Dzahn, jeremyb, 
Legoktm, chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T114443: EventBus MVP

2015-10-07 Thread GWicke
GWicke added a project: MediaWiki-RfCs.

TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, 
GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, 
Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, 
daniel, mark, JanZerebecki, RobLa-WMF, Jay8g, fgiunchedi, Dzahn, jeremyb, 
Legoktm, chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-10-05 Thread GWicke
GWicke added a comment.

I guess we have slightly different ideas about what a message bus should be:

1. a way to get blobs from a to b, and
2. a way to expose a stream of events in a defined format that can be consumed 
easily by a range of clients.

The use cases I care about require 2). Applying my interpretation of the 
Robustness Principle <https://en.wikipedia.org/wiki/Robustness_principle> to 
that use case means thoroughly checking / coercing things on the way in, and 
keeping promises on the way out.

I also agree that it is possible to implement 2) by writing directly to Kafka, 
provided that *each* producer

- emits only events satisfying the expected (current) schema,
- never writes to queues it shouldn't write to (access control), and
- is fully aware of internal optimizations such as binary encodings and 
compression specific to the event queue implementation and topic.

Add to this requirements like emitting per-topic metrics, and I think it 
becomes clear why limiting the number of implementations is desirable.

I also think that we should look at actual data before making assumptions about 
latency. For example, simple Kafka clients establish a new TCP connection per 
write, and might even fetch metadata for each connection. The simple REST 
service (120 lines) processes 1100+ req/s with a mean enqueue latency of around 
10ms, with both Kafka and the service running on a single-core labs instance. 
At production load, low single-digit ms should be typical. There will be use 
cases where sub-ms latency or extremely high volume is needed & REST is not a 
good fit, but lets base decisions around that on actual data.

Regarding the monolog backend, my understanding based on 
https://phabricator.wikimedia.org/T108618 and conversations is that this is 
primarily aiming to ship events to hadoop for later analysis. As such, it's 
message format is geared towards that use case, and no effort has been made to 
generalize events and their representation for general use. That said, we 
*could* consider using the monolog integration for emitting more general events 
from MediaWiki, but would then also need to implement support for alternative 
backends, and ensure that schemas agree.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, 
GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, 
jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, 
Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, 
chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-10-04 Thread GWicke
GWicke added a comment.

@ori, I changed the text to clarify which of those are potential, and which are 
concrete plans for this quarter. Please follow the provided links if things are 
still unclear.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, 
GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, 
jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, 
Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, 
chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T114443: EventBus MVP

2015-10-04 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, 
GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, 
jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, 
Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, 
chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T114443: EventBus MVP

2015-10-04 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, 
GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, 
jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, 
Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, 
chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-10-04 Thread GWicke
GWicke added a comment.

@Nuria, see the task description, heading "Initial use cases".


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, 
GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, 
jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, 
Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, 
chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-10-02 Thread GWicke
GWicke added a comment.

@ottomata, yes. One of the motivations for having a REST interface is 
having,... an interface.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, 
GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, 
jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, 
Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, 
chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-10-02 Thread GWicke
GWicke added a comment.

@ottomata, main reason would be the ability to work with $simple_queue, 
$binary_kafka, $amazon_queue and so on without changes in MW code. This isn't 
so theoretical. We'll want a lighter-weight queue for testing, developers and 
third party users rather soon.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, 
GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, 
jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, 
Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, 
chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T114443: EventBus MVP

2015-10-01 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, 
Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, 
Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, 
RobLa-WMF, bd808, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-10-01 Thread GWicke
GWicke added a comment.

I have now integrated some of those changes into the description.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, 
Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, 
Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, 
RobLa-WMF, bd808, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114443: EventBus MVP

2015-10-01 Thread GWicke
GWicke added a comment.

@ottomata, I would perhaps leave out the section "The MVP might also include". 
Much of it isn't so minimally viable, IMHO.

Re use cases, the following two have been driving the original idea and 
discussion:

1. Provide edit related events (ex: edit, creation, deletion, revision 
deletion, rename). Consumers: RESTBase / change propagation service, potential 
purge service, potentially RCStream, analytics.
2. EventLogging: Decode, validate and enqueue JSON events for EL.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, 
Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, 
Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, 
RobLa-WMF, bd808, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T107595: RFC: Multi-Content Revisions

2015-09-25 Thread GWicke
GWicke added a comment.

@daniel, your revised version seems to focus even more on implementing storage 
systems, change propagation etc, rather than defining a data access interface 
for MediaWiki, which can be backed by services.

Could you clarify how you see this relate to ongoing efforts with similar goals 
and use cases like

a) RESTBase offering a lot of the storage & API functionality (beyond blob 
storage), and
b) the event bus, dependency tracking and change propagation work in 
https://phabricator.wikimedia.org/T102476 and friends?


TASK DETAIL
  https://phabricator.wikimedia.org/T107595

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel, GWicke
Cc: Lydia_Pintscher, cscott, PleaseStand, awight, Ricordisamoa, GWicke, 
MarkTraceur, waldyrious, Legoktm, Aklapper, Jdforrester-WMF, Ltrlg, brion, 
Spage, MZMcBride, daniel, Wikidata-bugs, aude, Jay8g, bd808



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T84923: Reliable publish / subscribe event bus

2015-08-31 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T84923

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: Aklapper, Matanya, Mattflaschen, Ottomata, mmodell, Eevans, chasemp, brion, 
Krenair, Halfak, JanZerebecki, bd808, MZMcBride, mobrovac, GWicke, aaron, 
daniel, Hardikj, yuvipanda, JAllemandou, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, RobH, aude, Deskana, Manybubbles, mark, RobLa-WMF, faidon, 
fgiunchedi, Dzahn, jeremyb, Malyacko



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


  1   2   >