[Wikidata-bugs] [Maniphest] [Commented On] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2020-03-12 Thread Gehel
Gehel added a comment. And we have a first version of a design document . This is still work in progress, feel free to comment! TASK DETAIL https://phabricator.wikimedia.org/T247058 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2020-03-10 Thread Ottomata
Ottomata added a comment. While not a google doc, the parent ticket's description describes it pretty well: T244590: EPIC: Rework the WDQS updater as an event driven application TASK DETAIL https://phabricator.wikimedia.org/T247058 EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2020-03-09 Thread Nuria
Nuria added a comment. I think it will be very helpful to have a design document for this service so we are all in the same page of what the flink install would do (as there are other projects currently evaluating flink as well). Can we get a google doc that goes over the design proposed

[Wikidata-bugs] [Maniphest] [Commented On] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2020-03-06 Thread Pchelolo
Pchelolo added a comment. Yeah, @Gehel analysis is correct - change-prop is pretty simple and doesn't support any of the advanced features pointed in the task. We have never really needed any of those, we mostly rely on the fact that systems updated by change-prop are idempotent and don't

[Wikidata-bugs] [Maniphest] [Commented On] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2020-03-06 Thread Ottomata
Ottomata added a comment. A nice feature of Flink is its support for both batch and stream processing. Ideally, we'd be able to build lambda architectures reusing most of the core data logic for streams and historical batch backfilling.

[Wikidata-bugs] [Maniphest] [Commented On] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2020-03-06 Thread Ottomata
Ottomata added a comment. Ping also @Pchelolo for comments on ^ TASK DETAIL https://phabricator.wikimedia.org/T247058 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Ottomata, Pchelolo, Joe, Aklapper, dcausse, Zbyszko, Gehel,

[Wikidata-bugs] [Maniphest] [Commented On] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2020-03-06 Thread dcausse
dcausse added a comment. @Joe the main reason for me is that we need to do state-full computation over multiple event streams: - we want to union multiple event streams - we want to reorder events - we want to keep a state (last seen rev per entity) ChangeProp is a "mostly"

[Wikidata-bugs] [Maniphest] [Commented On] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2020-03-06 Thread Joe
Joe added a comment. I would like to read an assessment of why our current event processing platform, change-propagation, is not suited for this purpose, and we need to introduce a new software. I suppose this has been done at some point in another task; if so a quick link would suffice :)