Gehel created this task. Gehel added projects: Wikidata-Query-Service, Wikidata, Operations. Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION Our evaluation and proof of concept around Flink is moving forward. We need to start thinking about a deployment strategy. There are still a lot of unknowns, this is the start of the discussion, not the plan yet. **Context / problem space** See T244590 <https://phabricator.wikimedia.org/T244590> for the larger context. The new WDQS update strategy is an event driven application. This is a stream processing application that needs facilities for reordering of events, management of late events and management of state and check-pointing. Flink <https://flink.apache.org/> provides for all those needs, was already envisioned as part of the Event Platform <https://wikitech.wikimedia.org/wiki/Event_Platform> and is also being looked at by CPT for similar use cases. **Requirements** In our use case, Flink requires: - compute resources (CPU / RAM) - no numbers yet on how much resources we need, but the expectation is that the requirements are going to be similar to what we need for the current updater, which is sharing resources with Blazegraph - some local storage for state (which can be considered as transient) - our current estimate is that local state will be < 1 GB, but this needs to be refined - shared storage for check-pointing - current strategy is to use HDFS, but other backends can be supported (NFS, Cassandra, ...) **Dependencies** - initial state is expected to be loaded from HDFS on our Hadoop cluster - kafka (-main or -jumbo) to consume various event streams - wikidata to enrich events with actual content - kafka to produce TTL stream - some system (TBD) for check-point storage **Strategies** Since we don't have experience with Flink yet, the longer term use cases are still undefined, and addressing the updater issues for WDQS is time sensitive, it might make sense to have a short term intermediate solution and to evolve it in a longer term solution. - k8s: Flink itself has no persistent state, it might be a candidate for k8s. Kubernetes native support from Flink seems to still be experimental, but a standalone deployment seems viable - dedicated Flink cluster on new hardware (just for the WDQS use case) - shared Flink cluster on new hardware (shared cluster for WDQS and CPT use cases + additional future use cases) - dedicated Flink cluster collocated on existing WDQS hardware TASK DETAIL https://phabricator.wikimedia.org/T247058 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel Cc: Aklapper, dcausse, Zbyszko, Gehel, darthmon_wmde, Legado_Shulgin, Nandana, Davinaclare77, Qtn1293, Techguru.pc, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Zppix, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Rxy, Jay8g, fgiunchedi
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs