EBernhardson added a comment.
Talked with @dcausse about this on IRC today, based on the data we are now seeing[1] the estimates in here are too low. We are thinking that a 16G ganetti instance will not be sufficient for the growth we are seeing. I'm fairly suspicious of our "minimal" sizing of 32G ram, considering we have a 4 year refresh cycle. I wouldn't be surprised if we are talking 1B+ triples (half of 60M media files tagged with ~20 tripples) by 2024.. My intuition is that 64G is the minimum we should be considering, but i think price of memory should be taken into account. If we could have 256G and not worry about it for the cost of a few $k, we would get that back and more in saved time and focus for not having to think about things and the time wasted putting hacks in place to fit in something smaller than necessary several years from now. The beta service is being stood up in WMCS currently, we have access to bare metal machines installed to WMCS specifically for wdqs (report as 132G ram, 3.2T disk, 32 cores. But 132G ram seems suspicious). I'm not sure how long the query service is intended to be in WMCS, but this instance should have enough runway to get us to next fiscal when we can request machines. [1] https://analytics.wikimedia.org/published/notebooks/computer-aided-tagging/CAT-usage-report.html TASK DETAIL https://phabricator.wikimedia.org/T254232 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: EBernhardson Cc: EBernhardson, dcausse, RKemper, akosiaris, Addshore, Aklapper, Gehel, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs