[Wikidata-bugs] [Maniphest] [Commented On] T114904: Migrate wb_items_per_site to using prefixed entity IDs instead of numeric IDs
Ghuron added a comment. Look at the query that is used to get missing articles for "List of articles every Wikipedia should have" https://quarry.wmflabs.org/query/26700 There are 2 joins: - For iwlinks/wb_items_per_site join I have to do either CONCAT ('Q', ips_item_id) or TRIM('Q' FROM iwl_title) - For wb_items_per_site/page join I have to do REPLACE(ips_site_page, ' ', '_') or vice versa and potentially take care of non-default namespace Isn't it ironic that wikidatawiki_p.wb_items_per_site has 2 columns that is potentially joinable with wikipedia tables and **both of them** are require format modification? TASK DETAIL https://phabricator.wikimedia.org/T114904 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ghuron Cc: Ghuron, Addshore, WMDE-leszek, Ladsgroup, Multichill, Sjoerddebruin, Lydia_Pintscher, Pasleim, Ricordisamoa, hoo, daniel, Aklapper, alaa_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Wikidata-bugs, aude, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114904: Migrate wb_items_per_site to using prefixed entity IDs instead of numeric IDs
Ladsgroup added a comment. The query @Multichill is running is complex on its own and even if we resolve this task, it'll be still a very slow query, what I would recommend is to make a temporary table in labs for exactly this tool (which happens all the time, for example Magnus does it too) and query directly from there.TASK DETAILhttps://phabricator.wikimedia.org/T114904EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: LadsgroupCc: WMDE-leszek, Ladsgroup, Multichill, Sjoerddebruin, Lydia_Pintscher, Pasleim, Ricordisamoa, hoo, daniel, Aklapper, Lahi, GoranSMilovanovic, QZanden, Marostegui, Minhnv-2809, Luke081515, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114904: Migrate wb_items_per_site to using prefixed entity IDs instead of numeric IDs
daniel added a comment. We should also consider a solution that will only put the full ids on labs. We don't actually need it in production. Could be a separate table, kept up to day by a trigger.TASK DETAILhttps://phabricator.wikimedia.org/T114904EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: WMDE-leszek, Ladsgroup, Multichill, Sjoerddebruin, Lydia_Pintscher, Pasleim, Ricordisamoa, hoo, daniel, Aklapper, Lahi, GoranSMilovanovic, QZanden, Marostegui, Minhnv-2809, Luke081515, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114904: Migrate wb_items_per_site to using prefixed entity IDs instead of numeric IDs
hoo added a comment. Giving the size of the table, changing this shouldn't be overly horrible. It's a fair bit of migration work… but I assume doing this for maintenance queries and consistency is worth it.TASK DETAILhttps://phabricator.wikimedia.org/T114904EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hooCc: WMDE-leszek, Ladsgroup, Multichill, Sjoerddebruin, Lydia_Pintscher, Pasleim, Ricordisamoa, hoo, daniel, Aklapper, Lahi, GoranSMilovanovic, QZanden, Marostegui, Minhnv-2809, Luke081515, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T114904: Migrate wb_items_per_site to using prefixed entity IDs instead of numeric IDs
daniel added a comment. @Multichill wb_items_per_site is *always* items. So to get the full item ID, just use concat('Q', ips_item_id). That's not very nice, but viable I think. We could even make a virtual column ips_full_entity_id on the labs view... I'm dropping the prio back to normal, since the issue seems easy enough to resolve. Please let me know if there are big performance issues with using concat('Q', ips_item_id) in joins, though.TASK DETAILhttps://phabricator.wikimedia.org/T114904EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: danielCc: WMDE-leszek, Ladsgroup, Multichill, Sjoerddebruin, Lydia_Pintscher, Pasleim, Ricordisamoa, hoo, daniel, Aklapper, Lahi, GoranSMilovanovic, QZanden, Marostegui, Minhnv-2809, Luke081515, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs