Jheald added a comment.
The numbers in the overlap data table for P 1367 (Art UK Artist ID) are way off as well -- only one tenth of the VIAF and ULAN overlaps correctly reported, only one twentieth of the RKD Artist ID overlaps. Compare https://en.wikipedia.org/wiki/Wikipedia:GLAM/Your_paintings#Stats for Listeria tables with accurate counts (which you can go back to look through the week-by-week updates for, if you are interested in what the numbers were in the past at a particular point in time). Does this go any way to explain why the "Similarity Map" view seems so very wrong? I presume you're using something like a Jaccard similarity to score which identifiers should appear most closely together. It seems rather surprising that identifiers for people are not systematically clustered away from identifiers for places -- instead both seem more or less equally spread across the whole surface. This might indicate that either (i) the data is very incomplete (as seems to be the case); or (ii) some tweaks to the similarity function are required. TASK DETAIL https://phabricator.wikimedia.org/T204440 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic, Jheald Cc: Jheald, agray, Envlh, Lea_Lacroix_WMDE, VIGNERON, Pintoch, Daniel_Mietchen, connorshea, Moebeus, Multichill, Hjfocs, RazShuty, GoranSMilovanovic, Aklapper, Lydia_Pintscher, alaa_wmde, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, rosalieper, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs