Jheald added a comment.

  The numbers in the overlap data table for P 1367 (Art UK Artist ID) are way 
off as well -- only one tenth of the VIAF and ULAN overlaps correctly reported, 
only one twentieth of the RKD Artist ID overlaps.
  
  Compare https://en.wikipedia.org/wiki/Wikipedia:GLAM/Your_paintings#Stats for 
Listeria tables with accurate counts (which you can go back to look through the 
week-by-week updates for, if you are interested in what the numbers were in the 
past at a particular point in time).
  
  Does this go any way to explain why the "Similarity Map" view seems so very 
wrong?  I presume you're using something like a Jaccard similarity to score 
which identifiers should appear most closely together.   It seems rather 
surprising that identifiers for people are not systematically clustered away 
from identifiers for places -- instead both seem more or less equally spread 
across the whole surface.  This might indicate that either (i) the data is very 
incomplete (as seems to be the case); or (ii) some tweaks to the similarity 
function are required.

TASK DETAIL
  https://phabricator.wikimedia.org/T204440

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GoranSMilovanovic, Jheald
Cc: Jheald, agray, Envlh, Lea_Lacroix_WMDE, VIGNERON, Pintoch, Daniel_Mietchen, 
connorshea, Moebeus, Multichill, Hjfocs, RazShuty, GoranSMilovanovic, Aklapper, 
Lydia_Pintscher, alaa_wmde, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, 
rosalieper, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to