GoranSMilovanovic added a comment.
@Jheald > Does this go any way to explain why the "Similarity Map" view seems so very wrong? Not necessarily. The map uses coordinates from the 2D tSNE dimensionality reduction which attempts to conserve the local similarity structures, and there are many, many constraints in this dataset that the algorithm needs to fit. However: let's take a look at the map once the data are re-engineered. > I presume you're using something like a Jaccard similarity to score which identifiers ... Of course I am using Jaccard, it's a dataset of binary vector representations of the identifiers. TASK DETAIL https://phabricator.wikimedia.org/T204440 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Jheald, agray, Envlh, Lea_Lacroix_WMDE, VIGNERON, Pintoch, Daniel_Mietchen, connorshea, Moebeus, Multichill, Hjfocs, RazShuty, GoranSMilovanovic, Aklapper, Lydia_Pintscher, alaa_wmde, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, rosalieper, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs