The comparison tool on https://tools.wmflabs.org/copyvios/ can look for repeated phrases.
You might be able to tweak that a bit. On Sat, 4 May 2019 at 12:48, Haifeng Zhang <haife...@andrew.cmu.edu> wrote: > Dear folks, > > Is there a way to compute content similarity between two Wikipedia > articles? > > For example, I can think of representing each article as a vector of > likelihoods over possible topics. > > But, I wonder there are other work people have already explored in the > past. > > > Thanks, > > Haifeng > _______________________________________________ > Wiki-research-l mailing list > Wiki-research-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l