Levenshtein distance (http://en.wikipedia.org/wiki/ Levenshtein_distance) perhaps?
On Apr 10, 10:23 am, kunzmilan <kunzmi...@atlas.cz> wrote: > On 10 Dub, 07:45, Jing <jingai...@gmail.com> wrote: > > > Hey all, > > > My problem is as follows: > > > 1) I have N strings and all strings has the same length L. > > > 2) To be simplified, each string is composed by english letters. > > > How to select a set of M (M < N) strings that are most dissimilar? > > > My questions are as follows: > > > 1) For any two strings, we can calculate the "edition distance" as the > > dissimilarity. If there are more than 2, say M, strings, how to > > characterize the dissimilarity of the whole set? > > > 2) If the first question is well addressed, how to select the best M > > string so that the dissimilarity metric is maximized? > > > Many thanks! I appreaciate any of your comments! > > There are several solutions of your problem: > 1) Simple countings. > 2) Both strings can be compared as oriented indexed multigraphs > a > b > q > a > 3) distances between symbols (mixing distances) can be determined, > they should correspond to negative binomial distribution. > kunzmilan --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Algorithm Geeks" group. To post to this group, send email to algogeeks@googlegroups.com To unsubscribe from this group, send email to algogeeks+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/algogeeks -~----------~----~----~----~------~----~------~--~---