Levenshtein distance (http://en.wikipedia.org/wiki/
Levenshtein_distance) perhaps?

On Apr 10, 10:23 am, kunzmilan <kunzmi...@atlas.cz> wrote:
> On 10 Dub, 07:45, Jing <jingai...@gmail.com> wrote:
>
> > Hey all,
>
> > My problem is as follows:
>
> > 1) I have N strings and all strings has the same length L.
>
> > 2) To be simplified, each string is composed by english letters.
>
> > How to select a set of M (M < N) strings that are most dissimilar?
>
> > My questions are as follows:
>
> > 1) For any two strings, we can calculate the "edition distance" as the
> > dissimilarity. If there are more than 2, say M, strings, how to
> > characterize the dissimilarity of the whole set?
>
> > 2) If the first question is well addressed, how to select the best M
> > string so that the dissimilarity metric is maximized?
>
> > Many thanks! I appreaciate any of your comments!
>
> There are several solutions of your problem:
> 1) Simple countings.
> 2) Both strings can be compared as oriented indexed multigraphs
> a > b
> q > a
> 3) distances between symbols (mixing distances) can be determined,
> they should correspond to negative binomial distribution.
> kunzmilan
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/algogeeks
-~----------~----~----~----~------~----~------~--~---

Reply via email to