On 10 Dub, 07:45, Jing <jingai...@gmail.com> wrote:
> Hey all,
>
> My problem is as follows:
>
> 1) I have N strings and all strings has the same length L.
>
> 2) To be simplified, each string is composed by english letters.
>
> How to select a set of M (M < N) strings that are most dissimilar?
>
> My questions are as follows:
>
> 1) For any two strings, we can calculate the "edition distance" as the
> dissimilarity. If there are more than 2, say M, strings, how to
> characterize the dissimilarity of the whole set?
>
> 2) If the first question is well addressed, how to select the best M
> string so that the dissimilarity metric is maximized?
>
> Many thanks! I appreaciate any of your comments!
There are several solutions of your problem:
1) Simple countings.
2) Both strings can be compared as oriented indexed multigraphs
a > b
q > a
3) distances between symbols (mixing distances) can be determined,
they should correspond to negative binomial distribution.
kunzmilan


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/algogeeks
-~----------~----~----~----~------~----~------~--~---

Reply via email to