On Tuesday 29 January 2008 12:11, Tolkin, Steve wrote: > Suppose I have 2 or more lists that are (conceptually) sublists of the > same underlying list. > I want to reconstruct the underlying list. In other words the order of > the elements agrees in all the lists, but there is no sort condition. > > Example: > List 1: dog, cat, mouse > List 2: dog, shark, mouse, elephant > > There are 2 possible outputs, and I do not care which one I get.
Sounds like you are trying to solve an alignment problem. It's a common thing to do in NLP (especially in speech recognition) and I think also when working with gene sequences. If you are interested in surveying what's available it might be worth to search for "align" or "alignment" on CPAN, and to look for BioPerl stuff. As an quick suggestion, I have used Algorithm::Diff in the past for similar problems. It uses the longest common subsequence algorithm, which provides a minimal difference solution to the alignment problem. The sdiff() function from that module should get you within a hair's breadth of the solution (you'd have only to extract the second or third element of each array ref that's returned). Another typical algorithm for sequence comparison is the minimum edit distance, for which there are a couple of modules on CPAN: Text::Levenshtein, and Text::WagnerFischer. But those modules only give you the distance, not the alignment... Bernardo _______________________________________________ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm