From: Philipp Traeder <[EMAIL PROTECTED]>
> I'm facing a roughly similar problem at the moment, and I was planning
> on using String::Compare or something like it for comparing strings
> char by char. Taking a first glance at the code, it doesn't look too
> hard to modify it in a way that it returns not only the similarity
> between two strings, but also a string with special characters at
> those places that are different - something that I could call like:
>
> my $a = 'abcdXef';
> my $b = 'abcdYef';
> my ($similarity, $regexp) = compare_strings($a, $b);
>
> and would return the similarity as percentile of matching chars (6/7
> in this case) and a regex that looks like
> abcd.ef
There is an indefinite number of different regexps that do match both
your strings. Which of the them do you want? The minimal would be
/^abdc[XY]ef$/, the maximal //.
If you say you want to generate regexps that have the literals on all
places where all the specified strings match and dots on the other
places it's easy, but that only helps if the variable part of the
messages is always the same length.
What about the example you give in another email
Cannot connect to the primary server
Cannot connect to the secondary server
what do you expect to get out of this?
What about the other
unable to delete user 1234567
unable to delete user 1897584
Do you really want
/unable to delete user 1...5../
? And what if the IDs are not the same length?
Keep in mind that if you give a few examples of strings that you want
to match and ask him to write a regexp for you he/she has much more
information that just these strings. He/she knows what eahc part of
the string means, if he/she sees 2004/11/27 somewhere in the strings
he/she knows it's a date and can write the regexps so that it only
matches valid dates, if one message contains something like "user
138767" and some other "user 134795" he/she know you most probably
need the regexp to contain something like "user \d+", etc. etc. etc.
The computer has no chance to know all this.
Jenda
===== [EMAIL PROTECTED] === http://Jenda.Krynicky.cz =====
When it comes to wine, women and song, wizards are allowed
to get drunk and croon as much as they like.
-- Terry Pratchett in Sourcery
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>