On Sat, 19 Jan 2002 23:26:16 -0500, Michel Lambert wrote:
>Here's something which generates a regex that matches most shrinkable
>strings.
>
>print join"|",map{"$_.*[^$_-z]"}('a'..'z')
>which prints out:
>a.*[^a-z]|b.*[^b-z]|c.*[^c-z]|d.*[^d-z]|e.*[^e-z]|f.*[^f-z]|g.*[^g-z]|h.*[^h
>-z]|i.*[^i-z]|j.*[^j-z]|k.*[^k-z]|l.*[^l-z]|m.*[^m-z]|n.*[^n-z]|o.*[^o-z]|p.
>*[^p-z]|q.*[^q-z]|r.*[^r-z]|s.*[^s-z]|t.*[^t-z]|u.*[^u-z]|v.*[^v-z]|w.*[^w-z
>]|x.*[^x-z]|y.*[^y-z]|z.*[^z-z]
That is most definitely not right.
Urm... are we to suppose that these strings-under-test may only contain
lower case letters? Even then: what you are trying to test is see if all
characters in this string are in sorted order. It's true: if a string is
in sorted character order, then it is *not* shrinkable, but that is not
required; and your test is faulty (you should have used negative
lookahead). For example,
foogmm
Even though "m" lt "o", this string is not shrinkable. Huh... if the
string starts with the smalles character, then it is not shrinkable.
Alright?
Here's what I think that basically covers it: the string is shrinkable:
* If the string contains a character that is smaller than the first one
fooduu
->
duufoo
* if there is a repeated prefix, for example the string starts with
"foo" and that "foo" appears somewhere in the string again, and the
first character following the repeated prefix is smaller than the first
character following that prefix, for example (prefix = "foo")
fooyzzzfoox
->
fooxfooyzzz
Actually, with length prefix == 0, these two cases collide;
* if there is a (optionally repeated, 1+) prefix that is also a suffix,
and the first dissimilar string following that repeated prefix is larger
than the prefix itself, for example
hohohohqzzzho
->
hohohohohqzzz
Let me know if forgot some case.
Turning this into a regex isn't pretty. Testing for the repeated prefix
is alright, but doing the "lt" test without assertions is nothing but a
nightmare.
--
Bart.