On Sat, 5 Nov 2011, Dinara Vakhitova wrote:
I need to find the words in a corpus, which letters are in the alphabetical
order ("almost", "my" etc.)
I started with matching two consecutive letters in a word, which are in
the alphabetical order, and tried to use this expression: ([a-z])[\1-z], but
it won't work, it's matching any sequence of two letters. I can't figure out
why... Evidently I can't refer to a group like this, can I? But how in this
case can I achieve what I need?
First, I agree with the others that this is a lousy task for regular
expressions. It's not the tool I would use. But, I do think it's doable,
provided the requirement is not to check with a single regular expression.
For simplicity's sake, I'll construe the problem as determining whether a
given string consists entirely of lower-case alphabetic characters,
arranged in alphabetical order.
What I would do is set a variable to the lowest permissible character,
i.e., "a", and another to the highest permissible character, i.e., "z"
(actually, you could just use a constant, for the highest, but I like the
symmetry.
Then construct a regex to see if a character is within the
lowest-permissible to highest-permissible range.
Now, iterate through the string, processing one character at a time. On
each iteration:
- test if your character meets the regexp; if not, your answer is
"false"; on pass one, this means it's not lower-case alphabetic; on
subsequent passes, it means either that, or that it's not in sorted
order.
- If it passes, update your lowest permissible character with the
character you just processed.
- regenerate your regexp using the updated lowest permissible character.
- iterate.
I assumed lower case alphabetic for simplicity, but you could modify this
basic approach with mixed case (e.g., first transforming to all-lower-case
copy) or other complications.
I don't think there's a problem with asking for help with homework on this
list; but you should identify it as homework, so the responders know not
to just give you a solution to your homework, but instead provide you with
hints to help you solve it.
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor