[EMAIL PROTECTED] wrote: > Hello, > > I cannot figure out a way to find a regular expression that would > match one and only one of these two strings: > > s1 = ' how are you' > s2 = ' hello world how are you' > > All I could come up with was: > patt = re.compile('^[ ]*([A-Za-z]+)[ ]+([A-Za-z]+)$') > > Which of course does not work. I cannot express the fact: sentence > have 0 or 1 whitespace, separation of group have two or more > whitespaces. > > Any suggestion ? Thanks a bunch ! > Mathieu
1. A "word" is one or more non-whitespace charaters -- subpattern is \S+ 2. A "sentence" is one or more words separated by a single white space IOW a word followed by zero or more occurrences of whitespace+word -- so a sentence will be matched by \S+(\s\S+)* 3. Leading and trailing runs of whitespace should be ignored -- use \s* 4. You will need to detect the case of 0 sentences (all whitespace) separately -- I trust you don't need to be told how to do that :-) 5. Don't try to match two or more sentences; match one sentence, and anything that fails must 0 or 2+ sentences. So : |>>> s1 = ' how are you' |>>> s2 = ' hello world how are you' |>>> pat = r"^\s*\S+(\s\S+)*\s*$" |>>> import re |>>> re.match(pat, s1) |<_sre.SRE_Match object at 0x00AED9E0> |>>> re.match(pat, s2) |>>> re.match(pat, ' ') |>>> re.match(pat, ' a b ') |>>> re.match(pat, ' a b ') |<_sre.SRE_Match object at 0x00AED8E0> |>>> re.match(pat, ' ab ') |<_sre.SRE_Match object at 0x00AED920> |>>> re.match(pat, ' a ') |<_sre.SRE_Match object at 0x00AED9E0> |>>> re.match(pat, 'a') |<_sre.SRE_Match object at 0x00AED8E0> |>>> HTH, John -- http://mail.python.org/mailman/listinfo/python-list