On Apr 17, 4:49 pm, Jesse Aldridge <jessealdri...@gmail.com> wrote: > import re > > s1 = "I am an american" > > s2 = "I am american an " > > for s in [s1, s2]: > print re.findall(" (am|an) ", s) > > # Results: > # ['am'] > # ['am', 'an'] > > ------- > > I want the results to be the same for each string. What am I doing > wrong?
Does it help if you expand your RE to its full expression, with '_'s where the blanks go: "_am_" or "_an_" Now look for these in "I_am_an_american". After the first "_am_" is processed, findall picks up at the leading 'a' of 'an', and there is no leading blank, so no match. If you search through "I_am_american_an_", both "am" and "an" have surrounding spaces, so both match. Instead of using explicit spaces, try using '\b' meaning word break: >>> import re >>> re.findall(r"\b(am|an)\b", "I am an american") ['am', 'an'] >>> re.findall(r"\b(am|an)\b", "I am american an") ['am', 'an'] -- Paul Your find pattern includes (and consumes) a leading AND trailing space around each word. In the first string "I am an american", there is a leading and trailing space around "am", but the trailing space for "am" is the leading space for "an", so " an " -- http://mail.python.org/mailman/listinfo/python-list