You just need a one-character addition to your regex: regex = re.compile(r'<organisatie.*?</organisatie>', re.S)
Note, there is now a question mark (?) after the .* By default, regular expressions are "greedy" and will grab as much text as possible when making a match. So your original expression was grabbing everything between the first opening tag and the last closing tag. The question mark says, don't be greedy, and you get the behaviour you need. This is covered in the documentation for the re module. http://docs.python.org/lib/module-re.html Jason On Sep 17, 9:00 am, duikboot <[EMAIL PROTECTED]> wrote: > Hello, > > I am trying to extract a list of strings from a text. I am looking it > for hours now, googling didn't help either. > Could you please help me? > > >>>s = """ > >>>\n<organisatie>\n<Profiel_Id>28996</Profiel_Id>\n</organisatie>\n<organisatie>\n<Profiel_Id>28997</Profiel_Id>\n</organisatie>""" > >>> regex = re.compile(r'<organisatie.*</organisatie>', re.S) > >>> L = regex.findall(s) > >>> print L > > ['organisatie>\n<Profiel_Id>28996</Profiel_Id>\n</organisatie> > \n<organisatie>\n<Profiel_Id>28997</Profiel_Id>\n</organisatie'] > > I expected: > [('organisatie>\n<Profiel_Id>28996</Profiel_Id>\n</organisatie> > \n<organisatie>), (<organisatie>\n<Profiel_Id>28997</Profiel_Id>\n</ > organisatie')] > > I must be missing something very obvious. > > Greetings Arjen -- http://mail.python.org/mailman/listinfo/python-list