Runsun Pan helped me out with the following: You can also try the following very primitive solution that I sometimes use to extract simple information in a quick and dirty way:
def extract(text,s1,s2): ''' Extract strings wrapped between s1 and s2. >>> t="""this is a <span>test</span> for <span>extract()</span> that <span>does multiple extract</span> """ >>> extract(t,'<span>','</span>') ['test', 'extract()', 'does multiple extract'] ''' beg = [1,0][text.startswith(s1)] tmp = text.split(s1)[beg:] end = [len(tmp), len(tmp)+1][ text.endswith(s2)] return [ x.split(s2)[0] for x in tmp if len(x.split(s2))>1][:end] This will help out a *lot*! Thank you. This is a better bet than the parser in this particular implementation because the data I need is not encapsulated in tags! Field names are within <b></b> tags followed by plain text data and ended with a <br> tag. This was my main problem with a parser, but your extract fuction solves it beautifully! I'm posting back to the NG in just in case it is of value to anyone else. Could you/anyone explain the 4 lines of code to me though? A crash course in Python shorthand? What does it mean when you use two sets of brackets as in : beg = [1,0][text.startswith(s1)] ? Thanks for the help! -d- -- http://mail.python.org/mailman/listinfo/python-list