> Instead of writing a new 'for,if' loop to filter the repetetive tags > from the list, is there something that I can add in the re itself to > match the pattern only once?
Hi Intercodes, As far as I know, no: regular expressions don't have the capacity to remember what tags they've matched in previous matching runs. You may find "sets" useful in filtering out duplicates: http://www.python.org/doc/lib/module-sets.html For example: ###### >>> from sets import Set as set >>> set([3, 1, 4, 1, 5, 9, 2, 6]) Set([1, 2, 3, 4, 5, 6, 9]) ###### So you should not need to code much extra logic to do the duplication filtering: sets do this for you already. As an aside, I want to second Alan's recommendation to avoid doing so much string concatenation: it's actually very expensive to do the following: ######################### ans = '' while 1: data=file1.readline() if data=="": break ans=ans+data ######################### >From a technical standpoint, it has "quadratic" complexity in terms of what work the computer is doing. It's related to the mathematical idea that 1 + 2 + 3 + 4 + ... + n = n(n+1). http://en.wikipedia.org/wiki/Triangle_number The string concatenation that the code does above is analogous because it rebuilds larger and larger strings in a similar fashion. Alan's approach, to use file1.read(), sucks up the string in one shot, so it's much less computationally expensive. If you do end up having to concatenate a lot of strings together, use the join() method of lists instead. We can talk about this in more detail if you'd like. _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor