* Johny (10 Feb 2007 05:29:23 -0800) > I need to find all the same words in a text . > What would be the best idea to do that? > I used string.find but it does not work properly for the words. > Let suppose I want to find a number 324 in the text > > '45 324 45324' > > there is only one occurrence of 324 word but string.find() finds 2 > occurrences ( in 45324 too) > > Must I use regex?
There are two approaches: one is the "solve once and forget" approach where you code around this particular problem. Mario showed you one solution for this. The other approach would be to realise that your problem is a specific case of two general problems: partitioning a sequence by a separator and partioning a sequence into equivalence classes. The bonus for this approach is that you will have a /lot/ of problems that can be solved with either one of these utils or a combination of them. 1>>> a = '45 324 45324' 2>>> quotient_set(part(a, [' ', ' '], 'sep'), ident) 2: {'324': ['324'], '45': ['45'], '45324': ['45324']} The latter approach is much more flexible. Just imagine your problem changes to a string that's separated by newlines (instead of spaces) and you want to find words that start with the same character (instead of being the same as criterion). Thorsten -- http://mail.python.org/mailman/listinfo/python-list