First, just a little rant :-)
It doesn't help to randomly change some lines or introduce some new concepts you don't understand yet and then hope to get the right result. Your chances are very small that this will be succesful.
You should try to understand some basic concepts first and build on them.
From your postings the last weeks and especially from today I have the impression that you still don't understand how fundamental programming concepts work: for-loops, differences between data types (strings, lists, sets, ...) Honestly, have you already read any programming tutorial? (You'll find a big list at http://wiki.python.org/moin/BeginnersGuide/NonProgrammers )? At the moment it looks like you are just copying some code snippets from different places and then you hopelessly try to modify them to suit your needs. IMHO the problems you want to solve are a little too big for you right now.

Nevertheless, here are some comments:

Based on former advice, I made a correction/modification on the below code.

1] the set and subgroup does not work, here I wish to put all the
subgroup in a big set, the set like

That's a good idea, but you don't use the set correctly.

> subgroups=[]
> subgroup=[]
> def LongestCommonSubstring(S1, S2):

I think it's better to move "subgroups" and "subgroup" into the function. (I've noticed that in most of your scripts you are using a lot of global variables. IMHO that's not the best programming style. Do you know what "global/local variables", "namespace", "scope" mean?)

You are defining "subgroups" as an empty list, but later you want to use it as a set. Thus, you should define it as an empty set:

subgroups = set()

You are also defining "subgroup" as an empty list, but later you assign a slice of "S1" to it. Since "S1" is a string, the slice is also a string. Therefore:

subgroup = ""

>      M = [[0]*(1+len(S2)) for i in xrange(1+len(S1))]

Peter told you already why "xrange" doesn't work in Python 3. But instead of using an alias like

xrange = range

IMHO it's better to change it in the code directly.

>      longest, x_longest = 0, 0
>      for x in xrange(1,1+len(S1)):
>          for y in xrange(1,1+len(S2)):
>              if S1[x-1] == S2[y-1]:
>                  M[x][y] = M[x-1][y-1]+1
>                  if M[x][y]>  longest:
>                      longest = M[x][y]
>                      x_longest = x
>                  if longest>= 3:
>                      subgroup=S1[x_longest-longest:x_longest]
>                      subgroups=set([subgroup])

Here you overwrite in the first iteration your original empty list "subgroups" with the set of the list which contains the string "subgroup" as its only element. Do you really understand this line? And in all the following iterations you are overwriting this one-element set with another one-element set (the next "subgroup"). If you want to add an element to an existing set instead of replacing it, you have to use the "add()"-method for adding an element to a set:

subgroups.add(subgroup)

This will add the string "subgroup" as a new element to the set "subgroups".

>                      print(subgroups)
>              else:
>                      M[x][y] = 0
>
>      return S1[x_longest-longest:x_longest]

Here you probably want to return the set "subgroups":

return subgroups


2] I still have trouble in reading files, mainly about not read "" etc.

The problem is that in your data files there is just this big one-line string. AFAIK you have produced these data files yourself, haven't you? In that case it would be better to change the way how you save the data (be it a well-formatted string or a list or something else) instead of trying to fix it here (in this script).

Bye, Andreas
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to