On May 20, 6:53 pm, norseman <norse...@hughes.net> wrote: > bearophileh...@lycos.com wrote: > > yadin: > >> How can I build up a program that tells me that this sequence > >> 1000028706 > >> 1000028707 > >> 1000028708 > >> is repeated somewhere in the column, and how can i know where? > > > Can such patterns nest? That is, can you have a repeated pattern made > > of an already seen pattern plus something else? > > If you don't want a complex program, then you may need to specify the > > problem better. > > > You may want something like LZ77 or releated (LZ78, etc): > >http://en.wikipedia.org/wiki/LZ77 > > This may have a bug: > >http://code.activestate.com/recipes/117226/ > > > Bye, > > bearophile > > ============================================ > index on column > Ndx1 is set to index #1 > Ndx2 is set to index #2 > test Ndx1 against Ndx2 > if equal write line number and column content to a file > (that's two things on one line: 15 1000028706 > 283 1000028706 ) > Ndx1 is set to Ndx2 > Ndx2 is set to index #next > loop to test writing out each duplicate set > > Then use the outfile and index on line number > > In similar manor, check if line current and next line line numbers are > sequential. If so scan forward to match column content of lower line > number and check first matched column's line number and next for > sequential. Print them out if so > > everything in outfile has 1 or more duplicates > > 4 aa |-- > 5 bb |-- | thus 4/5 match 100/101 > 6 cc | | > . | | > 100 aa | |-- > 101 bb |-- > 102 ddd > 103 cc there is a duplicate but not a sequence > 200 ff > > mark duplicate sequences as tested and proceed on through > seq1 may have more than one other seq in file. > the progress is from start to finish without looking back > thus each step forward has fewer lines to test. > marking already knowns eliminates redundant sequence testing. > > By subseting on pass1 the expensive testing is greatly reduced. > If you know your subset data won't exceed memory then the "outfile" > can be held in memory to speed things up considerably. > > Today is: 20090520 > no code > > Steve- Hide quoted text - > > - Show quoted text -
this is the program...I wrote but is not working I have a list of valves, and another of pressures; If I am ask to find out which ones are the valves that are using all this set of pressures, wanted best pressures this is the program i wrote but is not working properly, it suppossed to return in the case find all the valves that are using pressures 1 "and" 2 "and" 3. It returns me A, A2, A35.... The correct answer supposed to be A and A2... if I were asked for pressures 56 and 78 the correct answer supossed to be valves G and G2... Valves = ['A','A','A','G', 'G', 'G', 'C','A2','A2','A2','F','G2','G2','G2','A35','A345','A4'] ##valve names pressures = [1,2,3,4235,56,78,12, 1, 2, 3, 445, 45,56,78,1, 23,7] ## valve pressures result = [] bestpress = [1,2,3] ##wanted base pressures print bestpress,'len bestpress is' , len(bestpress) print len(Valves) print len(Valves) for j in range(len(Valves)): #for i in range(len(bestpress)): #for j in range(len(Valves)): for i in range(len(bestpress)-2): if pressures [j]== bestpress[i] and bestpress [i+1] ==pressures [j+1] and bestpress [i+2]==pressures [j+2]: result.append(Valves[j]) #i = i+1 #j = j+1 # print i, j, bestpress[i] print "common PSVs are", result -- http://mail.python.org/mailman/listinfo/python-list