On 6 May 2013 19:39, Linsey Raaijmakers <lm.raaijmak...@gmail.com> wrote: > I have a file like this: > action start end > 50 5321 5321 > 7 5323 5347 > 12 5339 5351 > 45 5373 5373 > 45 5420 5420 > 25 5425 5425 [snip]
your code below suggests that your file also has an "apex" column. If that's correct what do you what is it and what do you want to do with it? [snip] > I have a script now that identifies overlap between two actions (see bottom > page), but how can I change this so that it outputs all possible combinations? I looked at that code and I don't think it does what you describe. Are you sure that it works? > My desired output would be: > > action times observed apex > 50 5 5321, 5451, 5533, 5634, 5740 > 50,45 1 5533;5533 > 7 4 5347, 5689, 5688, 5845 > 7,25 2 5347;5425, 5689;5682 > 7,25,26 1 5689;5682;5690 > > CODE: > > from collections import Counter > f = open('and.txt','r'); > > action_list = [] > onset_list = [] > apex_list = [] > offset_list = [] > action_counter = 0 > combination_list = [] > > > for line in f: > fields = line.split("\t") > for col in fields: > action = fields[0] > onset = fields[1] > apex = fields[2] > offset = fields[3] The above code suggests that the file has four columns. Also since you're not actually using the loop variable "col" you can just delete the "for" line and unindent the rest. In fact the easiest way to do all of this is just: action, onset, apex, offset = line.split() > > action_list.append(action) > onset_list.append(onset) > apex_list.append(apex) > offset_list.append(offset) > > action_cnvrt = map(int, action_list) > c = Counter(action_cnvrt) > > filtered = list(set(action_list)) There's no need to convert this back to a list if you're just going to iterate over it again with map. > filtered_cnvrt = map(int, filtered) > > for a in filtered_cnvrt: > action_count = str(a)+"\t"+str(c[a]) > print action_count The above counts the number of times each event occurs which is one part of your desired output. > > for i in range (0,len(onset_list)): > combination_list.append(action_list[i]) > for j in range(0,len(apex_list)): > if i != j: > if onset_list[j]>= onset_list[i] and apex_list[j] <= apex_list[i]: > print > action_list[j]+","+action_list[i]+'\t'+onset_list[j]+'\t'+apex_list[j]+'\t'+onset_list[i]+'\t'+apex_list[i] What is combination_list for? It should just end up containing the same thing as action_list if I understand correctly. It's generally better in Python to loop directly over things rather than using indices so, instead of something like: for i in range(len(onset_list)): print offset_list[i] - onset_list[i] you should do something like for offset, onset in zip(offset_list, onset_list): print offset - onset and your code will be a lot clearer. The algorithm you are using is to loop over all events and then to loop over all other events comparing all pairs of events. This will not scale very well if you want to look at a large file or to compare simultaneous occurrences of more than two events. It looks as if your input data is ordered by the onset column. Is that the case? If so then you can use an algorithm that just loops once over all the events. The way the algorithm works is that you store which events are currently active and loop through the events keeping track of the start time of the most recently added event adding and removing events as they start and stop. In pseudocode: now = start of first event active = [first event] for next_starting in events (not including first): next_ending = event that ends soonest from active while start of next_starting > end of next_ending: report active events from now to end of next_ending now = end of next_ending remove next_ending from active report active events from now until start of next_starting now = start of next_starting add next_starting to active And some more code to deal with what happens when you get to the end of the list of events... The report steps probably mean adding to a Counter or dict to remember that the currently active events were active during each particular time window. Oscar -- http://mail.python.org/mailman/listinfo/python-list