Dear group, I have a problem in finding a method to solve a problem where I want to walk through a lineage of terms and find group them from right to left.
A snippet of the problem is here. The terms in file as tab delim manner. a b c d car a b c f truck a b c d van a b c d SUV a b c f 18-wheeler a b j k boat a b j a submarine a b d a B-747 a b j c cargo-ship a b j p passenger-cruise ship a b a a bicycle a b a b motorcycle Now my question is to enrich members that have identical lineage with different leaf. 'i.e': a b c d - van suv . I have two terms in this path and I am not happy with two. I wish to have more. Then: a b c - car, van, truck, SUV and 18-wheeler (automobiles that travel on road). I am happy with this grouping and I enriched more items if I walk on lienage : (a-b-c) Thus, I want to try to enrich for all 21 K lines of lineages. My question: Is there a way to automate this problem. My idea of doing this: Since this is a tab delim file. I want to read a line with say 5 columns (5 tabs). Search for items with same column item 4 (because leaf items could be unique). If I find a hit, then check if columns 3 and 2 are identical if so create a list. Although this problem is more recursive and time and resource consuming, I cannot think of an easy solution. Would you please suggest a nice and simple method to solve this problem. For people who are into bioinformatics (I know Danny Yoo is a bioinformatician) the question is about GO terms. I parsed OBO file and laid out the term lineages that constitute the OBO-DAG structure. I want to enrich the terms to do an enrichment analysis for a set of terms that I am interested in. Thank you in advance. cheers Srini __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor