Nathan Harmston wrote: > Hi, > > It seems that by just going through the problem writing out a better > explanation for the reply I have figured out a solution and the > problem isnt as difficult as I thought it would be.
Often happens. > > What is a wontok? It's Melanesian Pidgin (from the English "one talk") meaning a person who speaks the same language as you, a member of your clan, ... the context being that [at least in Papua New Guinea] there are relatively many languages each with relatively not many speakers :-) > > Thanks > > Nathan > > PS --> the start of my reply: > > class Interval(object): > _id = "gene1" > _start = 50 > _end = 200 > _strand = 1 > > class Sequence(object): > _sequence = "atgtcgtgagagagagttgtgag................." > > > Only vaguely. You use several terms which appear to be from your trade > > jargon > > Sequence is a string made from a restricted alphabet (A,T,G,C...). > Sequences can be aligned: 1 ATGCTGCAT > 2 TAGCTGTTA > ------- > 2 5 I'm sure they can be, but appearances can be deceptive when you mix tabs and spaces -- or whatever caused the above 4 lines to be not vertically aligned but staggered diagonally like a flight of ducks heading equatorwards for winter. Sometimes a line of code (e.g. str1[2:6] == str2[2:6]) is worth a thousand pictures :-) > > I m trying to represent this as a graph Interval(id=1, start=2, end=6, > strand=1) ---edge------Interval(id=2, start=2, end=6, strand=1) > > The problem is I was planning on storing the sequences in a dictionary > {id:Seq}, however each dictionary would represent a different source > of sequences. File1, File2....... ( > STORE THE SOURCES AS A DICT Mapping what keys to what values? > AND HAVE SOURCE IN INTERVAL ASWELL So you had a data modelling problem. These are often better solved as a separate step before you think about implementation details like dictionaries. Good luck with your project. Cheers, John -- http://mail.python.org/mailman/listinfo/python-list