On 03.06.2014 18:24, jarod...@libero.it wrote:
HI there!!!
I have afile like this:
file.txt
programs sample gene
program1 sample1 TP53
program1 sample1 TP53
program1 sample2 PRNP
program1 sample2 ATF3
program2 sample1 TP53
program2 sample1 PRNP
program2 sample2 TRIM32
program2 sample2 TLK1
program2 sample2 KIT
with open("prova.csv") as p:
for i in p:
...: lines = i.rstrip("\n").split("\t")
...: print lines
...:
['programs ', 'sample', 'gene', 'values']
['program1', 'sample1', 'TP53', '2']
['program1', 'sample1', 'TP53', '3']
['program1', 'sample2', 'PRNP', '4']
['program1', 'sample2', 'ATF3', '3']
['program2', 'sample1', 'TP53', '2']
['program2', 'sample1', 'PRNP', '5']
['program2', 'sample2', 'TRIM32', '4']
['program2', 'sample2', 'TLK1', '4']
Be exact / do not provide approximate information if you are looking for
adequate answers !!
Your file did not look like the one you showed, there was an additional
'values' column in it.
What do you want to do with it ??
I want to create a dictionary with set data with the names of the genes:
example:
dic = {}
dic['program1-sample1] = set(TP53)
dic['program1-sample2] = set(TP53,PRNP,ATF3)
Again, this is nothing you were ever really trying in a python shell
since that would raise errors for several reasons, just try it yourself!
I would not build dictionary keys by concatenating the 'programs' and
'sample' strings - rather use a tuple of the two (any immutable object
works as a dict key), e.g.:
dic[('program1', 'sample1')] = {'TP53'}
Essentially, what you need to do is:
- instead of printing each individual list you've parsed from the input
file, use the first two elements as a tuple for the dict key, then add
the third element (the gene) to the set stored under that key (use
set.add() for that purpose.
- the tricky part is what to do with keys that are encountered for the
first time and, thus, don't have a set associated with them yet.
Here, dict.setdefault() will help you
(https://docs.python.org/2.7/library/stdtypes.html?highlight=setdefault#dict.setdefault).
hint: your_dict(your_key, set()).add(the_gene) will work whether or not
the key has been encountered before or not.
So If I have a dictionary like that I can compare two set I will compare the
capacity of the programs in function of the gene show.
I have no idea what you are trying to do, so I can't tell you whether
the data structure will be good for it.
Wolfgang
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor