jarod_v6--- via Tutor wrote:
> Dear All!
> I have this elements
>
> In [445]: pt = line.split("\t")[9]
>
> In [446]: pt
> Out[446]: 'gene_id "ENSG0223972"; gene_version "5"; transcript_id
> "ENST0456328"; transcript_version "2"; exon_number "1"; gene_name
> "DDX11L1"; gene_source "havana"; gene_biotype
> "transcribed_unprocessed_pseudogene"; transcript_name "DDX11L1-002";
> transcript_source "havana"; transcript_biotype "processed_transcript";
> exon_id "ENSE2234944"; exon_version "1"; tag "basic";
> transcript_support_level "1";\n'
>
>
> and I want to create a dictionary like this
>
> gene_id = "ENSG0223972"; ...
>
>
> I found on stack over flow this way to create a dictionary of dictionary
> (http://stackoverflow.com/questions/8550912/python-dictionary-of-dictionaries)
> # This is our sample data
> data = [("Milter", "Miller", 4), ("Milter", "Miler", 4), ("Milter",
> "Malter", 2)]
>
> # dictionary we want for the result
> dictionary = {}
>
> # loop that makes it work
> for realName, falseName, position in data:
> dictionary.setdefault(realName, {})[falseName] = position
>
> I want to create a dictionary using setdefault but I have difficult to
> trasform pt as list of tuple.
>
> data = pt.split(";")
> in ()
> 1 for i in data:
> 2 l = i.split()
> > 3 print l[0]
> 4
>
> IndexError: list index out of range
>
> In [457]: for i in data:
> l = i.split()
> print l
>.:
> ['gene_id', '"ENSG0223972"']
> ['gene_version', '"5"']
> ['transcript_id', '"ENST0456328"']
> ['transcript_version', '"2"']
> ['exon_number', '"1"']
> ['gene_name', '"DDX11L1"']
> ['gene_source', '"havana"']
> ['gene_biotype', '"transcribed_unprocessed_pseudogene"']
> ['transcript_name', '"DDX11L1-002"']
> ['transcript_source', '"havana"']
> ['transcript_biotype', '"processed_transcript"']
> ['exon_id', '"ENSE2234944"']
> ['exon_version', '"1"']
> ['tag', '"basic"']
> ['transcript_support_level', '"1"']
> []
>
>
> So how can do that more elegant way?
> thanks so much!!
I don't see why you would need dict.setdefault(), you have the necessary
pieces together:
data = pt.split(";")
pairs = (item.split() for item in data)
mydict = {item[0]: item[1].strip('"') for item in pairs if len(item) == 2}
You can protect against whitespace in the quoted strings with
item.split(None, 1) instead of item.split(). If ";" is allowed in the quoted
strings you have to work a little harder.
___
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor