On 15/12/16 01:56, renjith madhavan wrote:
I have a dataset in the below format.

id      A       B       C       D       E
100     1       0       0       0       0
101     0       1       1       0       0
102     1       0       0       0       0
103     0       0       0       1       1

I would like to convert this into below:
100, A
101, B C
102, A
103, D E

How do I do this ? I tried numpy argsort but I am new to Python and finding 
this challenging.
Appreciate any help in this.


Numpy or pandas? Neither, this is a straightforward bit of text manipulation you can do without needing to import anything. I wouldn't bother considering either unless your dataset is massive and speed is anything of an issue.

with open("data.txt") as datafile:
    # First line needs handling separately
    line = next(datafile)
    columns = line.split()[1:]
    # Now iterate through the rest
    for line in datafile:
        results = []
        for col, val in zip(columns, line.split()[1:]:
             if val == "1":
                 results.append(col)
        print("{0}, {1}".format(data[0], " ".join(results)))

Obviously there's no defensive coding for blank lines or unexpected data in there, and if want to use the results later on you probably want to stash them in a dictionary, but that will do the job.

--
Rhodri James *-* Kynesim Ltd
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to