On Jun 27, 2007, at 10:24 AM, Mike Hansen wrote: > > >> -----Original Message----- >> From: [EMAIL PROTECTED] >> [mailto:[EMAIL PROTECTED] On Behalf Of Gardner, Dean >> Sent: Wednesday, June 27, 2007 3:59 AM >> To: tutor@python.org >> Subject: [Tutor] Regular Expression help >> >> Hi >> >> I have a text file that I would like to split up so that I >> can use it in Excel to filter a certain field. However as it >> is a flat text file I need to do some processing on it so >> that Excel can correctly import it. >> >> File Example: >> tag desc VR VM >> (0012,0042) Clinical Trial Subject Reading ID LO 1 >> (0012,0050) Clinical Trial Time Point ID LO 1 >> (0012,0051) Clinical Trial Time Point Description ST 1 >> (0012,0060) Clinical Trial Coordinating Center Name LO 1 >> (0018,0010) Contrast/Bolus Agent LO 1 >> (0018,0012) Contrast/Bolus Agent Sequence SQ 1 >> (0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 >> (0018,0015) Body Part Examined CS 1 >> >> What I essentially want is to use python to process this file >> to give me >> >> >> (0012,0042); Clinical Trial Subject Reading ID; LO; 1 >> (0012,0050); Clinical Trial Time Point ID; LO; 1 >> (0012,0051); Clinical Trial Time Point Description; ST; 1 >> (0012,0060); Clinical Trial Coordinating Center Name; LO; 1 >> (0018,0010); Contrast/Bolus Agent; LO; 1 >> (0018,0012); Contrast/Bolus Agent Sequence; SQ ;1 >> (0018,0014); Contrast/Bolus Administration Route Sequence; SQ; 1 >> (0018,0015); Body Part Examined; CS; 1 >> >> so that I can import to excel using a delimiter. >> >> This file is extremely long and all I essentially want to do >> is to break it into it 'fields' >> >> Now I suspect that regular expressions are the way to go but >> I have only basic experience of using these and I have no >> idea what I should be doing. >> >> Can anyone help. >> >> Thanks >> > > Hmmmm... You might be able to do this without the need for regular > expressions. You can split the row on spaces which will give you a > list. > Then you can reconstruct the row inserting your delimiter as needed > and > joining the rest with spaces again. > > In [63]: row = "(0012,0042) Clinical Trial Subject Reading ID LO 1" > > In [64]: row_items = row.split(' ') > > In [65]: row_items > Out[65]: ['(0012,0042)', 'Clinical', 'Trial', 'Subject', 'Reading', > 'ID', 'LO', > '1'] > > In [66]: tag = row_items.pop(0) > > In [67]: tag > Out[67]: '(0012,0042)' > > In [68]: vm = row_items.pop() > > In [69]: vm > Out[69]: '1' > > In [70]: vr = row_items.pop() > > In [71]: vr > Out[71]: 'LO' > > In [72]: desc = ' '.join(row_items) > > In [73]: new_row = "%s; %s; %s; %s" %(tag, desc, vr, vm, ) > > In [74]: new_row > Out[74]: '(0012,0042); Clinical Trial Subject Reading ID; LO; 1' > > Someone might think of a better way with them thar fancy lambdas and > list comprehensions thingys, but I think this will work. > > I sent this to Dean this morning:
Dean, I would do something like this (if your pattern is always the same.) foo =['(0012,0042) Clinical Trial Subject Reading ID LO 1 ', '(0012,0050) Clinical Trial Time Point ID LO 1 ', '(0012,0051) Clinical Trial Time Point Description ST 1 ', '(0012,0060) Clinical Trial Coordinating Center Name LO 1 ', '(0018,0010) Contrast/Bolus Agent LO 1 ', '(0018,0012) Contrast/Bolus Agent Sequence SQ 1 ', '(0018,0014) Contrast/Bolus Administration Route Sequence SQ 1 ', '(0018,0015) Body Part Examined CS 1',] import csv writer = csv.writer(open('/Users/reed/tmp/foo.csv', 'w'), delimiter=';') for lin in foo: lin = lin.split() row = (lin[0], ' '.join(lin[1:-2]), lin[-2], lin[-1]) writer.writerow(row) more foo.csv (0012,0042);Clinical Trial Subject Reading ID;LO;1 (0012,0050);Clinical Trial Time Point ID;LO;1 (0012,0051);Clinical Trial Time Point Description;ST;1 (0012,0060);Clinical Trial Coordinating Center Name;LO;1 (0018,0010);Contrast/Bolus Agent;LO;1 (0018,0012);Contrast/Bolus Agent Sequence;SQ;1 (0018,0014);Contrast/Bolus Administration Route Sequence;SQ;1 (0018,0015);Body Part Examined;CS;1 HTH, ~reed _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor