Re: [Tutor] manipulating data to delete blank spaces
On 28/05/12 23:15, Brendan Dornan wrote: Hi, I’d like to eliminate the no data fields in this XML file, since Tableau, the graphing software doesn’t allow this. What would be the easiest way to approach this? I’m a complete neophyte, having gone through the first 15 chapters of the “Think Like a Computer Scientist.” Long ways to go and appreciate any help. Cheers. Ummm, what XML file? Maybe it got stripped in transit, or maybe you forgot to attach it. Or if its big can you put it on a pastebin? -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] manipulating data to delete blank spaces
Hi, I'd like to eliminate the no data fields in this XML file, since Tableau, the graphing software doesn't allow this. What would be the easiest way to approach this? I'm a complete neophyte, having gone through the first 15 chapters of the "Think Like a Computer Scientist." Long ways to go and appreciate any help. Cheers. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
If you run this code # f = open('test1.mlc') for line in f: print f.split() # You will see that about halfway through the file there is an empty list. I assume that there was nothing on that line, in which case, there is no [0] value. In which case, you need to put in a try: except IndexError: block like this~ f = open('test1.mlc') fields={} for line in f: try: words = line.split() firstword = words[0] except IndexError: continue if firstword == 'Field': field = int(words[-1]) elif firstword == 'Leaf': fields[field] = words[-1] You will notice I made a few other changes. I changed it so line.split() is assigned to a variable. That means I don't have to make the split() call every time I want to check for a different word. The try except block just fixes the problem you encountered. Also, I took out the last block-the else block-because it is not necessary, and in fact will cause what you would consider an error in the program. Calling f.next() will increment the line, yes, but that is exactly for what the "for loop" is intended. The natural conclusion of the for block is "finish the elif test and its block, then execute code after it. Since there is no code after it indented to that level, it automatically increments 'line' (line = f.next()) HTH, JS - Original Message - From: "Bryan Fodness" <[EMAIL PROTECTED]> To: "Alan Gauld" <[EMAIL PROTECTED]> Cc: Sent: Monday, November 12, 2007 2:43 PM Subject: Re: [Tutor] manipulating data >I try this, > > f = open('TEST1.MLC') > > fields = {} > > for line in f: >if line.split()[0] == 'Field': >field = int(line.split()[-1]) >elif line.split()[0] == 'Leaf': >fields[field] = line.split()[-1] >else: >line = f.next() > > and get, > > Traceback (most recent call last): > File "", line 1, in >line.split()[0] > IndexError: list index out of range > > I have attached my data file. > > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
Bryan Fodness wrote: > I try this, > > f = open('TEST1.MLC') > > fields = {} > > for line in f: > if line.split()[0] == 'Field': > field = int(line.split()[-1]) > elif line.split()[0] == 'Leaf': > fields[field] = line.split()[-1] > else: > line = f.next() > > and get, > > Traceback (most recent call last): > File "", line 1, in > line.split()[0] > IndexError: list index out of range For blank lines, line.split() is [] so there is no line.split()[0]. Try skipping blank lines before you split. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
On Monday November 12, 2007, Bryan Fodness wrote: >I try this, > >f = open('TEST1.MLC') > >fields = {} > >for line in f: >if line.split()[0] == 'Field': >field = int(line.split()[-1]) >elif line.split()[0] == 'Leaf': >fields[field] = line.split()[-1] >else: >line = f.next() > >and get, > >Traceback (most recent call last): > File "", line 1, in >line.split()[0] >IndexError: list index out of range Bryan, There are some blank lines in your file. When those lines are reached, line.split() returns an empty list, and therefore line.split()[0] is an IndexError. One way to rewrite this is as follows (untested): for line in f: pieces = line.split() if pieces: # non-empty line if pieces[0] == 'Field': field = int(pieces[-1]) elif pieces[0] == 'Leaf': fields[field] = pieces[-1] else: line = f.next() # I've left this here, but not sure # why you have it. The for loop # already advances from line to line Note as well that it is better to perform the split once per line (rather than recomputing it as you do in your original code). With regard, Michael ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
I try this, f = open('TEST1.MLC') fields = {} for line in f: if line.split()[0] == 'Field': field = int(line.split()[-1]) elif line.split()[0] == 'Leaf': fields[field] = line.split()[-1] else: line = f.next() and get, Traceback (most recent call last): File "", line 1, in line.split()[0] IndexError: list index out of range I have attached my data file. File Rev = G Treatment = Dynamic Dose Last Name = Fodness First Name = Bryan Patient ID = 0001 Number of Fields = 4 Number of Leaves = 120 Tolerance = 0.50 Field = 10 Index = 0. Carriage Group = 1 Operator = Collimator = 0.0 Leaf 1A = 0.00 Leaf 2A = 0.00 Leaf 3A = 0.00 Leaf 4A = 0.00 Leaf 5A = 0.00 Leaf 6A = 0.00 Leaf 7A = 0.00 Leaf 8A = 0.00 Leaf 9A = 0.00 Leaf 10A = 0.00 Leaf 11A = 0.00 Leaf 12A = 0.00 Leaf 13A = 0.00 Leaf 14A = 0.00 Leaf 15A = 0.00 Leaf 16A = 0.00 Leaf 17A = 0.00 Leaf 18A = 0.00 Leaf 19A = 0.00 Leaf 20A = 0.00 Leaf 21A = 5.00 Leaf 22A = 5.00 Leaf 23A = 5.00 Leaf 24A = 5.00 Leaf 25A = 5.00 Leaf 26A = 5.00 Leaf 27A = 5.00 Leaf 28A = 5.00 Leaf 29A = 5.00 Leaf 30A = 5.00 Leaf 31A = 5.00 Leaf 32A = 5.00 Leaf 33A = 5.00 Leaf 34A = 5.00 Leaf 35A = 5.00 Leaf 36A = 5.00 Leaf 37A = 5.00 Leaf 38A = 5.00 Leaf 39A = 5.00 Leaf 40A = 5.00 Leaf 41A = 0.00 Leaf 42A = 0.00 Leaf 43A = 0.00 Leaf 44A = 0.00 Leaf 45A = 0.00 Leaf 46A = 0.00 Leaf 47A = 0.00 Leaf 48A = 0.00 Leaf 49A = 0.00 Leaf 50A = 0.00 Leaf 51A = 0.00 Leaf 52A = 0.00 Leaf 53A = 0.00 Leaf 54A = 0.00 Leaf 55A = 0.00 Leaf 56A = 0.00 Leaf 57A = 0.00 Leaf 58A = 0.00 Leaf 59A = 0.00 Leaf 60A = 0.00 Leaf 1B = 0.00 Leaf 2B = 0.00 Leaf 3B = 0.00 Leaf 4B = 0.00 Leaf 5B = 0.00 Leaf 6B = 0.00 Leaf 7B = 0.00 Leaf 8B = 0.00 Leaf 9B = 0.00 Leaf 10B = 0.00 Leaf 11B = 0.00 Leaf 12B = 0.00 Leaf 13B = 0.00 Leaf 14B = 0.00 Leaf 15B = 0.00 Leaf 16B = 0.00 Leaf 17B = 0.00 Leaf 18B = 0.00 Leaf 19B = 0.00 Leaf 20B = 0.00 Leaf 21B = 5.00 Leaf 22B = 5.00 Leaf 23B = 5.00 Leaf 24B = 5.00 Leaf 25B = 5.00 Leaf 26B = 5.00 Leaf 27B = 5.00 Leaf 28B = 5.00 Leaf 29B = 5.00 Leaf 30B = 5.00 Leaf 31B = 5.00 Leaf 32B = 5.00 Leaf 33B = 5.00 Leaf 34B = 5.00 Leaf 35B = 5.00 Leaf 36B = 5.00 Leaf 37B = 5.00 Leaf 38B = 5.00 Leaf 39B = 5.00 Leaf 40B = 5.00 Leaf 41B = 0.00 Leaf 42B = 0.00 Leaf 43B = 0.00 Leaf 44B = 0.00 Leaf 45B = 0.00 Leaf 46B = 0.00 Leaf 47B = 0.00 Leaf 48B = 0.00 Leaf 49B = 0.00 Leaf 50B = 0.00 Leaf 51B = 0.00 Leaf 52B = 0.00 Leaf 53B = 0.00 Leaf 54B = 0.00 Leaf 55B = 0.00 Leaf 56B = 0.00 Leaf 57B = 0.00 Leaf 58B = 0.00 Leaf 59B = 0.00 Leaf 60B = 0.00 Note = 0 Shape = 4 500 500 500 -500 -500 -500 -500 500 Magnification = 1.00 Field = 8 Index = 0.4000 Carriage Group = 1 Operator = Collimator = 0.0 Leaf 1A = 0.00 Leaf 2A = 0.00 Leaf 3A = 0.00 Leaf 4A = 0.00 Leaf 5A = 0.00 Leaf 6A = 0.00 Leaf 7A = 0.00 Leaf 8A = 0.00 Leaf 9A = 0.00 Leaf 10A = 0.00 Leaf 11A = 0.00 Leaf 12A = 0.00 Leaf 13A = 0.00 Leaf 14A = 0.00 Leaf 15A = 0.00 Leaf 16A = 0.00 Leaf 17A = 0.00 Leaf 18A = 0.00 Leaf 19A = 0.00 Leaf 20A = 0.00 Leaf 21A = 0.00 Leaf 22A = 0.00 Leaf 23A = 4.00 Leaf 24A = 4.00 Leaf 25A = 4.00 Leaf 26A = 4.00 Leaf 27A = 4.00 Leaf 28A = 4.00 Leaf 29A = 4.00 Leaf 30A = 4.00 Leaf 31A = 4.00 Leaf 32A = 4.00 Leaf 33A = 4.00 Leaf 34A = 4.00 Leaf 35A = 4.00 Leaf 36A = 4.00 Leaf 37A = 4.00 Leaf 38A = 4.00 Leaf 39A = 0.00 Leaf 40A = 0.00 Leaf 41A = 0.00 Leaf 42A = 0.00 Leaf 43A = 0.00 Leaf 44A = 0.00 Leaf 45A = 0.00 Leaf 46A = 0.00 Leaf 47A = 0.00 Leaf 48A = 0.00 Leaf 49A = 0.00 Leaf 50A = 0.00 Leaf 51A = 0.00 Leaf 52A = 0.00 Leaf 53A = 0.00 Leaf 54A = 0.00 Leaf 55A = 0.00 Leaf 56A = 0.00 Leaf 57A = 0.00 Leaf 58A = 0.00 Leaf 59A = 0.00 Leaf 60A = 0.00 Leaf 1B = 0.00 Leaf 2B = 0.00 Leaf 3B = 0.00 Leaf 4B = 0.00 Leaf 5B = 0.00 Leaf 6B = 0.00 Leaf 7B = 0.00 Leaf 8B = 0.00 Leaf 9B = 0.00 Leaf 10B = 0.00 Leaf 11B = 0.00 Leaf 12B = 0.00 Leaf 13B = 0.00 Leaf 14B = 0.00 Leaf 15B = 0.00 Leaf 16B = 0.00 Leaf 17B = 0.00 Leaf 18B = 0.00 Leaf 19B = 0.00 Leaf 20B = 0.00 Leaf 21B = 0.00 Leaf 22B = 0.00 Leaf 23B = 4.00 Leaf 24B = 4.00 Leaf 25B = 4.00 Leaf 26B = 4.00 Leaf 27B = 4.00 Leaf 28B = 4.00 Leaf 29B = 4.00 Leaf 30B = 4.00 Leaf 31B = 4.00 Leaf 32B = 4.00 Leaf 33B = 4.00 Leaf 34B = 4.00 Leaf 35B = 4.00 Leaf 36B = 4.00 Leaf 37B = 4.00 Leaf 38B = 4.00 Leaf 39B = 0.00 Leaf 40B = 0.00 Leaf 41B = 0.00 Leaf 42B = 0.00 Leaf 43B = 0.00 Leaf 44B = 0.00 Leaf 45B = 0.00 Leaf 46B = 0.00 Leaf 47B = 0.00 Leaf 48B = 0.00 Leaf 49B = 0.00 Leaf 50B = 0.00 Leaf 51B = 0.00 Leaf 52B = 0.00 Leaf 53B = 0.00 Leaf 54B
Re: [Tutor] manipulating data
"Bryan Fodness" <[EMAIL PROTECTED]> > f = open('TEST1.MLC') > fields = {} > for line in f: > the_line = line.split() > if the_line: >if the_line[0] == 'Field': > field = int(the_line[-1]) >elif the_line[0] == 'Leaf': > fields[field] = the_line[-1] > > which, sort of works, but it overwrites each value. You need to create an empty list when you define field and you need to append top that list. See the pseudo code I sent last time... >> So we need to create an empty list entry where we >> define field and then append here, so my pseudo >> code now becomes: >> >> f = open('foo.dat') >> for line in f: >>if field == None and 'Field' in line: >> field = int(line.split()[-1]) >> fields[field] = [] >>elif 'Leaf' in line: >> fields[field].append(line.split()[-1]) >>else: f.next() >> Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
I have tried, f = open('TEST1.MLC') fields = {} for line in f: the_line = line.split() if the_line: if the_line[0] == 'Field': field = int(the_line[-1]) elif the_line[0] == 'Leaf': fields[field] = the_line[-1] which, sort of works, but it overwrites each value. On Nov 12, 2007 6:55 PM, Alan Gauld <[EMAIL PROTECTED]> wrote: > The lesson here is not to try to do two things at once... > > > file.next() > > TypeError: descriptor 'next' of 'file' object needs an argument > > OK, My algorithm was meant to be pseudo code so file was > not intended to be taken literally, its just a marker for an open > file object. > > > And, it is true that I am trying to build a list and not overwrite > > the value. > > OK, That adds a bit more tweaking... > > >> Personally I'd use a flag to detect when field had > >> been found and set - ie set field to None and then > >> test for that changing, then test for Leaf as you do. > > That was before I went back to testing my own project > > >> So I think your algorithm should be > >> > >> for line in file > >>if 'Field' in line: > >> field = int(line.split()[-1]) > > and this was after - with no flag anywhere in sight! Oops. > > I intended the if test to include a check for field == None... > > if field == None and 'Field' in line: > > >>elif 'Leaf' in line: > >> fields[field] = line.split()[-1] > >>else: file.next() > >> > >> But I think there's another problem in that you are > >> then overwriting the value of Leaf when I think you > >> are trying to build a list? > > So we need to create an empty list entry where we > define field and then append here, so my pseudo > code now becomes: > > f = open('foo.dat') > for line in f: >if field == None and 'Field' in line: > field = int(line.split()[-1]) > fields[field] = [] >elif 'Leaf' in line: > fields[field].append(line.split()[-1]) >else: f.next() > > still untested I'm afraid, so it still may not work. > > HTH, > > Alan G. > > > > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
The lesson here is not to try to do two things at once... > file.next() > TypeError: descriptor 'next' of 'file' object needs an argument OK, My algorithm was meant to be pseudo code so file was not intended to be taken literally, its just a marker for an open file object. > And, it is true that I am trying to build a list and not overwrite > the value. OK, That adds a bit more tweaking... >> Personally I'd use a flag to detect when field had >> been found and set - ie set field to None and then >> test for that changing, then test for Leaf as you do. That was before I went back to testing my own project >> So I think your algorithm should be >> >> for line in file >>if 'Field' in line: >> field = int(line.split()[-1]) and this was after - with no flag anywhere in sight! Oops. I intended the if test to include a check for field == None... if field == None and 'Field' in line: >>elif 'Leaf' in line: >> fields[field] = line.split()[-1] >>else: file.next() >> >> But I think there's another problem in that you are >> then overwriting the value of Leaf when I think you >> are trying to build a list? So we need to create an empty list entry where we define field and then append here, so my pseudo code now becomes: f = open('foo.dat') for line in f: if field == None and 'Field' in line: field = int(line.split()[-1]) fields[field] = [] elif 'Leaf' in line: fields[field].append(line.split()[-1]) else: f.next() still untested I'm afraid, so it still may not work. HTH, Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
Using the algorithm below, I get: Traceback (most recent call last): File "C:\Users\bryan\Documents\Yennes Medical Physics\mlcShape\findvalue.py", line 49, in file.next() TypeError: descriptor 'next' of 'file' object needs an argument And, it is true that I am trying to build a list and not overwrite the value. On Nov 12, 2007 5:22 PM, ALAN GAULD <[EMAIL PROTECTED]> wrote: > Brian, > > > if line.split()[0] == 'Field': > >field = int(line.split()[-1]) > > > > IndexError: list index out of range > > > You have blank lines in the file, when you try to call split > on an empty string you get an empty list so trying to > index any element will result in an Index error. > > That's why I suggested using exceptions, testing for > every possible error condition could take a long time > and be error prone. Unfortunately I guessed the wrong > error code and didn't realise you had some dross to > wade through first... so its a wee bit more complex. > > Personally I'd use a flag to detect when field had > been found and set - ie set field to None and then > test for that changing, then test for Leaf as you do. > > So I think your algorithm should be > > for line in file >if 'Field' in line: > field = int(line.split()[-1]) >elif 'Leaf' in line: > fields[field] = line.split()[-1] >else: file.next() > > But I think there's another problem in that you are > then overwriting the value of Leaf when I think you > are trying to build a list? I'm not 100% sure what > you are aiming for but hopefully its some help! > > Alan G. > > > > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
Alan Gauld wrote: > "Bryan Fodness" <[EMAIL PROTECTED]> wrote in > >> fields = {} >> for line in open('data.txt') : >>if line : ## You shouldn't need this. >> if line.split()[0] == 'field' : >> field = int(line.split()[-1]) >> else : >> fields[field] = tuple(line.split()) >> >> fields[field] = tuple(line.split()) >> NameError: name 'field' is not defined > > As you should expect since you only define field inside > the if branch so if you go down the else route first then > field will not exist. > > I'm jumping into this rather late but I'd have thought > something like this (untested code) might work: > > fields = {} > for line in open('data.txt'): > try: >name,value = line.split() >fields[name] = int(value) > except AttributeError: pass # catches blank lines > > Or if you can control the data format the ConfigParser module > might be a better solution. > > > HTH, > Nice use of try block! I should point though that the OP's file has this structure : > My data is in a file with a format, where there may be multiple fields > > >> > > >> field = 1 > > >> > > >> 1a 0 > > > > So you have a line where you get the field key and then a line where you get the values for that field (that's how I interpreted it), in that case the error comes from the fact that there is no "field = n" line before a value pair (if that would happen later the code I submitted wouldn't catch the error). OR the OP's line "field = 0" was not part of the file and then you have two choices, the "field" in the example submitted is "1" and the data : (a, 0) and there may be multiple lines with the same field, or there will be one data tuple for each "field" value. That would be : (yes, I used Alan's code, nicer than mine) Case 1) multiple tuples per field value fields = {} for line in open('data.txt'): try: name,value = line.split() fields.setdefault(name[:-1], []).append(tuple(name[-1],int(value))) except AttributeError: pass # catches blank lines case 2) one tuple per field value fields = {} for line in open('data.txt'): try: name,value = line.split() fields[name[:-1]] = tuple(name[-1],int(value)) except AttributeError: pass # catches blank lines I don't have the OP's original post at hand so maybe he should clarify his file's structure. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
Brian, > if line.split()[0] == 'Field': >field = int(line.split()[-1]) > > IndexError: list index out of range You have blank lines in the file, when you try to call split on an empty string you get an empty list so trying to index any element will result in an Index error. That's why I suggested using exceptions, testing for every possible error condition could take a long time and be error prone. Unfortunately I guessed the wrong error code and didn't realise you had some dross to wade through first... so its a wee bit more complex. Personally I'd use a flag to detect when field had been found and set - ie set field to None and then test for that changing, then test for Leaf as you do. So I think your algorithm should be for line in file if 'Field' in line: field = int(line.split()[-1]) elif 'Leaf' in line: fields[field] = line.split()[-1] else: file.next() But I think there's another problem in that you are then overwriting the value of Leaf when I think you are trying to build a list? I'm not 100% sure what you are aiming for but hopefully its some help! Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
"Bryan Fodness" <[EMAIL PROTECTED]> wrote in > fields = {} > for line in open('data.txt') : >if line : ## You shouldn't need this. > if line.split()[0] == 'field' : > field = int(line.split()[-1]) > else : > fields[field] = tuple(line.split()) > > fields[field] = tuple(line.split()) > NameError: name 'field' is not defined As you should expect since you only define field inside the if branch so if you go down the else route first then field will not exist. I'm jumping into this rather late but I'd have thought something like this (untested code) might work: fields = {} for line in open('data.txt'): try: name,value = line.split() fields[name] = int(value) except AttributeError: pass # catches blank lines Or if you can control the data format the ConfigParser module might be a better solution. HTH, -- Alan Gauld Author of the Learn to Program web site http://www.freenetpages.co.uk/hp/alan.gauld On Nov 8, 2007 7:34 AM, Ricardo Aráoz <[EMAIL PROTECTED]> wrote: > > Kent Johnson wrote: > > Bryan Fodness wrote: > >> I would like to have my data in a format so that I can create a > >> contour plot. > >> > >> My data is in a file with a format, where there may be multiple > >> fields > >> > >> field = 1 > >> > >> 1a 0 > > > > If your data is really this regular, it is pretty easy to parse. A > > useful technique is to access a file's next method directly. > > Something > > like this (not tested!): > > > > f = open('data.txt') > > fields = {} # build a dict of fields > > try: > >while True: > > # Get the field line > > line = f.next() > > field = int(line.split()[-1]) # last part of the line as an > > int > > > > f.next() # skip blank line > > > > data = {} # for each field, map (row, col) to value > > for i in range(20): # read 20 data lines > >line = f.next() > >ix, value = f.split() > >row = int(ix[:-1]) > >col = ix[-1] > >data[row, col] = int(value) > > > > fields[field] = data > > > > f.next() > > except StopIteration: > >pass > > > > Or maybe just (untested) : > > fields = {} # build a dict of fields > for line in open('data.txt') : >if line :# skip blank lines >if line.split()[0] == 'field' : >field = int(line.split()[-1]) >else : >fields[field] = tuple(line.split()) > > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
Using, fields = {} for line in open('data.txt') : if line : if line.split()[0] == 'field' : field = int(line.split()[-1]) else : fields[field] = tuple(line.split()) I get, fields[field] = tuple(line.split()) NameError: name 'field' is not defined On Nov 8, 2007 7:34 AM, Ricardo Aráoz <[EMAIL PROTECTED]> wrote: > > Kent Johnson wrote: > > Bryan Fodness wrote: > >> I would like to have my data in a format so that I can create a contour > >> plot. > >> > >> My data is in a file with a format, where there may be multiple fields > >> > >> field = 1 > >> > >> 1a 0 > > > > If your data is really this regular, it is pretty easy to parse. A > > useful technique is to access a file's next method directly. Something > > like this (not tested!): > > > > f = open('data.txt') > > fields = {} # build a dict of fields > > try: > >while True: > > # Get the field line > > line = f.next() > > field = int(line.split()[-1]) # last part of the line as an int > > > > f.next() # skip blank line > > > > data = {} # for each field, map (row, col) to value > > for i in range(20): # read 20 data lines > >line = f.next() > >ix, value = f.split() > >row = int(ix[:-1]) > >col = ix[-1] > >data[row, col] = int(value) > > > > fields[field] = data > > > > f.next() > > except StopIteration: > >pass > > > > Or maybe just (untested) : > > fields = {} # build a dict of fields > for line in open('data.txt') : >if line :# skip blank lines >if line.split()[0] == 'field' : >field = int(line.split()[-1]) >else : >fields[field] = tuple(line.split()) > > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
Kent Johnson wrote: > Bryan Fodness wrote: >> I would like to have my data in a format so that I can create a contour plot. >> >> My data is in a file with a format, where there may be multiple fields >> >> field = 1 >> >> 1a 0 > > If your data is really this regular, it is pretty easy to parse. A > useful technique is to access a file's next method directly. Something > like this (not tested!): > > f = open('data.txt') > fields = {} # build a dict of fields > try: >while True: > # Get the field line > line = f.next() > field = int(line.split()[-1]) # last part of the line as an int > > f.next() # skip blank line > > data = {} # for each field, map (row, col) to value > for i in range(20): # read 20 data lines >line = f.next() >ix, value = f.split() >row = int(ix[:-1]) >col = ix[-1] >data[row, col] = int(value) > > fields[field] = data > > f.next() > except StopIteration: >pass > Or maybe just (untested) : fields = {} # build a dict of fields for line in open('data.txt') : if line :# skip blank lines if line.split()[0] = 'field' : field = int(line.split()[-1]) else : fields[field] = tuple(line.split()) ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
I also have some information at the beginning of the file and between each field. Is there a way to get the info at the beginning and tell it once it sees Leaf 1A to read the values for the next 120 and then repeat until there are no more Fields. File Rev = G Treatment = Dynamic Dose Last Name = Fodness First Name = Bryan Patient ID = 0001 Number of Fields = 4 Number of Leaves = 120 Tolerance = 0.50 Field = 10 Index = 0. Carriage Group = 1 Operator = Collimator = 0.0 Leaf 1A = 0.00 Leaf 2A = 0.00 Leaf 3A = 0.00 Leaf 4A = 0.00 ... Leaf 57B = 0.00 Leaf 58B = 0.00 Leaf 59B = 0.00 Leaf 60B = 0.00 Note = 0 Shape = 4 500 500 500 -500 -500 -500 -500 500 Magnification = 1.00 Field = 8 Index = 0.4000 Carriage Group = 1 Operator = Collimator = 0.0 Leaf 1A = 0.00 Leaf 2A = 0.00 Leaf 3A = 0.00 Leaf 4A = 0.00 ... Leaf 57B = 0.00 Leaf 58B = 0.00 Leaf 59B = 0.00 Leaf 60B = 0.00 Note = 0 Shape = 4 400 400 400 -400 -400 -400 -400 400 Magnification = 1.00 I would like to have a data structure that I can use in one of the graphing utilities (matpolotlib?). I probably want to populate an array with values, but I have not figured out how I want to do that yet. On Nov 7, 2007 8:52 AM, Kent Johnson <[EMAIL PROTECTED]> wrote: > Bryan Fodness wrote: > > I would like to have my data in a format so that I can create a contour > > plot. > > > > My data is in a file with a format, where there may be multiple fields > > > > field = 1 > > > > 1a0 > > If your data is really this regular, it is pretty easy to parse. A > useful technique is to access a file's next method directly. Something > like this (not tested!): > > f = open('data.txt') > fields = {} # build a dict of fields > try: > while True: > # Get the field line > line = f.next() > field = int(line.split()[-1]) # last part of the line as an int > > f.next() # skip blank line > > data = {} # for each field, map (row, col) to value > for i in range(20): # read 20 data lines > line = f.next() > ix, value = f.split() > row = int(ix[:-1]) > col = ix[-1] > data[row, col] = int(value) > > fields[field] = data > > f.next() > except StopIteration: > pass > > This builds a dict whose keys are field numbers and values are > themselves dicts mapping (row, col) pairs to a value. > > > where, > > > >a b > > a b ab > > 1000|00 00|00 00|00 > > 9 00|00 00|00 00|00 > > 8 01|10 00|00 00|00 > > 7 01|10 00|00 00|00 > > 6 01|10 00|00 000111|111000 > > 5 01|10 00|00 000111|111000 > > 4 01|10 00|00 00|00 > > 3 01|10 00|00 00|00 > > 2 00|00 00|00 00|00 > > 1 00|00 00|00 00|00 > > I guess this is the intended output? Do you want to actually create a > printed table like this, or some kind of data structure that represents > the table, or what? > > Kent > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
Bryan Fodness wrote: > I also have some information at the beginning of the file and between > each field. Is there a way to get the info at the beginning and tell > it once it sees Leaf 1A to read the values for the next 120 and then > repeat until there are no more Fields. This should be a pretty simple modification to the technique I showed you using f.next(). Just add another loop to process the header fields. If you have variable-length sections then you may have to 'prefetch' the next line, something like this: try: line = f.next() while True: if 'Leaf 1A' in line: break # process header line # 'line' already contains the next line while True: # process body line line = f.next() Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] manipulating data
Bryan Fodness wrote: > I would like to have my data in a format so that I can create a contour plot. > > My data is in a file with a format, where there may be multiple fields > > field = 1 > > 1a0 If your data is really this regular, it is pretty easy to parse. A useful technique is to access a file's next method directly. Something like this (not tested!): f = open('data.txt') fields = {} # build a dict of fields try: while True: # Get the field line line = f.next() field = int(line.split()[-1]) # last part of the line as an int f.next() # skip blank line data = {} # for each field, map (row, col) to value for i in range(20): # read 20 data lines line = f.next() ix, value = f.split() row = int(ix[:-1]) col = ix[-1] data[row, col] = int(value) fields[field] = data f.next() except StopIteration: pass This builds a dict whose keys are field numbers and values are themselves dicts mapping (row, col) pairs to a value. > where, > >a b > a b ab > 1000|00 00|00 00|00 > 9 00|00 00|00 00|00 > 8 01|10 00|00 00|00 > 7 01|10 00|00 00|00 > 6 01|10 00|00 000111|111000 > 5 01|10 00|00 000111|111000 > 4 01|10 00|00 00|00 > 3 01|10 00|00 00|00 > 2 00|00 00|00 00|00 > 1 00|00 00|00 00|00 I guess this is the intended output? Do you want to actually create a printed table like this, or some kind of data structure that represents the table, or what? Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] manipulating data
I would like to have my data in a format so that I can create a contour plot. My data is in a file with a format, where there may be multiple fields field = 1 1a 0 2a 0 3a 5 4a 5 5a 5 6a 5 7a 5 8a 5 9a 0 10a 0 1b 0 2b 0 3b 5 4b 5 5b 5 6b 5 7b 5 8b 5 9b 0 10b 0 field = 2 1a 0 2a 0 3a 0 4a 4 5a 4 6a 4 7a 4 8a 0 9a 0 10a 0 1b 0 2b 0 3b 0 4b 4 5b 4 6b 4 7b 4 8b 0 9b 0 10b 0 field = 3 1a 0 2a 0 3a 0 4a 0 5a 3 6a 3 7a 0 8a 0 9a 0 10a 0 1b 0 2b 0 3b 0 4b 0 5b 3 6b 3 7b 0 8b 0 9b 0 10b 0 where, a b a b ab 10 00|00 00|00 00|00 9 00|00 00|00 00|00 8 01|10 00|00 00|00 7 01|10 00|00 00|00 6 01|10 00|00 000111|111000 5 01|10 00|00 000111|111000 4 01|10 00|00 00|00 3 01|10 00|00 00|00 2 00|00 00|00 00|00 1 00|00 00|00 00|00 I could possibly have many of these that I will add together and normalize to one. Also, there are 60 a and b blocks, the middle 40 are 0.5 times the width of the outer 20. I thought about filling an array, but there is not a one to one symmetry. I cannot seem to get my head around this. Can anybody help me get started? ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor