Re: [Tutor] text processing lines variable content
On 07/02/2019 18:06, Peter Otten wrote: Sorry, I don't understand the question. after a quick look not unlike what you propose but I have to investigate further, lengths of chunks are known or can be found (sketchy): order= [%i,%q,%r,%w,%p,%P,%o,%m,%g,%E,%s,%e,%F,%a,%A,%f,%t,%l,%n,%v,%c,%C] length=[ 1, 3, 1, 1,%w,%w,%w, 1, 1, 1, 1,%s, 1,%s,%a,%s,%s,%s,%s, 1, 3, 3] from there calculate the slices per line slices={"%i":(0,1), "%q":(1,4), "%r":(4:5)etc modify all functions to accept and deal with the slice tuple, then the action loop gets very simple: for points, line in enumerate(open("vorodat.txt.vol",'r'), 1): line = line.strip() line = line.split(" ") slices = calculate_slices(line) function[action](content[action],slices[action]) thanks for your time and insight, I'll try a few different ways ingo ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] text processing lines variable content
ingo janssen wrote: > > > On 07/02/2019 11:08, Peter Otten wrote: >> replace the sequence of tests with dictionary lookups > > updated the gist a few times, now I could pre calculate the slices to be > taken per line, but will there be much gain compared to the copping from > the left side of the list? Sorry, I don't understand the question. Looking at your code > if action == "%i": > lbl = function[action](content[action]) you really do not need the function[action] lookup here because you know that the result will always be f_number. Likewise you could bind content["%i"] to a name, labels, say, and then write if action == "%s": lbl = f_number(labels) which I find much more readable. A lookup table only makes sense if provides all necessary information. I tried to apply the idea to one of your gist versions: def set_lbl(items): global lbl lbl = f_number(items) def set_w(items): global v v = f_number(items) def set_f(items): global f f = f_number(items) def set_mx(items): global mx mx = mx_value_array(items, f) function = { "%i" : set_lbl, "%w" : set_w, "%s" : set_f, "%a" : set_mx, "%q" : f_vector, "%r" : f_value, "%p" : lambda items: f_vector_array(items, v), "%P" : lambda items: f_vector_array(items, v), "%o" : lambda items: f_value_array(items, v), "%m" : f_value, "%g" : f_number, "%E" : f_value, "%e" : lambda items: f_value_array(items, f), "%F" : f_value, "%A" : lambda items: f_value_array(items, mx + 1), "%f" : lambda items: f_value_array(items, f), "%t" : lambda items: f_nested_value_array(items, f), "%l" : lambda items: f_vector_array(items, f), "%n" : lambda items: f_value_array(items, f), "%v" : f_value, "%c" : f_vector, "%C" : f_vector } order = "%i %q %r %w %p %P %o %m %g %E %s %e %F %a %A %f %t %l %n %v %c %C" order = re.findall("%[a-z]",order,re.M|re.I) content = {} actions = [] for i in order: items = content[i] = [] actions.append(partial(function[i], items)) for points, line in enumerate(open("vorodat.txt.vol",'r'), 1): line = line.strip() line = line.split(" ") for action in actions: action() However, while the loop is rather clean now the rest of the code is sprinkled with implicit arguments and thus much worse than what you have. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] text processing lines variable content
On 07/02/2019 11:08, Peter Otten wrote: replace the sequence of tests with dictionary lookups updated the gist a few times, now I could pre calculate the slices to be taken per line, but will there be much gain compared to the copping from the left side of the list? ingo ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] text processing lines variable content
On 07/02/2019 11:08, Peter Otten wrote: Personally I would avoid the NameError and start with empty lists. If you manage to wrap all branches into functions with the same signature you can replace the sequence of tests with dictionary lookups. Just before I saw your post I put my current code up here: https://gist.github.com/ingoogni/e99c561f23777e59a5aa6b4ef5fe37c8 I will study yours, ingo ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] text processing lines variable content
On 07/02/2019 10:40, Alan Gauld via Tutor wrote: Just saves a little typing is all. Sensei, be lazy, I will study current state of code is at https://gist.github.com/ingoogni/e99c561f23777e59a5aa6b4ef5fe37c8 ingo ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] text processing lines variable content
ingo janssen wrote: > > On 07/02/2019 09:29, Peter Otten wrote: >> Where will you get the order from? > > Peter, > > the order comes from the command line. Then my one-function-per-format approach won't work. > I intend to call the python > program with the same command line options as the Voro++ program. Make > the python program call the Voro++ and process its output. > > one command line option contains a string for formatting the output. > That is what I use for order. > > #all output formatting options > order = "%i %q %r %w %p %P %o %m %g %E %s %e %F %a %A %f %t %l %n %v %c > %C" order = re.findall("%[a-z]",order,re.M|re.I) > for i, line in enumerate(open("vorodat.vol",'r')): >points = i >line = line.strip() >line = line.split(" ") >for action in order: > if action == "%i": >try: > lbl = f_label(label) >except NameError as e: > lbl = f_number(label) > label=[lbl] Personally I would avoid the NameError and start with empty lists. If you manage to wrap all branches into functions with the same signature you can replace the sequence of tests with dictionary lookups. Here's a sketch: # the f_...() functions take a parsed line and return a value and the # as yet unused rest of the parsed line labels = [] points = [] ... def add_label(parts): label, rest = f_label(parts) labels.append(label) return rest def add_point(parts): point, rest = f_vector(parts) points.append(point) return rest def add_point(parts): global width width, rest = f_width(parts) return rest lookup_actions = { "%i": add_label, "%q": add_point, "%w": set_width, ... } actions = [lookup_actions[action] for action in order] with open("vorodat.vol") as instream: for points, line in enumerate(instream, 1): # as per Mark's advice width = None # dummy value to provoke error when width # is not explicitly set parts = line.split() for action in actions: parts = actions(parts) > elif action == "%q": >try: > f_vector(point) >except NameError as e: >point = [f_vector(point)] > elif action == "%r": >try: > f_value(radius) >except NameError as e: > radius=[f_value(radius)] > etc. > > order is important as %w tells me how long %p, %P and %o will be. This > varies per line. > > I'll look into what you wrote, ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] text processing lines variable content
On 07/02/2019 08:58, ingo janssen wrote: >try: > lbl = f_label(label) >except NameError as e: > lbl = f_number(label) > label=[lbl] Just a minor point but since you aren't doing anything with the error you don't need the 'as e' bit at the end of each line... Just saves a little typing is all. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] text processing lines variable content
On 07/02/2019 09:58, ingo janssen wrote: On 07/02/2019 09:29, Peter Otten wrote: Where will you get the order from? Ahrg, that should have been: #all output formatting options order = "%i %q %r %w %p %P %o %m %g %E %s %e %F %a %A %f %t %l %n %v %c %C" order = re.findall("%[a-z]",order,re.M|re.I) for i, line in enumerate(open("vorodat.vol",'r')): points = i line = line.strip() line = line.split(" ") for action in order: if action == "%i": try: lbl = f_label(label) except NameError as e: label=[] lbl = f_number(label) elif action == "%q": try: f_vector(point) except NameError as e: point = [] f_vector(point) elif action == "%r": try: f_value(radius) except NameError as e: radius = [] f_value(radius) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] text processing lines variable content
On 07/02/2019 09:29, Peter Otten wrote: Where will you get the order from? Peter, the order comes from the command line. I intend to call the python program with the same command line options as the Voro++ program. Make the python program call the Voro++ and process its output. one command line option contains a string for formatting the output. That is what I use for order. #all output formatting options order = "%i %q %r %w %p %P %o %m %g %E %s %e %F %a %A %f %t %l %n %v %c %C" order = re.findall("%[a-z]",order,re.M|re.I) for i, line in enumerate(open("vorodat.vol",'r')): points = i line = line.strip() line = line.split(" ") for action in order: if action == "%i": try: lbl = f_label(label) except NameError as e: lbl = f_number(label) label=[lbl] elif action == "%q": try: f_vector(point) except NameError as e: point = [f_vector(point)] elif action == "%r": try: f_value(radius) except NameError as e: radius=[f_value(radius)] etc. order is important as %w tells me how long %p, %P and %o will be. This varies per line. I'll look into what you wrote, thanks, ingo ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] text processing lines variable content
ingo janssen wrote: > depending on how the input file is created data packet a can be in an > other position for every line. > figured out how to do it though > > order=[a,b,e,d...] > for i in lines: >i=i.split(" ") > for j in order: >if j = a: > use function for processing data chunk a >elseif j = b: > use proper function for processing data type b >... Where will you get the order from? If you plan to specify it manually, e. g. lookup_steps = { "foo": [a, b, c, ...], "bar": [a, a, f, ...], } fileformat = sys.argv[1] steps = lookup_steps[fileformat] ... for line in lines: for step in steps: if step == a: ... elif step == b: ... then I recommend storing one function per file format instead: def process_foo(line): ... # process one line in foo format def process_bar(line): ... lineprocessors = { "foo": process_foo, "bar": process_bar, } fileformat = sys.argv[1] process = lineprocessors[fileformat] ... for line in lines: process(line) That way you deal with Python functions instead of a self-invented minilanguage. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] text processing lines variable content
On 06/02/2019 21:45, Mark Lawrence wrote: So what, you still don't need to chop the front from the list, just process the data. just slice I'd like to adapt the order in that the functions are applied, but how? I suspect that you're trying to over complicate things, what's wrong with a simple if/elif chain, a switch based on a dict or similar? You mean create a list with the order=[a,b,e,d...] Again I've no idea what you're saying here. depending on how the input file is created data packet a can be in an other position for every line. figured out how to do it though order=[a,b,e,d...] for i in lines: i=i.split(" ") for j in order: if j = a: use function for processing data chunk a elseif j = b: use proper function for processing data type b ... I don't know beforehand how many lines I have. Now you tell us :-( sorry then loop order=[a,b,e,d...] for each line What has a loop order got to do with using a dict? order of data chunks varies per file Why bother, just have a list of lists and index on the position, or are we talking at cross purposes? Sorry for the amount of text below, I hope it clarifies one line of space delimited input data: 0 1094.82 0.1 582.419 0.5 14 (0.200231,1.13714,-8.35338) (-10.2097,1.13714,-4.05001) (-10.2097,-14.3466,-4.05001) (-2.4419,-39.895,9.65513) (-0.382375,-100.1,7.27361) (0.200231,-100.1,-8.35338) (-2.43137,1.58294,9.64296) (-10.1818,1.514,-4.00085) (-2.4419,1.51399,9.65513) (3.73705,-100.1,2.51013) (0.220825,1.58294,-8.29013) (-6.42082,-100.1,-5.61629) (-10.1626,1.58294,-3.9977) (3.73705,1.58294,2.51013) (1095.02,1.23714,574.066) (1084.61,1.23714,578.369) (1084.61,-14.2466,578.369) (1092.38,-39.795,592.074) (1094.44,-100,589.693) (1095.02,-100,574.066) (1092.39,1.68294,592.062) (1084.64,1.614,578.418) (1092.38,1.61399,592.074) (1098.56,-100,584.929) (1095.04,1.68294,574.129) (1088.4,-100,576.803) (1084.66,1.68294,578.421) (1098.56,1.68294,584.929) 3 3 3 3 3 3 3 3 3 3 3 3 3 3 10092.8 21 550.726 9 23.4034 221.001 102.986 190.388 219.178 39.1211 226.154 47.7032 31.5186 4765.01 5 5 5 4 6 4 5 4 4 0 0 0 0 4 4 1 5.07336 964.581 451.085 1100.75 865.736 81.7357 1161.69 133.262 1.10745 (1,0,10,12,7) (1,2,11,5,0) (1,7,8,3,2) (2,3,4,11) (3,8,6,13,9,4) (4,9,5,11) (5,9,13,10,0) (6,12,10,13) (6,8,7,12) (-0.377877,0.147157,-0.914086) (-0.382036,2.8913e-18,-0.924147) (-0.869981,0,0.493086) (-0.904528,-0.0477043,0.423738) (0.75639,-5.72053e-15,0.654121) (-0,-1,0) (0.950875,4.0561e-18,-0.309575) (-5.99268e-17,1,-1.44963e-16) (-0.849681,0.21476,0.481581) 9205 9105 3062 9946 5786 -3 1483 100 3262 11680.5 -2.00777 -44.9048 -0.428504 1092.81 -44.8048 581.99 this one line as it is in the output file. For a file with 10 lines the outer arrays will be 10 items long: #declare Labels = array[0]{0} #declare Points = array[0]{<1094.82,0.1,582.419>} #declare Radii = array[0]{0.5} #declare NumVertices = array[0]{14} #declare RelVertices = array[0]{ //label: 0 array[14]{ <0.200231,1.13714,-8.35338>,<-10.2097,1.13714,-4.05001>,<-10.2097,-14.3466,-4.05001>,<-2.4419,-39.895,9.65513>,<-0.382375,-100.1,7.27361>,<0.200231,-100.1,-8.35338>,<-2.43137,1.58294,9.64296>,<-10.1818,1.514,-4.00085>,<-2.4419,1.51399,9.65513>,<3.73705,-100.1,2.51013>,<0.220825,1.58294,-8.29013>,<-6.42082,-100.1,-5.61629>,<-10.1626,1.58294,-3.9977>,<3.73705,1.58294,2.51013> } } #declare GlobalVertices = array[0]{ //label: 0 array[14]{ <1095.02,1.23714,574.066>,<1084.61,1.23714,578.369>,<1084.61,-14.2466,578.369>,<1092.38,-39.795,592.074>,<1094.44,-100,589.693>,<1095.02,-100,574.066>,<1092.39,1.68294,592.062>,<1084.64,1.614,578.418>,<1092.38,1.61399,592.074>,<1098.56,-100,584.929>,<1095.04,1.68294,574.129>,<1088.4,-100,576.803>,<1084.66,1.68294,578.421>,<1098.56,1.68294,584.929> } } #declare MaxRadius = array[0]{10092.8} #declare NumEdges = array[0]{21} #declare EdgeDistance = array[0]{550.726} #declare NumFaces = array[0]{9} #declare FacePerimeter = array[0]{ //label: 0 array[9]{23.4034,221.001,102.986,190.388,219.178,39.1211,226.154,47.7032,31.5186} } #declare SurfaceArea = array[0]{4765.01} #declare FacesOrders = array[0]{ //label: 0 array[9]{5,5,5,4,6,4,5,4,4} } #declare FreqFaces = array[0]{ //label: 0 array[7]{0,0,0,0,4,4,1} } #declare FaceArea = array[0]{ //label: 0 array[9]{5.07336,964.581,451.085,1100.75,865.736,81.7357,1161.69,133.262,1.10745} } #declare FaceVerticesIndex = array[0]{ //label: 0 array[9]{ array[5]{1,0,10,12,7}, array[5]{1,2,11,5,0}, array[5]{1,7,8,3,2}, array[4]{2,3,4,11}, array[6]{3,8,6,13,9,4}, array[4]{4,9,5,11}, array[5]{5,9,13,10,0}, array[4]{6,12,10,13}, array[4]{6,8,7,12}, } } #declare FaceNormal = array[0]{ //label: 0 array[9]{ <-0.377877,0.147157,-0.914086>,<-0.382036,2.8913e-18,-0.924147>,<-0.869981,0,0.493086>,<-0.904528,-0.0477043,0.423738>,<0.75639,-5.72053e-15,0.654121>,<-0,-1,0>,<0.950875,4.0561e-18,-0.30957
Re: [Tutor] text processing lines variable content
On 06/02/2019 18:51, ingo janssen wrote: On 06/02/2019 19:07, Mark Lawrence wrote: That's going to a lot of work slicing and dicing the input lists. Perhaps a chunked recipe like this https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.chunked would be better. The length of the text chunks varies from a single character to a list of ~30 3D vectors. So what, you still don't need to chop the front from the list, just process the data. I'd like to adapt the order in that the functions are applied, but how? I suspect that you're trying to over complicate things, what's wrong with a simple if/elif chain, a switch based on a dict or similar? You mean create a list with the order=[a,b,e,d...] if a in order: f_vector_array(a, 3) elseif b in order: f_value(max_radius) that would run the proper function, but not in the right order? Again I've no idea what you're saying here. for i, line in enumerate(open("vorodat.vol",'r')): points = i+1 enumerate takes a start argument so you shouldn't need the above line. points is needed later on in the program and I don't know beforehand how many lines I have. Now you tell us :-( I thought about putting the functions in a dict and then create a list with the proper order, but can't get it to work. Please show us your code and exactly why it didn't work. def f_vector_array(outlist, length): rv = pop_left_slice(line, length) rv = [f'<{i[1:-1]}>' for i in rv] #i format is: '(1.234,2.345,3.456)' rv = ",".join(rv) outlist.append(f" //label: {lbl}\n array[{length}]"+"{\n "+rv+"\n }\n") functions={ 'a':f_number(num_vertex), 'b':f_vector_array(rel_vertex,v) } where rel_vertex is the list where to move the processed data to and v the amount of text to chop of the front of the line. v is not known when defining the dictionary. v comes from an other function v=f_number(num_vertex) that also should live in the dict. You don't need to specify the parameters in the dict, just give the function name. then loop order=[a,b,e,d...] for each line What has a loop order got to do with using a dict? I'm not absolutely sure what you're saying here, but would something like the SortedList from http://www.grantjenks.com/docs/sortedcontainers/ help? Maybe this explains it better, assume the split input lines: line1=[a,b,c,d,e,f,...] line2=[a,b,c,d,e,f,...] line3=[a,b,c,d,e,f,...] ... line10=... all data on position a should go to list a a=[a1,a2,a3,...a_n] b=[b1,b2,b3,...b_n] c=[c1,c2,c3,...n_n] etc. this is what for example the function f_vector_array(a, 3) does. Why bother, just have a list of lists and index on the position, or are we talking at cross purposes? All these lists have to be written to a single file, each list contains 10 items. Instead of keeping it all in memory I could write a1 to a temp file A instead of putting it in a list first and b1 to a temp file B etc. in the next loop a2 to file A, b2 to file B etc. When all lines are processed combine the files A,B,C ... to a single file. Or is there a more practical way? Speed is not important. What is your definition of "combine the files A,B,C ... to a single file"? ingo -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] text processing lines variable content
On 06/02/2019 19:07, Mark Lawrence wrote: That's going to a lot of work slicing and dicing the input lists. Perhaps a chunked recipe like this https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.chunked would be better. The length of the text chunks varies from a single character to a list of ~30 3D vectors. I'd like to adapt the order in that the functions are applied, but how? I suspect that you're trying to over complicate things, what's wrong with a simple if/elif chain, a switch based on a dict or similar? You mean create a list with the order=[a,b,e,d...] if a in order: f_vector_array(a, 3) elseif b in order: f_value(max_radius) that would run the proper function, but not in the right order? for i, line in enumerate(open("vorodat.vol",'r')): points = i+1 enumerate takes a start argument so you shouldn't need the above line. points is needed later on in the program and I don't know beforehand how many lines I have. I thought about putting the functions in a dict and then create a list with the proper order, but can't get it to work. Please show us your code and exactly why it didn't work. def f_vector_array(outlist, length): rv = pop_left_slice(line, length) rv = [f'<{i[1:-1]}>' for i in rv] #i format is: '(1.234,2.345,3.456)' rv = ",".join(rv) outlist.append(f" //label: {lbl}\n array[{length}]"+"{\n "+rv+"\n }\n") functions={ 'a':f_number(num_vertex), 'b':f_vector_array(rel_vertex,v) } where rel_vertex is the list where to move the processed data to and v the amount of text to chop of the front of the line. v is not known when defining the dictionary. v comes from an other function v=f_number(num_vertex) that also should live in the dict. then loop order=[a,b,e,d...] for each line I'm not absolutely sure what you're saying here, but would something like the SortedList from http://www.grantjenks.com/docs/sortedcontainers/ help? Maybe this explains it better, assume the split input lines: line1=[a,b,c,d,e,f,...] line2=[a,b,c,d,e,f,...] line3=[a,b,c,d,e,f,...] ... line10=... all data on position a should go to list a a=[a1,a2,a3,...a_n] b=[b1,b2,b3,...b_n] c=[c1,c2,c3,...n_n] etc. this is what for example the function f_vector_array(a, 3) does. All these lists have to be written to a single file, each list contains 10 items. Instead of keeping it all in memory I could write a1 to a temp file A instead of putting it in a list first and b1 to a temp file B etc. in the next loop a2 to file A, b2 to file B etc. When all lines are processed combine the files A,B,C ... to a single file. Or is there a more practical way? Speed is not important. ingo ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] text processing lines variable content
On 06/02/2019 16:33, ingo janssen wrote: For parsing the out put of the Voro++ program and writing the data to a POV-Ray include file I created a bunch of functions. def pop_left_slice(inputlist, length): outputlist = inputlist[0:length] del inputlist[:length] return outputlist That's going to a lot of work slicing and dicing the input lists. Perhaps a chunked recipe like this https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.chunked would be better. this is used by every function to chop of the required part of the input line. Two examples of the functions that proces a chopped of slice of the line and append the data to the approriate list. def f_vector(outlist): x,y,z = pop_left_slice(line,3) outlist.append(f"<{x},{y},{z}>,") def f_vector_array(outlist, length): rv = pop_left_slice(line, length) rv = [f'<{i[1:-1]}>' for i in rv] #i format is: '(1.234,2.345,3.456)' rv = ",".join(rv) outlist.append(f" //label: {lbl}\n array[{length}]"+"{\n "+rv+"\n }\n") Every line can contain up to 21 data chunks. Within one file each line contains the same amount of chunks, but it varies between files. The types of chunks vary and their position varies. I know beforehand how a line in a file is constructed. I'd like to adapt the order in that the functions are applied, but how? I suspect that you're trying to over complicate things, what's wrong with a simple if/elif chain, a switch based on a dict or similar? for i, line in enumerate(open("vorodat.vol",'r')): points = i+1 enumerate takes a start argument so you shouldn't need the above line. line = line.strip() line = line.split(" ") lbl = f_label(label) f_vector(point) Presumably the above is points? f_value(radius) v=f_number(num_vertex) f_vector_array(rel_vertex,v) f_vector_array(glob_vertex,v) f_value_array(vertex_orders,v) f_value(max_radius) e=f_number(num_edge) f_value(edge_dist) ...etc I thought about putting the functions in a dict and then create a list with the proper order, but can't get it to work. Please show us your code and exactly why it didn't work. A second question, all this works for small files with hundreds of lines, but some have 10. Then I can get at max 22 lists with 10 items. Not fun. I tried writing the data to a file "out of sequence", not fun either. What would be the way to do this? I thought about writing each data chunk to a proper temporary file instead of putting it in a list first. This would require at max 22 temp files and then a merge of the files into one. I'm not absolutely sure what you're saying here, but would something like the SortedList from http://www.grantjenks.com/docs/sortedcontainers/ help? TIA, ingo ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor