On Thu, Mar 12, 2009 at 9:43 AM, Dinesh B Vadhia <dineshbvad...@hotmail.com> wrote: > Have a large number (> 1bn) of integer co-ordinates (i, j). The i are > ordered and the j unordered. > > I want to create (j, i) with j ordered and i unordered ie. > > from: > > ... > 6940, 22886 > 6940, 38277 > 6940, 43788 > ... > > to: > ... > 38277, 567 > 38277, 90023 > 38277, 6940 > ... > > I've tried the dictionary route and it works perfectly for small set of > co-ordinate pairs but not for large sets as it hits memory capacity.
I'm not sure what the dictionary route is. Holding the data in lists might be more memory efficient but you still need enough memory to hold one or more copies of the list. Otherwise a file-based merge sort. I would do something like this: - read or otherwise acquire as many coordinates as you can comfortably reorder and sort - reorder and sort them - write the sorted list to a file, one coordinate pair per line - repeat until all coordinates have been written, using a new file for each group of coordinates - Make an iterator (use a generator function) that will open one of your files and yield individual coordinates. - use this recipe: http://code.activestate.com/recipes/491285/ to sort the coordinates, processing the result however you like (writing to a file?) Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor