[EMAIL PROTECTED] writes: > I have an Unicode text file with 1.6 billon lines (~2GB) that I'd like > to sort based on first two characters. > > I'd greatly appreciate if someone can post sample code that can help > me do this.
Use the unix sort command: sort inputfile -o outputfile I think there is a cygwin port. > Also, any ideas on approximately how long is the sort process going to > take (XP, Dual Core 2.0GHz w/2GB RAM). Eh, unix sort would probably take a while, somewhere between 15 minutes and an hour. If you only have to do it once it's not worth writing special purpose code. If you have to do it a lot, get some more ram for that box, suck the file into memory and do a radix sort. -- http://mail.python.org/mailman/listinfo/python-list