RE: How to write fast into a file in python?
Oh well! Just got a flashback from the old times at the 8-bit assembly line. Dirty deeds done dirt cheap! lol Date: Sun, 19 May 2013 16:44:55 +0100 From: pyt...@mrabarnett.plus.com To: python-list@python.org Subject: Re: How to write fast into a file in python? On 19/05/2013 04:53, Carlos Nepomuceno wrote: Date: Sat, 18 May 2013 22:41:32 -0400 From: da...@davea.name To: python-list@python.org Subject: Re: How to write fast into a file in python? On 05/18/2013 01:00 PM, Carlos Nepomuceno wrote: Python really writes '\n\r' on Windows. Just check the files. That's backwards. '\r\n' on Windows, IF you omit the b in the mode when creating the file. Indeed! My mistake just made me find out that Acorn used that inversion on Acorn MOS. According to this[1] (at page 449) the OSNEWL routine outputs '\n\r'. What the hell those guys were thinking??? :p Doing it that way saved a few bytes. Code was something like this: FFE3 .OSASCI CMP #0D FFE5 BNE OSWRCH FFE7 .OSNEWL LDA #0A FFE9 JSR OSWRCH FFEC LDA #0D FFEE .OSWRCH ... This means that the contents of the accumulator would always be preserved by a call to OSASCI. OSNEWL This call issues an LF CR (line feed, carriage return) to the currently selected output stream. The routine is entered at FFE7. [1] http://regregex.bbcmicro.net/BPlusUserGuide-1.07.pdf -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On Sun, May 19, 2013 at 3:31 PM, Carlos Nepomuceno carlosnepomuc...@outlook.com wrote: Thanks Dan! I've never used CPython or PyPy. Will try them later. CPython is the classic interpreter, written in C. It's the one you'll get from the obvious download links on python.org. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
RE: How to write fast into a file in python?
ooops! I meant to say Cython. nevermind... Date: Sun, 19 May 2013 19:21:54 +1000 Subject: Re: How to write fast into a file in python? From: ros...@gmail.com To: python-list@python.org On Sun, May 19, 2013 at 3:31 PM, Carlos Nepomuceno carlosnepomuc...@outlook.com wrote: Thanks Dan! I've never used CPython or PyPy. Will try them later. CPython is the classic interpreter, written in C. It's the one you'll get from the obvious download links on python.org. ChrisA -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On 19/05/2013 04:53, Carlos Nepomuceno wrote: Date: Sat, 18 May 2013 22:41:32 -0400 From: da...@davea.name To: python-list@python.org Subject: Re: How to write fast into a file in python? On 05/18/2013 01:00 PM, Carlos Nepomuceno wrote: Python really writes '\n\r' on Windows. Just check the files. That's backwards. '\r\n' on Windows, IF you omit the b in the mode when creating the file. Indeed! My mistake just made me find out that Acorn used that inversion on Acorn MOS. According to this[1] (at page 449) the OSNEWL routine outputs '\n\r'. What the hell those guys were thinking??? :p Doing it that way saved a few bytes. Code was something like this: FFE3.OSASCI CMP #0D FFE5BNE OSWRCH FFE7.OSNEWL LDA #0A FFE9JSR OSWRCH FFECLDA #0D FFEE.OSWRCH ... This means that the contents of the accumulator would always be preserved by a call to OSASCI. OSNEWL This call issues an LF CR (line feed, carriage return) to the currently selected output stream. The routine is entered at FFE7. [1] http://regregex.bbcmicro.net/BPlusUserGuide-1.07.pdf -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
Carlos Nepomuceno carlosnepomuc...@outlook.com wrote: Python really writes '\n\r' on Windows. Just check the files. It actually writes \r\n, but it's not Python that's doing it. It's the C runtime library. And, of course, you can eliminate all of that by opening the file in binary mode open(name,'wb'). -- Tim Roberts, t...@probo.com Providenza Boekelheide, Inc. -- http://mail.python.org/mailman/listinfo/python-list
RE: How to write fast into a file in python?
On 17 May 2013 19:38, Carlos Nepomuceno carlosnepomuc...@outlook.com wrote: Think the following update will make the code more portable: x += len(line)+len(os.linesep)-1 Not sure if it's the fastest way to achieve that. :/ Putting len(os.linesep)'s value into a local variable will make accessing it quite a bit faster. But why would you want to do that? You mentioned \n translating to two lines, but this won't happen. Windows will not mess with what you write to your file. It's just that traditionally windows and windows programs use \r\n instead of just \n. I think it was for compatibility with os/2 or macintosh (I don't remember which), which used \r for newlines. You don't have to follow this convention. If you open a \n-separated file with *any* text editor other than notepad, your newlines will be okay. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
Steven D'Aprano於 2013年5月18日星期六UTC+8下午12時01分13秒寫道: On Fri, 17 May 2013 21:18:15 +0300, Carlos Nepomuceno wrote: I thought there would be a call to format method by '%d\n' % i. It seems the % operator is a lot faster than format. I just stopped using it because I read it was going to be deprecated. :( Why replace such a great and fast operator by a slow method? I mean, why format is been preferred over %? That is one of the most annoying, pernicious myths about Python, probably second only to the GIL makes Python slow (it actually makes it fast). The print function in python is designed to print any printable object with a valid string representation. The format part of the print function has to construct a printable string according to the format string and the variables passed in on the fly. If the acctual string to be printed in the format processing is obtained in the high level OOP way , then it is definitely slow due to the high level overheads and generality requirements. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On Sat, May 18, 2013 at 5:49 PM, Fábio Santos fabiosantos...@gmail.com wrote: Putting len(os.linesep)'s value into a local variable will make accessing it quite a bit faster. But why would you want to do that? You mentioned \n translating to two lines, but this won't happen. Windows will not mess with what you write to your file. It's just that traditionally windows and windows programs use \r\n instead of just \n. I think it was for compatibility with os/2 or macintosh (I don't remember which), which used \r for newlines. You don't have to follow this convention. If you open a \n-separated file with *any* text editor other than notepad, your newlines will be okay. Into two characters, not two lines, but yes. A file opened in text mode on Windows will have its lines terminated with two characters. (And it's old Macs that used to use \r. OS/2 follows the DOS convention of \r\n, but again, many apps these days are happy with Unix newlines there too.) ChrisA -- http://mail.python.org/mailman/listinfo/python-list
RE: How to write fast into a file in python?
Python really writes '\n\r' on Windows. Just check the files. Internal representations only keep '\n' for simplicity, but if you wanna keep track of the file length you have to take that into account. ;) Date: Sat, 18 May 2013 08:49:55 +0100 Subject: RE: How to write fast into a file in python? From: fabiosantos...@gmail.com To: carlosnepomuc...@outlook.com CC: python-list@python.org On 17 May 2013 19:38, Carlos Nepomuceno carlosnepomuc...@outlook.commailto:carlosnepomuc...@outlook.com wrote: Think the following update will make the code more portable: x += len(line)+len(os.linesep)-1 Not sure if it's the fastest way to achieve that. :/ Putting len(os.linesep)'s value into a local variable will make accessing it quite a bit faster. But why would you want to do that? You mentioned \n translating to two lines, but this won't happen. Windows will not mess with what you write to your file. It's just that traditionally windows and windows programs use \r\n instead of just \n. I think it was for compatibility with os/2 or macintosh (I don't remember which), which used \r for newlines. You don't have to follow this convention. If you open a \n-separated file with *any* text editor other than notepad, your newlines will be okay. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
With CPython 2.7.3: ./t time taken to write a file of size 52428800 is 15.86 seconds time taken to write a file of size 52428800 is 7.91 seconds time taken to write a file of size 52428800 is 9.64 seconds With pypy-1.9: ./t time taken to write a file of size 52428800 is 3.708232 seconds time taken to write a file of size 52428800 is 4.868304 seconds time taken to write a file of size 52428800 is 1.93612 seconds Here's the code: #!/usr/local/pypy-1.9/bin/pypy #!/usr/bin/python import sys import time import StringIO sys.path.insert(0, '/usr/local/lib') import bufsock def create_file_numbers_old(filename, size): start = time.clock() value = 0 with open(filename, w) as f: while f.tell() size: f.write({0}\n.format(value)) value += 1 end = time.clock() print time taken to write a file of size, size, is , (end -start), seconds \n def create_file_numbers_bufsock(filename, intended_size): start = time.clock() value = 0 with open(filename, w) as f: bs = bufsock.bufsock(f) actual_size = 0 while actual_size intended_size: string = {0}\n.format(value) actual_size += len(string) + 1 bs.write(string) value += 1 bs.flush() end = time.clock() print time taken to write a file of size, intended_size, is , (end -start), seconds \n def create_file_numbers_file_like(filename, intended_size): start = time.clock() value = 0 with open(filename, w) as f: file_like = StringIO.StringIO() actual_size = 0 while actual_size intended_size: string = {0}\n.format(value) actual_size += len(string) + 1 file_like.write(string) value += 1 file_like.seek(0) f.write(file_like.read()) end = time.clock() print time taken to write a file of size, intended_size, is , (end -start), seconds \n create_file_numbers_old('output.txt', 50 * 2**20) create_file_numbers_bufsock('output2.txt', 50 * 2**20) create_file_numbers_file_like('output3.txt', 50 * 2**20) On Thu, May 16, 2013 at 9:35 PM, lokeshkopp...@gmail.com wrote: On Friday, May 17, 2013 8:50:26 AM UTC+5:30, lokesh...@gmail.com wrote: I need to write numbers into a file upto 50mb and it should be fast can any one help me how to do that? i had written the following code.. --- def create_file_numbers_old(filename, size): start = time.clock() value = 0 with open(filename, w) as f: while f.tell() size: f.write({0}\n.format(value)) value += 1 end = time.clock() print time taken to write a file of size, size, is , (end -start), seconds \n -- it takes about 20sec i need 5 to 10 times less than that. size = 50mb -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
In article mailman.1813.1368904489.3114.python-l...@python.org, Dennis Lee Bieber wlfr...@ix.netcom.com wrote: tOn Sat, 18 May 2013 08:49:55 +0100, Fábio Santos fabiosantos...@gmail.com declaimed the following in gmane.comp.python.general: You mentioned \n translating to two lines, but this won't happen. Windows will not mess with what you write to your file. It's just that traditionally windows and windows programs use \r\n instead of just \n. I think it was for compatibility with os/2 or macintosh (I don't remember which), which used \r for newlines. Neither... It goes back to Teletype machines where one sent a carriage return to move the printhead back to the left, then sent a line feed to advance the paper (while the head was still moving left), and in some cases also provided a rub-out character (a do-nothing) to add an additional character time delay. The delay was important. It took more than one character time for the print head to get back to the left margin. If you kept sending printable characters while the print head was still flying back, they would get printed in the middle of the line (perhaps blurred a little). There was also a dashpot which cushioned the head assembly when it reached the left margin. Depending on how well adjusted things were this might take another character time or two to fully settle down. You can still see the remnants of this in modern Unix systems: $ stty -a speed 9600 baud; rows 40; columns 136; line = 0; intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = M-^?; eol2 = M-^?; swtch = undef; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 0; -parenb -parodd cs8 -hupcl -cstopb cread -clocal -crtscts -ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl ixon -ixoff -iuclc ixany imaxbel -iutf8 opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0 isig icanon iexten echo echoe -echok -echonl -noflsh -xcase -tostop -echoprt echoctl echoke The nl0 and cr0 mean it's configured to insert 0 delay after newlines and carriage returns. Whether setting a non-zero delay actually does anything useful anymore is an open question, but the drivers still accept the settings. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On 18 May 2013 20:19, Dennis Lee Bieber wlfr...@ix.netcom.com wrote: tOn Sat, 18 May 2013 08:49:55 +0100, Fábio Santos fabiosantos...@gmail.com declaimed the following in gmane.comp.python.general: You mentioned \n translating to two lines, but this won't happen. Windows will not mess with what you write to your file. It's just that traditionally windows and windows programs use \r\n instead of just \n. I think it was for compatibility with os/2 or macintosh (I don't remember which), which used \r for newlines. Neither... It goes back to Teletype machines where one sent a carriage return to move the printhead back to the left, then sent a line feed to advance the paper (while the head was still moving left), and in some cases also provided a rub-out character (a do-nothing) to add an additional character time delay. TRS-80 Mod 1-4 used cr for new line, I believe Apple used lf for new line... And both lost the ability to move down the page without also resetting the carriage to the left. In a world where both crlf is used, one could draw a vertical line of | by just spacing across the first line, printing |, then repeat lfbkspc| until done. To do the same with conventional lf is new line/return one has to transmit all those spaces for each line... At 300baud, that took time On Sat, May 18, 2013 at 6:00 PM, Carlos Nepomuceno carlosnepomuc...@outlook.com wrote: Python really writes '\n\r' on Windows. Just check the files. Internal representations only keep '\n' for simplicity, but if you wanna keep track of the file length you have to take that into account. ;) On Sat, May 18, 2013 at 3:29 PM, Chris Angelico ros...@gmail.com wrote: Into two characters, not two lines, but yes. A file opened in text mode on Windows will have its lines terminated with two characters. (And it's old Macs that used to use \r. OS/2 follows the DOS convention of \r\n, but again, many apps these days are happy with Unix newlines there too.) ChrisA Thanks for your corrections and explanations. I stand corrected and have learned something. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On Sat, 18 May 2013 15:14:31 -0400, Dennis Lee Bieber wrote: tOn Sat, 18 May 2013 08:49:55 +0100, Fábio Santos fabiosantos...@gmail.com declaimed the following in gmane.comp.python.general: You mentioned \n translating to two lines, but this won't happen. Windows will not mess with what you write to your file. Windows certainly may mess with what you write to your file, if it is opened in Text mode instead of Binary mode. In text mode, Windows will: - interpret Ctrl-Z characters as End Of File when reading; - convert \r\n to \n when reading; - convert \n to \r\n when writing. http://msdn.microsoft.com/en-us/library/z5hh6ee9(v=vs.80).aspx It's just that traditionally windows and windows programs use \r\n instead of just \n. I think it was for compatibility with os/2 or macintosh (I don't remember which), which used \r for newlines. Neither... It goes back to Teletype machines where one sent a carriage return to move the printhead back to the left, then sent a line feed to advance the paper (while the head was still moving left), and in some cases also provided a rub-out character (a do-nothing) to add an additional character time delay. Yes, if you go back far enough, you get to teletype machines. But Windows inherits its text-mode behaviour from DOS, which inherits it from CP/M. TRS-80 Mod 1-4 used cr for new line, I believe Apple used lf for new line... I can't comment about TRS, but classic Apple Macs (up to System 9) used carriage return \r as the line separator. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On 05/18/2013 01:00 PM, Carlos Nepomuceno wrote: Python really writes '\n\r' on Windows. Just check the files. That's backwards. '\r\n' on Windows, IF you omit the b in the mode when creating the file. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
RE: How to write fast into a file in python?
Date: Sat, 18 May 2013 22:41:32 -0400 From: da...@davea.name To: python-list@python.org Subject: Re: How to write fast into a file in python? On 05/18/2013 01:00 PM, Carlos Nepomuceno wrote: Python really writes '\n\r' on Windows. Just check the files. That's backwards. '\r\n' on Windows, IF you omit the b in the mode when creating the file. Indeed! My mistake just made me find out that Acorn used that inversion on Acorn MOS. According to this[1] (at page 449) the OSNEWL routine outputs '\n\r'. What the hell those guys were thinking??? :p OSNEWL This call issues an LF CR (line feed, carriage return) to the currently selected output stream. The routine is entered at FFE7. [1] http://regregex.bbcmicro.net/BPlusUserGuide-1.07.pdf -- http://mail.python.org/mailman/listinfo/python-list
RE: How to write fast into a file in python?
Thanks Dan! I've never used CPython or PyPy. Will try them later. I think the main difference between your create_file_numbers_file_like() and the fastwrite5.py I sent earlier is that I've used cStringIO instead of StringIO. It took 12s less using cStringIO. My numbers are much greater, but I've used Python 2.7.5 instead: C:\src\Pythonpython create_file_numbers.py time taken to write a file of size 52428800 is 39.1199457743 seconds time taken to write a file of size 52428800 is 14.8704800436 seconds time taken to write a file of size 52428800 is 23.0011990985 seconds I've downloaded bufsock.py and python2x3.py. The later one was hard to remove the source code from the web page. Can I use them on my projects? I'm not used to the UCI license[1]. What's the difference to the GPL? [1] http://stromberg.dnsalias.org/~dstromberg/UCI-license.html Date: Sat, 18 May 2013 12:38:30 -0700 Subject: Re: How to write fast into a file in python? From: drsali...@gmail.com To: lokeshkopp...@gmail.com CC: python-list@python.org With CPython 2.7.3: ./t time taken to write a file of size 52428800 is 15.86 seconds time taken to write a file of size 52428800 is 7.91 seconds time taken to write a file of size 52428800 is 9.64 seconds With pypy-1.9: ./t time taken to write a file of size 52428800 is 3.708232 seconds time taken to write a file of size 52428800 is 4.868304 seconds time taken to write a file of size 52428800 is 1.93612 seconds Here's the code: #!/usr/local/pypy-1.9/bin/pypy #!/usr/bin/python import sys import time import StringIO sys.path.insert(0, '/usr/local/lib') import bufsock def create_file_numbers_old(filename, size): start = time.clock() value = 0 with open(filename, w) as f: while f.tell() size: f.write({0}\n.format(value)) value += 1 end = time.clock() print time taken to write a file of size, size, is , (end -start), seconds \n def create_file_numbers_bufsock(filename, intended_size): start = time.clock() value = 0 with open(filename, w) as f: bs = bufsock.bufsock(f) actual_size = 0 while actual_size intended_size: string = {0}\n.format(value) actual_size += len(string) + 1 bs.write(string) value += 1 bs.flush() end = time.clock() print time taken to write a file of size, intended_size, is , (end -start), seconds \n def create_file_numbers_file_like(filename, intended_size): start = time.clock() value = 0 with open(filename, w) as f: file_like = StringIO.StringIO() actual_size = 0 while actual_size intended_size: string = {0}\n.format(value) actual_size += len(string) + 1 file_like.write(string) value += 1 file_like.seek(0) f.write(file_like.read()) end = time.clock() print time taken to write a file of size, intended_size, is , (end -start), seconds \n create_file_numbers_old('output.txt', 50 * 2**20) create_file_numbers_bufsock('output2.txt', 50 * 2**20) create_file_numbers_file_like('output3.txt', 50 * 2**20) On Thu, May 16, 2013 at 9:35 PM, lokeshkopp...@gmail.commailto:lokeshkopp...@gmail.com wrote: On Friday, May 17, 2013 8:50:26 AM UTC+5:30, lokesh...@gmail.commailto:lokesh...@gmail.com wrote: I need to write numbers into a file upto 50mb and it should be fast can any one help me how to do that? i had written the following code.. --- def create_file_numbers_old(filename, size): start = time.clock() value = 0 with open(filename, w) as f: while f.tell() size: f.write({0}\n.format(value)) value += 1 end = time.clock() print time taken to write a file of size, size, is , (end -start), seconds \n -- it takes about 20sec i need 5 to 10 times less than that. size = 50mb -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: How to write fast into a file in python?
BTW, I've downloaded from the following places: http://stromberg.dnsalias.org/svn/bufsock/trunk/bufsock.py http://stromberg.dnsalias.org/~dstromberg/backshift/documentation/html/python2x3-pysrc.html Are those the latest versions? From: carlosnepomuc...@outlook.com To: python-list@python.org Subject: RE: How to write fast into a file in python? Date: Sun, 19 May 2013 08:31:08 +0300 CC: lokeshkopp...@gmail.com Thanks Dan! I've never used CPython or PyPy. Will try them later. I think the main difference between your create_file_numbers_file_like() and the fastwrite5.py I sent earlier is that I've used cStringIO instead of StringIO. It took 12s less using cStringIO. My numbers are much greater, but I've used Python 2.7.5 instead: C:\src\Pythonpython create_file_numbers.py time taken to write a file of size 52428800 is 39.1199457743 seconds time taken to write a file of size 52428800 is 14.8704800436 seconds time taken to write a file of size 52428800 is 23.0011990985 seconds I've downloaded bufsock.py and python2x3.py. The later one was hard to remove the source code from the web page. Can I use them on my projects? I'm not used to the UCI license[1]. What's the difference to the GPL? [1] http://stromberg.dnsalias.org/~dstromberg/UCI-license.html Date: Sat, 18 May 2013 12:38:30 -0700 Subject: Re: How to write fast into a file in python? From: drsali...@gmail.com To: lokeshkopp...@gmail.com CC: python-list@python.org With CPython 2.7.3: ./t time taken to write a file of size 52428800 is 15.86 seconds time taken to write a file of size 52428800 is 7.91 seconds time taken to write a file of size 52428800 is 9.64 seconds With pypy-1.9: ./t time taken to write a file of size 52428800 is 3.708232 seconds time taken to write a file of size 52428800 is 4.868304 seconds time taken to write a file of size 52428800 is 1.93612 seconds Here's the code: #!/usr/local/pypy-1.9/bin/pypy #!/usr/bin/python import sys import time import StringIO sys.path.insert(0, '/usr/local/lib') import bufsock def create_file_numbers_old(filename, size): start = time.clock() value = 0 with open(filename, w) as f: while f.tell() size: f.write({0}\n.format(value)) value += 1 end = time.clock() print time taken to write a file of size, size, is , (end -start), seconds \n def create_file_numbers_bufsock(filename, intended_size): start = time.clock() value = 0 with open(filename, w) as f: bs = bufsock.bufsock(f) actual_size = 0 while actual_size intended_size: string = {0}\n.format(value) actual_size += len(string) + 1 bs.write(string) value += 1 bs.flush() end = time.clock() print time taken to write a file of size, intended_size, is , (end -start), seconds \n def create_file_numbers_file_like(filename, intended_size): start = time.clock() value = 0 with open(filename, w) as f: file_like = StringIO.StringIO() actual_size = 0 while actual_size intended_size: string = {0}\n.format(value) actual_size += len(string) + 1 file_like.write(string) value += 1 file_like.seek(0) f.write(file_like.read()) end = time.clock() print time taken to write a file of size, intended_size, is , (end -start), seconds \n create_file_numbers_old('output.txt', 50 * 2**20) create_file_numbers_bufsock('output2.txt', 50 * 2**20) create_file_numbers_file_like('output3.txt', 50 * 2**20) On Thu, May 16, 2013 at 9:35 PM, lokeshkopp...@gmail.commailto:lokeshkopp...@gmail.com wrote: On Friday, May 17, 2013 8:50:26 AM UTC+5:30, lokesh...@gmail.commailto:lokesh...@gmail.com wrote: I need to write numbers into a file upto 50mb and it should be fast can any one help me how to do that? i had written the following code.. --- def create_file_numbers_old(filename, size): start = time.clock() value = 0 with open(filename, w) as f: while f.tell() size: f.write({0}\n.format(value)) value += 1 end = time.clock() print time taken to write a file of size, size, is , (end -start), seconds \n -- it takes about 20sec i need 5 to 10 times less than that. size = 50mb -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On 05/17/2013 12:35 AM, lokeshkopp...@gmail.com wrote: On Friday, May 17, 2013 8:50:26 AM UTC+5:30, lokesh...@gmail.com wrote: I need to write numbers into a file upto 50mb and it should be fast can any one help me how to do that? i had written the following code.. SNIP value = 0 with open(filename, w) as f: while f.tell() size: f.write({0}\n.format(value)) SNIP more double-spaced nonsense from googlegroups If you must use googlegroups, at least read this http://wiki.python.org/moin/GoogleGroupsPython. it takes about 20sec i need 5 to 10 times less than that. size = 50mb Most of the time is spent figuring out whether the file has reached its limit size. If you want Python to go fast, just specify the data. On my Linux system, it takes 11 seconds to write the first 633 values, which is just under 50mb. If I write the obvious loop, writing that many values takes .25 seconds. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
RE: How to write fast into a file in python?
I've got the following results on my desktop PC (Win7/Python2.7.5): C:\src\Pythonpython -m timeit -cvn3 -r3 execfile('fastwrite2.py') raw times: 123 126 125 3 loops, best of 3: 41 sec per loop C:\src\Pythonpython -m timeit -cvn3 -r3 execfile('fastwrite5.py') raw times: 34 34.3 34 3 loops, best of 3: 11.3 sec per loop C:\src\Pythonpython -m timeit -cvn3 -r3 execfile('fastwrite6.py') raw times: 0.4 0.447 0.391 3 loops, best of 3: 130 msec per loop If you can just copy a preexisting file it will surely increase the speed to the levels you need, but doing the cStringIO operations can reduce the time in 72%. Strangely I just realised that the time it takes to complete such scripts is the same no matter what hard drive I choose to run them. The results are the same for an SSD (main drive) and a HDD. I think it's very strange to take 11.3s to write 50MB (4.4MB/s) sequentially on a SSD which is capable of 140MB/s. Is that a Python problem? Why does it take the same time on the HDD? ### fastwrite2.py ### this is your code size = 50*1024*1024 value = 0 filename = 'fastwrite2.dat' with open(filename, w) as f: while f.tell() size: f.write({0}\n.format(value)) value += 1 f.close() ### fastwrite5.py ### import cStringIO size = 50*1024*1024 value = 0 filename = 'fastwrite5.dat' x = 0 b = cStringIO.StringIO() while x size: line = '{0}\n'.format(value) b.write(line) value += 1 x += len(line)+1 f = open(filename, 'w') f.write(b.getvalue()) f.close() b.close() ### fastwrite6.py ### import shutil src = 'fastwrite.dat' dst = 'fastwrite6.dat' shutil.copyfile(src, dst) Date: Fri, 17 May 2013 07:58:43 -0400 From: da...@davea.name To: python-list@python.org Subject: Re: How to write fast into a file in python? On 05/17/2013 12:35 AM, lokeshkopp...@gmail.com wrote: On Friday, May 17, 2013 8:50:26 AM UTC+5:30, lokesh...@gmail.com wrote: I need to write numbers into a file upto 50mb and it should be fast can any one help me how to do that? i had written the following code.. SNIP value = 0 with open(filename, w) as f: while f.tell() size: f.write({0}\n.format(value)) SNIP more double-spaced nonsense from googlegroups If you must use googlegroups, at least read this http://wiki.python.org/moin/GoogleGroupsPython. it takes about 20sec i need 5 to 10 times less than that. size = 50mb Most of the time is spent figuring out whether the file has reached its limit size. If you want Python to go fast, just specify the data. On my Linux system, it takes 11 seconds to write the first 633 values, which is just under 50mb. If I write the obvious loop, writing that many values takes .25 seconds. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On Fri, 17 May 2013 18:20:33 +0300, Carlos Nepomuceno wrote: I've got the following results on my desktop PC (Win7/Python2.7.5): C:\src\Pythonpython -m timeit -cvn3 -r3 execfile('fastwrite2.py') raw times: 123 126 125 3 loops, best of 3: 41 sec per loop Your times here are increased significantly by using execfile. Using execfile means that instead of compiling the code once, then executing many times, it gets compiled over and over and over and over again. In my experience, using exec, execfile or eval makes your code ten or twenty times slower: [steve@ando ~]$ python -m timeit 'x = 100; y = x/3' 100 loops, best of 3: 0.175 usec per loop [steve@ando ~]$ python -m timeit 'exec(x = 100; y = x/3)' 1 loops, best of 3: 37.8 usec per loop Strangely I just realised that the time it takes to complete such scripts is the same no matter what hard drive I choose to run them. The results are the same for an SSD (main drive) and a HDD. There's nothing strange here. The time you measure is dominated by three things, in reducing order of importance: * the poor choice of execfile dominates the time taken; * followed by choice of algorithm; * followed by the time it actually takes to write to the disk, which is probably insignificant compared to the other two, regardless of whether you are using a HDD or SSD. Until you optimize the code, optimizing the media is a waste of time. I think it's very strange to take 11.3s to write 50MB (4.4MB/s) sequentially on a SSD which is capable of 140MB/s. It doesn't. It takes 11.3 seconds to open a file, read it into memory, parse it, compile it into byte-code, and only then execute it. My prediction is that the call to f.write() and f.close() probably take a fraction of a second, and nearly all of the rest of the time is taken by other calculations. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
RE: How to write fast into a file in python?
Thank you Steve! You are totally right! It takes about 0.2s for the f.write() to return. Certainly because it writes to the system file cache (~250MB/s). Using a little bit different approach I've got: C:\src\Pythonpython -m timeit -cvn3 -r3 -sfrom fastwrite5r import run run() raw times: 24 25.1 24.4 3 loops, best of 3: 8 sec per loop This time it took 8s to complete from previous 11.3s. Does those 3.3s are the time to open, read, parse, compile steps you told me? If so, the execute step is really taking 8s, right? Why does it take so long to build the string to be written? Can it get faster? Thanks in advance! ### fastwrite5r.py ### def run(): import cStringIO size = 50*1024*1024 value = 0 filename = 'fastwrite5.dat' x = 0 b = cStringIO.StringIO() while x size: line = '{0}\n'.format(value) b.write(line) value += 1 x += len(line)+1 f = open(filename, 'w') f.write(b.getvalue()) f.close() b.close() if __name__ == '__main__': run() From: steve+comp.lang.pyt...@pearwood.info Subject: Re: How to write fast into a file in python? Date: Fri, 17 May 2013 16:42:55 + To: python-list@python.org On Fri, 17 May 2013 18:20:33 +0300, Carlos Nepomuceno wrote: I've got the following results on my desktop PC (Win7/Python2.7.5): C:\src\Pythonpython -m timeit -cvn3 -r3 execfile('fastwrite2.py') raw times: 123 126 125 3 loops, best of 3: 41 sec per loop Your times here are increased significantly by using execfile. Using execfile means that instead of compiling the code once, then executing many times, it gets compiled over and over and over and over again. In my experience, using exec, execfile or eval makes your code ten or twenty times slower: [steve@ando ~]$ python -m timeit 'x = 100; y = x/3' 100 loops, best of 3: 0.175 usec per loop [steve@ando ~]$ python -m timeit 'exec(x = 100; y = x/3)' 1 loops, best of 3: 37.8 usec per loop Strangely I just realised that the time it takes to complete such scripts is the same no matter what hard drive I choose to run them. The results are the same for an SSD (main drive) and a HDD. There's nothing strange here. The time you measure is dominated by three things, in reducing order of importance: * the poor choice of execfile dominates the time taken; * followed by choice of algorithm; * followed by the time it actually takes to write to the disk, which is probably insignificant compared to the other two, regardless of whether you are using a HDD or SSD. Until you optimize the code, optimizing the media is a waste of time. I think it's very strange to take 11.3s to write 50MB (4.4MB/s) sequentially on a SSD which is capable of 140MB/s. It doesn't. It takes 11.3 seconds to open a file, read it into memory, parse it, compile it into byte-code, and only then execute it. My prediction is that the call to f.write() and f.close() probably take a fraction of a second, and nearly all of the rest of the time is taken by other calculations. -- Steven -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On Fri, 17 May 2013 18:20:33 +0300, Carlos Nepomuceno wrote: ### fastwrite5.py ### import cStringIO size = 50*1024*1024 value = 0 filename = 'fastwrite5.dat' x = 0 b = cStringIO.StringIO() while x size: line = '{0}\n'.format(value) b.write(line) value += 1 x += len(line)+1 Oh, I forgot to mention: you have a bug in this function. You're already including the newline in the len(line), so there is no need to add one. The result is that you only generate 44MB instead of 50MB. f = open(filename, 'w') f.write(b.getvalue()) f.close() b.close() Here are the results of profiling the above on my computer. Including the overhead of the profiler, it takes just over 50 seconds to run your file on my computer. [steve@ando ~]$ python -m cProfile fastwrite5.py 17846645 function calls in 53.575 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 30.561 30.561 53.575 53.575 fastwrite5.py:1(module) 10.0000.0000.0000.000 {cStringIO.StringIO} 59488795.5820.0005.5820.000 {len} 10.0040.0040.0040.004 {method 'close' of 'cStringIO.StringO' objects} 10.0000.0000.0000.000 {method 'close' of 'file' objects} 10.0000.0000.0000.000 {method 'disable' of '_lsprof.Profiler' objects} 59488799.9790.0009.9790.000 {method 'format' of 'str' objects} 10.1030.1030.1030.103 {method 'getvalue' of 'cStringIO.StringO' objects} 59488797.1350.0007.1350.000 {method 'write' of 'cStringIO.StringO' objects} 10.2110.2110.2110.211 {method 'write' of 'file' objects} 10.0000.0000.0000.000 {open} As you can see, the time is dominated by repeatedly calling len(), str.format() and StringIO.write() methods. Actually writing the data to the file is quite a small percentage of the cumulative time. So, here's another version, this time using a pre-calculated limit. I cheated and just copied the result from the fastwrite5 output :-) # fasterwrite.py filename = 'fasterwrite.dat' with open(filename, 'w') as f: for i in xrange(5948879): # Actually only 44MB, not 50MB. f.write('%d\n' % i) And the profile results are about twice as fast as fastwrite5 above, with only 8 seconds in total writing to my HDD. [steve@ando ~]$ python -m cProfile fasterwrite.py 5948882 function calls in 28.840 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 20.592 20.592 28.840 28.840 fasterwrite.py:1(module) 10.0000.0000.0000.000 {method 'disable' of '_lsprof.Profiler' objects} 59488798.2290.0008.2290.000 {method 'write' of 'file' objects} 10.0190.0190.0190.019 {open} Without the overhead of the profiler, it is a little faster: [steve@ando ~]$ time python fasterwrite.py real0m16.187s user0m13.553s sys 0m0.508s Although it is still slower than the heavily optimized dd command, but not unreasonably slow for a high-level language: [steve@ando ~]$ time dd if=fasterwrite.dat of=copy.dat 90781+1 records in 90781+1 records out 46479922 bytes (46 MB) copied, 0.737009 seconds, 63.1 MB/s real0m0.786s user0m0.071s sys 0m0.595s -- Steven -- http://mail.python.org/mailman/listinfo/python-list
RE: How to write fast into a file in python?
You've hit the bullseye! ;) Thanks a lot!!! Oh, I forgot to mention: you have a bug in this function. You're already including the newline in the len(line), so there is no need to add one. The result is that you only generate 44MB instead of 50MB. That's because I'm running on Windows. What's the fastest way to check if '\n' translates to 2 bytes on file? Here are the results of profiling the above on my computer. Including the overhead of the profiler, it takes just over 50 seconds to run your file on my computer. [steve@ando ~]$ python -m cProfile fastwrite5.py 17846645 function calls in 53.575 seconds Didn't know the cProfile module.Thanks a lot! Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 30.561 30.561 53.575 53.575 fastwrite5.py:1(module) 1 0.000 0.000 0.000 0.000 {cStringIO.StringIO} 5948879 5.582 0.000 5.582 0.000 {len} 1 0.004 0.004 0.004 0.004 {method 'close' of 'cStringIO.StringO' objects} 1 0.000 0.000 0.000 0.000 {method 'close' of 'file' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 5948879 9.979 0.000 9.979 0.000 {method 'format' of 'str' objects} 1 0.103 0.103 0.103 0.103 {method 'getvalue' of 'cStringIO.StringO' objects} 5948879 7.135 0.000 7.135 0.000 {method 'write' of 'cStringIO.StringO' objects} 1 0.211 0.211 0.211 0.211 {method 'write' of 'file' objects} 1 0.000 0.000 0.000 0.000 {open} As you can see, the time is dominated by repeatedly calling len(), str.format() and StringIO.write() methods. Actually writing the data to the file is quite a small percentage of the cumulative time. So, here's another version, this time using a pre-calculated limit. I cheated and just copied the result from the fastwrite5 output :-) # fasterwrite.py filename = 'fasterwrite.dat' with open(filename, 'w') as f: for i in xrange(5948879): # Actually only 44MB, not 50MB. f.write('%d\n' % i) I had the same idea but kept the original method because I didn't want to waste time creating a function for calculating the actual number of iterations needed to deliver 50MB of data. ;) And the profile results are about twice as fast as fastwrite5 above, with only 8 seconds in total writing to my HDD. [steve@ando ~]$ python -m cProfile fasterwrite.py 5948882 function calls in 28.840 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 20.592 20.592 28.840 28.840 fasterwrite.py:1(module) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 5948879 8.229 0.000 8.229 0.000 {method 'write' of 'file' objects} 1 0.019 0.019 0.019 0.019 {open} I thought there would be a call to format method by '%d\n' % i. It seems the % operator is a lot faster than format. I just stopped using it because I read it was going to be deprecated. :( Why replace such a great and fast operator by a slow method? I mean, why format is been preferred over %? Without the overhead of the profiler, it is a little faster: [steve@ando ~]$ time python fasterwrite.py real 0m16.187s user 0m13.553s sys 0m0.508s Although it is still slower than the heavily optimized dd command, but not unreasonably slow for a high-level language: [steve@ando ~]$ time dd if=fasterwrite.dat of=copy.dat 90781+1 records in 90781+1 records out 46479922 bytes (46 MB) copied, 0.737009 seconds, 63.1 MB/s real 0m0.786s user 0m0.071s sys 0m0.595s -- Steven -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
RE: How to write fast into a file in python?
Think the following update will make the code more portable: x += len(line)+len(os.linesep)-1 Not sure if it's the fastest way to achieve that. :/ On Fri, 17 May 2013 18:20:33 +0300, Carlos Nepomuceno wrote: ### fastwrite5.py ### import cStringIO size = 50*1024*1024 value = 0 filename = 'fastwrite5.dat' x = 0 b = cStringIO.StringIO() while x size: line = '{0}\n'.format(value) b.write(line) value += 1 x += len(line)+1 Oh, I forgot to mention: you have a bug in this function. You're already including the newline in the len(line), so there is no need to add one. The result is that you only generate 44MB instead of 50MB. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On Fri, 17 May 2013 21:18:15 +0300, Carlos Nepomuceno wrote: I thought there would be a call to format method by '%d\n' % i. It seems the % operator is a lot faster than format. I just stopped using it because I read it was going to be deprecated. :( Why replace such a great and fast operator by a slow method? I mean, why format is been preferred over %? That is one of the most annoying, pernicious myths about Python, probably second only to the GIL makes Python slow (it actually makes it fast). String formatting with % is not deprecated. It will not be deprecated, at least not until Python 4000. The string format() method has a few advantages: it is more powerful, consistent and flexible, but it is significantly slower. Probably the biggest disadvantage to % formatting, and probably the main reason why it is discouraged, is that it treats tuples specially. Consider if x is an arbitrary object, and you call %s % x: py %s % 23 # works '23' py %s % [23, 42] # works '[23, 42]' and so on for *almost* any object. But if x is a tuple, strange things happen: py %s % (23,) # tuple with one item looks like an int '23' py %s % (23, 42) # tuple with two items fails Traceback (most recent call last): File stdin, line 1, in module TypeError: not all arguments converted during string formatting So when dealing with arbitrary objects that you cannot predict what they are, it is better to use format. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On Sat, May 18, 2013 at 2:01 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Consider if x is an arbitrary object, and you call %s % x: py %s % 23 # works '23' py %s % [23, 42] # works '[23, 42]' and so on for *almost* any object. But if x is a tuple, strange things happen Which can be guarded against by wrapping it up in a tuple. All you're seeing is that the shortcut notation for a single parameter can't handle tuples. def show(x): return %s % (x,) show(23) '23' show((23,)) '(23,)' show([23,42]) '[23, 42]' One of the biggest differences between %-formatting and str.format is that one is an operator and the other a method. The operator is always going to be faster, but the method can give more flexibility (not that I've ever needed or wanted to override anything). def show_format(x): return {}.format(x) # Same thing using str.format dis.dis(show) 2 0 LOAD_CONST 1 ('%s') 3 LOAD_FAST0 (x) 6 BUILD_TUPLE 1 9 BINARY_MODULO 10 RETURN_VALUE dis.dis(show_format) 2 0 LOAD_CONST 1 ('{}') 3 LOAD_ATTR0 (format) 6 LOAD_FAST0 (x) 9 CALL_FUNCTION1 (1 positional, 0 keyword pair) 12 RETURN_VALUE Attribute lookup and function call versus binary operator. Potentially a lot of flexibility, versus basically hard-coded functionality. But has anyone ever actually made use of it? str.format does have some cleaner features, like naming of parameters: {foo} vs {bar}.format(foo=1,bar=2) '1 vs 2' %(foo)s vs %(bar)s%{'foo':1,'bar':2} '1 vs 2' Extremely handy when you're working with hugely complex format strings, and the syntax feels a bit clunky in % (also, it's not portable to other languages, which is one of %-formatting's biggest features). Not a huge deal, but if you're doing a lot with that, it might be a deciding vote. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
How to write fast into a file in python?
I need to write numbers into a file upto 50mb and it should be fast can any one help me how to do that? i had written the following code.. --- def create_file_numbers_old(filename, size): start = time.clock() value = 0 with open(filename, w) as f: while f.tell() size: f.write({0}\n.format(value)) value += 1 end = time.clock() print time taken to write a file of size, size, is , (end -start), seconds \n -- it takes about 20sec i need 5 to 10 times less than that. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On Thu, 16 May 2013 20:20:26 -0700, lokeshkoppaka wrote: I need to write numbers into a file upto 50mb and it should be fast can any one help me how to do that? i had written the following code.. -- def create_file_numbers_old(filename, size): start = time.clock() value = 0 with open(filename, w) as f: while f.tell() size: f.write({0}\n.format(value)) value += 1 end = time.clock() print time taken to write a file of size, size, is , (end -start), seconds \n it takes about 20sec i need 5 to 10 times less than that. 20 seconds to write how many numbers? If you are doing create_file_numbers_old(filename, 5) then 20 seconds is really slow. But if you are doing: create_file_numbers_old(filename, 50) then 20 seconds is amazingly fast. Try this instead, it may be a little faster: def create_file_numbers_old(filename, size): count = value = 0 with open(filename, 'w') as f: while count size: s = '%d\n' % value f.write(s) count += len(s) value += 1 If this is still too slow, you can try three other tactics: 1) pre-calculate the largest integer that will fit in `size` bytes, then use a for-loop instead of a while loop: maxn = calculation(...) with open(filename, 'w') as f: for i in xrange(maxn): f.write('%d\n' % i) 2) Write an extension module in C that writes to the file. 3) Get a faster hard drive, and avoid writing over a network. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: How to write fast into a file in python?
On Friday, May 17, 2013 8:50:26 AM UTC+5:30, lokesh...@gmail.com wrote: I need to write numbers into a file upto 50mb and it should be fast can any one help me how to do that? i had written the following code.. --- def create_file_numbers_old(filename, size): start = time.clock() value = 0 with open(filename, w) as f: while f.tell() size: f.write({0}\n.format(value)) value += 1 end = time.clock() print time taken to write a file of size, size, is , (end -start), seconds \n -- it takes about 20sec i need 5 to 10 times less than that. size = 50mb -- http://mail.python.org/mailman/listinfo/python-list