Re: are there some special about '\x1a' symbol
Steve Holden st...@holdenweb.com writes: Unknown wrote: On 2009-01-12, John Machin sjmac...@lexicon.net wrote: I didn't think your question was stupid. Stupid was (a) CP/M recording file size as number of 128-byte sectors, forcing the use of an in-band EOF marker for text files (b) MS continuing to regard Ctrl-Z as an EOF decades after people stopped writing Ctrl-Z at the end of text files. I believe that feature was inherited by CP/M from DEC OSes (RSX-11 or RSTS-11). AFAICT, all of CP/M's file I/O API (including the FCB) was lifted almost directly from DEC's PDP-11 stuff, which probably copied it from PDP-8 stuff. Perhaps in the early 60's somebody at DEC had a reason. The really interesting thing is that we're still suffering because of it 40+ years later. I suspect this is probably a leftover from some paper tape data formats, when it was easier to detect the end of a file with a sentinel byte than it was to detect run-off as end of file. It could easily date back to the PDP-8. I think it was a reasonable way for CP/M to work. It's a nice simple interface for reading and writing files: you always read and write from/to a fixed 128-byte buffer. Allowing files to be arbitrary-length byte sequences would have made the system calls more complicated, and it would also have needed another byte in the on-disk file control block (so 7.3 filenames rather than 8.3, or some other compromise). For CP/M programs, it's hard to see what the gain would have been; it's easy to design a binary file format so that it doesn't matter whether or not there's junk on the end, and CP/M didn't have a tradition of storing data in 'plain text' files (for good reasons of disk space). It certainly is a shame that we didn't leave all this behind when MS/DOS 2 appeared, though. -M- -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: are there some special about '\x1a' symbol
En Tue, 13 Jan 2009 22:04:33 -0200, Terry Reedy tjre...@udel.edu escribió: Gabriel Genellina wrote: En Mon, 12 Jan 2009 12:00:16 -0200, John Machin sjmac...@lexicon.net escribió: I didn't think your question was stupid. Stupid was (a) CP/M recording file size as number of 128-byte sectors, forcing the use of an in-band EOF marker for text files (b) MS continuing to regard Ctrl-Z as an EOF decades after people stopped writing Ctrl-Z at the end of text files. This is called backwards compatibility and it's a good thing :) But it does not have to be the default or only behavior to be available. Sure. And it isn't - there are many flags to open and fopen to choose from... The C89 standard (the language used to compile CPython) guarantees *only* that printable characters, tab, and newline are preserved in a text file; everything else may or may not appear when it is read again. Even whitespace at the end of a line may be dropped. Binary files are more predictable... Delphi recognizes the EOF marker when reading a text file only inside the file's last 128-byte block -- this mimics the original CP/M behavior rather closely. I thought the MSC runtime did the same, but no, the EOF marker is recognized anywhere. And Python inherits that (at least in 2.6 -- I've not tested with 3.0) -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
Steve Holden stevenweb.com wrote: Unknown wrote: On 2009-01-12, John Machin sjmac...@lexicon.net wrote: I believe that feature was inherited by CP/M from DEC OSes (RSX-11 or RSTS-11). AFAICT, all of CP/M's file I/O API (including the FCB) was lifted almost directly from DEC's PDP-11 stuff, which probably copied it from PDP-8 stuff. Perhaps in the early 60's somebody at DEC had a reason. The really interesting thing is that we're still suffering because of it 40+ years later. I suspect this is probably a leftover from some paper tape data formats, when it was easier to detect the end of a file with a sentinel byte than it was to detect run-off as end of file. It could easily date back to the PDP-8. We can be kind of fortunate that the ASCII chars for field separator, record separator, file separator, unit separator did not catch on in a big way in file formatting. (remembering the Pick OS running on Reality...) - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
On 2009-01-14, Steve Holden st...@holdenweb.com wrote: Unknown wrote: On 2009-01-12, John Machin sjmac...@lexicon.net wrote: I didn't think your question was stupid. Stupid was (a) CP/M recording file size as number of 128-byte sectors, forcing the use of an in-band EOF marker for text files (b) MS continuing to regard Ctrl-Z as an EOF decades after people stopped writing Ctrl-Z at the end of text files. I believe that feature was inherited by CP/M from DEC OSes (RSX-11 or RSTS-11). AFAICT, all of CP/M's file I/O API (including the FCB) was lifted almost directly from DEC's PDP-11 stuff, which probably copied it from PDP-8 stuff. Perhaps in the early 60's somebody at DEC had a reason. The really interesting thing is that we're still suffering because of it 40+ years later. I suspect this is probably a leftover from some paper tape data formats, when it was easier to detect the end of a file with a sentinel byte than it was to detect run-off as end of file. It could easily date back to the PDP-8. You're probably right. That's why the delete character is all 1's (all holes). It's easy to punch more holes -- un-punching them is pretty arduous. -- Grant -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: are there some special about '\x1a' symbol
On 2009-01-14, Gabriel Genellina gagsl-...@yahoo.com.ar wrote: En Tue, 13 Jan 2009 22:04:33 -0200, Terry Reedy tjre...@udel.edu escribió: Gabriel Genellina wrote: En Mon, 12 Jan 2009 12:00:16 -0200, John Machin sjmac...@lexicon.net escribió: I didn't think your question was stupid. Stupid was (a) CP/M recording file size as number of 128-byte sectors, forcing the use of an in-band EOF marker for text files (b) MS continuing to regard Ctrl-Z as an EOF decades after people stopped writing Ctrl-Z at the end of text files. This is called backwards compatibility and it's a good thing :) But it does not have to be the default or only behavior to be available. Sure. And it isn't It _is_ the default behavior on some systems. The default file mode when you open a file in C or in Python is text mode. In text mode, Windows interprets a ctrl-Z as EOF, doesn't it? -- Grant -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
Steve Holden wrote: Unknown wrote: On 2009-01-12, John Machin sjmac...@lexicon.net wrote: I didn't think your question was stupid. Stupid was (a) CP/M recording file size as number of 128-byte sectors, forcing the use of an in-band EOF marker for text files (b) MS continuing to regard Ctrl-Z as an EOF decades after people stopped writing Ctrl-Z at the end of text files. I believe that feature was inherited by CP/M from DEC OSes (RSX-11 or RSTS-11). AFAICT, all of CP/M's file I/O API (including the FCB) was lifted almost directly from DEC's PDP-11 stuff, which probably copied it from PDP-8 stuff. Perhaps in the early 60's somebody at DEC had a reason. The really interesting thing is that we're still suffering because of it 40+ years later. I suspect this is probably a leftover from some paper tape data formats, when it was easier to detect the end of a file with a sentinel byte than it was to detect run-off as end of file. It could easily date back to the PDP-8. Perhaps, although in ASCII it's the SUB symbol: A control character that is used in the place of a character that is recognized to be invalid or in error or that cannot be represented on a given device. [Wikipedia]. There were other codes defined for End-of-Text and File-Separator. Unless the protocol were one of DEC's own. The fact that it's Ctrl-last-letter-of-the-alphabet makes me suspect that it was picked in a pretty informal way. Mel. -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
Raps cane on floor. It's probably an end-of-file sentinel because 'Z' is the last letter of the alphabet. I suspect it comes from MIT. Unix, developed at a telephone company, uses \x4, which was, in fact, the ASCII in-band end-of-transmission code and would disconnect a teletype. Does this qualify me for the dinosaur award? R Fritz On 2009-01-14 07:15:33 -0800, Mel mwil...@the-wire.com said: Steve Holden wrote: Unknown wrote: On 2009-01-12, John Machin sjmac...@lexicon.net wrote: I didn't think your question was stupid. Stupid was (a) CP/M recording file size as number of 128-byte sectors, forcing the use of an in-band EOF marker for text files (b) MS continuing to regard Ctrl-Z as an EOF decades after people stopped writing Ctrl-Z at the end of text files. I believe that feature was inherited by CP/M from DEC OSes (RSX-11 or RSTS-11). AFAICT, all of CP/M's file I/O API (including the FCB) was lifted almost directly from DEC's PDP-11 stuff, which probably copied it from PDP-8 stuff. Perhaps in the early 60's somebody at DEC had a reason. The really interesting thing is that we're still suffering because of it 40+ years later. I suspect this is probably a leftover from some paper tape data formats, when it was easier to detect the end of a file with a sentinel byte than it was to detect run-off as end of file. It could easily date back to the PDP-8. Perhaps, although in ASCII it's the SUB symbol: A control character that is used in the place of a character that is recognized to be invalid or in error or that cannot be represented on a given device. [Wikipedia]. There were other codes defined for End-of-Text and File-Separator. Unless the protocol were one of DEC's own. The fact that it's Ctrl-last-letter-of-the-alphabet makes me suspect that it was picked in a pretty informal way. Mel. -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
On 12 янв, 16:00, John Machin sjmac...@lexicon.net wrote: On Jan 13, 12:45 am, sim.sim maksim.kasi...@gmail.com wrote: On 10 ÑÎ×, 23:40, John Machin sjmac...@lexicon.net wrote: On Jan 11, 2:45šam, sim.sim maksim.kasi...@gmail.com wrote: Hi all! I had touch with some different python behavior: I was tried to write into a file a string with the '\x1a' symbol, and for FreeBSD system, it gives expected result: open(test, w).write('before\x1aafter') open('test').read() 'before\x1aafter' but for my WinXP box, it gives some strange: open(test, w).write('before\x1aafter') open('test').read() 'before' Here I can write all symbols, but not read. I've tested it with python 2.6, 2.5 and 2.2 and WinXP SP2. Why is it so and is it possible to fix it? You've already got two good answers, but this might add a little more explanation: You will aware that in Windows Command Prompt, to exit the interactive mode of Python (among others), you need to type Ctrl-Z ... | C:\junkpython | Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 | Type help, copyright, credits or license for more information. | problem = '\x1a' | ord(problem) | 26 | # What is the 26th letter of the English/ASCII alphabet? | ... | ^Z | | C:\junk HTH, John Hi John, I agree - those two answers are really good. Thanks to Mel and Marc. I'm sorry if my stupid question was annoyed you. I didn't think your question was stupid. Stupid was (a) CP/M recording file size as number of 128-byte sectors, forcing the use of an in-band EOF marker for text files (b) MS continuing to regard Ctrl-Z as an EOF decades after people stopped writing Ctrl-Z at the end of text files. And I wasn't annoyed either ... I was merely adding the information that Ctrl-Z and '\x1a' were the same thing; many people don't make the connection. Cheers, John Ah John, thank you for your explanations! My first impression was that your comments does not relates to my question, but I've found new things where I used to think there was nothing. Now it is interesting to me how one have to give reasons to use open (.., 'r') instead of open(.., 'rb')? There is confusing situation when we use open(.., 'r'), are there some scenario when we might be confused when we'll use open(.., 'rb')? -- Maksim -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
On Jan 13, 10:12 pm, sim.sim maksim.kasi...@gmail.com wrote: Ah John, thank you for your explanations! My first impression was that your comments does not relates to my question, but I've found new things where I used to think there was nothing. Now it is interesting to me how one have to give reasons to use open (.., 'r') instead of open(.., 'rb')? There is confusing situation when we use open(.., 'r'), are there some scenario when we might be confused when we'll use open(.., 'rb')? Some general rules: if you regard a file as text, open it with rt -- the t is redundant but gives you and anyone else who reads your code that assurance that you've actually thought about it. Otherwise you regard the file as binary, and open it with rb. The distinction was always important on Windows because of the special handling of newlines and '\x1a') but largely unimportant on *x boxes. With Python 3.0, it is important for all users to specify the mode that they really need: 'b' files read and write bytes objects whereas 't' files read and write str objects, have the newline etc changes, and need an encoding to decode the raw bytes into str (Unicode) objects -- and you can't use bytes objects directly with a 't' file nor str objects directly a 'b' file. HTH, John -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: are there some special about '\x1a' symbol
Gabriel Genellina wrote: En Mon, 12 Jan 2009 12:00:16 -0200, John Machin sjmac...@lexicon.net escribió: I didn't think your question was stupid. Stupid was (a) CP/M recording file size as number of 128-byte sectors, forcing the use of an in-band EOF marker for text files (b) MS continuing to regard Ctrl-Z as an EOF decades after people stopped writing Ctrl-Z at the end of text files. This is called backwards compatibility and it's a good thing :) But it does not have to be the default or only behavior to be available. Consider the Atucha II nuclear plant, started in 1980, based on a design from 1965, and still unfinished. People require access to the complete design, plans, specifications, CAD drawings... decades after they were initially written. I actually do use (and maintain! -- ugh!) some DOS programs. Some people would have a hard time if they could not read their old data with new programs. Even Python has a print statement decades after nobody uses a teletype terminal anymore... -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
On 2009-01-12, John Machin sjmac...@lexicon.net wrote: I didn't think your question was stupid. Stupid was (a) CP/M recording file size as number of 128-byte sectors, forcing the use of an in-band EOF marker for text files (b) MS continuing to regard Ctrl-Z as an EOF decades after people stopped writing Ctrl-Z at the end of text files. I believe that feature was inherited by CP/M from DEC OSes (RSX-11 or RSTS-11). AFAICT, all of CP/M's file I/O API (including the FCB) was lifted almost directly from DEC's PDP-11 stuff, which probably copied it from PDP-8 stuff. Perhaps in the early 60's somebody at DEC had a reason. The really interesting thing is that we're still suffering because of it 40+ years later. -- Grant Edwards grante Yow! I want to read my new at poem about pork brains and visi.comouter space ... -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
Unknown wrote: On 2009-01-12, John Machin sjmac...@lexicon.net wrote: I didn't think your question was stupid. Stupid was (a) CP/M recording file size as number of 128-byte sectors, forcing the use of an in-band EOF marker for text files (b) MS continuing to regard Ctrl-Z as an EOF decades after people stopped writing Ctrl-Z at the end of text files. I believe that feature was inherited by CP/M from DEC OSes (RSX-11 or RSTS-11). AFAICT, all of CP/M's file I/O API (including the FCB) was lifted almost directly from DEC's PDP-11 stuff, which probably copied it from PDP-8 stuff. Perhaps in the early 60's somebody at DEC had a reason. The really interesting thing is that we're still suffering because of it 40+ years later. I suspect this is probably a leftover from some paper tape data formats, when it was easier to detect the end of a file with a sentinel byte than it was to detect run-off as end of file. It could easily date back to the PDP-8. regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
On 10 янв, 23:40, John Machin sjmac...@lexicon.net wrote: On Jan 11, 2:45 am, sim.sim maksim.kasi...@gmail.com wrote: Hi all! I had touch with some different python behavior: I was tried to write into a file a string with the '\x1a' symbol, and for FreeBSD system, it gives expected result: open(test, w).write('before\x1aafter') open('test').read() 'before\x1aafter' but for my WinXP box, it gives some strange: open(test, w).write('before\x1aafter') open('test').read() 'before' Here I can write all symbols, but not read. I've tested it with python 2.6, 2.5 and 2.2 and WinXP SP2. Why is it so and is it possible to fix it? You've already got two good answers, but this might add a little more explanation: You will aware that in Windows Command Prompt, to exit the interactive mode of Python (among others), you need to type Ctrl-Z ... | C:\junkpython | Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 | Type help, copyright, credits or license for more information. | problem = '\x1a' | ord(problem) | 26 | # What is the 26th letter of the English/ASCII alphabet? | ... | ^Z | | C:\junk HTH, John Hi John, I agree - those two answers are really good. Thanks to Mel and Marc. I'm sorry if my stupid question was annoyed you. -- Maksim -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
On Jan 13, 12:45 am, sim.sim maksim.kasi...@gmail.com wrote: On 10 ÑÎ×, 23:40, John Machin sjmac...@lexicon.net wrote: On Jan 11, 2:45šam, sim.sim maksim.kasi...@gmail.com wrote: Hi all! I had touch with some different python behavior: I was tried to write into a file a string with the '\x1a' symbol, and for FreeBSD system, it gives expected result: open(test, w).write('before\x1aafter') open('test').read() 'before\x1aafter' but for my WinXP box, it gives some strange: open(test, w).write('before\x1aafter') open('test').read() 'before' Here I can write all symbols, but not read. I've tested it with python 2.6, 2.5 and 2.2 and WinXP SP2. Why is it so and is it possible to fix it? You've already got two good answers, but this might add a little more explanation: You will aware that in Windows Command Prompt, to exit the interactive mode of Python (among others), you need to type Ctrl-Z ... | C:\junkpython | Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 | Type help, copyright, credits or license for more information. | problem = '\x1a' | ord(problem) | 26 | # What is the 26th letter of the English/ASCII alphabet? | ... | ^Z | | C:\junk HTH, John Hi John, I agree - those two answers are really good. Thanks to Mel and Marc. I'm sorry if my stupid question was annoyed you. I didn't think your question was stupid. Stupid was (a) CP/M recording file size as number of 128-byte sectors, forcing the use of an in-band EOF marker for text files (b) MS continuing to regard Ctrl-Z as an EOF decades after people stopped writing Ctrl-Z at the end of text files. And I wasn't annoyed either ... I was merely adding the information that Ctrl-Z and '\x1a' were the same thing; many people don't make the connection. Cheers, John -- http://mail.python.org/mailman/listinfo/python-list
[OT] Re: are there some special about '\x1a' symbol
En Mon, 12 Jan 2009 12:00:16 -0200, John Machin sjmac...@lexicon.net escribió: I didn't think your question was stupid. Stupid was (a) CP/M recording file size as number of 128-byte sectors, forcing the use of an in-band EOF marker for text files (b) MS continuing to regard Ctrl-Z as an EOF decades after people stopped writing Ctrl-Z at the end of text files. This is called backwards compatibility and it's a good thing :) Consider the Atucha II nuclear plant, started in 1980, based on a design from 1965, and still unfinished. People require access to the complete design, plans, specifications, CAD drawings... decades after they were initially written. I actually do use (and maintain! -- ugh!) some DOS programs. Some people would have a hard time if they could not read their old data with new programs. Even Python has a print statement decades after nobody uses a teletype terminal anymore... -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
On Sat, 10 Jan 2009 07:45:53 -0800, sim.sim wrote: I had touch with some different python behavior: I was tried to write into a file a string with the '\x1a' symbol, and for FreeBSD system, it gives expected result: open(test, w).write('before\x1aafter') open('test').read() 'before\x1aafter' but for my WinXP box, it gives some strange: open(test, w).write('before\x1aafter') open('test').read() 'before' Here I can write all symbols, but not read. I've tested it with python 2.6, 2.5 and 2.2 and WinXP SP2. Why is it so and is it possible to fix it? \x1a is treated as end of text character in text files by Windows. So if you want all, unaltered data, open the file in binary mode ('rb' and 'wb'). Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
sim.sim wrote: Hi all! I had touch with some different python behavior: I was tried to write into a file a string with the '\x1a' symbol, and for FreeBSD system, it gives expected result: open(test, w).write('before\x1aafter') open('test').read() 'before\x1aafter' but for my WinXP box, it gives some strange: open(test, w).write('before\x1aafter') open('test').read() 'before' Here I can write all symbols, but not read. I've tested it with python 2.6, 2.5 and 2.2 and WinXP SP2. Why is it so and is it possible to fix it? '\x1a' is the End-of-file mark that Windows inherited from MS-DOS and CP/M. The underlying Windows libraries honour it for files opened in text mode. open ('test', 'rb').read() will read the whole file. Mel. -- http://mail.python.org/mailman/listinfo/python-list
Re: are there some special about '\x1a' symbol
On Jan 11, 2:45 am, sim.sim maksim.kasi...@gmail.com wrote: Hi all! I had touch with some different python behavior: I was tried to write into a file a string with the '\x1a' symbol, and for FreeBSD system, it gives expected result: open(test, w).write('before\x1aafter') open('test').read() 'before\x1aafter' but for my WinXP box, it gives some strange: open(test, w).write('before\x1aafter') open('test').read() 'before' Here I can write all symbols, but not read. I've tested it with python 2.6, 2.5 and 2.2 and WinXP SP2. Why is it so and is it possible to fix it? You've already got two good answers, but this might add a little more explanation: You will aware that in Windows Command Prompt, to exit the interactive mode of Python (among others), you need to type Ctrl-Z ... | C:\junkpython | Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 | Type help, copyright, credits or license for more information. | problem = '\x1a' | ord(problem) | 26 | # What is the 26th letter of the English/ASCII alphabet? | ... | ^Z | | C:\junk HTH, John -- http://mail.python.org/mailman/listinfo/python-list