Re: [Tutor] checking if data files are good, readable, and exist

2008-07-22 Thread Alan Gauld
"bob gailer" <[EMAIL PROTECTED]> wrote 


But that isn't. Python just reads the data and interprets it
as text if you specify a text file - the default - or as raw data
if you use rb.


But it DOES handle line-ends in an OS independent manner. 


OK, I'll grant you that small piece of data manipulation. :-)

(Although it could be argued that even that is just translating 
two bytes into one character in the set.)


Alan G.


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] checking if data files are good, readable, and exist

2008-07-22 Thread bob gailer

Alan Gauld wrote:

But that isn't. Python just reads the data and interprets it
as text if you specify a text file - the default - or as raw data
if you use rb.


But it DOES handle line-ends in an OS independent manner. Windows uses 
CR-LF as a line end, whereas Unix, Linux, Mac use (just CR or is it LF?).


Python presents line-ends uniformly as \n when you open the file in text 
mode.


Witness: (on Windows)

>>> f = open('c:/foo.txt', 'r')
>>> f.read()
'this line contains as\nbut this has as\n'
>>> f = open('c:/foo.txt', 'rb')
>>> f.read()
'this line contains as\r\nbut this has as\r\n'

[snip]

--
Bob Gailer
919-636-4239 Chapel Hill, NC

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] checking if data files are good, readable, and exist

2008-07-22 Thread Alan Gauld

"W W" <[EMAIL PROTECTED]> wrote


Am I wrong in thinking that /all/ files are stored as binary?


No, thats quite right.


python opens them, it automagically opens them in a
more readable format,


But that isn't. Python just reads the data and interprets it
as text if you specify a text file - the default - or as raw data
if you use rb.

Python doesn't alter the data in any way it simply assumes
that its text and interprets the bytes according to the current
alphabet. Thus it reads the value 65 and interprets it as 'A'
(assuming ASCII) in text mode or just as the bit pattern
0101 in binary. The application must then interpret the
bits in whatever way it considers appropriate - ass an integer,
a bitmask, part of a graphic image etc.

The important point is that there is no distinction between
binary data or text data in the file itself its just how it is
interpreted that distinguishes them. (This is not completely
true on some OS where text files always have an EOF marker,
but it is itself just a binary value!)

None of which helps the OP other than to highlight the difficulty
of determining if a file in binary or not. We can sometimes
tell if a file is not text - if it uses ASCII - by looking at the 
range

of byte values, but thats sloowww... but we can never be
sure that a file is non text. (We can also check for common
file headers such as postscript, GIF, MP3, JPEG, MIDI, etc
etc but even they can be misleading if they just coincidentally
look valid)

HTH,

Alan G. 



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] checking if data files are good, readable, and exist

2008-07-22 Thread W W
On Tue, Jul 22, 2008 at 10:40 AM, Bryan Fodness <[EMAIL PROTECTED]>
wrote:

> I would like to check to see if the data files are good, readable, and
> exist.  I have checked to see if they exist, but their is a possibility that
> the data file might be binary, and I would like to have a sys.exit for that
> as well.
>
> if not os.path.isfile(A_data) or not os.path.isfile(B_data)\
>or not os.path.isfile(C_data) or not os.path.isfile(D_data):
> sys.exit(14)
>

Am I wrong in thinking that /all/ files are stored as binary? And then when
python opens them, it automagically opens them in a more readable format,
unless you open them in binary with "rb" or similar command?

-Wayne

-- 
To be considered stupid and to be told so is more painful than being called
gluttonous, mendacious, violent, lascivious, lazy, cowardly: every weakness,
every vice, has found its defenders, its rhetoric, its ennoblement and
exaltation, but stupidity hasn't. - Primo Levi
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] checking if data files are good, readable, and exist

2008-07-22 Thread Tim Golden

Bryan Fodness wrote:
I would like to check to see if the data files are good, readable, and 
exist.  I have checked to see if they exist, but their is a possibility 
that the data file might be binary, and I would like to have a sys.exit 
for that as well.


You're going to be asked a lot of questions along the lines
of "What do you mean by binary?". I'm going to assume you
mean: has things other than ordinary letters, numbers
and punctuation in it. In today's internationalised and
Unicoded world that's a highly dodgy assumption, but I'm 
going to go with it.


To compound the crudeness of my approach, I'm going to
assume that anything > 126 is "binary" (thus dodging
the more complicated issue of the 0-31 control chars).


def is_binary (filename):
 return any (ord (c) > 126 for c in open (filename).read ())

print is_binary ("file1.txt")



Obviously, if you know any of the files is going to be
massive, you'll want to do something a bit smarter than
this.

TJG
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] checking if data files are good, readable, and exist

2008-07-22 Thread Bryan Fodness
I would like to check to see if the data files are good, readable, and
exist.  I have checked to see if they exist, but their is a possibility that
the data file might be binary, and I would like to have a sys.exit for that
as well.

if not os.path.isfile(A_data) or not os.path.isfile(B_data)\
   or not os.path.isfile(C_data) or not os.path.isfile(D_data):
sys.exit(14)




-- 
"The game of science can accurately be described as a never-ending insult to
human intelligence." - João Magueijo
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor