On Apr 2, 8:39 am, John Machin <sjmac...@lexicon.net> wrote: > On Apr 1, 4:59 pm, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote: > > > > > On Tue, 31 Mar 2009 14:26:08 -0700 (PDT), ritu > > <ritu_bhandar...@yahoo.com> declaimed the following in > > gmane.comp.python.general: > > > > if ( ( -B $filename || > > > $filename =~ /\.pdf$/ ) && > > > -s $filename > 0 ) { > > > return(1); > > > } > > > According to my old copy of the Camel, -B only reads the "first > > block" of the file. If the block contains a <NUL>, or if ~30% of the > > block contains bytes >127 or from some (undefined) set of control > > characters (that is, I expect it does not count <LF>, <CR>, <TAB>, <VT>, > > <FF>, maybe some others)... So... > > Not sure whether this is meant to be rough pseudocode or an April 1 > "jeu d'esprit" or ... > > > > > def isbin(fid): > > fin = open(fid, "r") > > (1) mode = "rb" might be better > > > block = fin.read(1024) #what is the size of a "block" these days > > binary = "\0" in block > > if not binary: > > mrkrs = [b for b in block > > if b > 127 > > (2) [assuming Python 2.x] > b is a str object; change 127 to "\x3f"
Gah ... it must be gamma rays from outer space! Trying again: change 127 to "\x7f" (and actually "\x7e" would be a better choice) > > > or b in [ "\r", "\n", "\t" > > ] ] #add needed > > (3) surely you mean "b not in" take 2: surely you mean ... or b < "\x20" and b not in "\r\n\t" and at that stage the idea of making a set of chars befor entering the loop has some attraction :-) -- http://mail.python.org/mailman/listinfo/python-list