Adriaan Renting wrote: > def StripNoPrint(self, S): > from string import printable > return "".join([ ch for ch in S if ch in printable ]) > > > Adriaan Renting | Email: [EMAIL PROTECTED] > ASTRON | Phone: +31 521 595 217 > P.O. Box 2 | GSM: +31 6 24 25 17 28 > NL-7990 AA Dwingeloo | FAX: +31 521 597 332 > The Netherlands | Web: http://www.astron.nl/~renting/ > >>>>"MKoool" <[EMAIL PROTECTED]> 07/16/05 2:33 AM >>> > > I have a file with binary and ascii characters in it. I massage the > data and convert it to a more readable format, however it still comes > up with some binary characters mixed in. I'd like to write something > to just replace all non-printable characters with '' (I want to delete > non-printable characters). > > I am having trouble figuring out an easy python way to do this... is > the easiest way to just write some regular expression that does > something like replace [^\p] with ''? > > Or is it better to go through every character and do ord(character), > check the ascii values? > > What's the easiest way to do something like this? > > thanks > I'd consider using the string's translate() method for this. Provide it with two arguments: the first should be a string of the 256 ordinals from 0 to 255 (because you won't be changing any characters, so you need a translate table that effects the null transformation) and the second argument should a string containing all the characters you want to remove.
So >>> tt = "".join([chr(i) for i in range(256)]) generates the null translate table quite easily. Then >>> import string >>> ds = tt.translate(tt, string.printable) sets ds to be all the non-printable characters (according to the string module, anyway). Now you should be able to remove the non-printable characters from s by writing s = s.translate(tt, ds) regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list