I looked for "VAV" in the files in the "encodings" directory (/usr/lib/python2.4/encodings/*.py on my machine). I found that the following character encodings seem to include hebrew characters: cp1255 cp424 cp856 cp862 iso8859-8 A file containing hebrew text might be in any one of these encodings, or any unicode-based encoding.
To open an encoded file for reading, use f = codecs.open(file, 'r', encoding='...') Now, calls like 'f.readline()' will return unicode strings. Here's an example, using a file in UTF-8 I have laying around: >>> f = codecs.open("/users/jepler/txt/UTF-8-demo.txt", "r", "utf-8") >>> for i in range(5): print repr(f.readline()) ... u'UTF-8 encoded sample plain-text file\n' u'\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\u203e\n' u'\n' u'Markus Kuhn [\u02c8ma\u02b3k\u028as ku\u02d0n] <[EMAIL PROTECTED]> \u2014 1999-08-20\n' u'\n' Jeff
pgpIIx2zTStwL.pgp
Description: PGP signature
-- http://mail.python.org/mailman/listinfo/python-list