On 28/06/2006 9:44 AM, Mike Currie wrote: > > What I am doing is converting data for processing that will be tab (for > columns) and newline (for row) delimited. Some of the data contains tabs > and newlines so, I have to convert them to something else so the file > integrity is good. > > Not my idea, I've been left with the implementation however. >
Do you *need* UTF-8? Or is that only there to hide away the \x88 and \x83? Apart from tab and linefeed, what (if any) other characters are there in the data that are not printable ASCII characters? In any case, if you have 8-bit string data, the CSV file format would appear to meet the requirement: it preserves your data by "quoting" delimiters and newlines that appear in the actual data. The Python csv module is included in every Python distribution since 2.3. Cheers, John -- http://mail.python.org/mailman/listinfo/python-list