On 2019-03-26 19:55, Adam Funk wrote:
Hi,

I have a Python 3 (using 3.6.7) program that reads a TSV file, does
some churning with the data, and writes a TSV file out.

#v+
print('reading', options.input_file)
with open(options.input_file, 'r', encoding='utf-8-sig') as f:
     for line in f.readlines():
         row = line.split('\t')
         # DO STUFF WITH THE CELLS IN THE ROW

# ...

print('writing', options.output_file)
with open(options.output_file, 'w', encoding='utf-8') as f:
     # MAKE THE HEADER list of str
     f.write('\t'.join(header) + '\n')

     for doc_id in sorted(all_ids):
        # CREATE A ROW list of str FOR EACH DOCUMENT ID
        f.write('\t'.join(row) + '\n')
#v-

I noticed that the file command on the output returns "UTF-8 Unicode
text, with very long lines, with LF, NEL line terminators".

I'd never come across NEL terminators until now, and I've never
(AFAIK) created a file with them before.  Any idea why this is
happening?

(I tried changing the input encoding from 'utf-8-sig' to 'utf-8' but
got the same results with the output.)

Does the input contain any NEL? Do the strings that you write out contain them?
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to