János Juhász wrote: > Dear All, > > I would like to convert my DOS txt file into pdf with reportlab. > The file can be seen correctly in Central European (DOS) encoding in > Explorer. > > My winxp uses cp852 as default codepage. > > When I open the txt file in notepad and set OEM/DOS script for terminal > fonts, it shows the file correctly. > > I tried to convert the file with the next way: > > from reportlab.platypus import * > from reportlab.lib.styles import getSampleStyleSheet > from reportlab.rl_config import defaultPageSize > PAGE_HEIGHT=defaultPageSize[1] > > styles = getSampleStyleSheet() > > def MakePdfInvoice(InvoiceNum, page): > style = styles["Normal"] > PdfInv = [Spacer(0,0)] > PdfInv.append(Preformatted(page, styles['Normal'])) > doc = SimpleDocTemplate(InvoiceNum) > doc.build(PdfInv) > > if __name__ == '__main__': > content = open('invoice01_0707.txt').readlines() > page = ''.join(content[:92]) > page = unicode(page, 'Latin-1')
Why latin-1? Try page = unicode(page, 'cp852') > MakePdfInvoice('test.pdf', page) > > But it made funny chars somewhere. > > I tried it so eighter > > if __name__ == '__main__': > content = open('invoice01_0707.txt').readlines() > page = ''.join(content[:92]) > page = page.encode('cp852') Use decode() here, not encode(). decode() goes towards Unicode encode() goes away from Unicode As a mnemonic I think of Unicode as pure unencoded data. (This is *not* accurate, it is a memory aid!) Then it's easy to remember that decode() removes encoding == convert to Unicode, encode() adds encoding == convert from Unicode. > MakePdfInvoice('test.pdf', page) > > But it raised exception: > debugger.run(codeObject, __main__.__dict__, start_stepping=0) > File > "C:\Python24\Lib\site-packages\pythonwin\pywin\debugger\__init__.py", line > 60, in run > _GetCurrentDebugger().run(cmd, globals,locals, start_stepping) > File > "C:\Python24\Lib\site-packages\pythonwin\pywin\debugger\debugger.py", line > 631, in run > exec cmd in globals, locals > File "D:\devel\reportlab\MakePdfInvoice.py", line 18, in ? > page = page.encode('cp852') > File "c:\Python24\lib\encodings\cp852.py", line 18, in encode > return codecs.charmap_encode(input,errors,encoding_map) > UnicodeDecodeError: 'ascii' codec can't decode byte 0xb5 in position 112: > ordinal not in range(128) When you call encode on a string (instead of a unicode object) the string is first decoded to Unicode using ascii encoding. This usually fails. Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor