expandtabs acts unexpectedly
Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3] on linux2 Type help, copyright, credits or license for more information. ' test\ttest'.expandtabs(4) ' test test' 'test \ttest'.expandtabs(4) 'testtest' 1st example: expect returning 4 spaces between 'test', 3 spaces returned 2nd example: expect returning 5 spaces between 'test', 4 spaces returned Is it a bug or something, please advice. -- http://mail.python.org/mailman/listinfo/python-list
Re: expandtabs acts unexpectedly
On Aug 19, 4:16 pm, Peter Brett pe...@peter-b.co.uk wrote: digisat...@gmail.com digisat...@gmail.com writes: Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3] on linux2 Type help, copyright, credits or license for more information. ' test\ttest'.expandtabs(4) ' test test' 'test \ttest'.expandtabs(4) 'test test' 1st example: expect returning 4 spaces between 'test', 3 spaces returned 2nd example: expect returning 5 spaces between 'test', 4 spaces returned Is it a bug or something, please advice. Consider where the 4-space tabstops are relative to those strings: test test test test ^ ^ ^ So no, it's not a bug. If you just want to replace the tab characters by spaces, use: test\ttest.replace(\t, ) ' test test' test \ttest.replace(\t, ) 'test test' HTH, Peter -- Peter Brett pe...@peter-b.co.uk Remote Sensing Research Group Surrey Space Centre You corrected me for the understanding of tab stop. Great explanation. Thank you so much. -- http://mail.python.org/mailman/listinfo/python-list
encoding problem
The below snippet code generates UnicodeDecodeError. #!/usr/bin/env python #--*-- coding: utf-8 --*-- s = 'äöü' u = unicode(s) It seems that the system use the default encoding- ASCII to decode the utf8 encoded string literal, and thus generates the error. The question is why the Python interpreter use the default encoding instead of utf-8, which I explicitly declared in the source. -- http://mail.python.org/mailman/listinfo/python-list
Re: encoding problem
On 12月19日, 下午9时34分, Marc 'BlackJack' Rintsch bj_...@gmx.net wrote: On Fri, 19 Dec 2008 04:05:12 -0800, digisat...@gmail.com wrote: The below snippet code generates UnicodeDecodeError. #!/usr/bin/env python #--*-- coding: utf-8 --*-- s = 'äöü' u = unicode(s) It seems that the system use the default encoding- ASCII to decode the utf8 encoded string literal, and thus generates the error. The question is why the Python interpreter use the default encoding instead of utf-8, which I explicitly declared in the source. Because the declaration is only for decoding unicode literals in that very source file. Ciao, Marc 'BlackJack' Rintsch Thanks for the answer. I believe the declaration is not only for unicode literals, it is for all literals in the source even including Comments. we can try runing a source file without encoding declaration and have only 1 line of comments with non-ASCII characters. That will arise a Syntax error and bring me to the pep263 URL. I read the pep263 and quoted below: Python's tokenizer/compiler combo will need to be updated to work as follows: 1. read the file 2. decode it into Unicode assuming a fixed per-file encoding 3. convert it into a UTF-8 byte string 4. tokenize the UTF-8 content 5. compile it, creating Unicode objects from the given Unicode data and creating string objects from the Unicode literal data by first reencoding the UTF-8 data into 8-bit string data using the given file encoding The above described Python internal process indicate that the step 2 will utilise the specific encoding to decode all literals in source, while in step5 will evolve a re-encoding with the specific encoding. That is the reason why we have to explicitly declare a encoding as long as we have non-ASCII in source. Bruno answered why we need specify a encoding when decoding a byte string with perfect explanation, Thank you very much. -- http://mail.python.org/mailman/listinfo/python-list