[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2019-11-16 Thread susaki
susaki added the comment: duplicated with #14811 -- resolution: -> duplicate stage: patch review -> resolved status: open -> closed ___ Python tracker ___ ___

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2019-11-15 Thread susaki
susaki added the comment: I think this issue is duplicated with #14811, I will close it. The key point of this issue is that the size of `tok->buf` is fixed and equals to `BUFSIZ`(defined in stdio.h, have different value depends on OS). one line of code will be truncated If it’s size exceeds `

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2019-11-15 Thread Ezio Melotti
Change by Ezio Melotti : -- nosy: +ezio.melotti ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2019-11-15 Thread Terry J. Reedy
Terry J. Reedy added the comment: On Windows, with 3.7, 3.8.0, and master, none of the demo.py statement here and the examples in #38755 raise an error. I tried 'python -m module', running from IDLE editor, and interactive IDLE and REPL. Even the following worked. >>> s = (b'\xe2\x96\x91'*1

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2018-10-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: This is a part of more general issue25643. I'll try to revive that issue. -- assignee: -> serhiy.storchaka nosy: +serhiy.storchaka ___ Python tracker

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2018-10-17 Thread susaki
Change by susaki : -- keywords: +patch pull_requests: +9276 stage: -> patch review ___ Python tracker ___ ___ Python-bugs-list mail

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2018-10-14 Thread Lu jaymin
Lu jaymin added the comment: Thanks for your suggestions. I will make a PR on github. The buffer is resizeable now, please see cpython/Parser/tokenizer.c#L1043 for details. -- __

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2018-10-14 Thread Karthikeyan Singaravelan
Karthikeyan Singaravelan added the comment: Thanks for the confirmation. I think the expected solution is to use a buffer that can be resized. CPython accepts GitHub PRs so if you have time then I would suggest raising a PR against the linked issue since a lot of people have subscribed there

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2018-10-14 Thread Lu jaymin
Lu jaymin added the comment: I think these two issue is the same issue, and the following is a patch write by me, hope this patch will help. ``` diff --git a/Parser/tokenizer.c b/Parser/tokenizer.c index 1af27bf..ba6fb3a 100644 --- a/Parser/tokenizer.c +++ b/Parser/tokenizer.c @@ -617,32 +617,2

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2018-10-14 Thread Karthikeyan Singaravelan
Karthikeyan Singaravelan added the comment: Got it. Thanks for the details and patience. I tested with less number of characters and it seems to work fine so using the encoding at the top is not a good way to test the original issue as you have mentioned. Then I searched around and found issu

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2018-10-14 Thread Lu jaymin
Lu jaymin added the comment: If you declare the encoding at the top of the file, then everything is fine, because in this case Python will use `io.open` to open the file and use `stream.readline` to read one line of code, please see function `fp_setreadl` in `cpython/Parser/tokenizer.c` for deta

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2018-10-13 Thread Karthikeyan Singaravelan
Karthikeyan Singaravelan added the comment: Thanks for the report. Is this a case of encoding not being declared at the top of the file or am I missing something? ➜ cpython git:(master) cat ../backups/bpo34979.py s = '测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2018-10-13 Thread Xiang Zhang
Change by Xiang Zhang : -- nosy: +xiang.zhang ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.py

[issue34979] Python throws “SyntaxError: Non-UTF-8 code start with \xe8...” when parse source file

2018-10-13 Thread Lu jaymin
New submission from Lu jaymin : ``` # demo.py s = '测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测试测