Change by Ma Lin :
--
pull_requests: +25427
pull_request: https://github.com/python/cpython/pull/26846
___
Python tracker
<https://bugs.python.org/issue44
Ma Lin added the comment:
If you update python/cpython-source-deps, I can submit a simple PR to
python/cpython.
I want to submit a PR to python/cpython-source-deps, but I think it’s better
for a credible person to do this.
--
nosy: +malin
Ma Lin added the comment:
> I suppose it is a very old code
I also found a few old code may have performance loss.
memoryview.cast() method was add in Python 3.3.
This code doesn't use memoryview.cast(), which will bring extra memory overhead
when the amount of data is very larg
New submission from Ma Lin :
The doc of os.fsync() said:
Availability: Unix, Windows.
https://docs.python.org/3.11/library/os.html#os.fsync
But it seems that macOS supports fsync.
(I'm not a macOS user)
--
assignee: docs@python
components: Documentation, macOS
messages: 3
Ma Lin added the comment:
Unix includes macOS.
Very sorry, close as invalid.
--
stage: -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.org/i
Ma Lin added the comment:
This article briefly introduces the inlining decisions in MSVC.
https://devblogs.microsoft.com/cppblog/inlining-decisions-in-visual-studio/
--
nosy: +malin
___
Python tracker
<https://bugs.python.org/issue45
Ma Lin added the comment:
MSVC 2019 has a /Ob3 option:
https://docs.microsoft.com/en-us/cpp/build/reference/ob-inline-function-expansion
>From the experience of another project, I conjecture /Ob3 increase the "global
>budget" mentioned in the blog.
I used /Ob3 for the 3.10 b
Ma Lin added the comment:
> In my case, pgo got stuck on linking with the object.h.
Me too. Since commit 28d28e0 (the first commit to slow down the PGO build), if
add `__forceinline` attribute to _Py_DECREF() function in object.h, the PGO
build hangs (>50 minutes).
So PR 28427 may no
Ma Lin added the comment:
Like OP's benchmark, if convert the inline functions to macros in object.h, the
3.10 branch is 1.03x faster, but still 1.07x slower than 28d28e0~1.
@vstinner could you prepare such a PR as a candidate fix.
There seem to be two ways to solve it in short-te
Ma Lin added the comment:
PR28475:
64-bit build is 1.03x slower than 28d28e0~1
32-bit build is 1.04x slower than 28d28e0~1
28d28e0~1 is the last good commit.
--
___
Python tracker
<https://bugs.python.org/issue45
Ma Lin added the comment:
I think this is a bug of MSVC2019, not a really regression of CPython. So
changing the code of CPython is just a workaround, maybe the right direction is
to prompt MSVC to fix the bug, otherwise there will be more trouble when 3.11
is released a year later.
Seeing
Ma Lin added the comment:
Today I tested with msvc2022-preview, `__forceinline` attribute will not hang
the build.
64-bit PGO builds:
28d28e0~1,vc2022 : baseline
28d28e0~1+F,vc2022 : 1.02x slower <1>
28d28e0,vc2022 : 1.03x slower <2>
28d28e0+F,vc2022 :
Change by Ma Lin :
--
pull_requests: +27721
pull_request: https://github.com/python/cpython/pull/29468
___
Python tracker
<https://bugs.python.org/issue44
Ma Lin added the comment:
Serhiy Storchaka:
Sorry, I found `zipfile` module also has this bug, fixed in PR29468.
This bug was reported & fixed by GitHub user `marcoffee` firstly, so I list him
as a co-author, his work:
https://github.com/animalize/pyzstd/issues/4
The second commit fixe
Change by Ma Lin :
--
pull_requests: +27830
pull_request: https://github.com/python/cpython/pull/29587
___
Python tracker
<https://bugs.python.org/issue41
Change by Ma Lin :
--
pull_requests: +27831
pull_request: https://github.com/python/cpython/pull/29588
___
Python tracker
<https://bugs.python.org/issue41
Ma Lin added the comment:
Sorry, I found an omission.
The previous PRs fixed the bug in these methods:
zlib.Compress.compress()
zlib.Decompress.decompress()
This method also has this bug, fix in PR29587 (main/3.10) and PR29588 (3.9-):
zlib.Decompress.flush()
Attached file
Ma Lin added the comment:
There are 5 link errors when building the PGO build.
Command: build --pgo
--
nosy: +malin
___
Python tracker
<https://bugs.python.org/issue45
Ma Lin added the comment:
They are LNK1268 error:
LINK : fatal error LNK1268: inconsistent option 'pdbthreads:5' specified with
/USEPROFILE but not with /GENPROFILE [e:\dev\cpython\PCbuild\_queue.vcx
proj]
LINK : fatal error LNK1268: inconsistent option 'pdbthreads:1&
Ma Lin added the comment:
Thanks for review!
--
___
Python tracker
<https://bugs.python.org/issue41735>
___
___
Python-bugs-list mailing list
Unsubscribe:
Ma Lin added the comment:
Since 243b6c3b8fd3144450c477d99f01e31e7c3ebc0f (21-08-19), this bug can't be
reproduced.
In `pysqlite_do_all_statements()`, 243b6c3 resets statements like this:
sqlite3_stmt *stmt = NULL;
while ((stmt = sqlite3_next_stmt(self->db, stmt))) {
Ma Lin added the comment:
This issue is not resolved, but was covered by a problematic behavior.
Maybe this issue will be solved in issue44092, I'll study that issue later.
--
___
Python tracker
<https://bugs.python.org/is
Ma Lin added the comment:
I think this change is no problem.
Erlend E. Aasland's explanation is very clear.
There is only one situation that a problem may occur. Write code with SQLite
3.8.7.2+ (2014-11-18), and run it on 3.7.15 (2012-12-12) ~ 3.8.7.1-, but this
situation may be diff
Ma Lin added the comment:
> How realistic is this scenario? If you compile with, for example 3.14.0 or
> newer, you'd link with sqlite3_trace_v2, not sqlite3_trace, so the loader
> would prevent you from running with anything pre 3.14. AFAIK, we've never
> had such pr
Ma Lin added the comment:
Imagine a person write a code with Python 3.11 and SQLite 3.8.7.2+, and then
deploying it to Python 3.11 and SQLite 3.8.7.1-, error may occur. However, this
situation is difficult to happen.
> Can you provide a reproducer? We've run this change through
Ma Lin added the comment:
If the special rollback handling is removed, the behavior of
Connection.rollback() and 'ON CONFLICT ROLLBACK' clause will be consistent.
See attached file on_conflict_rollback.py.
--
Added file: https://bugs.python.org/file50481/on_conflict_r
Ma Lin added the comment:
Is it possible to scan stdlib to find similar bugs?
--
nosy: +Ma Lin
___
Python tracker
<https://bugs.python.org/issue39033>
___
___
Ma Lin added the comment:
I also planned to review this commit at some moment, I feel a bit unsteady
about it.
If an optimization needs to be fine-tuned, and may introduces some pitfalls for
future code maintenance, IMHO it is best to avoid doing this kind of
optimization.
--
nosy
Ma Lin added the comment:
Windows build encountered a similar problem, see issue32394.
The solution is to check the runtime system version when importing socket
module, if it is an older system, delete the constants. [1]
issue32394 has a small script (winsdk_watchdog.py) to help find such
Ma Lin added the comment:
It seems that people usually use the socket module like this, I think it's safe
to respect this habit:
if hasattr(socket, "FLAG_NAME"):
do_something
If use PR19402, your program will have problem on the older version system, not
on
Ma Lin added the comment:
On Windows 10, Python 3.7, I get the same message as above reply.
If use Python 3.8, it works well.
--
nosy: +Ma Lin
___
Python tracker
<https://bugs.python.org/issue40
Ma Lin added the comment:
I did a git bisect, this commit fixed the bug:
https://github.com/python/cpython/commit/ac22f6aa989f18c33c12615af1c66c73cf75d5e7
--
___
Python tracker
<https://bugs.python.org/issue40
Ma Lin added the comment:
This issue can be closed.
'0x' 2
'd26935a5ee4cd542e8a3a7e74fb7a99855975b59' 40
'\n' 1
2+40+1 = 43
--
nosy: +malin
___
Python tracker
<h
New submission from Ma Lin :
Above code already cover this check:
if (Py_SIZE(v) == newsize) {
/* return early if newsize equals to v->ob_size */
return 0;
}
if (Py_SIZE(v) == 0) {
- if (newsize == 0) {
- return 0;
- }
Change by Ma Lin :
--
keywords: +patch
pull_requests: +23149
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/24330
___
Python tracker
<https://bugs.python.org/issu
Ma Lin added the comment:
Found a new issue, can be combined with this issue.
--
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.org/i
New submission from Ma Lin :
PyBytes_FromStringAndSize() uses a global cache for 1-byte bytes:
https://github.com/python/cpython/blob/v3.10.0a4/Objects/bytesobject.c#L147
if (size == 1 && str != NULL) {
struct _Py_bytes_state *state = get_bytes_state();
op
Change by Ma Lin :
--
nosy: +erlendaasland
___
Python tracker
<https://bugs.python.org/issue33376>
___
___
Python-bugs-list mailing list
Unsubscribe:
New submission from Ma Lin :
654PyErr_Fetch(&t, &v, &tb);
655if (v == NULL || !PyErr_GivenExceptionMatches(v, PyExc_BlockingIOError))
{
↑ this should be t
https://github.com/python/cpython/blob/v3.10.0a5/Modules/_io/buffe
Ma Lin added the comment:
I am trying to write a test-case.
--
___
Python tracker
<https://bugs.python.org/issue43305>
___
___
Python-bugs-list mailin
Ma Lin added the comment:
Close as invalid.
They the same effect:
PyErr_GivenExceptionMatches(v, PyExc_BlockingIOError))
PyErr_GivenExceptionMatches(t, PyExc_BlockingIOError))
--
resolution: -> wont fix
stage: -> resolved
status: open -&g
Ma Lin added the comment:
Is there hope to merge to 3.9 branch?
--
___
Python tracker
<https://bugs.python.org/issue35859>
___
___
Python-bugs-list mailin
New submission from Ma Lin :
The Windows build is using xz-5.2.2, it was released on 2015-09-29.
xz-5.2.5 was released recently, maybe we can update this library.
When preparing cpython-source-deps, don't forget to copy
`xz-5.2.5\windows\vs2019\config.h` to `xz-5.2.5\windows\` f
Change by Ma Lin :
--
keywords: +patch
pull_requests: +19847
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/20622
___
Python tracker
<https://bugs.python.org/issu
Ma Lin added the comment:
Good catch.
You can submit a PR to fix this. If you start from zero and do it slowly, it
will take about a week or two.
--
components: +Windows -Build
nosy: +Ma Lin, paul.moore, steve.dower, tim.golden, zach.ware
Ma Lin added the comment:
I suggest not to close this issue, this is an opportunity to investigate
whether Python3 has this problem as well.
--
nosy: +Ma Lin
___
Python tracker
<https://bugs.python.org/issue29
Ma Lin added the comment:
Group name is `str` is very reasonable. Essentially it is just a name, it has
nothing to do with `bytes`.
Other names in Python are also `str` type, such as codec names, hashlib names.
--
nosy: +Ma Lin
___
Python tracker
Ma Lin added the comment:
> a non-ascii group name will raise an error in bytes, even if encoded
Looks like this is a language limitation:
>>> b'é'
File "", line 1
SyntaxError: bytes can only contain ASCII literal characters.
No prob
Ma Lin added the comment:
`latin1` is the character set that Unicode code point from \u to \u00ff,
and the characters are directly mapped from/to bytes.
So b'\xe9' is mapped to \u00e9, it is `é`.
Of course, characters with Unicode code point greater than 0xff are impossible
to
Ma Lin added the comment:
In this case, you can only use 'latin1', which directly map one character
(\u-\u00FF) to/from one byte.
If use 'utf-8', it may map one character to multiple bytes, such as 'Δ' ->
b'\xce\x94'
'\x94
Ma Lin added the comment:
It seems you don't know some knowledge of encoding yet.
Naturally, `bytes` cannot contain character which Unicode code point is greater
than \u00ff. So you can only use "latin1" encoding, which map from character to
byte (or reverse) directly.
&
Ma Lin added the comment:
> this limitation to the latin-1 subset is not compatible with the
> documentation, which says that valid Python identifiers are valid group names.
Not all latin-1 characters are valid identifier, for example:
>>> '\x94'.en
Ma Lin added the comment:
Please look at these:
>>> orig_name = "Ř"
>>> orig_ch = orig_name.encode("cp1250") # Because why not?
>>> orig_ch
b'\xd8'
>>> name = list(re.match(b"(?P<" + orig_ch +
Ma Lin added the comment:
Why you always want to use "utf-8" encoded identifier as group name in `bytes`
pattern.
The direction is: a group name written in `bytes` pattern, and will convert to
`str.
Not this direction: `str` group name -(utf8)-> `bytes` pattern -> `
Ma Lin added the comment:
Do I need to write a detailed review guide? I suppose that after reading it
from beginning to end, it will be easy to understand PR 12427, no need to read
anything else.
Or plan to replace the sre module with the regex module in a future version
Change by Ma Lin :
--
components: +Library (Lib) -Extension Modules
nosy: +malin
___
Python tracker
<https://bugs.python.org/issue41210>
___
___
Python-bug
Ma Lin added the comment:
The docs[1] said:
Compression filters:
FILTER_LZMA1 (for use with FORMAT_ALONE)
FILTER_LZMA2 (for use with FORMAT_XZ and FORMAT_RAW)
But your code uses a combination of `FILTER_LZMA1` and `FORMAT_RAW`, is this ok?
[1] https
Ma Lin added the comment:
There was a similar issue (issue21872).
When decompressing a lzma.FORMAT_ALONE format data, and it doesn't have the end
marker (but has the correct "Uncompressed Size" in the .lzma header), sometimes
the last one to dozens bytes can't be outpu
New submission from Ma Lin :
lzma/bz2 modules are using the same buffer growth algorithm: [1][2]
newsize = size + (size >> 3) + 6;
lzma/bz2 modules' default output buffer is 8192 bytes [3][4], so the growth
step is below.
For many cases, maybe the buffer is resized too
Ma Lin added the comment:
Maybe the zlib module can also use the same algorithm.
zlib module's initial buffer size is 16KB [1], each time the size doubles [2].
[1] zlib module's initial buffer size:
https://github.com/python/cpython/blob/v3.9.0b4/Modules/zlibmodule.c#L32
[2] z
Ma Lin added the comment:
It is better to raise a warning when using problematic combination.
But IMO either "raising a warning" or "adding more description to doc" is too
dependent on the implementation detail of liblzma.
--
__
Ma Lin added the comment:
> Add zstd support in tarfile
This requires the stdlib to contain a Zstandard module.
You can ask in the Idea forum:
https://discuss.python.org/c/ideas
--
nosy: +malin
___
Python tracker
<https://bugs.pyth
New submission from Ma Lin :
CJK encode/decode functions only have three error-handler fast-paths:
replace
ignore
strict
See the code: [1][2]
If use other built-in error-handlers, need to get the error-handler object, and
call it with an Unicode Exception argument. See the code
Ma Lin added the comment:
IMO "xmlcharrefreplace" is useful for Web application.
For example, the page's charset is "gbk", then this statement can generate the
bytes content easily & safely:
s.encode('gbk', 'xmlcharrefreplace')
Maybe so
Ma Lin added the comment:
> But how many new Python web application use CJK codec instead of UTF-8?
A CJK character usually takes 2-bytes in CJK encodings, but takes 3-bytes in
UTF-8.
I tested a Chinese book:
in GBK: 853,025 bytes
in UTF-8: 1,267,523 bytes
For CJK content, UTF-8
Ma Lin added the comment:
I'm working on a patch.
lzma decompressing speed increases:
baseline: 0.275722 sec
patched: 0.140405 sec
(Uncompressed data size 52.57 MB)
The new algorithm looks like this:
#define INITIAL_BUFFER_SIZE (16*1024)
static inline Py_ssize_t
get_ne
New submission from Ma Lin :
BufferedReader's constructor has a `buffer_size` parameter, it's the size of
this buffer:
When reading data from BufferedReader object, a larger
amount of data may be requested from the underlying raw
stream, and kept in an inter
Change by Ma Lin :
--
keywords: +patch
pull_requests: +20842
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/21698
___
Python tracker
<https://bugs.python.org/issu
Ma Lin added the comment:
At least fix this bug:
the error-handler object is not cached, it needs to be
looked up from a dict every time, which is very inefficient.
The code:
https://github.com/python/cpython/blob/v3.9.0b4/Modules/cjkcodecs/multibytecodec.c#L81-L98
I will submit a
Ma Lin added the comment:
Some underlying stream has fast-path for .readall().
So close this issue.
--
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.org/i
Ma Lin added the comment:
I'm working on issue41265.
If nothing happens, I also would like to write a zstd module for stdlib before
the end of the year, but I dare not promise this.
If anyone wants to work on this issue, very gra
New submission from Ma Lin :
🔵 bz2/lzma module's current growth algorithm
bz2/lzma module's initial output buffer size is 8KB [1][2], and they are using
this output buffer growth algorithm [3][4]:
newsize = size + (size >> 3) + 6
[1] https://github.com/python/cpyth
Change by Ma Lin :
Added file: https://bugs.python.org/file49364/0to2GB_step30MB.png
___
Python tracker
<https://bugs.python.org/issue41486>
___
___
Python-bugs-list m
Change by Ma Lin :
Added file: https://bugs.python.org/file49366/0to20MB_step64KB.png
___
Python tracker
<https://bugs.python.org/issue41486>
___
___
Python-bugs-list m
Change by Ma Lin :
Added file: https://bugs.python.org/file49365/0to200MB_step2MB.png
___
Python tracker
<https://bugs.python.org/issue41486>
___
___
Python-bugs-list m
Change by Ma Lin :
Added file: https://bugs.python.org/file49367/benchmark.py
___
Python tracker
<https://bugs.python.org/issue41486>
___
___
Python-bugs-list mailin
Change by Ma Lin :
Added file: https://bugs.python.org/file49368/benchmark_real.py
___
Python tracker
<https://bugs.python.org/issue41486>
___
___
Python-bugs-list mailin
Change by Ma Lin :
--
keywords: +patch
pull_requests: +20886
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/21740
___
Python tracker
<https://bugs.python.org/issu
Ma Lin added the comment:
A more thorough solution was used, see issue41486.
So I close this issue.
--
stage: -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.org/i
Ma Lin added the comment:
The re.sub() doc said:
Changed in version 3.7: Empty matches for the pattern are replaced when
adjacent to a previous non-empty match.
IMO 3.7+ behavior is more reasonable, and it fixed a bug, see issue25054.
--
nosy: +malin
Ma Lin added the comment:
There can be at most one empty match at a position. IIRC, Perl's regex engine
has very similar behavior.
If don't want empty match, use + is fine.
--
___
Python tracker
<https://bugs.python.o
Ma Lin added the comment:
There are two zstd modules on pypi:
https://pypi.org/project/zstd/
https://pypi.org/project/zstandard/
The first one is too simple.
The second one is powerful, but has too many APIs:
ZstdCompressorIterator
ZstdDecompressorIterator
Ma Lin added the comment:
> More realistically, including the docs as unbundled HTML files
> and relying on the default browser is probably an all-around better idea.
CHM's index function is very convenient, I almost always use this feature when
I use CHM.
How about use tkinter
Ma Lin added the comment:
> when I delete the file %APPDATA%\Microsoft\HTML Help\hh.dat,
> the problem seems to go away.
It doesn't work for me.
Moreover, `Binary Index=Yes` no longer works on my PC.
A few days ago, I installed a clean Windows 10 2004, then CHM's index
Ma Lin added the comment:
I have spent two weeks, almost complete the code, a preview:
https://github.com/animalize/cpython/pull/8/files
Write directly for stdlib, since there are already zstd modules on pypi.
In addition, the API of zstd is simple, not as complicated as lzma.
Can also use
New submission from Ma Lin :
The code in zlib module:
self->zst.next_in = data->buf; // set next_in
...
ENTER_ZLIB(self); // acquire thread lock
`self->zst` is a `z_stream` struct defined in zlib, used to record states of a
compress/decompress stream:
typed
Change by Ma Lin :
--
keywords: +patch
pull_requests: +21208
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/22126
___
Python tracker
<https://bugs.python.org/issu
Change by Ma Lin :
--
pull_requests: +21211
pull_request: https://github.com/python/cpython/pull/22130
___
Python tracker
<https://bugs.python.org/issue41
Change by Ma Lin :
--
pull_requests: +21213
pull_request: https://github.com/python/cpython/pull/22132
___
Python tracker
<https://bugs.python.org/issue41
Ma Lin added the comment:
Although the improvement is not great, it's a very hot code path.
Could you review the PR?
--
components: +Windows
nosy: +paul.moore, tim.golden
___
Python tracker
<https://bugs.python.org/is
Ma Lin added the comment:
I modify lzma module to use different growth factors, see attached picture
different_factors.png
1.5x should be the growth factor of _PyBytesWriter under Windows.
So if change _PyBytesWriter to use memory blocks, maybe there will be no
performance improvement
New submission from Ma Lin :
C type `long` is 4-byte integer in 64-bit Windows build (MSVC behavior). [1]
In other compilers, `long` is 8-byte integer in 64-bit build.
This leads to a bit unnecessary performance waste, issue38252 fixed this
problem in a situation.
Search `SIZEOF_LONG` in
Ma Lin added the comment:
> What is the problem exactly?
There are several different problems, such as:
https://github.com/python/cpython/blob/v3.10.0a2/Modules/mathmodule.c#L2033
In addition, `utf16_decode` also has this problem, I forgot this:
https://github.com/python/cpython/b
Ma Lin added the comment:
> I do not think that this is suitable for newcomers because you need to have
> deep understanding why it was written in such form at first place and what
> will be changed if you change it.
I agree contributors need to understand code, rather than simpl
New submission from Ma Lin :
MSVC2019 has a new option `/Ob3`, it specifies more aggressive inlining than
/Ob2:
https://docs.microsoft.com/en-us/cpp/build/reference/ob-inline-function-expansion?view=msvc-160
If use this option in MSVC2017, it will emit a warning:
cl : Command line warning
Ma Lin added the comment:
> Could you please try again with PGO?
Please wait.
BTW, this option was advised in another project.
In that project, even enable `\Ob3`, it still slower than GCC 9 build.
If you are interested, see: https://github.com/facebook/zstd/issues/2
Ma Lin added the comment:
In PGO build, the improvement is not much.
(3.9 branch, with PGO, build.bat -p X64 --pgo)
+-+--+--+
| Benchmark | baseline-pgo | ob3-pgo
Change by Ma Lin :
--
nosy: +malin
___
Python tracker
<https://bugs.python.org/issue42369>
___
___
Python-bugs-list mailing list
Unsubscribe:
https://mail.pyth
Ma Lin added the comment:
Last benchmark was wrong, \Ob3 option was not enabled.
Apply `pgo_ob3.diff`, it slows, so I close this issue.
+-++--+
| Benchmark | py39_pgo_a | py39_pgo_b
Ma Lin added the comment:
@Mariatta Wijaya, would you update SQLite?
I want to do it myself, by following your patch in issue28791.
But I find I have to commit SQLite's source code to
https://github.com/python/cpython-source-deps, so I think this should be done
by a core deve
101 - 200 of 394 matches
Mail list logo