New submission from speedrun-program <xboxlivewill...@gmail.com>:

In string.py, the capwords function passes str.join a generator expression, but 
the map function
could be used instead. This is how capwords is currently written:

--------------------
```py
def capwords(s, sep=None):
    """capwords(s [,sep]) -> string
    
    Split the argument into words using split, capitalize each
    word using capitalize, and join the capitalized words using
    join.  If the optional second argument sep is absent or None,
    runs of whitespace characters are replaced by a single space
    and leading and trailing whitespace are removed, otherwise
    sep is used to split and join the words.
    
    """
    return (sep or ' ').join(x.capitalize() for x in s.split(sep))
```
--------------------

This is how capwords could be written:

--------------------
```py
def capwords(s, sep=None):
    """capwords(s [,sep]) -> string
    
    Split the argument into words using split, capitalize each
    word using capitalize, and join the capitalized words using
    join.  If the optional second argument sep is absent or None,
    runs of whitespace characters are replaced by a single space
    and leading and trailing whitespace are removed, otherwise
    sep is used to split and join the words.
    
    """
    return (sep or ' ').join(map(str.capitalize, s.split(sep)))
```
--------------------

These are the benefits:

1. Faster performance which increases with the number of times the str is split.

2. Very slightly smaller .py and .pyc file sizes.

3. Source code is slightly more concise.

This is the performance test code in ipython:

--------------------
```py
def capwords_current(s, sep=None):
    return (sep or ' ').join(x.capitalize() for x in s.split(sep))
​
def capwords_new(s, sep=None):
    return (sep or ' ').join(map(str.capitalize, s.split(sep)))
​
tests = ["a " * 10**n for n in range(9)]
tests.append("a " * (10**9 // 2)) # I only have 16GB of RAM
```
--------------------

These are the results of a performance test using %timeit in ipython:

--------------------

%timeit x = capwords_current("")
835 ns ± 15.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit x = capwords_new("")
758 ns ± 35.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
- - - - - - - - - - - - - - - - - - - - 
%timeit x = capwords_current(tests[0])
977 ns ± 16.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit x = capwords_new(tests[0])
822 ns ± 30 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
- - - - - - - - - - - - - - - - - - - - 
%timeit x = capwords_current(tests[1])
3.07 µs ± 88.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit x = capwords_new(tests[1])
2.17 µs ± 194 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
- - - - - - - - - - - - - - - - - - - - 
%timeit x = capwords_current(tests[2])
28 µs ± 896 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit x = capwords_new(tests[2])
19.4 µs ± 352 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
- - - - - - - - - - - - - - - - - - - - 
%timeit x = capwords_current(tests[3])
236 µs ± 14.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit x = capwords_new(tests[3])
153 µs ± 2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
- - - - - - - - - - - - - - - - - - - - 
%timeit x = capwords_current(tests[4])
2.12 ms ± 106 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit x = capwords_new(tests[4])
1.5 ms ± 9.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
- - - - - - - - - - - - - - - - - - - - 
%timeit x = capwords_current(tests[5])
23.8 ms ± 1.38 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit x = capwords_new(tests[5])
15.6 ms ± 355 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
- - - - - - - - - - - - - - - - - - - - 
%timeit x = capwords_current(tests[6])
271 ms ± 10.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit x = capwords_new(tests[6])
192 ms ± 807 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
- - - - - - - - - - - - - - - - - - - - 
%timeit x = capwords_current(tests[7])
2.66 s ± 14.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit x = capwords_new(tests[7])
1.95 s ± 26.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
- - - - - - - - - - - - - - - - - - - - 
%timeit x = capwords_current(tests[8])
25.9 s ± 80.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit x = capwords_new(tests[8])
18.4 s ± 123 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
- - - - - - - - - - - - - - - - - - - - 
%timeit x = capwords_current(tests[9])
6min 17s ± 29 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit x = capwords_new(tests[9])
5min 36s ± 24.8 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

--------------------

----------
components: Library (Lib)
messages: 401981
nosy: speedrun-program
priority: normal
pull_requests: 26808
severity: normal
status: open
title: use map function instead of genexpr in capwords
type: performance

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue45225>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to