[issue29882] Add an efficient popcount method for integers

2022-01-23 Thread Mark Dickinson


Mark Dickinson  added the comment:


New changeset 83a0ef2162aa379071e243f1b696aa6814edcd2a by Mark Dickinson in 
branch 'main':
bpo-29882: Fix portability bug introduced in GH-30774 (#30794)
https://github.com/python/cpython/commit/83a0ef2162aa379071e243f1b696aa6814edcd2a


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2022-01-22 Thread Mark Dickinson


Change by Mark Dickinson :


--
pull_requests: +28979
pull_request: https://github.com/python/cpython/pull/30794

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2022-01-21 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset cd8de40b3b10311de2db7b90abdf80af9e35535f by Victor Stinner in 
branch 'main':
bpo-29882: _Py_popcount32() doesn't need 64x64 multiply (GH-30774)
https://github.com/python/cpython/commit/cd8de40b3b10311de2db7b90abdf80af9e35535f


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2022-01-21 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +28959
pull_request: https://github.com/python/cpython/pull/30774

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-06-28 Thread Vedran Čačić

Vedran Čačić  added the comment:

Well, bit_sum is what it really is. But I agree it's a terrible name. :-)

--
nosy: +veky

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-06-08 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset c6b292cdeee689f0bfac6c1e2c2d4e4e01fa8d9e by Victor Stinner in 
branch 'master':
bpo-29882: Add _Py_popcount32() function (GH-20518)
https://github.com/python/cpython/commit/c6b292cdeee689f0bfac6c1e2c2d4e4e01fa8d9e


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-31 Thread Mark Dickinson


Mark Dickinson  added the comment:

A couple of other data points:

- Swift has nonzeroBitCount: 
https://developer.apple.com/documentation/swift/int/2886050-nonzerobitcount

- Rust has count_ones: https://doc.rust-lang.org/std/primitive.u64.html

- Go's math/bits package has OnesCount

- The closest thing in Mathematica appears to be DigitCount, which isn't 
base-specific.

@Mark Shannon: what name would you suggest, and why? The term "population 
count" feels too non-obvious and specialist to me, and anything involving 
"Hamming" likewise.

"count_ones" isn't obviously a bit operation.

"count_set_bits"?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-31 Thread Mark Dickinson


Mark Dickinson  added the comment:

> Why are calling a population count method "bit_count()"?

Naming things is hard, but I don't think this is an unreasonable name, and it's 
not without precedent. Java similarly has Integer.bitCount and 
BigInteger.bitCount. MySQL has BIT_COUNT.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-31 Thread Mark Shannon


Mark Shannon  added the comment:

Why are calling a population count method "bit_count()"?
That seems likely to cause confusion with "bit_length()".

I might reasonable expect that 0b1000.bit_count() be 4, not 1. It does have 4 
bits.
Whereas 0b1000.population_count() is clearly 1.

I have no objection to adding this method, just the choice of name.

--
nosy: +Mark.Shannon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-29 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +19762
pull_request: https://github.com/python/cpython/pull/20518

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-29 Thread Mark Dickinson


Mark Dickinson  added the comment:


New changeset 8bd216dfede9cb2d5bedb67f20a30c99844dbfb8 by Niklas Fiekas in 
branch 'master':
bpo-29882: Add an efficient popcount method for integers (#771)
https://github.com/python/cpython/commit/8bd216dfede9cb2d5bedb67f20a30c99844dbfb8


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-29 Thread Mark Dickinson


Change by Mark Dickinson :


--
resolution:  -> fixed
stage:  -> resolved
status: open -> closed
versions: +Python 3.10 -Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-26 Thread Mark Dickinson


Mark Dickinson  added the comment:

PR is updated and mergeable.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-25 Thread Mark Dickinson


Mark Dickinson  added the comment:

> For example, should numpy.int64 get this method as well?

That's for the NumPy folks to decide (and I've added Nathaniel Smith to the 
nosy in case he wants to comment), but I don't see any particularly strong 
reason that NumPy would need to add it. It looks as though the NumPy integer 
types have survived happily without a bit_length method, for example - I don't 
even see any issues in the NumPy tracker suggesting that anyone missed it. 
(Though perhaps that's because in the case of a NumPy int one always has at 
least an upper bound on the bit_length available.)

> What is the effect on https://docs.python.org/3.9/library/numbers.html?

No effect, just as int.bit_length has no effect.

> Does it make sense to call (True).popcount()?

It would be spelled `True.bit_count()` if the PR goes in as-is, but sure, why 
not. :-)

--
nosy: +njs

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-25 Thread STINNER Victor


STINNER Victor  added the comment:

> In CPython itself: See count_set_bits in Modules/mathmodule.c

Python/hamt.c contains an optimized function:

static inline uint32_t
hamt_bitcount(uint32_t i)
{
/* We could use native popcount instruction but that would
   require to either add configure flags to enable SSE4.2
   support or to detect it dynamically.  Otherwise, we have
   a risk of CPython not working properly on older hardware.

   In practice, there's no observable difference in
   performance between using a popcount instruction or the
   following fallback code.

   The algorithm is copied from:
   https://graphics.stanford.edu/~seander/bithacks.html
*/
i = i - ((i >> 1) & 0x);
i = (i & 0x) + ((i >> 2) & 0x);
return (((i + (i >> 4)) & 0xF0F0F0F) * 0x1010101) >> 24;
}


Python/pymath.c provides a "unsigned int _Py_bit_length(unsigned long d)" 
function used by math.factorial, _PyLong_NumBits(), int.__format__(), long / 
long, _PyLong_Frexp() and PyLong_AsDouble(), etc.

Maybe we could add a _Py_bit_count().

See also bpo-29782: "Use __builtin_clzl for bits_in_digit if available" which 
proposes to micro-optimize _Py_bit_length().

--

In the meanwhile, I also added pycore_byteswap.h *internal* header which 
provides static inline function which *do* use builtin functions like 
__builtin_bswap32().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-25 Thread STINNER Victor


STINNER Victor  added the comment:

Adding a function to a new hypothetical imath module sounds reasonable.

I'm less comfortable with adding a new method to int type: it would mean that 
any int subtype "should" implement it.

For example, should numpy.int64 get this method as well?

What is the effect on https://docs.python.org/3.9/library/numbers.html?

Does it make sense to call (True).popcount()?

--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-25 Thread Tim Peters


Tim Peters  added the comment:

I see I never explicitly said +1, so I will now: +1 on merging this :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-25 Thread Mark Dickinson


Mark Dickinson  added the comment:

I'm re-reviewing this discussion three years on. I'd be happy for this to go 
into 3.10. Are there other strong opinions? It would be good to either update 
and merge the PR, or close as rejected.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2019-06-01 Thread Tim Peters


Tim Peters  added the comment:

I prefer that a negative int raise ValueError, but am OK with it using the 
absolute value instead (i.e., what it does now).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2019-06-01 Thread Mark Dickinson


Mark Dickinson  added the comment:

> I am going to add the imath module.

Aimed at 3.8, or 3.9? This seems somewhat rushed for 3.8.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2019-06-01 Thread Mark Dickinson


Mark Dickinson  added the comment:

> Is everyone comfortable with how negative numbers are handled by this patch?

Not entirely, but it's not terribly wrong and it's consistent with how 
`int.bit_length` works. (I'm also a bit uncomfortable about `int.bit_length`s 
behaviour on negative numbers, but it is the way it is.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2019-06-01 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

I am going to add the imath module. If we decide to add popcount(), it may be 
better to add it in this module instead of int class.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2019-06-01 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Is everyone comfortable with how negative numbers are handled by this patch?  
It might be better to limit the domain and raise a ValueError rather than make 
a presumption about what the user intends.

--
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2018-01-30 Thread Tamás Bajusz

Change by Tamás Bajusz :


--
nosy: +gbtami

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2017-03-23 Thread Case Van Horsen

Case Van Horsen added the comment:

I like the name bit_count and I'll gladly add it to gmpy2 with the appropriate 
changes to exceptions, etc.

--
nosy: +casevh

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2017-03-22 Thread Tim Peters

Tim Peters added the comment:

See also:

https://en.wikipedia.org/wiki/Hamming_weight

As that says, there are a number of languages and processors with first class 
support for a popcount function.  I've frequently implemented it in Python when 
using integers as integer bitsets (`i` is in the set if and only if bit `2**i` 
is set in the integer), which often - except for finding the cardinality - runs 
much faster than using general Python sets.

--
nosy: +tim.peters

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2017-03-22 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I think that adding bitarray or bitset (or both) in the stdlib would better 
satisfy the needs. There are open issues for adding ability to read or set 
selected bits or range of bits in int or for bitwise operations on bytes. I 
think that bitarray and bitset would provide better interface for these 
operations.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2017-03-22 Thread Mark Dickinson

Mark Dickinson added the comment:

Many of those applications are really for bitstrings (chess bitboards, hamming 
distance), which aren't really the same thing as integers.

Nice find for the mathmodule.c case. I'd forgotten about that one (though 
according to git blame, apparently I'm responsible for checking it in). It's a 
fairly obscure corner case, though.

Overall, I'm -1 on adding this: I don't think it meets the bar of being useful 
enough to justify the extra method. I'd suggest that people needing this kind 
of efficient bitstring operation use a 3rd-party bitstring library instead.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2017-03-22 Thread Niklas Fiekas

Niklas Fiekas added the comment:

Searching popcount in Python files on GitHub yields
a considerable number of examples:

https://github.com/search?utf8=%E2%9C%93=popcount+extension%3Apy=Code

Perhaps intresting:

* In CPython itself: See count_set_bits in
  Modules/mathmodule.c

* Domain specific applications: Bitboards in Chess,
  fairly shuffling cards in Poker, comparing molecules

* Size of bitsets (see bitarray and bitsets I listed above).
  Probably for this reason also as a first class citizen
  in Redis: https://redis.io/commands/bitcount.

Probably most important:

* As the Hamming Distance:
  https://en.wikipedia.org/wiki/Hamming_distance#History_and_applications

---

Btw. not a concrete application. I just stumbled upon this.
popcnt was considered important enough to be included in the
rather limited WebAssembly instruction set:
https://github.com/WebAssembly/spec/raw/master/papers/pldi2017.pdf

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2017-03-22 Thread Mark Dickinson

Mark Dickinson added the comment:

Can you give some examples of concrete use-cases? I've spent the last six years 
or so writing scientific applications and parsing all sorts of odd binary 
formats, and haven't needed or wanted a popcount yet.

> (I am not a fan of the arbitrary return value).

Agreed: if this were implemented, I think raising ValueError would be the most 
appropriate thing to do for negative inputs.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2017-03-22 Thread Jim Fasarakis-Hilliard

Changes by Jim Fasarakis-Hilliard :


--
nosy: +Jim Fasarakis-Hilliard

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2017-03-22 Thread Niklas Fiekas

Changes by Niklas Fiekas :


--
pull_requests: +678

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2017-03-22 Thread Niklas Fiekas

New submission from Niklas Fiekas:

An efficient popcount (something equivalent to bin(a).count("1")) would
be useful for numerics, parsing binary formats, scientific applications
and others.

DESIGN DECISIONS

 * Is a popcount method useful enough?
 * How to handle negative values?
 * How should the method be named?

SURVEY

gmpy calls the operation popcount and returns -1/None for negative values:

>>> import gmpy2
>>> gmpy2.popcount(-10)
-1

>>> import gmpy
>>> gmpy.popcount(-10)

>From the documentation [1]:

> If x < 0, the number of bits with value 1 is infinite
> so -1 is returned in that case.

(I am not a fan of the arbitrary return value).

The bitarray module has a count(value=True) method:

>>> from bitarray import bitarray
>>> bitarray(bin(123456789).strip("0b")).count()
16

Bitsets [2] exposes __len__.

There is an SSE4 POPCNT instruction. C compilers call the corresponding
intrinsic popcnt or popcount (with some prefix and suffix) and they
accept unsigned arguments.

Rust calls the operation count_ones [3]. Ones are counted in the binary
representation of the *absolute* value. (I propose to do the same).

Introducing popcount was previously considered here but closed for lack
of a PEP or patch: http://bugs.python.org/issue722647

Sensible names could be bit_count along the lines of the existing
bit_length or popcount for gmpy compability and to distinguish it more.

PERFORMANCE

$ ./python -m timeit "bin(123456789).count('1')"  # equivalent
100 loops, best of 5: 286 nsec per loop
$ ./python -m timeit "(123456789).bit_count()"  # fallback
500 loops, best of 5: 46.3 nsec per loop

[1] https://gmpy2.readthedocs.io/en/latest/mpz.html#mpz-functions
[2] https://pypi.python.org/pypi/bitsets/0.7.9
[3] https://doc.rust-lang.org/std/primitive.i64.html#method.count_ones

--
components: Interpreter Core
messages: 290003
nosy: mark.dickinson, niklasf
priority: normal
severity: normal
status: open
title: Add an efficient popcount method for integers
type: enhancement
versions: Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com