It might be interesting to note that Numpy provides various set routines 
operating on their arrays (and hence lists as well, by conversion): 
https://docs.scipy.org/doc/numpy/reference/routines.set.html

For 
[intersection](https://docs.scipy.org/doc/numpy/reference/generated/numpy.intersect1d.html)
 for example they do the following:
1. Concatenate the arrays,
2. Sort the result,
3. Compare subsequent elements for equality.

Most likely because for each of the steps, there is a C extension that provides 
an efficient implementation.

For [membership 
testing](https://github.com/numpy/numpy/blob/d9b1e32cb8ef90d6b4a47853241db2a28146a57d/numpy/lib/arraysetops.py#L560),
 i.e. check which elements of `a` are in `b`, however, they use a condition 
where they decide that sometimes it's faster to use a loop over `a` and `b` 
instead of the concat+sort approach:

    if len(b) < 10 * len(a) ** 0.145

Not sure where they got the exact numbers from (maybe from benchmarks?).
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VFBIHAQBZNWO45KQAPUZ52YERO5ODBHP/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to