On Sat, Sep 21, 2019 at 7:54 AM Richard Higginbotham <higgi...@gmail.com> wrote: > I'm not concerned about micro op, I'm concerned with macro op on large data > sets. Right now if someone comes to you with that simple use case of two > large lists you have to tell them to convert to sets and then back again. > Forget the new programmers who tried the naive way and quite because "python > is too slow", they are waiting 20 seconds when we could deliver it to them > in 1 second. If I can get within 10x of C code that seems like it would be > useful for that type of use case. I don't know if my case is common and I've > used it on more esoteric problems that wouldn't expect anyone to do, I'm used > to dealing with a lot of data, but if it can be of help I came it to make it > available. >
If you directly construct a set from your source of truth (eg from a directory iterator, if you're looking at file names), you don't have to "convert a list to a set". Even if you did have to convert (or technically, to construct a set with the elements of that list), how many elements do you need before it's actually going to take an appreciable amount of time? I don't mean a measurable amount of time - a handful of microseconds can be measured easily - but a length of time that would actually make people consider your program slow, which probably means half a second. Now multiply that number of elements by, say, 64 bytes apiece (15-character strings in Python 3.4+), and see if that would mean you have other issues than the actual hashing. I think you're still looking at a microoptimization here. ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/D7T744VLU66R66PHUXZ7Q2YKJE6BVJHV/ Code of Conduct: http://python.org/psf/codeofconduct/