Thank you, I have spent a fair bit of time on these and found optimal solutions (at least I think so), but they are still multiple times slower than python builtin function.
I am currently out of ideas, maybe will dig something out in time. > On 14 Nov 2023, at 07:23, Thomas Passin via Python-list > <python-list@python.org> wrote: > > On 11/13/2023 11:44 PM, AVI GROSS via Python-list wrote: >> Dom, >> I hear you. >> As you say, writing your own extension in something like C++ may not appeal >> to you even if it is faster. >> I was wondering if using a generator or something similar in R might make >> sense. >> I mean what happens if you write a function that includes a "yield" or two >> and does a part of what you want. It maintains some internal state between >> invocations. So you can call it once to setup things then call it repeatedly >> to keep processing the next item. You stop calling it when you get a result >> you want, such as that it has seen what you want N times. >> Since the code stays in memory, it may effectively run faster than some >> other kinds of functions calls. It can keep things in internal storage such >> as not just how many N you want but how many it has seen. > > I'm inclined to just turn the iterable into a set to get the values, then > iterate through those values calling count() on a listified version of the > iterable. If the count >= target, return. > > It may not be the fastest one could do but it's simple and probably pretty > fast for many uses. I suppose that for some iterables it would be better not > to turn them into lists, but one could figure out about that after working > out more carefully what cases need to be covered. > >> Your outer function can maintain a list of the items you want to XOR or >> generate a new one dynamically as needed. It can use functional programming >> techniques to create a new customized version of the iterator, such as with >> a value of N built in. You would then call the outer function and let it use >> the inner function till the result is available or until the data in the >> iterator runs out or perhaps other tweaks involving two way communication of >> sorts between the functions. >> I am NOT suggesting this approach is optimal or fast but merely wondering if >> something along these lines is worth trying that might speed things up even >> if not very fast. Such approaches can be even more effective if what you are >> working on need not all be instantiated up front but can be dynamically >> calculated or incrementally read from files. With care, you can make >> multiple instantiations that each iterate over their own sets of data >> without interference. >> Just a thought. In a sense, this can be a slightly decent substitute for the >> non-standard evaluation in R where you can arrange for lots of your data to >> not be interpreted till absolutely needed. >> -----Original Message----- >> From: Dom Grigonis <dom.grigo...@gmail.com> >> Sent: Monday, November 13, 2023 10:12 PM >> To: avi.e.gr...@gmail.com >> Cc: Grant Edwards <grant.b.edwa...@gmail.com>; Python >> <python-list@python.org> >> Subject: Re: xor operator (DEPRECATED) >> Fair point. However, I gave it a shot for the following reason: >> I couldn’t find a way to make such performant function. Using python builtin >> components still ends up several times slower than builtin `all`. Cython or >> numba or similar is not an option as they do not support `truth` values. Or >> if they do, it ends up slower than pure python variant. >> So all what is left is writing a proper extension. Which I would prefer not >> to do for 1 function. I thought maybe `xor`, as in logical XOR functionality >> in its vanilla case could be compelling. And after doing a bit of search I >> see that very very few languages have that and it seems for a good reason. >> Some that do: R, thats all I could find. Although some (if not many) went >> through the proposal phase. And yes, none of them have a function that I am >> proposing. >> So yes, you are right, not a good proposal. >> But there still seems to be the need for short-circuiting performant >> implementations in python space. The issue is that there are many variants >> of what might be needed while there is no efficient solution to sourcing >> predicates from python to lower level implementations. Someone mentioned >> that numpy experimented with such implementations in C, but they did not get >> anywhere with it. >> The best I could come up with is cached numba for numpy problems, which does >> perform very well and more than worth it if function is re-used. It even >> ends up faster than cython or cffi extensions, however can’t have many of >> those due to JIT and AOT is currently being deprecated (which wouldn’t solve >> anything anyway). However, as I mentioned earlier it does not apply to this >> case. >> So it’s either: >> a) Something very clever and flexible implemented that covers most of such >> needs and doesn’t require predicates. >> b) I welcome any thoughts on this. >> DG >>> On 14 Nov 2023, at 04:27, AVI GROSS via Python-list >>> <python-list@python.org> wrote: >>> >>> I was going to ask a dumb question. Has any other language you know of made >>> something available that does what is being asked for and included it in the >>> main program environment rather than an add-on? >>> >>> A secondary mention here has been whether short-circuiting functions like >>> "any" and "all" have been augmented with something like "has_n" that >>> evaluates arguments till it has found n or perhaps n+1 of what it wants then >>> skips the rest. Does any language supply something like that? What would >>> such a function return and does it have an "any" or an "all" side? >>> >>> It sounds like if I asked if a list of integers has at least n prime numbers >>> in "any" mode, it should ignore any that are not primes till it finds n >>> primes or fails and returns true or false. If in "all" mode, I assume it >>> would have to be the first n items without a failure. >>> >>> Fine, but then someone may want to know WHERE you stopped or for you to >>> return the sublist of the ones that made the match, or even return >>> everything that was skipped so you can later process that. Consider a long >>> list of jurors you process to place a dozen that qualify on a jury and then >>> later you want to choose from among the rest for another jury. >>> >>> Human minds can come up with an amazing number of ideas including for >>> "useful" functions or features but I find the vast majority would rarely be >>> used as nobody remembers it is available and some fairly simple method using >>> other functions can easily be cobbled together. >>> >>> -----Original Message----- >>> From: Python-list <python-list-bounces+avi.e.gross=gmail....@python.org> On >>> Behalf Of Grant Edwards via Python-list >>> Sent: Monday, November 13, 2023 8:19 PM >>> To: python-list@python.org >>> Subject: Re: xor operator >>> >>> On 2023-11-14, Dom Grigonis via Python-list <python-list@python.org> wrote: >>>> >>>>> Except the 'any' and 'all' builtins are _exactly_ the same as bitwise >>>>> or and and applided to many bits. To do something "in line" with that >>>>> using the 'xor' operator would return True for an odd number of True >>>>> values and False for an even Number of True values. >>>> >>>> Fair point. >>>> >>>> Have you ever encountered the need for xor for many bits (the one >>>> that I am NOT referring to)? Would be interested in what sort of >>>> case it could be useful. >>> >>> Yes, it's used all the time in low-level communications protocols, >>> where it's often implemented in hardware. But, it is also not at all >>> unusual to implement it in software. >>> >>> It's also not that unusual for the "count-ones" part of the function >>> you're asking for to be implemented in hardware by a CPU having an >>> instruction that counts the number of 1 bits in a register. >>> >>> GCC has a low-level builtins called __builtin_popcount() and >>> __builtin-popcountl() that counts the number of 1's in an unsigned >>> (long) int. >>> >>> >>> -- >>> https://mail.python.org/mailman/listinfo/python-list >>> >>> -- >>> https://mail.python.org/mailman/listinfo/python-list > > -- > https://mail.python.org/mailman/listinfo/python-list > <https://mail.python.org/mailman/listinfo/python-list> -- https://mail.python.org/mailman/listinfo/python-list