On 7/4/12 3:19 PM, Jonathan M Davis wrote:
In theory all of these redundant operations can be done away with
optimization techniques, but probably some aren't. So we should look
first at optimizing this flow heavily before looking at an API addition
that has disadvantages in other dimensions.

The problem is that there are inherent costs in calling both front and
popFront which can be avoided if they're done as a single operation. front
must call decode (or code that does its equivalent), and popFront must call
stride (or code that does its equivalent). There are duplicate calculations
there that cannot be avoided as long as they're separate operations.

This pretty much rehashes stuff that has been discussed already. (I'm only mentioning this because it only adds uninformative chaff to an already long message, which discourages one from reading it.)

Now, if want to argue that if we can get that cost down low enough adding
consumeFront is not worth the extra burden on the range API and the developers
using it, then okay. That's a valid argument, and looking to further optimize
front and popFront is beneficial regardless.

That too :o).

But at present, I'm seeing a performance improvement of approximately 70 - 80%
in iterating over strings with consumeFront rather than front and popFront
(depending on the compiler flags and strings used).

Great. Could you please post some code so we play with it? Thanks.

And odds are, if you're
optimizing front and popFront, then you'll likely get the same optimizations
in consumeFront.

That reasoning is only partly applicable. To make a long story short many of the optimizations I mentioned will indeed accelerate both approaches, but will also reduce the speed difference between them because they'll benefit front/popFront twice as much.


Andrei

Reply via email to