On Sunday, 31 January 2021 at 00:53:05 UTC, Steven Schveighoffer wrote:
I'd suggest trying it in reverse. If you have the sequence "cba", "ba", "a", then determining "a" is in "ba" is probably cheaper than determining "a" is in "cba".

I have user requirements that this application track string IDs that get collapsed under parents. To minimize/simplify array concatenations, I figured that going in descending order might make those operations less expensive.

Though I think this is also worth trying.

Are you still convinced that it's possible to do it in under 2 seconds? That would seem a huge discrepancy. If not, what specifically are you looking for in terms of performance?

Good question. As a sanity check, at some point this past week, I re-wrote everything in a single contained Perl program, and running that program on the dataset I've shared above took well over 2 hours of wall time.

Considering that the index() function in Perl is pre-compiled, I imagine that Dlang running 4-5x faster is a pretty good speed boost, as it may only be benefiting from the speed of compiled loops over interpreted loops.

What confuses me, at this point, is this: I originally wrote the D code using foreach in this style:

foreach( i, ref parentString; strings )
{
    foreach( j, ref childString; strings[ i + 1 .. $ ] )
    {
        // ...
    }
}

Essentially, the value of j printed to stdout should always be larger than the value of i. Yet, when running in the above style, the values of j reported to stdout inevitably become smaller than i, suggesting that the loop is somehow traversing backwards. How can this be explained?

Reply via email to