Re: 200-600x slower Dlang performance with nested foreach loop

methonash via Digitalmars-d-learn Tue, 26 Jan 2021 16:00:45 -0800

On Tuesday, 26 January 2021 at 18:17:31 UTC, H. S. Teoh wrote:

Do not do this. Every time you call .array it allocates a newarray and copies all its contents over. If this code runsfrequently, it will cause a big performance hit, not to mentionhigh GC load.
The function you're looking for is .release, not .array.

Many thanks for the tip! I look forward to trying this soon. Forreference, the .array call is only performed once.

That nested loop is an O(n^2) algorithm. Meaning it will slowdown *very* quickly as the size of the array n increases. Youmight want to think about how to improve this algorithm.

Nice observation, and yes, this would typically be an O(n^2)approach.

However, due to subsetting the input dataset to unique stringsand then sorting in descending length, one might notice that theinner foreach loop does not iterate over all of n, only on theiterator value i+1 through the end of the array.

Thus, I believe this would then become approximately O(n^2/2).More precisely, it should be O( ( n^2 + n ) / 2 ).

Further: the original dataset has 64k strings. Squaring thatyields 4.1 billion string comparisons.

Once uniquely de-duplicated, the dataset is reduced to ~46kstrings. Considering roughly O(n^2/2) at this level, this yields1.06 billion string comparisons.

So, performing steps 1 through 3 improves the brute-force stringcomparison problem four-fold in my test development dataset.

Using AA's may not necessarily improve performance. It dependson what your code does with it. Because AA's require randomaccess to memory, it's not friendly to the CPU cache hierarchy,whereas traversing linear arrays is more cache-friendly and insome cases will out-perform AA's.

I figured a built-in AA might be an efficient path to performingunique string de-duplication. If there's a more performant methodavailable, I'll certainly try it.

First of all, you need to use a profiler to identify where thehotspots are. Otherwise you could well be pouring tons ofeffort into "optimizing" code that doesn't actually needoptimizing, while completely missing the real source of theproblem. Whenever you run into performance problems, do notassume you know where the problem is, profile, profile, profile!

Message received. Given that D is the first compiled languageI've semi-seriously dabbled with, I have no real experience withprofiler usage.

Second, you only posted a small fragment of your code, so it'shard to say where the problem really is. I can only guessbased on what you described. If you could post the entireprogram, or at least a complete, compilable and runnableexcerpt thereof that displays the same (or similar) performanceproblems, then we could better help you pinpoint where theproblem is.

Yes, I'll be looking to present a complete, compilable, andexecutable demo of code for this issue if/when subsequent effortscontinue to fail.

For professional reasons (because I no longer work in academia),I cannot share the original source code for the issue presentedhere, but I can attempt to reproduce it in a minimally completeform for a public dataset.

Re: 200-600x slower Dlang performance with nested foreach loop

Reply via email to