Re: 200-600x slower Dlang performance with nested foreach loop

H. S. Teoh via Digitalmars-d-learn Tue, 26 Jan 2021 18:15:14 -0800

On Wed, Jan 27, 2021 at 01:28:33AM +0000, Paul Backus via Digitalmars-d-learn 
wrote:
> On Tuesday, 26 January 2021 at 23:57:43 UTC, methonash wrote:
> > > Using AA's may not necessarily improve performance.  It depends on
> > > what your code does with it.  Because AA's require random access
> > > to memory, it's not friendly to the CPU cache hierarchy, whereas
> > > traversing linear arrays is more cache-friendly and in some cases
> > > will out-perform AA's.
> > 
> > I figured a built-in AA might be an efficient path to performing
> > unique string de-duplication. If there's a more performant method
> > available, I'll certainly try it.
> 
> You could try sorting the array first, and then using `uniq` [1] to
> discard duplicate elements. There's an example in the docs that shows
> how to do this in-place (without allocating additional memory).
> 
> [1] http://phobos.dpldocs.info/std.algorithm.iteration.uniq.html


Yes, definitely try this.  This will completely eliminate the overhead
of using an AA, which has to allocate memory (at least) once per entry
added.  Especially since the data has to be sorted eventually anyway,
you might as well sort first then use the sortedness as a convenient
property for fast de-duplication.  Since .uniq traverses the range
linearly, this will be cache-friendly, and along with eliminating GC
load should give you a speed boost.


T

-- 
Nearly all men can stand adversity, but if you want to test a man's character, 
give him power. -- Abraham Lincoln

Re: 200-600x slower Dlang performance with nested foreach loop

Reply via email to