Re: AA Performance in Benchmarks

2015-04-26 Thread Martin Nowak via Digitalmars-d
On 04/26/2015 12:58 PM, Daniel Murphy wrote:
 
 Yes, if we had an AA in phobos.

Yes, I think phobos should get a few more optimized containers.

SparseSet
DenseSet
SparseHash
DenseHash

They should be configurable w.r.t. load factor, equality comparison, and
allocation policy (when std.allocator is available).

http://sparsehash.googlecode.com/svn/trunk/doc/index.html

As the builtin druntime AA will likely always remain a general purpose
hash table we can't do all the possible optimizations.


Re: AA Performance in Benchmarks

2015-04-26 Thread Daniel Murphy via Digitalmars-d
Laeeth Isharc  wrote in message 
news:hnihyhgtmdzhszkpn...@forum.dlang.org...


At a slight tangent: is there a way to reserve capacity for an associative 
array?  The obvious way does not seem to work.


No.

I noticed also in a talk at nativecode (possibly by Herb Sutter) that C++ 
gives you the ability to tune almost any parameter of the associative 
array implementation in the standard library.  (I'm not a C++ guy).  Is 
this entirely spurious, or would it be helpful for D?


Yes, if we had an AA in phobos. 



Re: AA Performance in Benchmarks

2015-04-25 Thread Martin Nowak via Digitalmars-d
On 04/23/2015 05:29 PM, Steven Schveighoffer wrote:
 
 https://github.com/D-Programming-Language/druntime/pull/1229
 
 It would be interesting to know how the new AA performs in this test.

Went from 11s down to 9s, ~20% improvement.


Re: AA Performance in Benchmarks

2015-04-25 Thread Laeeth Isharc via Digitalmars-d
On Thursday, 23 April 2015 at 15:29:37 UTC, Steven Schveighoffer 
wrote:

On 4/23/15 10:40 AM, Shammah Chancellor wrote:

2) I saw a thread awhile ago where someone had posted an 
updated

implementation of the AA with benchmarks and showed that it was
substantially faster in all cases.  I can't find the thread 
again --

what happened with this work?


There is a PR being discussed right now. Please take a look and 
chime in:


https://github.com/D-Programming-Language/druntime/pull/1229

It would be interesting to know how the new AA performs in this 
test.


-Steve


At a slight tangent: is there a way to reserve capacity for an 
associative array?  The obvious way does not seem to work.


I noticed also in a talk at nativecode (possibly by Herb Sutter) 
that C++ gives you the ability to tune almost any parameter of 
the associative array implementation in the standard library.  
(I'm not a C++ guy).  Is this entirely spurious, or would it be 
helpful for D?



Laeeth.


Re: AA Performance in Benchmarks

2015-04-24 Thread Chris via Digitalmars-d

On Friday, 24 April 2015 at 09:18:48 UTC, Chris wrote:
On Thursday, 23 April 2015 at 14:40:58 UTC, Shammah Chancellor 
wrote:
So, I was tooling around with one of the benchmarks I saw 
listed on twitter:


https://github.com/kostya/benchmarks/tree/master/brainfuck

This D benchmark spends most of it's time on AA lookups.   
Discussions about the implementation aside, I am curious as to 
why the D AA is so slow even after calling AA.rehash.


I took a look at the Nim tables implementation, and it looks 
that they use power-of-two vectors and use the property of 
primitive roots modulo n to get the next value in the case of 
a collision:


```
proc nextTry(h, maxHash: THash): THash {.inline.} =
 result = ((5 * h) + 1) and maxHash
```

So, my questions:

1) What are the general purpose performance implications of 
something like this?
2) I saw a thread awhile ago where someone had posted an 
updated implementation of the AA with benchmarks and showed 
that it was substantially faster in all cases.  I can't find 
the thread again -- what happened with this work?


-Shammah


Interesting. I noticed while doing some benchmarking that 
looking up data that is stored in an AA is slower than 
generating the data on the fly. I was really surprised.


AA lookup is (more often than not) slightly slower than (or at 
least as fast as) generating the data on demand:


import std.datetime : benchmark, Duration;
import std.string : format;
import std.conv : to;
import std.stdio : writeln;

enum {
  string[string] words = [
Data1:Blah,
Data2:Blub,
  ],
  string[] letters = [B, l, a, h]
}

void main() {
  words.rehash();
  auto result = benchmark!(lookup, make)(100_000);
  writeln(to!Duration(result[0]));
  writeln(to!Duration(result[1]));
}

string lookup() {
  auto blah = words[Data1];
  return blah;
}

string make() {
  string res;
  foreach (c; letters)
res ~= c;
  return res;
}

dmd v2.067.0 -release -O -boundscheck=off -inline



Re: AA Performance in Benchmarks

2015-04-24 Thread Chris via Digitalmars-d

On Friday, 24 April 2015 at 11:30:14 UTC, Chris wrote:

On Friday, 24 April 2015 at 09:18:48 UTC, Chris wrote:
On Thursday, 23 April 2015 at 14:40:58 UTC, Shammah Chancellor 
wrote:
So, I was tooling around with one of the benchmarks I saw 
listed on twitter:


https://github.com/kostya/benchmarks/tree/master/brainfuck

This D benchmark spends most of it's time on AA lookups.   
Discussions about the implementation aside, I am curious as 
to why the D AA is so slow even after calling AA.rehash.


I took a look at the Nim tables implementation, and it looks 
that they use power-of-two vectors and use the property of 
primitive roots modulo n to get the next value in the case of 
a collision:


```
proc nextTry(h, maxHash: THash): THash {.inline.} =
result = ((5 * h) + 1) and maxHash
```

So, my questions:

1) What are the general purpose performance implications of 
something like this?
2) I saw a thread awhile ago where someone had posted an 
updated implementation of the AA with benchmarks and showed 
that it was substantially faster in all cases.  I can't find 
the thread again -- what happened with this work?


-Shammah


Interesting. I noticed while doing some benchmarking that 
looking up data that is stored in an AA is slower than 
generating the data on the fly. I was really surprised.


AA lookup is (more often than not) slightly slower than (or at 
least as fast as) generating the data on demand:


import std.datetime : benchmark, Duration;
import std.string : format;
import std.conv : to;
import std.stdio : writeln;

enum {
  string[string] words = [
Data1:Blah,
Data2:Blub,
  ],
  string[] letters = [B, l, a, h]
}

void main() {
  words.rehash();
  auto result = benchmark!(lookup, make)(100_000);
  writeln(to!Duration(result[0]));
  writeln(to!Duration(result[1]));
}

string lookup() {
  auto blah = words[Data1];
  return blah;
}

string make() {
  string res;
  foreach (c; letters)
res ~= c;
  return res;
}

dmd v2.067.0 -release -O -boundscheck=off -inline


Recte: I meant to say creating the data is slower here, of 
course. Ah, well, it's Friday!


Re: AA Performance in Benchmarks

2015-04-24 Thread Chris via Digitalmars-d
On Thursday, 23 April 2015 at 14:40:58 UTC, Shammah Chancellor 
wrote:
So, I was tooling around with one of the benchmarks I saw 
listed on twitter:


https://github.com/kostya/benchmarks/tree/master/brainfuck

This D benchmark spends most of it's time on AA lookups.   
Discussions about the implementation aside, I am curious as to 
why the D AA is so slow even after calling AA.rehash.


I took a look at the Nim tables implementation, and it looks 
that they use power-of-two vectors and use the property of 
primitive roots modulo n to get the next value in the case of a 
collision:


```
proc nextTry(h, maxHash: THash): THash {.inline.} =
  result = ((5 * h) + 1) and maxHash
```

So, my questions:

1) What are the general purpose performance implications of 
something like this?
2) I saw a thread awhile ago where someone had posted an 
updated implementation of the AA with benchmarks and showed 
that it was substantially faster in all cases.  I can't find 
the thread again -- what happened with this work?


-Shammah


Interesting. I noticed while doing some benchmarking that looking 
up data that is stored in an AA is slower than generating the 
data on the fly. I was really surprised.


Re: AA Performance in Benchmarks

2015-04-23 Thread Steven Schveighoffer via Digitalmars-d

On 4/23/15 10:40 AM, Shammah Chancellor wrote:


2) I saw a thread awhile ago where someone had posted an updated
implementation of the AA with benchmarks and showed that it was
substantially faster in all cases.  I can't find the thread again --
what happened with this work?


There is a PR being discussed right now. Please take a look and chime in:

https://github.com/D-Programming-Language/druntime/pull/1229

It would be interesting to know how the new AA performs in this test.

-Steve