Why Seq search is faster than Table search

2020-07-06 Thread cyberlis
I have this piece of code


import std/monotimes
import tables
import random
var myTable = initTable[int, int]()
var myList: seq[int]

for i in 0 .. 1_000_000:
myTable[i] = i
myList.add(i)

myList.shuffle()

var start = getMonoTime()
for i in 0 ..< 1_000_000:
discard myList.contains(i)
echo("List: ", getMonoTime() - start)

start = getMonoTime()
for i in 0 ..< 1_000_000:
discard myTable.hasKey(i)
echo("Table: ", getMonoTime() - start)


Run

Compile with this command nim c -d:release -r speed_test.nim

And here is my results 


List: (seconds: 0, nanosecond: 9384)
Table: (seconds: 0, nanosecond: 6553878)


Run

My question is why linear search of seq is faster than constant search of 
tables?

I know how tables work and I understand that Tables has to calculate hash for 
each key But it is main feature of hash maps to give constant search. I 
shuffled list before search

P.S. When I asked this question on Gitter I faced aggression and sarcasm 


 shit, we have tables for no reason
 it's not that tables are expensive, it's that finding the first 
element in a seq is fast.


Run

I sent this link 
[https://play.nim-lang.org/#ix=2r1w](https://play.nim-lang.org/#ix=2r1w) and if 
you look carefully my list has reverse order 


for i in countdown(1_000_000, 0):
myList.add(i)


Run

It was rude and not very welcoming


Re: Why Seq search is faster than Table search

2020-07-06 Thread Yardanico
Just about the messages, you didn't include this one (and that's why narimiran 
replied like that): 


  Than why we need tables if array search faster?


Run


Re: Why Seq search is faster than Table search

2020-07-06 Thread cyberlis
And still I just asked very curious question and I didn't accuse anybody. I 
didn't said that hash maps useless. I asked why tables are slow


Re: Why Seq search is faster than Table search

2020-07-06 Thread jasonfi
When performing benchmarks previously in C/C++ I noticed that lookup operations 
on vectors were extremely fast, so fast that I thought that perhaps the code 
didn't run. All of these tests only performed a simple operation(s) in a loop, 
similar to what you are doing. I think it has to do with highly efficient 
caching when this type of structure used in a loop.

In more realistic code, as opposed to this benchmark-style code, your results 
could be different.


Re: Why Seq search is faster than Table search

2020-07-06 Thread cyberlis
" Than why we need tables if array search faster?" I can't find here 
word useless. And I said Tables which is nim's implementation of hash map. I 
didn't say why we need hash maps


Re: Why Seq search is faster than Table search

2020-07-06 Thread Yardanico
Well we need Tables for the same reason we need hash tables. You shouldn't 
generalise your results from this microbenchmark onto all possible uses of 
Tables


Re: Why Seq search is faster than Table search

2020-07-06 Thread cyberlis
Right now I understand this idea. Thank you


Re: Why Seq search is faster than Table search

2020-07-06 Thread cyberlis
Thank you. I highly appreciate your answer


Re: Why Seq search is faster than Table search

2020-07-06 Thread miran
> I faced aggression and sarcasm

I can assure you that was just a plain sarcasm without even a slightest bit of 
aggression.

> I didn't accuse anybody

...except now on the forum, when you accused others for aggression because you 
didn't get/like their humor.

> I didn't said that hash maps useless

I quote your exact comment: " Than why we need tables if array search 
faster?"


Re: Why Seq search is faster than Table search

2020-07-06 Thread Yardanico
Yeah it seems like the C compiler is optimising a lot of stuff away


Re: Why Seq search is faster than Table search

2020-07-06 Thread jackhftang
The list one should O(N^2) and here N = 10^6. Unless you have a very very 
special CPU, it can't complete in 9384ns...


Re: Why Seq search is faster than Table search

2020-07-06 Thread spip
Please note also that the code version in the 
[playground](https://play.nim-lang.org/#ix=2r1w) tests only on the `0` element 
of `seq` and `Table`.


...
for _ in 0 ..< 1_000_000:
discard myList.contains(0)
echo("List: ", getMonoTime() - start)

start = getMonoTime()
for _ in 0 ..< 1_000_000:
discard myTable.hasKey(0)
...


Run

Also, the `MyTable` and `MyList` don't contain the same elements as you used 
`countdown(1_000_000, 0)` to fill the `seq` but excluded `1_000_000` when doing 
the tests.


Re: Why Seq search is faster than Table search

2020-07-06 Thread jackhftang
This this is the result on my machine if I change N to 1000 


List: (seconds: 0, nanosecond: 296203084)
Table: (seconds: 0, nanosecond: 4101)


Run


Re: Why Seq search is faster than Table search

2020-07-06 Thread lscrd
I changed the count for 1_000_000 to 100_000 and compiled your test with stable 
and devel in release mode and my results are totally different. With version 
1.2.4:


List: (seconds: 3, nanosecond: 71413798)
Table: (seconds: 0, nanosecond: 496792)


Run

So, I retried with 1_000_000 and got these results:


List: (seconds: 326, nanosecond: 776822305)
Table: (seconds: 0, nanosecond: 4693424)


Run

The times are consistent. There are 10 more values in the list and we do 10 
more searches, so we can expect 100 more time to do the search which is roughly 
what we get. For the table, as we do 10 more accesses, we need 10 more time 
which is also roughly what we get.

The time for the table seems to be in par which what you obtain, my CPU being 
probably somewhat faster. But there is clearly a problem with the time you get 
for the list. And I have no idea what the problem could be. 


Re: Why Seq search is faster than Table search

2020-07-06 Thread jasonfi
It could be that the optimizer somehow realizes that your loop will never do 
anything meaningful and never runs it.


Re: Why Seq search is faster than Table search

2020-07-06 Thread treeform
I love performance puzzles like this!

Answer: Turns it it just optimizes the for loops away!

Timings i get with the code provided:


List: (seconds: 0, nanosecond: 9180)
Table: (seconds: 0, nanosecond: 2794545)


Run

Really strange!

Added dummy counter so that for loops don't get optimized out. In fact, program 
ran so slow I had to change the counter to 1000:


import std/monotimes
import tables
import random
var myTable = initTable[int, int]()
var myList: seq[int]

for i in 0 .. 1_000:
myTable[i] = i
myList.add(i)

myList.shuffle()

var start = getMonoTime()
var dummyCounter = 0
for i in 0 ..< 1_000:
if myList.contains(i):
  inc dummyCounter
echo("List: ", getMonoTime() - start)

start = getMonoTime()
for i in 0 ..< 1_000:
if myTable.hasKey(i):
  inc dummyCounter
echo("Table: ", getMonoTime() - start)
echo dummyCounter


Run


List: (seconds: 0, nanosecond: 142290)
Table: (seconds: 0, nanosecond: 3060)


Run

Tables are soo soo much faster!


Re: Why Seq search is faster than Table search

2020-07-06 Thread treeform
I did some more work on it, I measured time at each array size from 0 to 
10_000, see chart: 
[https://dl3.pushbulletusercontent.com/4m74W8NlcdcrpJnVgCT53RkL2nxUPimc/image.png](https://dl3.pushbulletusercontent.com/4m74W8NlcdcrpJnVgCT53RkL2nxUPimc/image.png)

It shows you how much tables are faster during each number of elements.

run with nim c -r --d:danger


import times
import tables
import random
var myTable = initTable[int, int]()
var myList: seq[int]

for i in 0 .. 1_000:
myTable[i] = i
myList.add(i)

myList.shuffle()
var dummyCounter = 0
for n in 0 .. 10_000:
  
  const trytimes = 10
  
  var start = epochTime()
  for j in 0 .. trytimes:
for i in 0 ..< n:
if myList.contains(i):
  inc dummyCounter
  let listTime = epochTime() - start
  
  start = epochTime()
  for j in 0 .. trytimes:
for i in 0 ..< n:
if myTable.hasKey(i):
  inc dummyCounter
  let tableTime = epochTime() - start
  
  echo n, ", ", listTime/trytimes, ", ", tableTime/trytimes

echo dummyCounter


Run


Re: Why Seq search is faster than Table search

2020-07-06 Thread lscrd
This is strange. I get the long execution time in release mode without changing 
a line in the code.

Why is the code not optimized in my case? To get the same behavior, I have to 
use `-d:danger` instead of `-d:release`. May be it depends on some options of 
the C compiler.

But, at least, we have the explanation.


Re: Why Seq search is faster than Table search

2020-07-07 Thread erikenglund
Perhaps try the new --asm command option?


Re: Why Seq search is faster than Table search

2020-07-10 Thread snej
> My question is why linear search of seq is faster than constant search of 
> tables?

Sorry to pile on four days later, but the right question would be "why is my 
benchmark code producing crazy results?" since it's much more likely there's 
something wrong with your quick benchmark than with Nim's fairly-mature 
standard library.

I made the same mistake a few weeks ago when I posted here saying there seemed 
to be a bad performance bottleneck in the standard lib ... but it turned out 
that the real problem was that the build config I was using did not produce an 
optimized build at all. Oops.