Why Seq search is faster than Table search
I have this piece of code import std/monotimes import tables import random var myTable = initTable[int, int]() var myList: seq[int] for i in 0 .. 1_000_000: myTable[i] = i myList.add(i) myList.shuffle() var start = getMonoTime() for i in 0 ..< 1_000_000: discard myList.contains(i) echo("List: ", getMonoTime() - start) start = getMonoTime() for i in 0 ..< 1_000_000: discard myTable.hasKey(i) echo("Table: ", getMonoTime() - start) Run Compile with this command nim c -d:release -r speed_test.nim And here is my results List: (seconds: 0, nanosecond: 9384) Table: (seconds: 0, nanosecond: 6553878) Run My question is why linear search of seq is faster than constant search of tables? I know how tables work and I understand that Tables has to calculate hash for each key But it is main feature of hash maps to give constant search. I shuffled list before search P.S. When I asked this question on Gitter I faced aggression and sarcasm shit, we have tables for no reason it's not that tables are expensive, it's that finding the first element in a seq is fast. Run I sent this link [https://play.nim-lang.org/#ix=2r1w](https://play.nim-lang.org/#ix=2r1w) and if you look carefully my list has reverse order for i in countdown(1_000_000, 0): myList.add(i) Run It was rude and not very welcoming
Re: Why Seq search is faster than Table search
Just about the messages, you didn't include this one (and that's why narimiran replied like that): Than why we need tables if array search faster? Run
Re: Why Seq search is faster than Table search
And still I just asked very curious question and I didn't accuse anybody. I didn't said that hash maps useless. I asked why tables are slow
Re: Why Seq search is faster than Table search
When performing benchmarks previously in C/C++ I noticed that lookup operations on vectors were extremely fast, so fast that I thought that perhaps the code didn't run. All of these tests only performed a simple operation(s) in a loop, similar to what you are doing. I think it has to do with highly efficient caching when this type of structure used in a loop. In more realistic code, as opposed to this benchmark-style code, your results could be different.
Re: Why Seq search is faster than Table search
" Than why we need tables if array search faster?" I can't find here word useless. And I said Tables which is nim's implementation of hash map. I didn't say why we need hash maps
Re: Why Seq search is faster than Table search
Well we need Tables for the same reason we need hash tables. You shouldn't generalise your results from this microbenchmark onto all possible uses of Tables
Re: Why Seq search is faster than Table search
Right now I understand this idea. Thank you
Re: Why Seq search is faster than Table search
Thank you. I highly appreciate your answer
Re: Why Seq search is faster than Table search
> I faced aggression and sarcasm I can assure you that was just a plain sarcasm without even a slightest bit of aggression. > I didn't accuse anybody ...except now on the forum, when you accused others for aggression because you didn't get/like their humor. > I didn't said that hash maps useless I quote your exact comment: " Than why we need tables if array search faster?"
Re: Why Seq search is faster than Table search
Yeah it seems like the C compiler is optimising a lot of stuff away
Re: Why Seq search is faster than Table search
The list one should O(N^2) and here N = 10^6. Unless you have a very very special CPU, it can't complete in 9384ns...
Re: Why Seq search is faster than Table search
Please note also that the code version in the [playground](https://play.nim-lang.org/#ix=2r1w) tests only on the `0` element of `seq` and `Table`. ... for _ in 0 ..< 1_000_000: discard myList.contains(0) echo("List: ", getMonoTime() - start) start = getMonoTime() for _ in 0 ..< 1_000_000: discard myTable.hasKey(0) ... Run Also, the `MyTable` and `MyList` don't contain the same elements as you used `countdown(1_000_000, 0)` to fill the `seq` but excluded `1_000_000` when doing the tests.
Re: Why Seq search is faster than Table search
This this is the result on my machine if I change N to 1000 List: (seconds: 0, nanosecond: 296203084) Table: (seconds: 0, nanosecond: 4101) Run
Re: Why Seq search is faster than Table search
I changed the count for 1_000_000 to 100_000 and compiled your test with stable and devel in release mode and my results are totally different. With version 1.2.4: List: (seconds: 3, nanosecond: 71413798) Table: (seconds: 0, nanosecond: 496792) Run So, I retried with 1_000_000 and got these results: List: (seconds: 326, nanosecond: 776822305) Table: (seconds: 0, nanosecond: 4693424) Run The times are consistent. There are 10 more values in the list and we do 10 more searches, so we can expect 100 more time to do the search which is roughly what we get. For the table, as we do 10 more accesses, we need 10 more time which is also roughly what we get. The time for the table seems to be in par which what you obtain, my CPU being probably somewhat faster. But there is clearly a problem with the time you get for the list. And I have no idea what the problem could be.
Re: Why Seq search is faster than Table search
It could be that the optimizer somehow realizes that your loop will never do anything meaningful and never runs it.
Re: Why Seq search is faster than Table search
I love performance puzzles like this! Answer: Turns it it just optimizes the for loops away! Timings i get with the code provided: List: (seconds: 0, nanosecond: 9180) Table: (seconds: 0, nanosecond: 2794545) Run Really strange! Added dummy counter so that for loops don't get optimized out. In fact, program ran so slow I had to change the counter to 1000: import std/monotimes import tables import random var myTable = initTable[int, int]() var myList: seq[int] for i in 0 .. 1_000: myTable[i] = i myList.add(i) myList.shuffle() var start = getMonoTime() var dummyCounter = 0 for i in 0 ..< 1_000: if myList.contains(i): inc dummyCounter echo("List: ", getMonoTime() - start) start = getMonoTime() for i in 0 ..< 1_000: if myTable.hasKey(i): inc dummyCounter echo("Table: ", getMonoTime() - start) echo dummyCounter Run List: (seconds: 0, nanosecond: 142290) Table: (seconds: 0, nanosecond: 3060) Run Tables are soo soo much faster!
Re: Why Seq search is faster than Table search
I did some more work on it, I measured time at each array size from 0 to 10_000, see chart: [https://dl3.pushbulletusercontent.com/4m74W8NlcdcrpJnVgCT53RkL2nxUPimc/image.png](https://dl3.pushbulletusercontent.com/4m74W8NlcdcrpJnVgCT53RkL2nxUPimc/image.png) It shows you how much tables are faster during each number of elements. run with nim c -r --d:danger import times import tables import random var myTable = initTable[int, int]() var myList: seq[int] for i in 0 .. 1_000: myTable[i] = i myList.add(i) myList.shuffle() var dummyCounter = 0 for n in 0 .. 10_000: const trytimes = 10 var start = epochTime() for j in 0 .. trytimes: for i in 0 ..< n: if myList.contains(i): inc dummyCounter let listTime = epochTime() - start start = epochTime() for j in 0 .. trytimes: for i in 0 ..< n: if myTable.hasKey(i): inc dummyCounter let tableTime = epochTime() - start echo n, ", ", listTime/trytimes, ", ", tableTime/trytimes echo dummyCounter Run
Re: Why Seq search is faster than Table search
This is strange. I get the long execution time in release mode without changing a line in the code. Why is the code not optimized in my case? To get the same behavior, I have to use `-d:danger` instead of `-d:release`. May be it depends on some options of the C compiler. But, at least, we have the explanation.
Re: Why Seq search is faster than Table search
Perhaps try the new --asm command option?
Re: Why Seq search is faster than Table search
> My question is why linear search of seq is faster than constant search of > tables? Sorry to pile on four days later, but the right question would be "why is my benchmark code producing crazy results?" since it's much more likely there's something wrong with your quick benchmark than with Nim's fairly-mature standard library. I made the same mistake a few weeks ago when I posted here saying there seemed to be a bad performance bottleneck in the standard lib ... but it turned out that the real problem was that the build config I was using did not produce an optimized build at all. Oops.