Re: Execution speed Nim vs. Python
Python code is not optimised. def do_something(board): res = [] for i, p in enumerate(board): for d in directions: res.append(board[d[i]]) return res Run should be written as list comprehension: do_something(board): return [d[i] for d in directions for i, p in enumerate(board)] Run Comprehensions in Python are much faster than ordinary loops. You will gain 2-3x perfomance boost.
Execution speed Nim vs. Python
check with... [Compile time Vs. Runtime](http://net-informations.com/python/iq/checking.htm)
Re: Execution speed Nim vs. Python
> In C#, we call them IEnumerable which is quite a good name. That's quite incorrect. openarray is more like the C# params keyword.
Re: Execution speed Nim vs. Python
(I should perhaps qualify beyond -d:release using gcc or clang with high optimization levels on the back end since jil210 revealed in [another thread](https://forum.nim-lang.org/t/3198) the baseline 5X performance mystery arose from using tcc as a back end.) Also, for the curious, the reason C++ STL will usually underperform Nim's hash tables in benchmarks like this is that STL iterator deletion semantics make the natural STL hash table collision resolution implementation choice be external chaining. Those extra linked list indirections add latency, especially when tables do not fit in on-CPU cache.
Re: Execution speed Nim vs. Python
Nim `Table` and `HashSet` should be caching the hash code values in their tables and also using that as a "comparison prefix" (comparing string values only if the integer hash codes already match). The string hash of the new inputs is ineliminable - or rather has to be done at least once anyway. My guess is the lines are long-ish (greater than 10 chars, say). The default Nim hash string function could be faster, especially for longer strings. E.g., the current default does byte-at-a-time manipulations and usually 8-byte/word-at-a-time can get about as good bit mixing much faster, especially for long strings like lines in files. Such a replacement might take that 60% hashing time in this particular benchmark down to 10-15% for a 2X-ish overall speed up. I have benchmarked C++ for these kinds of tests and I suspect the current STL would not be faster than Nim with the -d:release in this case. (C++, too could benefit from faster default string hash).
Re: Execution speed Nim vs. Python
Python caches hashing of strings. Nim does not (it would be a challenge, as Nim strings are mutable). I suggest using string references or Hash objects if you want to compare performance. Has anyone benchmarked C++ for this kind of test?
Re: Execution speed Nim vs. Python
Please help speedup with the following nim code taking five times longer than python version: import os, sets, strutils if paramCount() != 2: echo "mydiff file1 file2" quit() let file1 = open(paramStr(1)) let file2 = open(paramStr(2)) let old_lines = file1.readAll().splitLines() let new_lines = file2.readAll().splitLines() file1.close() file2.close() let old_lines_set = toSet(old_lines) let new_lines_set = toSet(new_lines) let old_added = old_lines_set - new_lines_set let old_removed = new_lines_set - old_lines_set for line in old_lines: if line in old_added: echo "-", line.strip() elif line in old_removed: echo "+", line.strip() for line in new_lines: if line in old_added: echo "-", line.strip() elif line in old_removed: echo "+", line.strip()
Re: Execution speed Nim vs. Python
@didlybom: In C#, we call them IEnumerable which is quite a good name. Something like enumerable[int] or iterable[int] might make more sense than openarray does, though it's a fairly substantial change.
Re: Execution speed Nim vs. Python
I think it was already discussed a while ago but I think this thread shows that "openarray" is not a very good name. IMHO it really does not convey what it really means. Something else such as "anyarray" or "arraylike" might be less accurate but easiert o inderstand (again, IMVHO).
Re: Execution speed Nim vs. Python
Since seqs and strings have value semantics we can implement them with C++'s "Small vector optimization" and don't need more complexity in the language (but in the runtime).
Re: Execution speed Nim vs. Python
flyx wrote: > And yes, something between array and seq would be nice. But currently, Nim's > type system does not allow a type with a runtime-calculated length parameter. > Allowing it would be a hack similar to C VLAs. No hack involved here. The length parameter can be part of the object like seq rather than part of the type like array. openarray-accepting procs should not care since they can receive seqs anyway with run-time variable everything or arrays with compile-time fixed everything. One might argue that I/others could just do their own concept which is true. The standard lib has lots of openarray-accepting routines that would work fine with post-init-fixed seq-like objects, though. So, not having such a type in the standard lib that's part of openarray feels unnecessarily inflexible, but on this I think we agree since you said it would be nice. It may just be easier than first imagined. :)
Re: Execution speed Nim vs. Python
> Weirdly enough that code runs ~25% faster for me when changing res from > string to seq[char], 1.51 instead of 1.99. I have no clue why. @Tarmean: Strange indeed ... I did try that too, but the performance was the same.
Re: Execution speed Nim vs. Python
cblake: C11 made VLAs an optional feature though. I consider C VLAs a very poor solution; as you say, it's mostly syntactic sugar for alloca. And yes, something between array and seq would be nice. But currently, Nim's type system does not allow a type with a runtime-calculated length parameter. Allowing it would be a hack similar to C VLAs.
Re: Execution speed Nim vs. Python
@flyx - c99 (and gcc long before it) have VLAs of the fixed-after-initialization variety...mostly syntactic sugar for alloca, of course. All that is surely more commonly used than Ada and required no fancy containers or any constraints about being a leaf function. Such an array style may be different enough from the post-init-resizable Nim seqs to warrant a different type, but it could still be an openarray. I think Nim needs a kind of openarray between array and seq - one which is run-time-sized, but fixed after initialization and with backing memory not necessarily on the heap. For example, an array of char that is backed by a read-only memory mapped file and so cannot be zero-terminated, but being from a file its size cannot change after initialization and yet the size can also never be known at compile time. Then it could get bounds checking like other openarrays and otherwise behave similarly. Maybe that's not the best example, but I'm sure there are many more. Lots of times the sizes of things are only known at run-time, but don't change once known/computed. Also, many openarray accepting routines need not modify their inputs -- why, they cannot change lengths of arrays, for example. So, it would be nice if all openarray accepting routines could work with such a type, whatever it might be called. Maybe there is one and I am just unaware.
Re: Execution speed Nim vs. Python
Just a quick idea: if you don't understand what happens under the hood, then you can peek at nimcache/*.c (or even better, you can diff two versions). > Peter
Re: Execution speed Nim vs. Python
wiffel Weirdly enough that code runs ~25% faster for me when changing res from string to seq[char], 1.51 instead of 1.99. I have no clue why.
Re: Execution speed Nim vs. Python
Stefan_Salewski: > May it be possible to have one single Seq in each proc on the stack? Of > course it has to be the last element, and can only work when that proc do not > call other procs. That may be possible, but far too complicated with far too few impact on performance. The compiler would need to properly place the seq, statically analyze that no other procs are called, and add special transformation code every time you copy this seq. If you have, somewhere in your proc, some code `a.b`, where `b` is some field in `a``s type, and you transform ``b` into a getter proc at some point in the future, you code suddenly wouldn't be able to compile anymore. And this is just one example why it is a really bad idea. Ada is the one language I know which went a long way for arrays of dynamic length which can still be stored on the stack. The only restriction is that an array cannot change size after initialization. Therefore, it does not cover all use cases and it still needs dynamic container types in its standard library. Also, handling those arrays is a bit more difficult than in other programming languages. Everything comes at a price.
Re: Execution speed Nim vs. Python
> Seqs in nim have value type semantics. Your inner loop ( for d in Directions) > actually assigns Directions[...] to d on every iteration which involves full > copy of the seq. How strange that it actually makes a new copy when mitems doesn't. What's the reason of this?
Re: Execution speed Nim vs. Python
@yglukhov, @def : If this is indeed a performance contest ... :) This code is about 10x faster. (and I only cheated a tiny little bit :) ) A 100 million iterations runs in less than 2 seconds on my machine. import times const count = 100_000_000 NE = [8, 2, 13, 4, 5, 7, 8, 3, 9, 12, 7, 8, 3, 9, 12] Directions = [NE, NE, NE, NE] proc do_something(board: string, res: var string) = for b in board.low .. board.high: for d in Directions.low .. Directions.high: res[b * Directions.len + d] = board[Directions[d][b]] var board = "0p.pP.pP.p.pP.p" res = newStringOfCap(board.len * Directions.len) let start = cpuTime() for i in 1 .. count: do_something(board, res) echo "** Nim " echo "** Time : ", $(cpuTime() - start), " seconds" echo "** Counts : ", count echo "** Result : ", res echo "***"
Re: Execution speed Nim vs. Python
The const array is fast, the const seq seems to be very slow, much worse than in the var array vs. var seq case. So, const seq definitely is not identical to const array. To me, the takeaway message is: do not use const seq (probably does not make much sense to use anyway) and use const array instead.
Re: Execution speed Nim vs. Python
> that it is really the const seq which is causing most of the slowdown That is very interesting! I strongly assume that const seg is identical to const array? Have you tested? So this may indicate that we should avoid const arrays when speed is important.
Re: Execution speed Nim vs. Python
> flyx: Seqs always need to be allocated on the heap. Sorry, do not now much about it... May it be possible to have one single Seq in each proc on the stack? Of course it has to be the last element, and can only work when that proc do not call other procs.
Re: Execution speed Nim vs. Python
I would like to point out, that it is really the **const seq** which is causing most of the slowdown in the original version (and a const seq does not make much sense, if you think about it and should maybe be complained about by the compiler). If you use a var seq instead, the difference to the var array is much less pronounced. This takes 0.9 sec for 1_000_000 iterations: var NE = @[8, 2, 13, 4, 5, 7, 8, 3, 9, 12, 7, 8, 3, 9, 12] var Directions = [NE, NE, NE, NE] while this takes 0.6 sec: var NE = [8, 2, 13, 4, 5, 7, 8, 3, 9, 12, 7, 8, 3, 9, 12] var Directions = [NE, NE, NE, NE] Compare this to 11 vs. 1 sec for the const case mentioned earlier.
Re: Execution speed Nim vs. Python
> I do not understand the use of the openarray parameter in relation to > performance. `openarray` has nothing to do with performance. It is just a feature to allow implementing procs that take both an `array` and a `seq` as parameter. The actual performance gain stems from using arrays instead of seqs. Because an array's length is known at compile time, it can be allocated on the stack. Seqs always need to be allocated on the heap. Therefore, it is always advisable to use arrays when possible. Other performance optimizations are explained in the previous posts. > If the length of the output array of a proc is not known in advance, are > there general rules to accomplish a good performance? If you can calculate the target length of the array efficiently before actually filling it, initialize the result seq with as many elements as needed (as shown in the other posts) or use `newSeqOfCap` if you're on devel or once this has made it into a release.
Re: Execution speed Nim vs. Python
The answers were very instructive. Thanks. Two questions. I do not understand the use of the openarray parameter in relation to performance. Is it better for the speed than using seq[char] ? If the length of the output array of a proc is not known in advance, are there general rules to accomplish a good performance? For example using tables.
Re: Execution speed Nim vs. Python
Hi, I am still very new to Nim and played around with the different solutions given here. To me, the biggest difference is made by changing the type of NE from seq to array: With this line: const NE = @[8, 2, 13, 4, 5, 7, 8, 3, 9, 12, 7, 8, 3, 9, 12] and 1_000_000 iterations it takes around 11 sec on my Linux box. With this line instead: const NE = [8, 2, 13, 4, 5, 7, 8, 3, 9, 12, 7, 8, 3, 9, 12] the time is reduced to ~ 1 sec.
Re: Execution speed Nim vs. Python
With the original version I got 1.822 s runtime. With yglukhov's (first) version, I got 0.012 s runtime, replacing Directions.len with Directions.high. Python runtime (for comparison) not available.
Re: Execution speed Nim vs. Python
@yglukhov: Ah, you don't make copies of the entries of Directions. I think that's an optimization the compiler should be able to do somehow. Also, you can use `0 .. Directions.high` instead of `0 ..< Directions.len`. Using `mitems` instead achieves the same effect, but needs a `var`: # Nim example to test execution speed against a Python version import times, os const board = ['0', 'p', '.', 'p', 'P', '.', 'p', 'P', '.', 'p', '.', 'p', 'P', '.', 'p'] NE = [8, 2, 13, 4, 5, 7, 8, 3, 9, 12, 7, 8, 3, 9, 12] var Directions = [NE, NE, NE, NE] # for short: in real application the array elements differ proc do_something(board: openarray[char]): seq[char] = result = newSeqOfCap[char](board.len * Directions.len) for i in 0 .. board.high: for d in Directions.mitems: result.add board[d[i]] let t0 = cpuTime() var res: seq[char] const count = 100_000 for i in 1..count: res = do_something(board) let t1 = cpuTime() echo "* Time elapsed for Nim: ", t1 - t0, " Counts: ", count echo res.repr
Re: Execution speed Nim vs. Python
@def, your version is 0.08 for me. My version is 0.05 ;) import times, os const board = ['0', 'p', '.', 'p', 'P', '.', 'p', 'P', '.', 'p', '.', 'p', 'P', '.', 'p'] NE = [8, 2, 13, 4, 5, 7, 8, 3, 9, 12, 7, 8, 3, 9, 12] Directions = [NE, NE, NE, NE] # for short: in real application the array elements differ proc do_something(board: openarray[char]): seq[char] = result = newSeqOfCap[char](board.len * Directions.len) for i in 0 .. board.high: for d in 0 ..< Directions.len: result.add board[Directions[d][i]] let t0 = cpuTime() var res: seq[char] const count = 10 for i in 1..count: res = do_something(board) let t1 = cpuTime() echo "* Time elapsed for Nim: ", t1 - t0, " Counts: ", count
Re: Execution speed Nim vs. Python
Optimization, my favorite Nim topic! Runtime of your version for me: 0.766 s res.add board[d[i]] This line resizes the `res` seq, instead you should allocate it to have the correct size directly. Like this: proc do_something(board: seq[char]): seq[char] = var res = newSeq[char](board.len * Directions.len) for i, p in board: for j, d in Directions: res[i * Directions.len + j] = board[d[i]] return res New runtime: 0.073 s With the new devel branch of Nim you can also use `newSeqOfCap` if you don't want to manually calculate the index: proc do_something(board: seq[char]): seq[char] = var res = newSeqOfCap[char](board.len * Directions.len) for i, p in board: for d in Directions: res.add board[d[i]] return res As a workaround you can also do `var res = newSeq[char](board.len * Directions.len); res.setLen(0)` to achieve the same effect. res = do_something(board) You create a new `res` seq for every call to do_something. Instead you could reuse a single data structure and pass it to `do_something`. But that's probably not what you want to benchmark then. Full code with some cleanup (use implicit result variable, no `$` for echo, arrays instead of seqs, more consts): # Nim example to test execution speed against a Python version import times, os const board = ['0', 'p', '.', 'p', 'P', '.', 'p', 'P', '.', 'p', '.', 'p', 'P', '.', 'p'] NE = [8, 2, 13, 4, 5, 7, 8, 3, 9, 12, 7, 8, 3, 9, 12] Directions = [NE, NE, NE, NE] # for short: in real application the array elements differ proc do_something(board: openarray[char]): seq[char] = result = newSeqOfCap[char](board.len * Directions.len) for i in 0 .. board.high: for d in Directions: result.add board[d[i]] let t0 = cpuTime() var res: seq[char] const count = 100_000 for i in 1..count: res = do_something(board) let t1 = cpuTime() echo "* Time elapsed for Nim: ", t1 - t0, " Counts: ", count echo res.repr New runtime: 0.065 s For comparison, Python runtime: 0.783 s
Re: Execution speed Nim vs. Python
Seqs in nim have value type semantics. Your inner loop ( for d in Directions) actually assigns Directions[...] to d on every iteration which involves full copy of the seq. Here's the patched version. I have also preallocated result seq. import times, os var board: seq[char] = @['0', 'p', '.', 'p', 'P', '.', 'p', 'P', '.', 'p', '.', 'p', 'P', '.', 'p'] const NE: seq[int] = @[8, 2, 13, 4, 5, 7, 8, 3, 9, 12, 7, 8, 3, 9, 12] const Directions = [NE, NE, NE, NE] # for short: in real application the array elements differ proc do_something(board: openarray[char]): seq[char] = result = newSeq[char](board.len * Directions.len) var ir = 0 for i, p in board: for d in 0 ..< Directions.len: result[ir] = board[Directions[d][i]] inc ir let t0 = cpuTime() var res: seq[char] let count = 10 for i in 1..count: res = do_something(board) let t1 = cpuTime() echo "* Time elapsed for Nim: ", $(t1 - t0), " Counts: ", $count
Execution speed Nim vs. Python
I wrote an application with Nim and discovered that an application with the same logic and data structures written in Python has a better execution speed than the Nim version. I was trying to find out about the reason why Nim has a larger execution speed than Python. I isolated some code and tested two small and equivalent Python and Nim programs. For a loop count of 10 Python takes 1.0 sec and Nim takes 1.8 sec This speed ratio corresponds to the speed ratio in the real application. I use Python version 2.7.3 and Nim compiler version 0.13.0 I compiled Nim with with -d:release option. I'd like to know what is the reason why in this case Python beats Nim in execution speed. Here are the both programs. The problem is the proc "do_something" # Nim example to test execution speed against a Python version import times, os var board: seq[char] = @['0', 'p', '.', 'p', 'P', '.', 'p', 'P', '.', 'p', '.', 'p', 'P', '.', 'p'] const NE: seq[int] = @[8, 2, 13, 4, 5, 7, 8, 3, 9, 12, 7, 8, 3, 9, 12] const Directions = [NE, NE, NE, NE] # for short: in real application the array elements differ proc do_something(board: seq[char]): seq[char] = var res: seq[char] = @[] for i, p in board: for d in Directions: res.add board[d[i]] return res let t0 = cpuTime() var res: seq[char] let count = 10 for i in 1..count: res = do_something(board) let t1 = cpuTime() echo "* Time elapsed for Nim: ", $(t1 - t0), " Counts: ", $count ##echo res.repr # Python example to test execution speed against a Nim version import time, sys board = ['0', 'p', '.', 'p', 'P', '.', 'p', 'P', '.', 'p', '.', 'p', 'P', '.', 'p'] NE = [8, 2, 13, 4, 5, 7, 8, 3, 9, 12, 7, 8, 3, 9, 12] directions = [NE, NE, NE, NE] # for short: in real application the array elements differ def do_something(board): res = [] for i, p in enumerate(board): for d in directions: res.append(board[d[i]]) return res t0 = time.time() res = [] count = 10 for i in range(1, count): res = do_something(board) t1 = time.time() print("* Time elapsed for Python: ", str(t1 - t0)), " Counts: ", str(count) ##print res