Re: Fastest way to count number of lines
Nice, boia01! In my same `/usr/share/nim/**.nim` test I get 768 microseconds for your version and 2080us for just doing the memSlices approach..So, 2.7X speed up, a bit less than the 4X I saw when I last compared the approach in two C versions..maybe unrolling. Dunno. @alfrednewman - if the statistics of these data files are stable over the whole file, you could always stop after the first gigabyte (or maybe less), figure out the average line length and use the file size to estimate the number of lines, and then it would only be a ~1 second delay. A MemFile knows its size, as does each slice...So, this is pretty easy to code. Of course, if the first gig is not always representative of the remainder that might not work, but it sounds like you probably don't need an exact answer. This is sort of a simplified version of one of my/jlp765's suggestions. 2.7 * 1.3 GB/s =~ 3.5 GB/s is faster IO than many people have. So, even using boia01's code you may not see a great speed-up.
Re: progress while binding libxl
Hello @oyster Ssorry to ask, but how is the process going with the libxl binding ? I need to work with a project where reading and writing Excel files will be an important prerequisite ... figuring that there is not yet a component in Nim to get the job done, I will have to perform the delivery using Python. Anyway, despite not having a very advanced knowledge of Nim, if I can help in anything with its development, please let me know. I'm sure more people here in our group will benefit from being able to do some CRUD in .xlsx files directly from Nim. Cheers
Re: procs where you forget to return a value
Thanks Dom, done.
Re: procs where you forget to return a value
Please report this as an issue on GitHub.
Re: Which FUSE library shall I use?
I would say zielmicha's library. It seems to be more recently maintained, and he's more active in the Nim community. That said I haven't used either of these so I could be very wrong.
Re: How to store procs of different arguments and return values
You can if you add {.nimcall.} pragma: cast[(proc(x: int): int {.nimcall.})](p) I think it doesn't work because, without this pragma, compiler thinks this is a closure, and closures have a greater size than normal proc pointers, that's the reason why the casting is not allowed. [https://nim-lang.org/docs/manual.html#types-procedural-type](https://nim-lang.org/docs/manual.html#types-procedural-type) To avoid the boilerplate you can define your own proc types: type MyProc = proc(x: int): int {.nimcall.} proc myProc(x: int): int = x + 1 let f: MyProc = myProc let p: pointer = cast[pointer](myProc) let casted = cast[MyProc](p)
Re: How to store procs of different arguments and return values
Well then, why can't I specify proc's type instead of relying on type()? proc myProc(x: int): int = x + 1 let f: proc(x: int): int = myProc let p: pointer = cast[pointer](myProc) let casted = cast[proc(x: int): int](p) #expression cannot be cast to proc(x: int): int{.closure.}
Re: How to store procs of different arguments and return values
[Run code](https://glot.io/snippets/eur6yblvzq): proc print(i: int) = echo i proc plus(a, b: int): int = a + b let procs = [cast[pointer](print), cast[pointer](plus)] cast[type(print)](procs[0])(1) echo cast[type(plus)](procs[1])(1, 2)
Re: project organization question
You can also use nimscript to create the dir ( eg `mkdir "tests"`), or do some nice things like build all tests in a dir, eg import ospaths task tests, "Runs tests": withdir "tests": for file in listfiles("."): if splitfile(file).ext == ".nim": exec "nim c -r --verbosity:0 --hints:off " & file
How to store procs of different arguments and return values
Since cast doesn't seem to work on procs, what are the alternatives?
Re: project organization question
I think I found the answers I need here: [https://github.com/nim-lang/nimble#project-structure](https://github.com/nim-lang/nimble#project-structure) In case anyone else finds this with a search, this worked: # Package version = "0.1.0" author= "Jack Mott" description = "type safe opengl wrapper" license = "MIT" srcDir= "src" # Dependencies requires "nim >= 0.17.0" requires "sdl2" requires "opengl" requires "stb_image" task hello_triangle, "Runs hello triangle": exec "nim c -r examples/hello_triangle" task shaders, "Runs shaders": exec "nim c -r examples/shaders"
project organization question
>From looking at other nim repos, it looks like the accepted pattern if you are >making a library is to have a src directory under your pojects root directory >with the library code. Then an example directory under the root for example >code. How do you set this up so that you can conveniently compile and run the >example programs? Currently my nimble file is: # Package version = "0.1.0" author= "Jack Mott" description = "type safe opengl wrapper" license = "MIT" bin = @["../examples/example02"] srcDir= "src" # Dependencies requires "nim >= 0.17.0" requires "sdl2" requires "opengl" and my examples dir has a nim.cfg with \--path:"../src" This works until it tries to produce the executable at root/../examples/example02.exe and that dir doesn't exist, it should be root/examples/example02.exe Any tips?
Re: Arrays, openarrays, and sequences
awesome, yes this is perfect!
Re: Fastest way to count number of lines
Just for fun, I ported and hacked together a self-contained Nim version of Daniel Lemire's avxcount: [https://gist.github.com/aboisvert/3f89bc0ae0a2168fcf35ccca98177f6a](https://gist.github.com/aboisvert/3f89bc0ae0a2168fcf35ccca98177f6a) (I didn't bother with the loop-unrolled versions)
Re: What's happening with destructors?
@rayman22201 what I have seen is, that each proc that has a var of a type which is handled by a destructor needs a hidden try finally block (currently even if the var is not used). This can be a big overhead for small routines. The c target needs setjump/longjump which is a big overhead. The cpp target is better here, because it uses native exception handling.
Re: What's happening with destructors?
@Araq I wanted to "show my appreciation" for this new development, but BoutySource doesn't let me leave a comment. I'm "skunkiferous" there; I think "monster" was already taken...
Re: procs where you forget to return a value
I agree that a simple code like this: proc p: int = echo "test" var x = p() echo x should issue a warning (like in Pascal). Error is fine, too (like in Java), since it's very seldom an intention to do so.
Re: procs where you forget to return a value
In my experience, this is an issue you run into when lacking experience using nim. Then it stop giving you problems. Could be a good idea to warn under these circunstances.
Re: Arrays, openarrays, and sequences
> do the array get copied when you call the function No. > accept a sequence OR an array, in an efficient manner openArray is the way.
Re: Problem using
The `cnt +=` operation in parallel is ripe for creating a `reduction` like option for Nim that's in `OpenMP`. [https://stackoverflow.com/questions/13290245/reduction-with-openmp#13290673](https://stackoverflow.com/questions/13290245/reduction-with-openmp#13290673)
procs where you forget to return a value
I spent about an hour tonight tracking down a weird bug where it turned out I had a proc with a return value, but I forgot to return a value. This compiled fine, but ran wrong. I assume what happened is that the implicit return value was returned, with the default value for the type. Should it perhaps be a compiler error if no value is returned and the return value is never assigned to anywhere in the proc? Or at least a warning?
Re: Problem using
OK, I had to clean it up a little to make it work, but here is the code that gets it to compile. var cnt: array[rescnt, uint] parallel: for i in 0..rescnt-1: cnt[i] = spawn segcount(i*KB, Kn) sync() for i in 0..rescnt-1: primecnt += cnt[i].uint So was the issue that the single value of `cnt` was getting clobbered by each thread's return value, causing the type mismatch? Thanks for getting it to compile, but if you can explain why this works I'd appreciate it even more.
Re: What's happening with destructors?
@araq, I missed the live stream, but watched the recording on youtube. Thank you for taking the time to do the presentation! Very awesome. > The really hard part is replacing the existing runtime with one with a > different performance profile ("yay, deterministic freeing, yay more > efficient multi threading possibilities, ugh, overall slower?!") Can you go into more detail about why the new destructor based string implementation is slower than the current GC string implementation? More generally, what is the performance profile of the destructor based runtime? Why is it, "overall slower" as you say? Thank you again, super awesome presentation!
Re: What's happening with destructors?
> It's a bit early to say but I think so, yes. I think so too, though there is the ability of regions (used to call them "arenas" a long time ago) to free many allocated objects at once, even if they aren't in the same container. I don't think destructors give you that. I also remember using region APIs that would have `setMark` and `freeToMark` and stuff like that, but I don't think those kind of APIs lasted, mostly just the `/allocateFromeRegion/freeAllInRegion` ones. Good stuff, I hope to see it in Nim 1.0 really soon, like in a few weeks .
Re: Fastest way to count number of lines
Hi, thanks for the help of all of you. Yes, I'm pre-calculating things. In the data orchestration process I'm involved in, I can usually estimate the time of a rendering based on the number of rows I'm processing. It is a linear process and the processing time is typically not much affected as a function of the size of each line. The files have an average of several gigabytes in size. Some have tens of millions of lines. Knowing the amount of lines before processing guarantees some interesting benefits to the user. I'm lucky because the data is on an SSD server... I already had a performance gain using the tips provided by @cblake (using import memfiles). All the best AN
Arrays, openarrays, and sequences
A couple of quick questions: If I have a function that accepts an array as input via an openarray, do the array get copied when you call the function, or no? Is it possible to write a function using generics, or otherwise, that can accept a sequence OR an array, in an efficient manner?
Re: Fastest way to count number of lines
Yeah..Depending on what he's doing, same-file dynamic estimation might also work. Good point, @jlp765. On my system the 5.4 MB of `/usr/share/nim/**.nim` gets counted in about 4 milliseconds - over 1.3 GB/sec, probably faster than all but the most powerhouse nvme/disk array IO. This is why I suspect @alfrednewman might be re-calculating things instead of saving the answer either in RAM or as files. I'm sure a pre-pass calculating the number of lines can avoid certain complexities. However, once you start doing assembly hijinks that are not even portable through a given CPU family (e.g., using SSE, AVX2, AVX512, ...) performance becomes very deployment sensitive. Meanwhile, eliminating the entire pre-pass by merging it with per-line allocations/whatever costs complexity, too, but yields portable performance gains. If it's really ineliminable then have at it, asm-wise, I guess. I just suspect it's off-track.
Re: Problem using
try var cnt = array[rescnt, int] parallel: for i in 0..rescnt-1: cnt[i] = spawn segcount(i*KB, Kn) sync() for i in 0..rescnt-1: primecnt += cnt[i].uint