Re: Fastest way to count number of lines

2017-10-21 Thread cblake
Nice, boia01! In my same `/usr/share/nim/**.nim` test I get 768 microseconds 
for your version and 2080us for just doing the memSlices approach..So, 2.7X 
speed up, a bit less than the 4X I saw when I last compared the approach in two 
C versions..maybe unrolling. Dunno.

@alfrednewman - if the statistics of these data files are stable over the whole 
file, you could always stop after the first gigabyte (or maybe less), figure 
out the average line length and use the file size to estimate the number of 
lines, and then it would only be a ~1 second delay. A MemFile knows its size, 
as does each slice...So, this is pretty easy to code. Of course, if the first 
gig is not always representative of the remainder that might not work, but it 
sounds like you probably don't need an exact answer. This is sort of a 
simplified version of one of my/jlp765's suggestions. 2.7 * 1.3 GB/s =~ 3.5 
GB/s is faster IO than many people have. So, even using boia01's code you may 
not see a great speed-up.


Re: progress while binding libxl

2017-10-21 Thread alfrednewman
Hello @oyster

Ssorry to ask, but how is the process going with the libxl binding ?

I need to work with a project where reading and writing Excel files will be an 
important prerequisite ... figuring that there is not yet a component in Nim to 
get the job done, I will have to perform the delivery using Python.

Anyway, despite not having a very advanced knowledge of Nim, if I can help in 
anything with its development, please let me know.

I'm sure more people here in our group will benefit from being able to do some 
CRUD in .xlsx files directly from Nim.

Cheers


Re: procs where you forget to return a value

2017-10-21 Thread jackmott
Thanks Dom, done.


Re: procs where you forget to return a value

2017-10-21 Thread dom96
Please report this as an issue on GitHub.


Re: Which FUSE library shall I use?

2017-10-21 Thread dom96
I would say zielmicha's library. It seems to be more recently maintained, and 
he's more active in the Nim community. That said I haven't used either of these 
so I could be very wrong.


Re: How to store procs of different arguments and return values

2017-10-21 Thread Arrrrrrrrr
You can if you add {.nimcall.} pragma: 


cast[(proc(x: int): int {.nimcall.})](p)


I think it doesn't work because, without this pragma, compiler thinks this is a 
closure, and closures have a greater size than normal proc pointers, that's the 
reason why the casting is not allowed.

[https://nim-lang.org/docs/manual.html#types-procedural-type](https://nim-lang.org/docs/manual.html#types-procedural-type)

To avoid the boilerplate you can define your own proc types: 


type MyProc = proc(x: int): int {.nimcall.}

proc myProc(x: int): int = x + 1

let f: MyProc = myProc

let p: pointer = cast[pointer](myProc)
let casted = cast[MyProc](p)



Re: How to store procs of different arguments and return values

2017-10-21 Thread dawkot
Well then, why can't I specify proc's type instead of relying on type()?


proc myProc(x: int): int = x + 1

let f: proc(x: int): int = myProc

let p: pointer = cast[pointer](myProc)
let casted = cast[proc(x: int): int](p) #expression cannot be cast to 
proc(x: int): int{.closure.}



Re: How to store procs of different arguments and return values

2017-10-21 Thread Arrrrrrrrr
[Run code](https://glot.io/snippets/eur6yblvzq): 


proc print(i: int) = echo i
proc plus(a, b: int): int = a + b

let procs = [cast[pointer](print), cast[pointer](plus)]

cast[type(print)](procs[0])(1)
echo cast[type(plus)](procs[1])(1, 2)



Re: project organization question

2017-10-21 Thread stisa
You can also use nimscript to create the dir ( eg `mkdir "tests"`), or do some 
nice things like build all tests in a dir, eg 


import ospaths
task tests, "Runs tests":
  withdir "tests":
for file in listfiles("."):
  if splitfile(file).ext == ".nim":
exec "nim c -r --verbosity:0 --hints:off " & file



How to store procs of different arguments and return values

2017-10-21 Thread dawkot
Since cast doesn't seem to work on procs, what are the alternatives?


Re: project organization question

2017-10-21 Thread jackmott
I think I found the answers I need here: 
[https://github.com/nim-lang/nimble#project-structure](https://github.com/nim-lang/nimble#project-structure)

In case anyone else finds this with a search, this worked:


# Package

version   = "0.1.0"
author= "Jack Mott"
description   = "type safe opengl wrapper"
license   = "MIT"

srcDir= "src"

# Dependencies

requires "nim >= 0.17.0"
requires "sdl2"
requires "opengl"
requires "stb_image"

task hello_triangle, "Runs hello triangle":
  exec "nim c -r examples/hello_triangle"

task shaders, "Runs shaders":
  exec "nim c -r examples/shaders"



project organization question

2017-10-21 Thread jackmott
>From looking at other nim repos, it looks like the accepted pattern if you are 
>making a library is to have a src directory under your pojects root directory 
>with the library code. Then an example directory under the root for example 
>code. How do you set this up so that you can conveniently compile and run the 
>example programs?

Currently my nimble file is:


# Package

version   = "0.1.0"
author= "Jack Mott"
description   = "type safe opengl wrapper"
license   = "MIT"

bin   = @["../examples/example02"]
srcDir= "src"

# Dependencies

requires "nim >= 0.17.0"
requires "sdl2"
requires "opengl"


and my examples dir has a nim.cfg with \--path:"../src"

This works until it tries to produce the executable at 
root/../examples/example02.exe and that dir doesn't exist, it should be 
root/examples/example02.exe

Any tips? 


Re: Arrays, openarrays, and sequences

2017-10-21 Thread jackmott
awesome, yes this is perfect!


Re: Fastest way to count number of lines

2017-10-21 Thread boia01
Just for fun, I ported and hacked together a self-contained Nim version of 
Daniel Lemire's avxcount: 
[https://gist.github.com/aboisvert/3f89bc0ae0a2168fcf35ccca98177f6a](https://gist.github.com/aboisvert/3f89bc0ae0a2168fcf35ccca98177f6a)

(I didn't bother with the loop-unrolled versions) 


Re: What's happening with destructors?

2017-10-21 Thread adrianv
@rayman22201 what I have seen is, that each proc that has a var of a type which 
is handled by a destructor needs a hidden try finally block (currently even if 
the var is not used). This can be a big overhead for small routines. The c 
target needs setjump/longjump which is a big overhead. The cpp target is better 
here, because it uses native exception handling. 


Re: What's happening with destructors?

2017-10-21 Thread monster
@Araq I wanted to "show my appreciation" for this new development, but 
BoutySource doesn't let me leave a comment. I'm "skunkiferous" there; I think 
"monster" was already taken...


Re: procs where you forget to return a value

2017-10-21 Thread leledumbo
I agree that a simple code like this:


proc p: int =
  echo "test"

var x = p()

echo x


should issue a warning (like in Pascal). Error is fine, too (like in Java), 
since it's very seldom an intention to do so.


Re: procs where you forget to return a value

2017-10-21 Thread Arrrrrrrrr
In my experience, this is an issue you run into when lacking experience using 
nim. Then it stop giving you problems. Could be a good idea to warn under these 
circunstances.


Re: Arrays, openarrays, and sequences

2017-10-21 Thread Arrrrrrrrr
> do the array get copied when you call the function

No.

> accept a sequence OR an array, in an efficient manner

openArray is the way.


Re: Problem using

2017-10-21 Thread jzakiya
The `cnt +=` operation in parallel is ripe for creating a `reduction` like 
option for Nim that's in `OpenMP`.

[https://stackoverflow.com/questions/13290245/reduction-with-openmp#13290673](https://stackoverflow.com/questions/13290245/reduction-with-openmp#13290673)


procs where you forget to return a value

2017-10-21 Thread jackmott
I spent about an hour tonight tracking down a weird bug where it turned out I 
had a proc with a return value, but I forgot to return a value. This compiled 
fine, but ran wrong. I assume what happened is that the implicit return value 
was returned, with the default value for the type. Should it perhaps be a 
compiler error if no value is returned and the return value is never assigned 
to anywhere in the proc? Or at least a warning?


Re: Problem using

2017-10-21 Thread jzakiya
OK, I had to clean it up a little to make it work, but here is the code that 
gets it to compile.


  var cnt: array[rescnt, uint]
  parallel:
for i in 0..rescnt-1:
  cnt[i] = spawn segcount(i*KB, Kn)
  sync()
  for i in 0..rescnt-1:
primecnt += cnt[i].uint



So was the issue that the single value of `cnt` was getting clobbered by each 
thread's return value, causing the type mismatch?

Thanks for getting it to compile, but if you can explain why this works I'd 
appreciate it even more.


Re: What's happening with destructors?

2017-10-21 Thread rayman22201
@araq,

I missed the live stream, but watched the recording on youtube. Thank you for 
taking the time to do the presentation! Very awesome. 

> The really hard part is replacing the existing runtime with one with a 
> different performance profile ("yay, deterministic freeing, yay more 
> efficient multi threading possibilities, ugh, overall slower?!")

Can you go into more detail about why the new destructor based string 
implementation is slower than the current GC string implementation?

More generally, what is the performance profile of the destructor based 
runtime? Why is it, "overall slower" as you say?

Thank you again, super awesome presentation!


Re: What's happening with destructors?

2017-10-21 Thread bpr
> It's a bit early to say but I think so, yes.

I think so too, though there is the ability of regions (used to call them 
"arenas" a long time ago) to free many allocated objects at once, even if they 
aren't in the same container. I don't think destructors give you that. I also 
remember using region APIs that would have `setMark` and `freeToMark` and stuff 
like that, but I don't think those kind of APIs lasted, mostly just the 
`/allocateFromeRegion/freeAllInRegion` ones.

Good stuff, I hope to see it in Nim 1.0 really soon, like in a few weeks .


Re: Fastest way to count number of lines

2017-10-21 Thread alfrednewman
Hi, thanks for the help of all of you.

Yes, I'm pre-calculating things. In the data orchestration process I'm involved 
in, I can usually estimate the time of a rendering based on the number of rows 
I'm processing. It is a linear process and the processing time is typically not 
much affected as a function of the size of each line.

The files have an average of several gigabytes in size. Some have tens of 
millions of lines. Knowing the amount of lines before processing guarantees 
some interesting benefits to the user.

I'm lucky because the data is on an SSD server... I already had a performance 
gain using the tips provided by @cblake (using import memfiles).

All the best AN


Arrays, openarrays, and sequences

2017-10-21 Thread jackmott
A couple of quick questions:

If I have a function that accepts an array as input via an openarray, do the 
array get copied when you call the function, or no?

Is it possible to write a function using generics, or otherwise, that can 
accept a sequence OR an array, in an efficient manner?


Re: Fastest way to count number of lines

2017-10-21 Thread cblake
Yeah..Depending on what he's doing, same-file dynamic estimation might also 
work. Good point, @jlp765.

On my system the 5.4 MB of `/usr/share/nim/**.nim` gets counted in about 4 
milliseconds - over 1.3 GB/sec, probably faster than all but the most 
powerhouse nvme/disk array IO. This is why I suspect @alfrednewman might be 
re-calculating things instead of saving the answer either in RAM or as files.

I'm sure a pre-pass calculating the number of lines can avoid certain 
complexities. However, once you start doing assembly hijinks that are not even 
portable through a given CPU family (e.g., using SSE, AVX2, AVX512, ...) 
performance becomes very deployment sensitive. Meanwhile, eliminating the 
entire pre-pass by merging it with per-line allocations/whatever costs 
complexity, too, but yields portable performance gains. If it's really 
ineliminable then have at it, asm-wise, I guess. I just suspect it's off-track.


Re: Problem using

2017-10-21 Thread jlp765
try 


var cnt = array[rescnt, int]
parallel:
for i in 0..rescnt-1:
cnt[i] = spawn segcount(i*KB, Kn)
sync()
for i in 0..rescnt-1:
   primecnt += cnt[i].uint