@snej @mratsim \- Thank you both very much. I've re-watched Andreas' talk and
gone through the String CoW and mratsim's example. It looks like I'm covered,
but there's one thing that I haven't managed to convince myself of, described
below.
Is this enough to guarantee that the compiler won't
I've read that C++ abandoned CoW for `std::string` because the atomic
ref-counting turned out to be more expensive on average than simply copying the
string every time. But of course YMMV; the tradeoff depends on the size of the
objects and how expensive they are to copy.
And for objects that
This is a RFC on copy-on-write strings
[https://github.com/nim-lang/RFCs/issues/221](https://github.com/nim-lang/RFCs/issues/221)
(no code or pseudo-code though)
For your own refcounting scheme, as @snej mention, it's easy to do with
destructors, see my atomic ref counted type here:
Did you watch Andreas's talk on ARC and ORC from Saturday's NimConf? He has a
running example where he builds a toy `seq` from scratch, including the
ref-counting.
Basically you can implement your own ref-counting, if your object is a
non-`ref`. You give it a `ptr` to the shared state, and put
So ... It's been a while, and an awful lot has happened on the arc/orc front;
so hopefully it's OK to bump this up:
* What's a good strategy for implementing copy-on-write with --gc:arc and/or
--gc:orc?
* Is it possible to inspect/access the refcounter, implicit or explicit, from
--gc:arc
So ... I was trying to read up on `=sink` and `=destroy`, and am a bit confused:
There's
[https://github.com/nim-lang/Nim/wiki/Destructors](https://github.com/nim-lang/Nim/wiki/Destructors)
, which appears to be outdated, forwarding to
> Overloading = is really troublesome, I don't think it changed since then.
Wait what? Everything changed about it, it now has a spec and a different
implementation. Containers with `=`, `=sink` and `=destroy` have never been
easier. Caveat: The old default `seq` implementation doesn't call
Thanks! I'm still going to try to CoW, using a similar style to C++, though
using the standard GC, that would essentially mean the refcounting will happen
twice all the time ... I'll look for a way to piggyback on Nim's refcounter
when I have the time.
I might need a copy-on-write implementation, and was wondering if this is still
a good way.
Am I right in thinking that this will only work with the refcount gc, and not
with the new (owned/bd) or alternative (mark & sweep, boehm, regions) gcs?
Yes, it's possible to have a Tensor of x, y, z points.
Sorting is not implemented yet. (And will not be implemented in terms of
map/parallel map but in terms of whatever algo I find the fastest/most memory
efficient).
What I mean is a vectorized structure-of-arrays for x, y, z (and possibly
others) for a set of particles. They should be ordered according to their place
in space grid. As I said, in numpy I can have a any-D numpy array and sort it,
no python lists involved. I imagine the same for tensors in
I'm not sure we are speaking of the same thing.
In which case are we: a Tensor containing seqs or a pure rank 2 (matrix) or
rank 3 tensor?
import arraymancer
import sequtils, random
let p1 = newSeqWith(4, random(0..10))
let p2 = newSeqWith(4, random(0..10))
Let's say I want to do some operations on particles. They should be vectorized
(and maybe parallelized, some calculations could also benfit from GPU) and it
would be really nice if particles from the same space grid would be in the same
place in the sequence, as they will need to access the
In which cases would you need to store a seq or a list in a tensor or a Numpy
ndarray?
@mratsim Oh, really, you don't know any example of an operation the cost of
depends on values? Well, I easily know one: sorting.
Ah right, indeed. Well let's say that it's a non-supported use case shrugs and
that Arraymancer tensors are the wrong container for that. I'm not aware of any
scientific/numerical computing situation with an operation depends not only on
the tensor size but also the value of what is wrapped.
@mratsim No, it's not. That's why I asked whenever you use dynamic scheduling.
Imagine you have a sequence of 1, 2, 4, 8, ..., 1048576. Now, map it with an
operation with O(N) complexity, where N is the value of your number. If you use
static scheduling, it's entirely possible most of the work
@Udiknedormin: Actually `map` is the easiest, you can just do:
for i in 0||(container.len - 1):
result[i] = mapped_op(container[i])
OpenMP will divide work statically in `(container.len - 1) / threads_count`
chunks.
Right now Arraymancer offers 2 choices, value and ref
lly in the main
thread.
I don't have a use case (yet ?) to introduce even more parallelism. That would
require lock/guard or atomics (not sure we can compare-and-swap ref though) in
the proposed copy-on-write container as well which will make it harder to
implement, and maybe slower for the general
Could you elaborate about the main thread being the only one being able to
create and destroy the objects? It sounds quite restrictive so I'd like to hear
what your motivation and the general idea was.
After exploring various designs for a container (Tensor) which only copy memory
when necessary (shallow `let` and deep `var`, using the GC refcount) and
hitting technical roadblocks. I've settled on this one. Note that this should
work nicely with the future `=move` and `=sink` operator:
21 matches
Mail list logo