Hello Jürgen,

I don't know if you have given this issue any thought, but it has certainly
occupied my mind for the last few days.

It's clear that heavy array processing does far too much cloning than
should be necessary. Especially in cases where you have lots of operations
on smaller arrays (as opposed to few operations on large arrays).

This is because the code always performs a clone prior to a destructive
operation because at that point it can never be sure that the array will
not be used again. The (very few) exceptions to this is handled by the
"temp" system, which is really only used in the ravel function.

The way I see it, there are two approaches to solving this: The first one
being to go through the code with a fine-toothed comb and implement the
temp system everywhere. This is lots of work and is error-prone.

The other solution would be to re-engineer the Value class so that it can
share underlying data with other Value instances. The Value class would
then get a counter called share_count or something, indicating how many
other references there are to the same data. When clone() is called, it
will simply create a new Value instance, share the underlying data and
increase share_count. Any destructive operations would only copy the
content if it's not already shared.

*Now, Value_P already implements some of these semantics. Could it be
reused for this?*

While appealing, this copy-on-write solution might not be perfect though.
The assumption is that the caller would have decremented the share count
(effectively "releasing" the Value) before the called function tries to
modify it. Now, this "release" is similar to the the "temp" marking. Could
there be a better way?

I've been experimenting with this on and off, and my interim results (as
I've mentioned before) show that the potential performance boosts are
massive. We're talking about 3-4 orders of magnitude. Definitely worth
quite a lot of effort, IMHO.

I'm likely going to continue pursuing this, because I personally feel
frustrated when I do something and it's not instantaneous, knowing that I'm
waiting for the interpreter to perform unnecessary operations. :-)

What are your thoughts on this? What would be the best approach?

Regards,
Elias

Reply via email to