On 1 February 2014 22:04, JR <zor...@gmail.com> wrote:
On Saturday, 1 February 2014 at 05:36:44 UTC, Manu wrote:
I write realtime and memory-constrained software (console games), and
for
me, I think the biggest issue that can never be solved is the
non-deterministic nature of the collect cycles, and the unknowable
memory
footprint of the application. You can't make any guarantees or
predictions
about the GC, which is fundamentally incompatible with realtime
software.
(tried to manually fix ugly linebreaks here, so apologies if it turns
out
even worse.)
(Maybe this would be better posted in D.learn; if so I'll crosspost.)
In your opinion, of how much value would deadlining be? As in, "okay
handyman, you may sweep the floor now BUT ONLY FOR 6 MILLISECONDS;
whatever's left after that you'll have to take care of next time, your
pride as a professional Sweeper be damned"?
This has been my only suggestion for years. Although 6ms is way too much
(almost half a frame), 500us is more realistic (a little over 3% of a
frame, still quite a lot of time).
It obviously doesn't address memory footprint, but you would get the
illusion of determinism in cases similar to where race-to-idle
approaches
work. Inarguably, this wouldn't apply if the goal is to render as many
frames per second as possible, such as for non-console shooters where
tearing is not a concern but latency is very much so.
If it were running a small amount every frame, maybe the memory footprint
would stay within a predictable level... don't know. I'd be interested in
experimenting with this, but as far as I know, nobody knows how to do it.
I'm very much a layman in this field, but I'm trying to soak up as much
knowledge as possible, and most of it from threads like these. To my
uneducated eyes, an ARC collector does seem like the near-ideal
solution --
assuming, as always, the code is written with the GC in mind. But am I
right in gathering that it solves two-thirds of the problem? You don't
need
to scan the managed heap, but when memory is actually freed is still
non-deterministic and may incur pauses, though not necessarily a
program-wide stop. Aye?
I'm not sure what you mean. If you mean pauses because it's simply doing
work to free things, then that's deterministically issued workload just
like anything else and can (should) be budgeted. It doesn't disappear in
a
GC context, it just happens in bursts at unpredictable times.
Like you say, it doesn't stop the world, no need for that. Additionally,
it
leaves room for case-by-case flexibility in destruction approach. If
something takes some time to destroy, and you're okay to have it hang
around for a while, you can easily plonk it in a dead list waiting for
some
idle time to work through it; thus truly allows the opportunity to use
idle
time to clean stuff up since you don't need to scan the heap every time,
a
list of things waiting to be destroyed is already known.
You could also add dead objects to a list which can be processed by
another
thread (without stopping all threads).
There is far more opportunity to implement the destruction pattern that
suits your application, and for those not interested in implementing
their
own destruction patterns, it would be easy to offer a library function
setDestructionPattern(patternType); which would configure a particular
policy application-wide, and they'd never have to think about it again.
At the same time, Lucarella's dconf slides were very, very attractive. I
gather that allocations themselves may become slower with a concurrent
collector, but collection times in essence become non-issues.
Technically
parallelism doesn't equate to free CPU time; but that it more or less
*is*
assuming there is a cores/thread to spare. Wrong?
IIRC, it assumes a particular operating system, and required significant
resource overhead. I recall being impressed, but concluding that it
wasn't
a solution that could be merged into D across the board.
I'd need to go back and revisit the slides...
Lastly, am I right in understanding precise collectors as identical to
the
stop-the-world collector we currently have, but with a smarter
allocation
scheme resulting in a smaller managed heap to scan? With the additional
bonus of less false pointers. If so, this would seem like a good
improvement to the current implementation, with the next increment in
the
same direction being a generational gc.
I think the reduction in heap to scan is due to only following what are
known to be pointers, but I don't think that reduces the scan volume, the
memory footprint of your app is the same (more because there's new
metadata), but it wouldn't chase false pointers anymore.
I recall most agreed that it was a good improvement (although it
benchmarked something like 5% slower), I would appreciate it simply for
the
improved precision... but it doesn't solve the problem in any way. The
practical application for me is that it wouldn't leak (much) memory.
I would *dearly* love to have concurrency in whatever we end up with,
though. For a multi-core personal computer threads are free lunches, or
close enough so. Concurrentgate and all that jazz.
The kicker for me with the whole GC thing, is that as long as I've been
talking about it around here (among a whole bunch of experts), I've been
saying that an acceptable GC would require to 1. not stop the world, 2.
support incremental collection/time-slicing, so I can budget it a maximum
amount of time per frame.
I'm yet to hear anyone suggest how they can even IMAGINE writing a
collector like that. As far as I can tell, the consensus is, that it's
impossible.
Certainly, nobody has ever given any suggestions how it could possibly be
done in the future, or steps in that direction, which says a lot to me.
I've abandoned the idea as unrealistic fantasy.