Re: low-latency GC

2020-12-08 Thread oddp via Digitalmars-d-learn

On 06.12.20 06:16, Bruce Carneal via Digitalmars-d-learn wrote:

How difficult would it be to add a, selectable, low-latency GC to dlang?

Is it closer to "we cant get there from here" or "no big deal if you already have the low-latency GC 
in hand"?


I've heard Walter mention performance issues (write barriers IIRC).  I'm also interested in the 
GC-flavor performance trade offs but here I'm just asking about feasibility.




What our closest competition, Nim, is up to with their mark-and-sweep 
replacement ORC [1]:

ORC is the existing ARC algorithm (first shipped in version 1.2) plus a cycle 
collector

[...]

ARC is Nim’s pure reference-counting GC, however, many reference count operations are optimized 
away: Thanks to move semantics, the construction of a data structure does not involve RC operations. 
And thanks to “cursor inference”, another innovation of Nim’s ARC implementation, common data 
structure traversals do not involve RC operations either!


[...]

Benchmark:

Metric/algorithm ORCMark
Latency (Avg)  320.49 us  65.31 ms
Latency (Max)6.24 ms 204.79 ms
Requests/sec30963.96282.69
Transfer/sec 1.48 MB  13.80 KB
Max memory   137 MiB   153 MiB

That’s right, ORC is over 100 times faster than the M GC. The reason is that ORC only touches 
memory that the mutator touches, too.


[...]

- uses 2x less memory than classical GCs
- can be orders of magnitudes faster in throughput
- offers sub-millisecond latencies
- suited for (hard) realtime systems
- no “stop the world” phase
- oblivious to the size of the heap or the used stack space.


[1] https://nim-lang.org/blog/2020/12/08/introducing-orc.html


Re: low-latency GC

2020-12-08 Thread oddp via Digitalmars-d-learn

On 06.12.20 06:16, Bruce Carneal via Digitalmars-d-learn wrote:

How difficult would it be to add a, selectable, low-latency GC to dlang?

Is it closer to "we cant get there from here" or "no big deal if you already have the low-latency GC 
in hand"?


I've heard Walter mention performance issues (write barriers IIRC).  I'm also interested in the 
GC-flavor performance trade offs but here I'm just asking about feasibility.



What our closest competition, Nim, is up to with their mark-and-sweep 
replacement ORC [1]:

ORC is the existing ARC algorithm (first shipped in version 1.2) plus a cycle 
collector

[...]

ARC is Nim’s pure reference-counting GC, however, many reference count operations are optimized 
away: Thanks to move semantics, the construction of a data structure does not involve RC operations. 
And thanks to “cursor inference”, another innovation of Nim’s ARC implementation, common data 
structure traversals do not involve RC operations either!


[...]

Benchmark:

Metric/algorithm ORCMark
Latency (Avg)  320.49 us  65.31 ms
Latency (Max)6.24 ms 204.79 ms
Requests/sec30963.96282.69
Transfer/sec 1.48 MB  13.80 KB
Max memory   137 MiB   153 MiB

That’s right, ORC is over 100 times faster than the M GC. The reason is that ORC only touches 
memory that the mutator touches, too.


[...]

- uses 2x less memory than classical GCs
- can be orders of magnitudes faster in throughput
- offers sub-millisecond latencies
- suited for (hard) realtime systems
- no “stop the world” phase
- oblivious to the size of the heap or the used stack space.

There's also some discussion on /r/programming [2] and hackernews [3], but it 
hasn't taken off yet.

[1] https://nim-lang.org/blog/2020/12/08/introducing-orc.html
[2] 
https://old.reddit.com/r/programming/comments/k95cc5/introducing_orc_nim_nextgen_memory_management/
[3] https://news.ycombinator.com/item?id=25345770


Re: low-latency GC

2020-12-06 Thread Ola Fosheim Grostad via Digitalmars-d-learn

On Sunday, 6 December 2020 at 17:28:52 UTC, Bruce Carneal wrote:
D is good for systems level work but that's not all.  I use it 
for projects where, in the past, I'd have split the work 
between two languages (Python and C/C++).  I much prefer 
working with a single language that spans the problem space.


My impression from reading the forums is that people either use D 
as a replacement for C/C++ or Python/numpy, so I think your 
experience covers the essential use case scenario that is 
dominating current D usage? Any improvements have to improve both 
dimension, I agree.


If there is a way to extend D's reach with zero or a near-zero 
complexity increase as seen by the programmer, I believe we 
should (as/when resources allow of course).


ARC involves a complexity increase, to some extent. Library 
authors have to think a bit more principled about when objects 
should be phased out and destructed, which I think tend to lead 
to better programs. It would also allow for faster precise 
collection. So it could be beneficial for all.




Re: low-latency GC

2020-12-06 Thread Ola Fosheim Grostad via Digitalmars-d-learn

On Sunday, 6 December 2020 at 17:35:19 UTC, IGotD- wrote:
Is automatic atomic reference counting a contender for kernels? 
In kernels you want to reduce the increase/decrease of the 
counts. Therefore the Rust approach using 'clone' is better 
unless there is some optimizer that can figure it out. 
Performance is important in kernels, you don't want the kernel 
to steal useful CPU time that otherwise should go to programs.


I am not sure if kernel authors want autmatic memory management, 
they tend to want full control and transparency. Maybe something 
people who write device drivers would consider.


In general I think that reference counting should be supported 
in D, not only implicitly but also under the hood with fat 
pointers. This will make D more attractive to performance 
applications. Another advantage is the reference counting can 
use malloc/free directly if needed without any complicated GC 
layer with associated meta data.


Yes, I would like to see it, just expect that there will be 
protests when people realize that they have to make ownership 
explicit.


Also tracing GC in a kernel is my opinion not desirable. For 
the reason I previously mentioned, you want to reduce meta 
data, you want reduce CPU time, you want to reduce 
fragmentation. Special allocators for structures are often used.


Yes, an ARC solution should support fixed size allocators for 
types that are frequently allocated to get better speed.





Re: low-latency GC

2020-12-06 Thread IGotD- via Digitalmars-d-learn
On Sunday, 6 December 2020 at 15:44:32 UTC, Ola Fosheim Grøstad 
wrote:


It was more a hypothetical, as read barriers are too expensive. 
But write barriers should be ok, so a single-threaded 
incremental collector could work well if D takes a principled 
stance on objects not being 'shared' not being handed over to 
other threads without pinning them in the GC.


Maybe a better option for D than ARC, as it is closer to what 
people are used to.


In kernel programming there are plenty of atomic reference 
counted objects. The reason is that is you have kernel that 
supports SMP you must have it because you don't really know which 
CPU is working with a structure at any given time. These are 
often manually reference counted objects, which can lead to 
memory leaking bugs but they are not that hard to find.


Is automatic atomic reference counting a contender for kernels? 
In kernels you want to reduce the increase/decrease of the 
counts. Therefore the Rust approach using 'clone' is better 
unless there is some optimizer that can figure it out. 
Performance is important in kernels, you don't want the kernel to 
steal useful CPU time that otherwise should go to programs.


In general I think that reference counting should be supported in 
D, not only implicitly but also under the hood with fat pointers. 
This will make D more attractive to performance applications. 
Another advantage is the reference counting can use malloc/free 
directly if needed without any complicated GC layer with 
associated meta data.


Also tracing GC in a kernel is my opinion not desirable. For the 
reason I previously mentioned, you want to reduce meta data, you 
want reduce CPU time, you want to reduce fragmentation. Special 
allocators for structures are often used.




Re: low-latency GC

2020-12-06 Thread Bruce Carneal via Digitalmars-d-learn
On Sunday, 6 December 2020 at 16:42:00 UTC, Ola Fosheim Grostad 
wrote:

On Sunday, 6 December 2020 at 14:44:25 UTC, Paulo Pinto wrote:
And while on the subject of low level programming in JVM or 
.NET.


https://www.infoq.com/news/2020/12/net-5-runtime-improvements/


Didnt say anything about low level, only simd intrinsics, which 
isnt really low level?


It also stated "When it came to something that is pure CPU raw 
computation doing nothing but number crunching, in general, you 
can still eke out better performance if you really focus on 
"pedal to the metal" with your C/C++ code."


So you must make the familiar "ease-of-programming" vs "x% of 
performance" choice, where 'x' is presumably much smaller than 
earlier.




So it is more of a Go contender, and Go is not a systems level 
language... Apples and oranges.




D is good for systems level work but that's not all.  I use it 
for projects where, in the past, I'd have split the work between 
two languages (Python and C/C++).  I much prefer working with a 
single language that spans the problem space.


If there is a way to extend D's reach with zero or a near-zero 
complexity increase as seen by the programmer, I believe we 
should (as/when resources allow of course).




Re: low-latency GC

2020-12-06 Thread Ola Fosheim Grostad via Digitalmars-d-learn

On Sunday, 6 December 2020 at 14:44:25 UTC, Paulo Pinto wrote:
And while on the subject of low level programming in JVM or 
.NET.


https://www.infoq.com/news/2020/12/net-5-runtime-improvements/


Didnt say anything about low level, only simd intrinsics, which 
isnt really low level?


It also stated "When it came to something that is pure CPU raw 
computation doing nothing but number crunching, in general, you 
can still eke out better performance if you really focus on 
"pedal to the metal" with your C/C++ code."


So it is more of a Go contender, and Go is not a systems level 
language... Apples and oranges.


As I already mentioned in another thread, rebooting the 
language to pull in imaginary crowds will only do more damage 
than good, while the ones deemed unusable by the same imaginary 
crowd just keep winning market share, slowly and steady, even 
if takes yet another couple of years.


A fair number of people here are in that imaginary crowd.
So, I guess it isnt imaginary...


Re: low-latency GC

2020-12-06 Thread Ola Fosheim Grostad via Digitalmars-d-learn

On Sunday, 6 December 2020 at 14:11:41 UTC, Max Haughton wrote:
On Sunday, 6 December 2020 at 11:35:17 UTC, Ola Fosheim Grostad 
wrote:

On Sunday, 6 December 2020 at 11:27:39 UTC, Max Haughton wrote:

[...]


No, unique doesnt need indirection, neither does ARC, we put 
the ref count at a negative offset.


shared_ptr is a fat pointer with the ref count as a separate 
object to support existing C libraries, and make weak_ptr easy 
to implement. But no need for indirection.



[...]


I think you need a new IR, but it does not have to be used for 
code gen, it can point back to the ast nodes that represent 
ARC pointer assignments.


One could probably translate the one used in Rust, even.


https://gcc.godbolt.org/z/bnbMeY


If you pass something as a parameter then there may or may not be 
an extra reference involved. Not specific for smart pointers, but 
ARC optimization should take care of that.




Re: low-latency GC

2020-12-06 Thread Ola Fosheim Grøstad via Digitalmars-d-learn

On Sunday, 6 December 2020 at 14:45:21 UTC, Bruce Carneal wrote:
Well, you could in theory avoid putting owning pointers on the 
stack/globals or require that they are registered as gc roots. 
Then you don't have to scan the stack. All you need then is 
write barriers. IIRC


'shared' with teeth?


It was more a hypothetical, as read barriers are too expensive. 
But write barriers should be ok, so a single-threaded incremental 
collector could work well if D takes a principled stance on 
objects not being 'shared' not being handed over to other threads 
without pinning them in the GC.


Maybe a better option for D than ARC, as it is closer to what 
people are used to.





Re: low-latency GC

2020-12-06 Thread Bruce Carneal via Digitalmars-d-learn
On Sunday, 6 December 2020 at 08:59:49 UTC, Ola Fosheim Grostad 
wrote:

On Sunday, 6 December 2020 at 08:36:49 UTC, Bruce Carneal wrote:
Yes, but they don't allow low level programming. Go also 
freeze to sync threads this has a rather profound impact on 
code generation. They have spent a lot of effort on  sync 
instructions in code gen to lower the latency AFAIK.


So, much of the difficulty in bringing low-latency GC to dlang 
would be the large code gen changes required.  If it is a 
really big effort then that is all we need to know.  Not worth 
it until we can see a big payoff and have more resources.


Well, you could in theory avoid putting owning pointers on the 
stack/globals or require that they are registered as gc roots. 
Then you don't have to scan the stack. All you need then is 
write barriers. IIRC


'shared' with teeth?




Re: low-latency GC

2020-12-06 Thread Paulo Pinto via Digitalmars-d-learn
On Sunday, 6 December 2020 at 08:12:58 UTC, Ola Fosheim Grostad 
wrote:

On Sunday, 6 December 2020 at 07:45:17 UTC, Bruce Carneal wrote:
GCs scan memory, sure.  Lots of variations.  Not germane.  Not 
a rationale.


We need to freeze the threads when collecting stacks/globals.

D is employed at multiple "levels".  Whatever level you call 
it, Go and modern JVMs employ low latency GCs in 
multi-threaded environments.  Some people would like to use D 
at that "level".


Yes, but they don't allow low level programming. Go also freeze 
to sync threads this has a rather profound impact on code 
generation. They have spent a lot of effort on  sync 
instructions in code gen to lower the latency AFAIK.




They surely do.

Looking forward to see D achieve the same performance level as 
.NET 5 is capable of, beating Google's own gRPC C++ 
implementation, only Rust implementation beats it.


https://www.infoq.com/news/2020/12/aspnet-core-improvement-dotnet-5/

And while on the subject of low level programming in JVM or .NET.

https://www.infoq.com/news/2020/12/net-5-runtime-improvements/

Many of the performance improvements in the HTTP/2 
implementation are related to the reimplementation from 
unmanaged C++ code to managed C# code. Lander notes that there 
"still is this kind of idea that managed languages are not 
quite up to the task for some of those low-level super 
performance sensitive components,


Rich Lander being one of the main .NET architects, and upcoming 
Java 16 features, http://openjdk.java.net/jeps/389 (JNI 
replacement), http://openjdk.java.net/jeps/393 (native memory 
management).


As I already mentioned in another thread, rebooting the language 
to pull in imaginary crowds will only do more damage than good, 
while the ones deemed unusable by the same imaginary crowd just 
keep winning market share, slowly and steady, even if takes yet 
another couple of years.


Re: low-latency GC

2020-12-06 Thread Max Haughton via Digitalmars-d-learn
On Sunday, 6 December 2020 at 11:35:17 UTC, Ola Fosheim Grostad 
wrote:

On Sunday, 6 December 2020 at 11:27:39 UTC, Max Haughton wrote:

[...]


No, unique doesnt need indirection, neither does ARC, we put 
the ref count at a negative offset.


shared_ptr is a fat pointer with the ref count as a separate 
object to support existing C libraries, and make weak_ptr easy 
to implement. But no need for indirection.



[...]


I think you need a new IR, but it does not have to be used for 
code gen, it can point back to the ast nodes that represent ARC 
pointer assignments.


One could probably translate the one used in Rust, even.


https://gcc.godbolt.org/z/bnbMeY


Re: low-latency GC

2020-12-06 Thread Ola Fosheim Grostad via Digitalmars-d-learn

On Sunday, 6 December 2020 at 12:58:44 UTC, IGotD- wrote:
I was thinking about how to deal with this in D and the 
question is if it would be better to be able to control move as 
default per type basis. This way we can implement Rust style 
reference counting without intruding too much on the rest of 
the language. The question is if we want this or if we should 
go for a fully automated approach where the programmer doesn't 
need to worry about 'clone'.


I dont know, but I suspect that people that use D want something 
more high level than Rust? But I dont use Rust, so...




Re: low-latency GC

2020-12-06 Thread IGotD- via Digitalmars-d-learn
On Sunday, 6 December 2020 at 11:07:50 UTC, Ola Fosheim Grostad 
wrote:


ARC can be done incrementally, we can do it as a library first 
and use a modified version existing GC for detecting failed 
borrows at runtime during testing.


But all libraries that use owning pointers need ownership to be 
made explicit.


A static borrow checker an ARC optimizer needs a high level IR 
though. A lot of work though.


The Rust approach is interesting as it doesn't need an ARC 
optimizer. Everything is a  move so no increase/decrease is done 
when doing that. Increase is done first when the programmer 
decides to 'clone' the reference. This inherently becomes 
optimized without any compiler support. However, this requires 
that the programmer inserts 'clone' when necessary so it isn't 
really automatic.


I was thinking about how to deal with this in D and the question 
is if it would be better to be able to control move as default 
per type basis. This way we can implement Rust style reference 
counting without intruding too much on the rest of the language. 
The question is if we want this or if we should go for a fully 
automated approach where the programmer doesn't need to worry 
about 'clone'.


Re: low-latency GC

2020-12-06 Thread Ola Fosheim Grostad via Digitalmars-d-learn

On Sunday, 6 December 2020 at 11:27:39 UTC, Max Haughton wrote:
ARC with a library will have overhead unless the compiler/ABI 
is changed e.g. unique_ptr in C++ has an indirection.


No, unique doesnt need indirection, neither does ARC, we put the 
ref count at a negative offset.


shared_ptr is a fat pointer with the ref count as a separate 
object to support existing C libraries, and make weak_ptr easy to 
implement. But no need for indirection.


The AST effectively is a high-level IR. Not a good one, but 
good enough. The system Walter has built shows the means are 
there in the compiler already.


I think you need a new IR, but it does not have to be used for 
code gen, it can point back to the ast nodes that represent ARC 
pointer assignments.


One could probably translate the one used in Rust, even.


Re: low-latency GC

2020-12-06 Thread Max Haughton via Digitalmars-d-learn
On Sunday, 6 December 2020 at 11:07:50 UTC, Ola Fosheim Grostad 
wrote:

On Sunday, 6 December 2020 at 10:44:39 UTC, Max Haughton wrote:
On Sunday, 6 December 2020 at 05:29:37 UTC, Ola Fosheim 
Grostad wrote:
It has to be either some kind of heavily customisable small GC 
(i.e. with our resources the GC cannot please everyone), or 
arc. The GC as it is just hurts the language.


Realistically, we probably need some kind of working group or 
at least serious discussion to really narrow down where to go 
in the future. The GC as it is now must go, we need borrowing 
to work with more than just pointers, etc.


The issue is that it can't just be done incrementally, it 
needs to be specified beforehand.


ARC can be done incrementally, we can do it as a library first 
and use a modified version existing GC for detecting failed 
borrows at runtime during testing.


But all libraries that use owning pointers need ownership to be 
made explicit.


A static borrow checker an ARC optimizer needs a high level IR 
though. A lot of work though.


ARC with a library will have overhead unless the compiler/ABI is 
changed e.g. unique_ptr in C++ has an indirection.


The AST effectively is a high-level IR. Not a good one, but good 
enough. The system Walter has built shows the means are there in 
the compiler already.


As things are at the moment, the annotations we have for pointers 
like scope go a long way, but the language doesn't deal with 
things like borrowing structs (and the contents of structs i.e. 
making a safe vector) properly yet. That is what needs thinking 
about.


Re: low-latency GC

2020-12-06 Thread Ola Fosheim Grostad via Digitalmars-d-learn

On Sunday, 6 December 2020 at 10:44:39 UTC, Max Haughton wrote:
On Sunday, 6 December 2020 at 05:29:37 UTC, Ola Fosheim Grostad 
wrote:
It has to be either some kind of heavily customisable small GC 
(i.e. with our resources the GC cannot please everyone), or 
arc. The GC as it is just hurts the language.


Realistically, we probably need some kind of working group or 
at least serious discussion to really narrow down where to go 
in the future. The GC as it is now must go, we need borrowing 
to work with more than just pointers, etc.


The issue is that it can't just be done incrementally, it needs 
to be specified beforehand.


ARC can be done incrementally, we can do it as a library first 
and use a modified version existing GC for detecting failed 
borrows at runtime during testing.


But all libraries that use owning pointers need ownership to be 
made explicit.


A static borrow checker an ARC optimizer needs a high level IR 
though. A lot of work though.






Re: low-latency GC

2020-12-06 Thread Max Haughton via Digitalmars-d-learn
On Sunday, 6 December 2020 at 05:29:37 UTC, Ola Fosheim Grostad 
wrote:

On Sunday, 6 December 2020 at 05:16:26 UTC, Bruce Carneal wrote:
How difficult would it be to add a, selectable, low-latency GC 
to dlang?


Is it closer to "we cant get there from here" or "no big deal 
if you already have the low-latency GC in hand"?


I've heard Walter mention performance issues (write barriers 
IIRC).  I'm also interested in the GC-flavor performance trade 
offs but here I'm just asking about feasibility.


The only reasonable option for D is single threaded GC or ARC.


It has to be either some kind of heavily customisable small GC 
(i.e. with our resources the GC cannot please everyone), or arc. 
The GC as it is just hurts the language.


Realistically, we probably need some kind of working group or at 
least serious discussion to really narrow down where to go in the 
future. The GC as it is now must go, we need borrowing to work 
with more than just pointers, etc.


The issue is that it can't just be done incrementally, it needs 
to be specified beforehand.




Re: low-latency GC

2020-12-06 Thread Ola Fosheim Grostad via Digitalmars-d-learn
On Sunday, 6 December 2020 at 08:59:49 UTC, Ola Fosheim Grostad 
wrote:
Well, you could in theory avoid putting owning pointers on the 
stack/globals or require that they are registered as gc roots. 
Then you don't have to scan the stack. All you need then is 
write barriers. IIRC


Abd read barriers... I assume. However with single threaded 
incremental, write barriers should be enough.


Re: low-latency GC

2020-12-06 Thread Ola Fosheim Grostad via Digitalmars-d-learn

On Sunday, 6 December 2020 at 08:36:49 UTC, Bruce Carneal wrote:
Yes, but they don't allow low level programming. Go also 
freeze to sync threads this has a rather profound impact on 
code generation. They have spent a lot of effort on  sync 
instructions in code gen to lower the latency AFAIK.


So, much of the difficulty in bringing low-latency GC to dlang 
would be the large code gen changes required.  If it is a 
really big effort then that is all we need to know.  Not worth 
it until we can see a big payoff and have more resources.


Well, you could in theory avoid putting owning pointers on the 
stack/globals or require that they are registered as gc roots. 
Then you don't have to scan the stack. All you need then is write 
barriers. IIRC







Re: low-latency GC

2020-12-06 Thread Bruce Carneal via Digitalmars-d-learn
On Sunday, 6 December 2020 at 08:12:58 UTC, Ola Fosheim Grostad 
wrote:

On Sunday, 6 December 2020 at 07:45:17 UTC, Bruce Carneal wrote:
GCs scan memory, sure.  Lots of variations.  Not germane.  Not 
a rationale.


We need to freeze the threads when collecting stacks/globals.

OK.  Low latency GCs exist.



D is employed at multiple "levels".  Whatever level you call 
it, Go and modern JVMs employ low latency GCs in 
multi-threaded environments.  Some people would like to use D 
at that "level".


Yes, but they don't allow low level programming. Go also freeze 
to sync threads this has a rather profound impact on code 
generation. They have spent a lot of effort on  sync 
instructions in code gen to lower the latency AFAIK.


So, much of the difficulty in bringing low-latency GC to dlang 
would be the large code gen changes required.  If it is a really 
big effort then that is all we need to know.  Not worth it until 
we can see a big payoff and have more resources.




My question remains: how difficult would it be to bring such 
technology to D as a GC option?  Is it precluded somehow by 
the language?   Is it doable but quite a lot of effort because 
...?
 Is it no big deal once you have the GC itself because you 
only need xyz hooks? Is it ...?


Get rid of the system stack and globals. Use only closures and 
put in a restrictive memory model. Then maybe you can get a 
fully no freeze multi threaded GC.  That would be a different 
language.


It would be, but I don't think it is the only way to get lower 
latency GC.  That said, if the code gen effort you mentioned 
earlier is a big deal, then no need to speculate/examine further.




Also, I think Walter may have been concerned about read 
barrier overhead but, again, I'm looking for feasibility 
information.  What would it take to get something that we 
could compare?


Just add ARC + single threaded GC. And even that is quite 
expensive.


Thanks for the feedback.



Re: low-latency GC

2020-12-06 Thread Ola Fosheim Grostad via Digitalmars-d-learn

On Sunday, 6 December 2020 at 07:45:17 UTC, Bruce Carneal wrote:
GCs scan memory, sure.  Lots of variations.  Not germane.  Not 
a rationale.


We need to freeze the threads when collecting stacks/globals.

D is employed at multiple "levels".  Whatever level you call 
it, Go and modern JVMs employ low latency GCs in multi-threaded 
environments.  Some people would like to use D at that "level".


Yes, but they don't allow low level programming. Go also freeze 
to sync threads this has a rather profound impact on code 
generation. They have spent a lot of effort on  sync instructions 
in code gen to lower the latency AFAIK.


My question remains: how difficult would it be to bring such 
technology to D as a GC option?  Is it precluded somehow by the 
language?   Is it doable but quite a lot of effort because ...?
 Is it no big deal once you have the GC itself because you only 
need xyz hooks? Is it ...?


Get rid of the system stack and globals. Use only closures and 
put in a restrictive memory model. Then maybe you can get a fully 
no freeze multi threaded GC.  That would be a different language.


Also, I think Walter may have been concerned about read barrier 
overhead but, again, I'm looking for feasibility information.  
What would it take to get something that we could compare?


Just add ARC + single threaded GC. And even that is quite 
expensive.





Re: low-latency GC

2020-12-05 Thread Bruce Carneal via Digitalmars-d-learn
On Sunday, 6 December 2020 at 06:52:41 UTC, Ola Fosheim Grostad 
wrote:

On Sunday, 6 December 2020 at 05:41:05 UTC, Bruce Carneal wrote:
OK.  Some rationale?  Do you, for example, believe that 
no-probable-dlanger could benefit from a low-latency GC?  That 
it is too hard to implement?  That the language is somehow 
incompatible? That ...


The GC needs to scan all the affected call stacks before it can 
do incremental collection. Multi threaded GC is generally not 
compatible with low level programming.


GCs scan memory, sure.  Lots of variations.  Not germane.  Not a 
rationale.


D is employed at multiple "levels".  Whatever level you call it, 
Go and modern JVMs employ low latency GCs in multi-threaded 
environments.  Some people would like to use D at that "level".


My question remains: how difficult would it be to bring such 
technology to D as a GC option?  Is it precluded somehow by the 
language?   Is it doable but quite a lot of effort because ...?  
Is it no big deal once you have the GC itself because you only 
need xyz hooks? Is it ...?


Also, I think Walter may have been concerned about read barrier 
overhead but, again, I'm looking for feasibility information.  
What would it take to get something that we could compare?




Re: low-latency GC

2020-12-05 Thread Ola Fosheim Grostad via Digitalmars-d-learn

On Sunday, 6 December 2020 at 05:41:05 UTC, Bruce Carneal wrote:
OK.  Some rationale?  Do you, for example, believe that 
no-probable-dlanger could benefit from a low-latency GC?  That 
it is too hard to implement?  That the language is somehow 
incompatible? That ...


The GC needs to scan all the affected call stacks before it can 
do incremental collection. Multi threaded GC is generally not 
compatible with low level programming.





Re: low-latency GC

2020-12-05 Thread Bruce Carneal via Digitalmars-d-learn
On Sunday, 6 December 2020 at 05:29:37 UTC, Ola Fosheim Grostad 
wrote:

On Sunday, 6 December 2020 at 05:16:26 UTC, Bruce Carneal wrote:
How difficult would it be to add a, selectable, low-latency GC 
to dlang?


Is it closer to "we cant get there from here" or "no big deal 
if you already have the low-latency GC in hand"?


I've heard Walter mention performance issues (write barriers 
IIRC).  I'm also interested in the GC-flavor performance trade 
offs but here I'm just asking about feasibility.


The only reasonable option for D is single threaded GC or ARC.


OK.  Some rationale?  Do you, for example, believe that 
no-probable-dlanger could benefit from a low-latency GC?  That it 
is too hard to implement?  That the language is somehow 
incompatible? That ...




Re: low-latency GC

2020-12-05 Thread Ola Fosheim Grostad via Digitalmars-d-learn

On Sunday, 6 December 2020 at 05:16:26 UTC, Bruce Carneal wrote:
How difficult would it be to add a, selectable, low-latency GC 
to dlang?


Is it closer to "we cant get there from here" or "no big deal 
if you already have the low-latency GC in hand"?


I've heard Walter mention performance issues (write barriers 
IIRC).  I'm also interested in the GC-flavor performance trade 
offs but here I'm just asking about feasibility.


The only reasonable option for D is single threaded GC or ARC.






low-latency GC

2020-12-05 Thread Bruce Carneal via Digitalmars-d-learn
How difficult would it be to add a, selectable, low-latency GC to 
dlang?


Is it closer to "we cant get there from here" or "no big deal if 
you already have the low-latency GC in hand"?


I've heard Walter mention performance issues (write barriers 
IIRC).  I'm also interested in the GC-flavor performance trade 
offs but here I'm just asking about feasibility.




Re: Go’s march to low-latency GC

2016-07-06 Thread Jack Stouffer via Digitalmars-d-learn

On Wednesday, 6 July 2016 at 16:58:45 UTC, chmike wrote:

In case you missed it

https://blog.twitch.tv/gos-march-to-low-latency-gc-a6fa96f06eb7#.emwja62y1


This should have been posted in General.


Go’s march to low-latency GC

2016-07-06 Thread chmike via Digitalmars-d-learn

In case you missed it

https://blog.twitch.tv/gos-march-to-low-latency-gc-a6fa96f06eb7#.emwja62y1