Re: radical ideas about GC and ARC : need to be time driven?
On Thursday, 15 May 2014 at 12:28:47 UTC, Marc Schütz wrote: But as long as there can be false pointers, no matter how improbable, there can be no guaranteed destruction, which was my point. Maybe it becomes acceptable at very low probabilities, but it's still a gamble... A couple of not freed resources should not cause the program run out of resources. With precise GC you can get memory leaks too, it's always a gamble, so GC may or may not save you, but it still can.
Re: radical ideas about GC and ARC : need to be time driven?
On Thursday, 15 May 2014 at 12:56:13 UTC, Ola Fosheim Grøstad wrote: On Thursday, 15 May 2014 at 12:44:56 UTC, Marc Schütz wrote: On Wednesday, 14 May 2014 at 20:02:08 UTC, Ola Fosheim Grøstad wrote: However, you could have rules for collection and FFI (calling C). Like only allowing collection if all C parameters that point to GC memory have a shorter life span than other D pointers to the same memory (kind of like borrowed pointers in Rust). Some kind of lifetime annotation would be required for this. Not that this is a bad idea, but it will require some work... Isn't it sufficient to let the backend always push pointers that could be to GC memory on the stack in the functions that calls C? You don't know what the C function does with them. `scope` can be used to tell the compiler that the function doesn't keep them after it returned. Of course the compiler can't verify it, but it could only allow GC pointers to be passed as scope arguments. And of course, it would need to know which pointers _are_ GC pointers in the first place. The easy solution is to use something that is to define safe zones where you can freeze (kind of like rendezvous semaphores, but not quite). This helps with getting the registers on the stack, but we still need type information for them. Yes, but you have that in the description of the stack frame that you look up when doing precise collection? You need such stack frame identification utilities for doing exception handling too. Exception handling info is not detailed enough. It only contains addresses of cleanup code that needs to be called during stack unwinding, but nothing about the objects on the stack, AFAIK. Which of course requires type information. And existing unions need to be updated to implement this function. I guess sometimes it might not even be possible to implement it, because the state information is not present in the union itself. Then the compiler could complain or insist that you use a conservative GC.
Re: radical ideas about GC and ARC : need to be time driven?
On Wednesday, 14 May 2014 at 09:39:01 UTC, Marc Schütz wrote: RC is done by the object itself, so by definition it knows its own type, while the GC needs to be told about the type on allocation. AFAIK there is ongoing work to make this information available for non-class types. If you can unify RC on binary level for any type, GC can use that unification too: when you allocate the object, you has its type and can setup necessary structures needed to call the destructor. Well, it cannot be made 100% reliable by principle. That's just an inherent property of tracing GCs. The question is, can we define which uses of destructors are safe in this sense and which ones are not, and ideally find ways to detect unsafe uses at compile time... That's very much in the spirit of D: Something that looks right, should be right. If it is not, it should be rejected by the compiler. Does this suggest that if you slip a type with destructor into your code, it will force everything to be refcounted?
Re: radical ideas about GC and ARC : need to be time driven?
On Wednesday, 14 May 2014 at 19:45:20 UTC, Marc Schütz wrote: - We have external code programmed in languages other than D, most prominently C and C++. These don't provide any type information, therefore the GC needs to handle their memory conservatively, which means there can be false pointers = no deterministic destruction. It's a very rare scenario to escape GC memory into foreign opaque data structures. Usually you don't see, where your pointer goes, as foreign API is usually completely opaque, so you have nothing to scan, even if you have a precise GC. Sometimes C API will notify your code, where it releases your data, in other cases you can store your data in a managed memory and release it after you release the foreign data structure. - Variables on the stack and in registers. In theory, the compiler could generate that information, or existing debug information might be used, but that's complicated for the GC to handle and will probably have runtime costs. I guess it's unlikely to happen. And of course, when we call a C function, we're lost again. Precise GC is needed to implement moving GC, it's not needed to implement good memory management, at least on 64-bit architecture. On 32-bit architecture false pointers are possible, when you have lots of data without pointers on 32-bit architecture. It could be treated simply by allocating data without pointers (like strings) in not scanned blocks. The more valid pointers you have in your data, the smaller is probability of false pointers. The smaller is handle wrapper, the smaller is probability of a false pointer holding it. If you manage resources well, probability of handle leak goes even smaller (in C# it doesn't pose any notable difficulty even though you don't have any mechanism of eager resource management at all, only GC, in D you have non-zero opportunity for eager resource management). All these small probabilities multiply, and you get even smaller probability of an eventual resource leak.
Re: radical ideas about GC and ARC : need to be time driven?
On Thursday, 15 May 2014 at 07:27:41 UTC, Kagamin wrote: On Wednesday, 14 May 2014 at 09:39:01 UTC, Marc Schütz wrote: RC is done by the object itself, so by definition it knows its own type, while the GC needs to be told about the type on allocation. AFAIK there is ongoing work to make this information available for non-class types. If you can unify RC on binary level for any type, GC can use that unification too: when you allocate the object, you has its type and can setup necessary structures needed to call the destructor. Exactly. Well, it cannot be made 100% reliable by principle. That's just an inherent property of tracing GCs. The question is, can we define which uses of destructors are safe in this sense and which ones are not, and ideally find ways to detect unsafe uses at compile time... That's very much in the spirit of D: Something that looks right, should be right. If it is not, it should be rejected by the compiler. Does this suggest that if you slip a type with destructor into your code, it will force everything to be refcounted? Hmm... that's probably too strict. There are often non-critical resources that need to be released on destruction, like a hypothetical String class which owns it's data and which is itself allocated on the GC heap, because we don't need eager destruction for it. We'd want the data buffer to be released as soon the String object is destroyed. This buffer might even be allocated on the C heap, so we cannot rely on the garbage collector to clean it up later. Is that a job for a finalizer?
Re: radical ideas about GC and ARC : need to be time driven?
On Thursday, 15 May 2014 at 12:05:27 UTC, Kagamin wrote: On Wednesday, 14 May 2014 at 19:45:20 UTC, Marc Schütz wrote: - We have external code programmed in languages other than D, most prominently C and C++. These don't provide any type information, therefore the GC needs to handle their memory conservatively, which means there can be false pointers = no deterministic destruction. It's a very rare scenario to escape GC memory into foreign opaque data structures. Usually you don't see, where your pointer goes, as foreign API is usually completely opaque, so you have nothing to scan, even if you have a precise GC. Sometimes C API will notify your code, where it releases your data, in other cases you can store your data in a managed memory and release it after you release the foreign data structure. Fair point. But can this be made safer? Currently you don't get any warning if a GC pointer escapes into C land. - Variables on the stack and in registers. In theory, the compiler could generate that information, or existing debug information might be used, but that's complicated for the GC to handle and will probably have runtime costs. I guess it's unlikely to happen. And of course, when we call a C function, we're lost again. Precise GC is needed to implement moving GC, it's not needed to implement good memory management, at least on 64-bit architecture. On 32-bit architecture false pointers are possible, when you have lots of data without pointers on 32-bit architecture. It could be treated simply by allocating data without pointers (like strings) in not scanned blocks. The more valid pointers you have in your data, the smaller is probability of false pointers. The smaller is handle wrapper, the smaller is probability of a false pointer holding it. If you manage resources well, probability of handle leak goes even smaller (in C# it doesn't pose any notable difficulty even though you don't have any mechanism of eager resource management at all, only GC, in D you have non-zero opportunity for eager resource management). All these small probabilities multiply, and you get even smaller probability of an eventual resource leak. But as long as there can be false pointers, no matter how improbable, there can be no guaranteed destruction, which was my point. Maybe it becomes acceptable at very low probabilities, but it's still a gamble...
Re: radical ideas about GC and ARC : need to be time driven?
On Wednesday, 14 May 2014 at 20:02:08 UTC, Ola Fosheim Grøstad wrote: On Wednesday, 14 May 2014 at 19:45:20 UTC, Marc Schütz wrote: - We have external code programmed in languages other than D, most prominently C and C++. These don't provide any type information, therefore the GC needs to handle their memory conservatively, which means there can be false pointers = no deterministic destruction. Oh yes, I agree. However, you could have rules for collection and FFI (calling C). Like only allowing collection if all C parameters that point to GC memory have a shorter life span than other D pointers to the same memory (kind of like borrowed pointers in Rust). Some kind of lifetime annotation would be required for this. Not that this is a bad idea, but it will require some work... - Variables on the stack and in registers. In theory, the compiler could generate that information, or existing debug information might be used, but that's complicated for the GC to handle and will probably have runtime costs. The easy solution is to use something that is to define safe zones where you can freeze (kind of like rendezvous semaphores, but not quite). This helps with getting the registers on the stack, but we still need type information for them. - Untagged unions. The GC has no way to figure out which of the union fields is currently valid. If any of them is a pointer, it needs to treat them conservatively. So you need a function that can help the GC if the pointer fields of the union don't match up or don't point to class instances. Which of course requires type information. And existing unions need to be updated to implement this function. I guess sometimes it might not even be possible to implement it, because the state information is not present in the union itself.
Re: radical ideas about GC and ARC : need to be time driven?
On Thursday, 15 May 2014 at 12:44:56 UTC, Marc Schütz wrote: On Wednesday, 14 May 2014 at 20:02:08 UTC, Ola Fosheim Grøstad wrote: However, you could have rules for collection and FFI (calling C). Like only allowing collection if all C parameters that point to GC memory have a shorter life span than other D pointers to the same memory (kind of like borrowed pointers in Rust). Some kind of lifetime annotation would be required for this. Not that this is a bad idea, but it will require some work... Isn't it sufficient to let the backend always push pointers that could be to GC memory on the stack in the functions that calls C? The easy solution is to use something that is to define safe zones where you can freeze (kind of like rendezvous semaphores, but not quite). This helps with getting the registers on the stack, but we still need type information for them. Yes, but you have that in the description of the stack frame that you look up when doing precise collection? You need such stack frame identification utilities for doing exception handling too. Which of course requires type information. And existing unions need to be updated to implement this function. I guess sometimes it might not even be possible to implement it, because the state information is not present in the union itself. Then the compiler could complain or insist that you use a conservative GC.
Re: radical ideas about GC and ARC : need to be time driven?
On Tuesday, 13 May 2014 at 17:53:10 UTC, Marc Schütz wrote: Currently it isn't, because the GC sometimes lacks type information, e.g. for dynamic arrays. Will RC be guaranteed to always have type information? If it can, why GC can't? If it can't, what's the difference? On Tuesday, 13 May 2014 at 18:07:42 UTC, Marc Schütz wrote: It's not (memory) unsafe because you cannot delete live objects accidentally, but it's unsafe because it leaks resources. Imagine a file object that relies on the destructor closing the file descriptor. You will quickly run out of FDs... It's the same situation in .net, where GC doesn't guarantee calling finalizers of arbitrary classes in all scenarios, they have to be special classes like SafeHandle, and resource handles are usually implemented deriving from SafeHandle. Is it constructive to require D GC be better than .net GC?
Re: radical ideas about GC and ARC : need to be time driven?
On Wednesday, 14 May 2014 at 06:44:44 UTC, Kagamin wrote: On Tuesday, 13 May 2014 at 17:53:10 UTC, Marc Schütz wrote: Currently it isn't, because the GC sometimes lacks type information, e.g. for dynamic arrays. Will RC be guaranteed to always have type information? If it can, why GC can't? If it can't, what's the difference? RC is done by the object itself, so by definition it knows its own type, while the GC needs to be told about the type on allocation. AFAIK there is ongoing work to make this information available for non-class types. On Tuesday, 13 May 2014 at 18:07:42 UTC, Marc Schütz wrote: It's not (memory) unsafe because you cannot delete live objects accidentally, but it's unsafe because it leaks resources. Imagine a file object that relies on the destructor closing the file descriptor. You will quickly run out of FDs... It's the same situation in .net, where GC doesn't guarantee calling finalizers of arbitrary classes in all scenarios, they have to be special classes like SafeHandle, and resource handles are usually implemented deriving from SafeHandle. Is it constructive to require D GC be better than .net GC? Well, it cannot be made 100% reliable by principle. That's just an inherent property of tracing GCs. The question is, can we define which uses of destructors are safe in this sense and which ones are not, and ideally find ways to detect unsafe uses at compile time... That's very much in the spirit of D: Something that looks right, should be right. If it is not, it should be rejected by the compiler.
Re: radical ideas about GC and ARC : need to be time driven?
On Wednesday, 14 May 2014 at 09:39:01 UTC, Marc Schütz wrote: Well, it cannot be made 100% reliable by principle. That's just an inherent property of tracing GCs. I don't think this is true. Why is this an inherent property of tracing GCs?
Re: radical ideas about GC and ARC : need to be time driven?
On Wednesday, 14 May 2014 at 10:00:29 UTC, Ola Fosheim Grøstad wrote: On Wednesday, 14 May 2014 at 09:39:01 UTC, Marc Schütz wrote: Well, it cannot be made 100% reliable by principle. That's just an inherent property of tracing GCs. I don't think this is true. Why is this an inherent property of tracing GCs? You're right, theoretically it's possible. I was only considering the situation with D: - We have external code programmed in languages other than D, most prominently C and C++. These don't provide any type information, therefore the GC needs to handle their memory conservatively, which means there can be false pointers = no deterministic destruction. - Variables on the stack and in registers. In theory, the compiler could generate that information, or existing debug information might be used, but that's complicated for the GC to handle and will probably have runtime costs. I guess it's unlikely to happen. And of course, when we call a C function, we're lost again. - Untagged unions. The GC has no way to figure out which of the union fields is currently valid. If any of them is a pointer, it needs to treat them conservatively. There are probably other things...
Re: radical ideas about GC and ARC : need to be time driven?
On Wednesday, 14 May 2014 at 19:45:20 UTC, Marc Schütz wrote: - We have external code programmed in languages other than D, most prominently C and C++. These don't provide any type information, therefore the GC needs to handle their memory conservatively, which means there can be false pointers = no deterministic destruction. Oh yes, I agree. However, you could have rules for collection and FFI (calling C). Like only allowing collection if all C parameters that point to GC memory have a shorter life span than other D pointers to the same memory (kind of like borrowed pointers in Rust). - Variables on the stack and in registers. In theory, the compiler could generate that information, or existing debug information might be used, but that's complicated for the GC to handle and will probably have runtime costs. The easy solution is to use something that is to define safe zones where you can freeze (kind of like rendezvous semaphores, but not quite). - Untagged unions. The GC has no way to figure out which of the union fields is currently valid. If any of them is a pointer, it needs to treat them conservatively. So you need a function that can help the GC if the pointer fields of the union don't match up or don't point to class instances. Ola.
Re: radical ideas about GC and ARC : need to be time driven?
Am Mon, 12 May 2014 08:44:51 + schrieb Marc Schütz schue...@gmx.net: On Monday, 12 May 2014 at 04:22:21 UTC, Marco Leise wrote: On the positive side the talk about Rust, in particular how reference counted pointers decay to borrowed pointers made me think the same could be done for our scope args. A reference counted slice with 3 machine words could decay to a 2 machine word scoped slice. Most of my code at least just works on the slices and doesn't keep a reference to them. A counter example is when you have something like an XML parser - a use case that D traditionally (see Tango) excelled in. The GC environment and slices make it possible to replace string copies with cheap slices into the original XML string. Rust also has a solution for this: They have lifetime annotations. D's scope could be extended to support something similar: scope(input) string getSlice(scope string input); or with methods: struct Xml { scope(this) string getSlice(); } scope(symbol) means, this value references/aliases (parts of) the value referred to by symbol. The compiler can then make sure it is never assigned to variables with longer lifetimes than symbol. Crazy shit, now we are getting into concepts that I have no idea of how well they play in real code. There are no globals, but threads all create their own call stacks with independent lifetimes. So at that point lifetime annotations become interesting. -- Marco
Re: radical ideas about GC and ARC : need to be time driven?
On 13 May 2014 14:39, Kagamin via Digitalmars-d digitalmars-d@puremagic.com wrote: On Saturday, 10 May 2014 at 19:17:02 UTC, Xavier Bigand wrote: My concerns as Dlang user are : - Even if GC is the solution, how long I need suffer with destructor's issues (calls order)? What issues do you have with destructors and how they affect you? - When we will able to see a performant GC implementation can satisfy someone like Manu :) ? Months, years, a decade? Neither GC nor C heap will satisfy Manu's requirements. When it comes to shooters, the only way is to not allocate and write accurate code, even in C++. Even substitution of allocator won't help him, if the code relies on GC in a non-trivial way. I'm not quite sure what you're saying, but I don't think it's quite as stringent as you suggest, at least, not anymore. We so try to minimise allocations, but some dynamic memory usage is just a modern reality. There is only a single requirement I see for automatic memory management to be practical, and it is very simple; time associated with whatever memory management needs to be fine-grained and evenly distributed. It needs to be amortised in some way. This obviously lends itself to eager-freeing systems like ARC. If a GC can be made that is reasonably nonintrusive; like one that is decently concurrent, ideally incremental in some way, maybe has split pools, where a high-frequency/small-allocation/temporary pool (ie, runtime temp data, closures, strings) may clean up very quickly without resulting in full memory scans... maybe it's possible. I don't know. The problem is, it's been years, nobody seems to know how to do it. It's still not clear that would be acceptable, and I have no reason to believe such a GC would be higher performance than ARC anyway... but I'm more than happy to be surprised, if someone really thinks it can be done. The other topic is still relevant to me too however (and many others). We still need to solve the problem with destructors. I agree with Andrei, they should be removed from the language as they are. You can't offer destructors if they don't get called. And the usefulness of destructors is seriously compromised if you can't rely on them being executed eagerly. Without eager executed destructors, in many situations, you end up with effective manual releasing the object anyway (*cough* C#), and that implies manually maintaining knowledge of lifetime/end of life and calling some release. I see memory management as a moot offering as soon as that reality exists. If we do end out with a GC that is somehow acceptable (I'm still skeptical), then this discussion about what to do with destructors is still ongoing. Do we ARC just those objects that have destructors like Andrei suggested? It's a possibility, I can't think of any other solution. In lieu of any other solution, it sounds like we could very well end up with ARC tech available one way or another, even if it's not pervasive, just applied implicitly to things with destructors.
Re: radical ideas about GC and ARC : need to be time driven?
On Monday, 12 May 2014 at 04:22:21 UTC, Marco Leise wrote: On the positive side the talk about Rust, in particular how reference counted pointers decay to borrowed pointers made me think the same could be done for our scope args. A reference counted slice with 3 machine words could decay to a 2 machine word scoped slice. Most of my code at least just works on the slices and doesn't keep a reference to them. I wouldn't mind banning slices on the heap, but what D needs is to ban having pointers to internal data outlive allocation base pointers. I think that is a bad practice anyway and consider it to be a bug. If you can establish that constraint then you can avoid tracing pointers to non-aligned addresses: if (addrMASK==0) trace... You could also statically annotate pointers to be known as guaranteed allocation base pointer or known to be traced already (e.g. borrowed pointers in Rust) A counter example is when you have something like an XML parser - a use case that D traditionally (see Tango) excelled in. The GC environment and slices make it possible to replace string copies with cheap slices into the original XML string. As pointed out by others, this won't work for XML. It will work for some binary formats, but you usually want to map a stuct onto the data (or copy) anyway. I have little need for slices on the heap... I'd much rather have it limited to registers (conceptually) if that means faster GC.
Re: radical ideas about GC and ARC : need to be time driven?
On Tuesday, 13 May 2014 at 07:12:02 UTC, Marco Leise wrote: Am Mon, 12 May 2014 08:44:51 + schrieb Marc Schütz schue...@gmx.net: On Monday, 12 May 2014 at 04:22:21 UTC, Marco Leise wrote: On the positive side the talk about Rust, in particular how reference counted pointers decay to borrowed pointers made me think the same could be done for our scope args. A reference counted slice with 3 machine words could decay to a 2 machine word scoped slice. Most of my code at least just works on the slices and doesn't keep a reference to them. A counter example is when you have something like an XML parser - a use case that D traditionally (see Tango) excelled in. The GC environment and slices make it possible to replace string copies with cheap slices into the original XML string. Rust also has a solution for this: They have lifetime annotations. D's scope could be extended to support something similar: scope(input) string getSlice(scope string input); or with methods: struct Xml { scope(this) string getSlice(); } scope(symbol) means, this value references/aliases (parts of) the value referred to by symbol. The compiler can then make sure it is never assigned to variables with longer lifetimes than symbol. Crazy shit, now we are getting into concepts that I have no idea of how well they play in real code. There are no globals, but threads all create their own call stacks with independent lifetimes. So at that point lifetime annotations become interesting. I don't really know a lot about Rust, but I believe this is not an issue with Rust, as its variables are only thread-local. You can send things to other threads, but then they become inaccessible in the current thread. In general, lifetime annotations can only be used for simple relationships. It's also not a way to keep objects alive as long as they are referenced, but rather a way to disallow references to exist longer than the objects they point to.
Re: radical ideas about GC and ARC : need to be time driven?
On Tuesday, 13 May 2014 at 07:42:26 UTC, Manu via Digitalmars-d wrote: The other topic is still relevant to me too however (and many others). We still need to solve the problem with destructors. I agree with Andrei, they should be removed from the language as they are. You can't offer destructors if they don't get called. Andrei only said, they are not called sometimes, not always, so we can guarantee destructor calls, when it can be guaranteed. And the usefulness of destructors is seriously compromised if you can't rely on them being executed eagerly. Without eager executed destructors, in many situations, you end up with effective manual releasing the object anyway (*cough* C#), and that implies manually maintaining knowledge of lifetime/end of life and calling some release. I use finalizers in C#, they're useful. I understand, it's a popular misunderstanding, that people think, that GC must work like RAII. But GC manages only its resources, not your resources. It can manage memory without RAII, and it does so. Speaking about eager resource management, we have Unique and RefCounted in phobos, in fact, files are already managed that way. What's problem?
Re: radical ideas about GC and ARC : need to be time driven?
On Tuesday, 13 May 2014 at 07:42:26 UTC, Manu via Digitalmars-d wrote: Do we ARC just those objects that have destructors like Andrei suggested? It's a possibility, I can't think of any other solution. In lieu of any other solution, it sounds like we could very well end up with ARC tech available one way or another, even if it's not pervasive, just applied implicitly to things with destructors. BTW, I don't see how ARC would be more able to call destructors, than GC. If ARC can call destructor, so can GC. Where's the difference?
Re: radical ideas about GC and ARC : need to be time driven?
On 13 May 2014 21:42, Kagamin via Digitalmars-d digitalmars-d@puremagic.com wrote: On Tuesday, 13 May 2014 at 07:42:26 UTC, Manu via Digitalmars-d wrote: The other topic is still relevant to me too however (and many others). We still need to solve the problem with destructors. I agree with Andrei, they should be removed from the language as they are. You can't offer destructors if they don't get called. Andrei only said, they are not called sometimes, not always, so we can guarantee destructor calls, when it can be guaranteed. ... what? And the usefulness of destructors is seriously compromised if you can't rely on them being executed eagerly. Without eager executed destructors, in many situations, you end up with effective manual releasing the object anyway (*cough* C#), and that implies manually maintaining knowledge of lifetime/end of life and calling some release. I use finalizers in C#, they're useful. I understand, it's a popular misunderstanding, that people think, that GC must work like RAII. But GC manages only its resources, not your resources. It can manage memory without RAII, and it does so. Speaking about eager resource management, we have Unique and RefCounted in phobos, in fact, files are already managed that way. What's problem? It completely undermines the point. If you're prepared to call finalise, when you might as well call free... Every single detail required to perform full manual memory management is required to use finalise correctly. I see absolutely no point in a GC when used with objects that require you to manually call finalise anyway. Do we ARC just those objects that have destructors like Andrei suggested? It's a possibility, I can't think of any other solution. In lieu of any other solution, it sounds like we could very well end up with ARC tech available one way or another, even if it's not pervasive, just applied implicitly to things with destructors. BTW, I don't see how ARC would be more able to call destructors, than GC. If ARC can call destructor, so can GC. Where's the difference? ARC release is eager. It's extremely common that destructors either expect to be called eagerly, or rely on proper destruction ordering. Otherwise you end up with finalise again, read: unsafe manual memory management :/
Re: radical ideas about GC and ARC : need to be time driven?
On 13/05/14 13:46, Kagamin wrote: BTW, I don't see how ARC would be more able to call destructors, than GC. If ARC can call destructor, so can GC. Where's the difference? The GC will only call destructors when it deletes an object, i.e. when it runs a collection. There's no guarantee that a collection will happen. With ARC, as soon as a reference goes out of scope it's decremented. If the reference count then goes to zero it will call the destructor and delete the object. -- /Jacob Carlborg
Re: radical ideas about GC and ARC : need to be time driven?
On Tuesday, 13 May 2014 at 13:21:04 UTC, Jacob Carlborg wrote: The GC will only call destructors when it deletes an object, i.e. when it runs a collection. There's no guarantee that a collection will happen. Ah, so when GC collects an object, it calls destructor. It sounded as if it's not guaranteed at all.
Re: radical ideas about GC and ARC : need to be time driven?
On Tuesday, 13 May 2014 at 12:18:06 UTC, Manu via Digitalmars-d wrote: It completely undermines the point. If you're prepared to call finalise, when you might as well call free... Every single detail required to perform full manual memory management is required to use finalise correctly. I see absolutely no point in a GC when used with objects that require you to manually call finalise anyway. Well, GC doesn't run immidiately, so you can't do eager resource management with it. GC manages memory, not other resources, and lots of people do see point in it: java and C# are industry quality technologies in wide use. ARC release is eager. It's extremely common that destructors either expect to be called eagerly, or rely on proper destruction ordering. Otherwise you end up with finalise again, read: unsafe manual memory management :/ No language will figure out all algorithms for you, but this looks like a rare scenario: for example, kernel objects don't require ordered destruction. Finalizer will be called when GC collects the object, it's a last resort cleanup, but it's not as unsafe as it used to be.
Re: radical ideas about GC and ARC : need to be time driven?
On Tuesday, 13 May 2014 at 14:46:18 UTC, Kagamin wrote: On Tuesday, 13 May 2014 at 13:21:04 UTC, Jacob Carlborg wrote: The GC will only call destructors when it deletes an object, i.e. when it runs a collection. There's no guarantee that a collection will happen. Ah, so when GC collects an object, it calls destructor. It sounded as if it's not guaranteed at all. Currently it isn't, because the GC sometimes lacks type information, e.g. for dynamic arrays.
Re: radical ideas about GC and ARC : need to be time driven?
On Tuesday, 13 May 2014 at 14:59:42 UTC, Kagamin wrote: On Tuesday, 13 May 2014 at 12:18:06 UTC, Manu via Digitalmars-d wrote: It completely undermines the point. If you're prepared to call finalise, when you might as well call free... Every single detail required to perform full manual memory management is required to use finalise correctly. I see absolutely no point in a GC when used with objects that require you to manually call finalise anyway. Well, GC doesn't run immidiately, so you can't do eager resource management with it. GC manages memory, not other resources, and lots of people do see point in it: java and C# are industry quality technologies in wide use. ARC release is eager. It's extremely common that destructors either expect to be called eagerly, or rely on proper destruction ordering. Otherwise you end up with finalise again, read: unsafe manual memory management :/ No language will figure out all algorithms for you, but this looks like a rare scenario: for example, kernel objects don't require ordered destruction. Finalizer will be called when GC collects the object, it's a last resort cleanup, but it's not as unsafe as it used to be. It's not (memory) unsafe because you cannot delete live objects accidentally, but it's unsafe because it leaks resources. Imagine a file object that relies on the destructor closing the file descriptor. You will quickly run out of FDs... I only see two use cases for finalizers (as opposed to destructors): 1.) Release manually allocated objects (or even ARC objects) that belong to the finalized object, i.e. releasing dependent objects. This, of course _must not_ involve critical external resources like FDs or temporary files. 2.) Implement weak references.
Re: radical ideas about GC and ARC : need to be time driven?
On Monday, 12 May 2014 at 04:22:21 UTC, Marco Leise wrote: On the positive side the talk about Rust, in particular how reference counted pointers decay to borrowed pointers made me think the same could be done for our scope args. A reference counted slice with 3 machine words could decay to a 2 machine word scoped slice. Most of my code at least just works on the slices and doesn't keep a reference to them. A counter example is when you have something like an XML parser - a use case that D traditionally (see Tango) excelled in. The GC environment and slices make it possible to replace string copies with cheap slices into the original XML string. Rust also has a solution for this: They have lifetime annotations. D's scope could be extended to support something similar: scope(input) string getSlice(scope string input); or with methods: struct Xml { scope(this) string getSlice(); } scope(symbol) means, this value references/aliases (parts of) the value referred to by symbol. The compiler can then make sure it is never assigned to variables with longer lifetimes than symbol.
Re: radical ideas about GC and ARC : need to be time driven?
On Sunday, 11 May 2014 at 05:16:26 UTC, Paulo Pinto wrote: This is what java.lang.ref.ReferenceQueue are for in Java, but one needs to be a GC expert on how to use it, otherwise it will hinder the GCs work. I think all memory-partitioning-related performance requires expert knowledge. If people care about performance and reliability they have to accept that they cannot blindly use abstractions or throw everything into the same bag. Java is probably a good example of how unrealistic it is to have a general programming language that does reasonable well in most domains. The outcome has not been everybody under the Sun umbrella, but a wide variety of Java runtime-solutions and special systems.
Re: radical ideas about GC and ARC : need to be time driven?
Le 12/05/2014 06:26, Marco Leise a écrit : Am Mon, 12 May 2014 03:36:34 +1000 schrieb Manu via Digitalmars-d digitalmars-d@puremagic.com: On 12 May 2014 02:38, Marco Leise via Digitalmars-d digitalmars-d@puremagic.com wrote: Am Sun, 11 May 2014 14:52:50 +1000 schrieb Manu via Digitalmars-d digitalmars-d@puremagic.com: On 11 May 2014 05:39, H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: On Sat, May 10, 2014 at 09:16:54PM +0200, Xavier Bigand via Digitalmars-d wrote: - Same question if D migrate to ARC? I highly doubt D will migrate to ARC. ARC will probably become *possible*, but some language features fundamentally rely on the GC, and I can't see how that will ever be changed. Which ones are incompatible with ARC? Pass-by-value slices as 2 machine words 64bit pointers are only 40-48 bits, so there's 32bits waste for an offset... and if the base pointer is 32byte aligned (all allocated memory is aligned), then you can reclaim another 5 bits there... I think saving an arg register would probably be worth a shift. 32bit pointers... not so luck :/ video games consoles though have bugger all memory, so heaps of spare bits in the pointers! :P And remember how people abused the high bit in 32-bit until kernels were modified to support the full address space and the Windows world got that LARGE_ADDRESS_AWARE flag to mark executables that do not gamble with the high bit. On the positive side the talk about Rust, in particular how reference counted pointers decay to borrowed pointers made me think the same could be done for our scope args. A reference counted slice with 3 machine words could decay to a 2 machine word scoped slice. Most of my code at least just works on the slices and doesn't keep a reference to them. A counter example is when you have something like an XML parser - a use case that D traditionally (see Tango) excelled in. The GC environment and slices make it possible to replace string copies with cheap slices into the original XML string. I don't really understand why there is no parser with something like slices in a language without GC. It's not possible to put the array to a more globally place, then the parser API will use 2 indexes instead of the buffer as parameter?
Re: radical ideas about GC and ARC : need to be time driven?
On Saturday, 10 May 2014 at 19:17:02 UTC, Xavier Bigand wrote: My concerns as Dlang user are : - Even if GC is the solution, how long I need suffer with destructor's issues (calls order)? What issues do you have with destructors and how they affect you? - When we will able to see a performant GC implementation can satisfy someone like Manu :) ? Months, years, a decade? Neither GC nor C heap will satisfy Manu's requirements. When it comes to shooters, the only way is to not allocate and write accurate code, even in C++. Even substitution of allocator won't help him, if the code relies on GC in a non-trivial way.
Re: radical ideas about GC and ARC : need to be time driven?
On Monday, 12 May 2014 at 21:54:51 UTC, Xavier Bigand wrote: I don't really understand why there is no parser with something like slices in a language without GC. It's not possible to put the array to a more globally place, then the parser API will use 2 indexes instead of the buffer as parameter? Slices are counterproductive if you want to provide standard-compliant xml implementation, i.e. unescape strings. It also requires more memory to hold entire xml document and can't collect nodes, which became unused. Usually xml parsers use a string table to reuse all repetitive strings in xml, reducing memory requirements.
Re: radical ideas about GC and ARC : need to be time driven?
On Saturday, 10 May 2014 at 19:41:15 UTC, H. S. Teoh wrote: On Sat, May 10, 2014 at 09:16:54PM +0200, Xavier Bigand wrote: Why theses questions : - Memory management seems to be one of last (or the last) critical point for a vastly adoption in production. [...] Nah, it's just the thing that gets complained about the most. There are other big issues that need to be fixed. Like compatibility with handheld architectures. Completing the implementation of @safe. Fixing the holes in the type system (esp. w.r.t. const/immutable). Issues with 'shared'. AA implementation. +1000! :-P --- Paolo
Re: radical ideas about GC and ARC : need to be time driven?
Am Sun, 11 May 2014 14:52:50 +1000 schrieb Manu via Digitalmars-d digitalmars-d@puremagic.com: On 11 May 2014 05:39, H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: On Sat, May 10, 2014 at 09:16:54PM +0200, Xavier Bigand via Digitalmars-d wrote: - Same question if D migrate to ARC? I highly doubt D will migrate to ARC. ARC will probably become *possible*, but some language features fundamentally rely on the GC, and I can't see how that will ever be changed. Which ones are incompatible with ARC? Pass-by-value slices as 2 machine words -- Marco
Re: radical ideas about GC and ARC : need to be time driven?
On 12 May 2014 02:38, Marco Leise via Digitalmars-d digitalmars-d@puremagic.com wrote: Am Sun, 11 May 2014 14:52:50 +1000 schrieb Manu via Digitalmars-d digitalmars-d@puremagic.com: On 11 May 2014 05:39, H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: On Sat, May 10, 2014 at 09:16:54PM +0200, Xavier Bigand via Digitalmars-d wrote: - Same question if D migrate to ARC? I highly doubt D will migrate to ARC. ARC will probably become *possible*, but some language features fundamentally rely on the GC, and I can't see how that will ever be changed. Which ones are incompatible with ARC? Pass-by-value slices as 2 machine words 64bit pointers are only 40-48 bits, so there's 32bits waste for an offset... and if the base pointer is 32byte aligned (all allocated memory is aligned), then you can reclaim another 5 bits there... I think saving an arg register would probably be worth a shift. 32bit pointers... not so luck :/ video games consoles though have bugger all memory, so heaps of spare bits in the pointers! :P
Re: radical ideas about GC and ARC : need to be time driven?
On Sunday, 11 May 2014 at 17:36:44 UTC, Manu via Digitalmars-d wrote: On 12 May 2014 02:38, Marco Leise via Digitalmars-d digitalmars-d@puremagic.com wrote: Am Sun, 11 May 2014 14:52:50 +1000 schrieb Manu via Digitalmars-d digitalmars-d@puremagic.com: On 11 May 2014 05:39, H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: On Sat, May 10, 2014 at 09:16:54PM +0200, Xavier Bigand via Digitalmars-d wrote: - Same question if D migrate to ARC? I highly doubt D will migrate to ARC. ARC will probably become *possible*, but some language features fundamentally rely on the GC, and I can't see how that will ever be changed. Which ones are incompatible with ARC? Pass-by-value slices as 2 machine words 64bit pointers are only 40-48 bits, so there's 32bits waste for an offset... and if the base pointer is 32byte aligned (all allocated memory is aligned), then you can reclaim another 5 bits there... I think saving an arg register would probably be worth a shift. 32bit pointers... not so luck :/ video games consoles though have bugger all memory, so heaps of spare bits in the pointers! :P I thought x86_64 pointers are sign extended? or does that just apply to real memory pointers and not virtual memory pointers?
Re: radical ideas about GC and ARC : need to be time driven?
Am Mon, 12 May 2014 03:36:34 +1000 schrieb Manu via Digitalmars-d digitalmars-d@puremagic.com: On 12 May 2014 02:38, Marco Leise via Digitalmars-d digitalmars-d@puremagic.com wrote: Am Sun, 11 May 2014 14:52:50 +1000 schrieb Manu via Digitalmars-d digitalmars-d@puremagic.com: On 11 May 2014 05:39, H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: On Sat, May 10, 2014 at 09:16:54PM +0200, Xavier Bigand via Digitalmars-d wrote: - Same question if D migrate to ARC? I highly doubt D will migrate to ARC. ARC will probably become *possible*, but some language features fundamentally rely on the GC, and I can't see how that will ever be changed. Which ones are incompatible with ARC? Pass-by-value slices as 2 machine words 64bit pointers are only 40-48 bits, so there's 32bits waste for an offset... and if the base pointer is 32byte aligned (all allocated memory is aligned), then you can reclaim another 5 bits there... I think saving an arg register would probably be worth a shift. 32bit pointers... not so luck :/ video games consoles though have bugger all memory, so heaps of spare bits in the pointers! :P And remember how people abused the high bit in 32-bit until kernels were modified to support the full address space and the Windows world got that LARGE_ADDRESS_AWARE flag to mark executables that do not gamble with the high bit. On the positive side the talk about Rust, in particular how reference counted pointers decay to borrowed pointers made me think the same could be done for our scope args. A reference counted slice with 3 machine words could decay to a 2 machine word scoped slice. Most of my code at least just works on the slices and doesn't keep a reference to them. A counter example is when you have something like an XML parser - a use case that D traditionally (see Tango) excelled in. The GC environment and slices make it possible to replace string copies with cheap slices into the original XML string. -- Marco
Re: radical ideas about GC and ARC : need to be time driven?
On Sunday, 11 May 2014 at 20:18:23 UTC, luminousone wrote: I thought x86_64 pointers are sign extended? or does that just apply to real memory pointers and not virtual memory pointers? With typed slices you can use one 64 bit base pointer and two 32 bit indexes instead of two pointers.
Re: radical ideas about GC and ARC : need to be time driven?
On Sat, May 10, 2014 at 09:16:54PM +0200, Xavier Bigand via Digitalmars-d wrote: [...] My concerns as Dlang user are : - Even if GC is the solution, how long I need suffer with destructor's issues (calls order)? Dtor calling order and GC are fundamentally incompatible. I don't think this will ever be changed. The problem is, how do you guarantee that the GC will only clean up garbage in the order of reference? You can't do this without killing GC performance. - When we will able to see a performant GC implementation can satisfy someone like Manu :) ? Months, years, a decade? I think somebody is working on porting a D1 concurrent GC to D2, so hopefully that will be done sometime in the near future... But I don't know if that's enough to satisfy Manu. His requirements are pretty high. :) - Same question if D migrate to ARC? I highly doubt D will migrate to ARC. ARC will probably become *possible*, but some language features fundamentally rely on the GC, and I can't see how that will ever be changed. Why theses questions : - Memory management seems to be one of last (or the last) critical point for a vastly adoption in production. [...] Nah, it's just the thing that gets complained about the most. There are other big issues that need to be fixed. Like compatibility with handheld architectures. Completing the implementation of @safe. Fixing the holes in the type system (esp. w.r.t. const/immutable). Issues with 'shared'. AA implementation. I'm sure people can come up with many other big items that need to be addressed. T -- Mediocrity has been pushed to extremes.
Re: radical ideas about GC and ARC : need to be time driven?
I think the reason people ask about improving the GC so frequently is that it's not clear when any potential future iprovements will come around. I think perhaps things may become more clear to those concerned if they were told who was working on GC, roughly when they can expect to see certain improvements, and so on. I think the recent work Walter has been doing on the @nogc attribute is valuable, and will be a major contributor to reducing the number of undue allocations in the standard library. While this will not address quality of implementation issues with the garbage collector, this will certainly reduce the impact of performance problems caused by pause times. Culprits will be easier to track down and eliminate with the new attribute and its semantics.
Re: radical ideas about GC and ARC : need to be time driven?
On Saturday, 10 May 2014 at 19:41:15 UTC, H. S. Teoh via Digitalmars-d wrote: On Sat, May 10, 2014 at 09:16:54PM +0200, Xavier Bigand via Digitalmars-d wrote: [...] My concerns as Dlang user are : - Even if GC is the solution, how long I need suffer with destructor's issues (calls order)? Dtor calling order and GC are fundamentally incompatible. I don't think this will ever be changed. The problem is, how do you guarantee that the GC will only clean up garbage in the order of reference? You can't do this without killing GC performance. You can build a queue of root nodes in terms of parent-child ownership if you have parent backpointers. That allows you to separate scanning from releasing. You can then release when idle using a priority queue. You can optimize scanning by tracing parent-pointers first then mark parent-children trees as live when hitting roots using extra datastructures and meta information. (assuming the tree has no external pointers below the root) It has language and runtime consequences, but I doubt it will kill performance. (I don't think it belongs in a system level language though)
Re: radical ideas about GC and ARC : need to be time driven?
On Saturday, 10 May 2014 at 19:17:02 UTC, Xavier Bigand wrote: - When we will able to see a performant GC implementation can satisfy someone like Manu :) ? Months, years, a decade? Never, Manu wants to do fine granularity allocations. - Same question if D migrate to ARC? You probably also need to use an autorelease pool for realtime callbacks and add whole program analysis to avoid exessive ref counting. For DQuick we completely loose our motivation cause of the lack of clear plans on decisions in the language having a deep impact on this project. Yes, evolutionary drift is a key problem. Not having worked out the core memory model before adding language features is also a problem Don't forget the ecosystem around the language will also take years to grow before coming really interesting. For applications, yes. But I don't think that is true for a system level language... A solid language and compiler with minimal runtime requirements would cut it. In my sens C++ fails to evolve cause of conservative decisions, Apple with Clang work on include removal. I'd prefer to see the C++ community took this decision to make it standard even if it break the language. I would be sad to see D, making the same. D is limiting itself by: 1. Requiring C/C++ compatible runtime for all threads. You could do better by having some threads with no FFI. 2. Limiting breaking changes for a development branch... 3. Having 3 different backends. Language design does affect IR/backend if you eant performance. Rewriting 3 backends is... Not realistic, If phobos comes allocation free or use allocators, then it will be easier to migrate from one memory management to an other one. Phobos is a non-issue, language constructs and runtime is the primary issue. (you can have a real time library) In conclusion would be possible to migrate to ARC if the time gap in comparison of an ideal implementation of GC is of the order of many years or a decade? No, because it still won't perform well with multi threading and by that time CPUs will support transactional memory. RC kills transactions by doing writes. Isolates are easy to do, but they are not performant.
Re: radical ideas about GC and ARC : need to be time driven?
On 11 May 2014 05:39, H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: On Sat, May 10, 2014 at 09:16:54PM +0200, Xavier Bigand via Digitalmars-d wrote: - Same question if D migrate to ARC? I highly doubt D will migrate to ARC. ARC will probably become *possible*, but some language features fundamentally rely on the GC, and I can't see how that will ever be changed. Which ones are incompatible with ARC?
Re: radical ideas about GC and ARC : need to be time driven?
Am 11.05.2014 03:31, schrieb Ola Fosheim Grøstad ola.fosheim.grostad+dl...@gmail.com: On Saturday, 10 May 2014 at 19:41:15 UTC, H. S. Teoh via Digitalmars-d wrote: On Sat, May 10, 2014 at 09:16:54PM +0200, Xavier Bigand via Digitalmars-d wrote: [...] My concerns as Dlang user are : - Even if GC is the solution, how long I need suffer with destructor's issues (calls order)? Dtor calling order and GC are fundamentally incompatible. I don't think this will ever be changed. The problem is, how do you guarantee that the GC will only clean up garbage in the order of reference? You can't do this without killing GC performance. You can build a queue of root nodes in terms of parent-child ownership if you have parent backpointers. That allows you to separate scanning from releasing. You can then release when idle using a priority queue. This is what java.lang.ref.ReferenceQueue are for in Java, but one needs to be a GC expert on how to use it, otherwise it will hinder the GCs work. -- Paulo