Re: Yet another optparse
On Wednesday, 10 January 2007 at 03:57:58 UTC, Kirk McDonald wrote: Knowing that D already has (by my count) three command-line argument parsers, I have gone and written my own, anyway. As with at least one other of the parsers that I've seen, it is (at least loosely) based on Python's optparse library. You can find it here: http://dsource.org/projects/pyd/browser/misc/optparse.d An example of its use can be found here: http://dsource.org/projects/pyd/browser/misc/opttest.d This code does not compile with the current version of phobos. Most updates are straight forward except for one loop using an old version of find. Has anyone out there updated this old module? If so I would find it useful. Please let me know if it's considered bad form on this forum to revive old (in this case ancient) threads. Thanks -- Chris
Re: GC experiments. Writing my own GC.
Existing GC code: 15700ms (average) My GC code: 500ms (Average) Congratulations! Can you share you good work for us? or exe? dll? Thank you. Frank
Re: Next step on reference counting topics
On 12/05/14 21:00, Andrei Alexandrescu wrote: There's been a lot of talk lately regarding improving resource management for D, and I'd like to figure the next logical step to take. It seems clear that we have reached a collective impasse on a few fundamentals, and that more just talk about it all is at the point of diminishing returns. One action item that is hopefully useful to people of all viewpoints is to double down on library support, and see how far we can get and what insights we collect from the experience. For that I'm proposing we start real work toward a state-of-the-art std.refcounted module. It would include adapters for class, array, and pointer types, and should inform language improvements for qualifiers (i.e. the tail-const problem), copy elision, literals, operators, and such. Perhaps, as has been already stared, sprinkle Phobos with output ranges and/or allocators. -- /Jacob Carlborg
Re: Some simple ideas about GC
On Monday, 12 May 2014 at 23:44:09 UTC, Andrei Alexandrescu wrote: I'll keep those with which std.allocator is likely to help: - The current GC code is not hackable. First rewrite then improve. - A testable and more modular rewrite (using recent D practices) would encourage more contribution and is necessary for experimentation. I think std.allocator is some 15 work-hours from reviewable form, and std.typed_allocator (with tracing and all) some 50 more work-hours. Unfortunately these numbers grow due to fragmentation - and OMG I made a pun too. Let's hope it doesn't become a fractal :). There are already some existing allocators, e.g. vibe.d. If you make it possible to try out the allocator, report bugs and contribute fixes, this should help to polish the implementation. You could do this by moving your work to a separate repo and registering a dub package, instead of using a phobos branch. I haven't yet looked at typed_allocator, but the heap layers concept is just about right for a GC rewrite. Maybe we'll use multiple specialized GCs in the future, instead of one generic GC.
Re: borrowed pointers vs ref
On Tuesday, 13 May 2014 at 04:46:41 UTC, Russel Winder via Digitalmars-d wrote: On Tue, 2014-05-13 at 04:07 +, logicchains via Digitalmars-d wrote: […] This sounds a bit like an 'issue' of sorts that Rust has with borrowed pointers, where certain types of datastructures cannot be written without resorting to the 'unsafe' parts of the language. The solution they've adopted is having such code written in libraries so that the user doesn't have to mess around with 'unsafe'. Probably re-finding many of the things people have to use sun.misc.Unsafe for on the JVM. Which is why the Java designers are looking on how to make Unsafe an official package as of Java 9. And did the survey a few months ago, about how Unsafe was being used in major Java projects. -- Paulo
Re: More radical ideas about gc and reference counting
On 13.05.2014 00:15, Martin Nowak wrote: On 05/11/2014 08:18 PM, Rainer Schuetze wrote: 1. Use a scheme that takes a snapshot of the heap, stack and registers at the moment of collection and do the actual collection in another thread/process while the application can continue to run. This is the way Leandro Lucarellas concurrent GC works (http://dconf.org/2013/talks/lucarella.html), but it relies on "fork" that doesn't exist on every OS/architecture. A manual copy of the memory won't scale to very large memory, though it might be compressed to possible pointers. Worst case it will need twice as much memory as the current heap. There is a problem with this scheme, copy-on-write is extremely expensive when a mutation happens. That's one page fault (context switch) + copying a whole page + mapping the new page. I agree that this might be critical, but it is a one time cost per page. It seems unrealistic to do this with user mode exceptions, but the OS should have this optimized pretty well. > It's much worse > with huge pages (2MB page size). How common are huge pages nowadays?
Re: More radical ideas about gc and reference counting
On 12.05.2014 13:53, "Marc Schütz" " wrote: I'm surprised that you didn't include: 3. Thread-local GC, isolated zones (restricting where references to objects of a particular heap can be placed), exempting certain threads from GC completely, ... This comes up from time to time, but to me it is very blurry how this can work in reality. Considering how "shared" is supposed to be used to be useful (do some locking, then cast away "shared") there is no guarantee by the language that any object is actually thread local (no references from other threads). Working with immutable (e.g. strings) is shared by design.
Re: radical ideas about GC and ARC : need to be time driven?
On Monday, 12 May 2014 at 21:54:51 UTC, Xavier Bigand wrote: I don't really understand why there is no parser with something like slices in a language without GC. It's not possible to put the array to a more globally place, then the parser API will use 2 indexes instead of the buffer as parameter? Slices are counterproductive if you want to provide standard-compliant xml implementation, i.e. unescape strings. It also requires more memory to hold entire xml document and can't collect nodes, which became unused. Usually xml parsers use a string table to reuse all repetitive strings in xml, reducing memory requirements.
Re: borrowed pointers vs ref
On Tue, 2014-05-13 at 04:07 +, logicchains via Digitalmars-d wrote: […] > This sounds a bit like an 'issue' of sorts that Rust has with > borrowed pointers, where certain types of datastructures cannot > be written without resorting to the 'unsafe' parts of the > language. The solution they've adopted is having such code > written in libraries so that the user doesn't have to mess around > with 'unsafe'. Probably re-finding many of the things people have to use sun.misc.Unsafe for on the JVM. -- Russel. = Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.win...@ekiga.net 41 Buckmaster Roadm: +44 7770 465 077 xmpp: rus...@winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
GC experiments. Writing my own GC.
Hi all, As a learning exercise I've just been doing some experimenting with rewriting the garbage collection code, and thought I might share some of the initial results. I only program as a hobby these days, and I'm certainly no expert, but I thought some people might find it interesting. My interest started because I wrote a LR1 file parser in D. I then multi threaded the application so multiple files could be parsed simultaneously. (Disk IO was all on one thread). To my surprise, the through-put dropped significantly. I could process the files a lot faster using only one thread. It turns out the delays were due to my liberal use of the “new” statement. Rather than block allocate the memory I thought I would just hack the GC. So I cleared out gc.d in the runtime and started again. The basic plan was to make it more multi-thread friendly. Already I have learnt quite a lot. There are a number of things I would do differently if I tried another re-write. However so far it has been a good learning experience. Currently, allocations are all working, and the mark phase is running. It still will not sweep and free the memory. All memory allocations are entirely lock free (using CAS instructions). So a pre-empted thread will never block another. For allocations of less than 128 bytes, each thread is allocated memory from it's own memory pool to avoid false sharing on the CPU's cache. The collector component runs on a background thread using a mark and sweep algorithm which is basically the same as the existing algorithm. Currently the thread will wake up every 100ms and decide if a collection should be performed. An emergency collection will run in the foreground if a memory allocation fails during that period. The mark phase needs to stop the world. The sweeping portion of the collection will run in the background. This is similar to the current implementation as the world is restarted after the mark phase, however the thread doing the collection will not allocate the requested memory to the calling thread until after the sweep has completed. This means that single threaded applications always wait for the full garbage collection cycle. So far allocation speed seems to have improved. I can't test collection speed as it's not complete. As a test I wrote a simple function that allocates a linked list of 2 million items. This function is then spawned by 20 threads. This test script is shown below. Timing for allocation (with GC disabled) is as follows. (Using DMD 2.065) Existing GC code: 15700ms (average) My GC code: 500ms (Average) When performing the same amount of allocations on a single thread, the new code is still slightly faster than the old. What this demonstrates is that the locking mechanisms in the current GC code is a huge overhead for multi threaded applications that perform a lot of memory allocations. (ie. Use the “new” operator or dynamic arrays.) It would be nice to see the default GC and memory allocator improved. There is certainly room for improvement on the allocator end which may mask some of the performance issues associated with garbage collection. In the future I think D needs to look at making collection precise. It would not be too hard to adjust the mark and sweep GC to be nearly precise. The language needs to support precise GC before things like moving garbage collection become feasible. Anyway, I just thought I'd share the results of my experimenting. I would be happy to make the code available in a few weeks time. Perhaps someone might find is useful. I need to get it finished and tested first. :-) Cheers! Adam -- //Test script that generated these results: import std.stdio; import std.datetime; import std.concurrency; import core.memory; class LinkedList{ long value =0; LinkedList next; } shared int threadCount = 0; void main(){ core.memory.GC.disable(); auto start = Clock.currSystemTick(); foreach(i; 0 .. 20){ auto tid = spawn(&doSomething, thisTid); threadCount++; } while(threadCount >0){}; auto ln = Clock.currSystemTick() - start; writeln(ln.msecs, "ms"); } void doSomething(Tid tid){ auto top = new LinkedList; auto recent = top; //Create the linked list foreach(i; 1 .. 2_000_000){ auto newList = new LinkedList; newList.value = i; recent.next = newList; recent = newList; } //Sum the values. (Just spends some time walking the memory). recent = top; long total=0; while(recent !is null){ total += recent.value; recent = recent.next; } writeln("Total : ", total ); threadCount--; }
Re: radical ideas about GC and ARC : need to be time driven?
On Saturday, 10 May 2014 at 19:17:02 UTC, Xavier Bigand wrote: My concerns as Dlang user are : - Even if GC is the solution, how long I need suffer with destructor's issues (calls order)? What issues do you have with destructors and how they affect you? - When we will able to see a performant GC implementation can satisfy someone like Manu :) ? Months, years, a decade? Neither GC nor C heap will satisfy Manu's requirements. When it comes to shooters, the only way is to not allocate and write accurate code, even in C++. Even substitution of allocator won't help him, if the code relies on GC in a non-trivial way.
Re: borrowed pointers vs ref
On 13 May 2014 06:36, Walter Bright via Digitalmars-d wrote: > It's been brought up more than once that the 'scope' storage class is an > unimplemented borrowed pointer. But thinking a bit more along those lines, > actually 'ref' fills the role of a borrowed pointer. > > One particularly apropos behavior is that struct member functions pass > 'this' by ref, meaning that members can be called without the inc/dec > millstone. > > ref is still incomplete as far as this goes, but we can go the extra > distance with it, and then it will be of great help in supporting any ref > counting solution. > > What it doesn't work very well with are class references. But Andrei > suggested that we can focus the use of 'scope' to deal with that in an > analogous way. > > What do you think? > > Anyone want to enumerate a list of the current deficiencies of 'ref' in > regards to this, so we can think about solving it? I agree, I think finishing scope appears to deserve a priority boost, it would be enabling to a lot of developments in D to have reliable escape analysis. It seems more problematic to repurpose ref than to finish scope though. ref would change meaning quite significantly. ref would probably have to become part of the type (I can imagine needs for overloads arising?). You would need to be able to make ref locals, and ref members of structs so you can do useful work with them. You'd need to be able to create an array of 'ref's I think by-value scope still has some value too. A small struct that's passed by value (like slices) may contain a pointer. You shouldn't need to handle that small struct by reference when you really just wanted to attribute it with scope. I never saw any problems with the scope idea as it stood, and I think ref is still useful in it's existing incarnation; the same way that it's useful in C++, ie, a pointer that must be initialised, hides reassignment and offset/indexing semantics (which can often interfere with generic code). extern(C++) would gain a new problem if ref were repurposed.
Re: borrowed pointers vs ref
On Monday, 12 May 2014 at 21:15:38 UTC, Steven Schveighoffer wrote: Hm... the one piece that I think would be damaging is to not be able to take an address of the 'this' reference. It's probably OK to just use pointers and static functions in some cases, but member functions do not have that luxury. In other words, operators. Big example would be a doubly linked list used with ~=. -Steve This sounds a bit like an 'issue' of sorts that Rust has with borrowed pointers, where certain types of datastructures cannot be written without resorting to the 'unsafe' parts of the language. The solution they've adopted is having such code written in libraries so that the user doesn't have to mess around with 'unsafe'.
Re: Some simple ideas about GC
On Monday, 12 May 2014 at 23:09:07 UTC, Martin Nowak wrote: - The current GC code is not hackable. First rewrite then improve. I believe this might be one of the bigger factors for why we still (3 years later) do not have a GC that allows allocation during finalization. Allocation during finalization isn't an over ambitious project, but between the uglyness of the GC code and the low demand for this feature it is unlikely that somebody with tackle it.
Re: More radical ideas about gc and reference counting
On 5/12/2014 2:28 PM, Xavier Bigand wrote: All compile time things of D are marvelous. This with the compile time and the language less error prone make me want D. I am not sure I need safety so much. It's nice but not mandatory for any of my projects. The only one which has to be safe is DQuick. Safety becomes a big concern when you're developing code as part of a team.
Re: Some simple ideas about GC
On 5/12/14, 4:09 PM, Martin Nowak wrote: I'd like to share some thoughts on improving D's GC, nothing radically different though. A few observations I'll keep those with which std.allocator is likely to help: - The current GC code is not hackable. First rewrite then improve. - A testable and more modular rewrite (using recent D practices) would encourage more contribution and is necessary for experimentation. I think std.allocator is some 15 work-hours from reviewable form, and std.typed_allocator (with tracing and all) some 50 more work-hours. Unfortunately these numbers grow due to fragmentation - and OMG I made a pun too. Andrei
Re: FYI - mo' work on std.allocator
On 5/12/14, 3:58 PM, Brian Schott wrote: On Sunday, 27 April 2014 at 05:43:07 UTC, Andrei Alexandrescu wrote: Added SbrkRegion, SimpleBlocklist, and Blocklist. http://erdani.com/d/phobos-prerelease/std_allocator.html#.SbrkRegion http://erdani.com/d/phobos-prerelease/std_allocator.html#.SimpleBlocklist http://erdani.com/d/phobos-prerelease/std_allocator.html#.Blocklist https://github.com/andralex/phobos/blob/allocator/std/allocator.d Destruction is as always welcome. I plan to get into tracing tomorrow morning. Andrei Some comments on the version currently checked in (916032a0b6a76b6e37169121ee5cc680bb40b4c4): Line 3173: b2 is unused Line 3177: b3 is unused Line 3496: tids is unused Line 3512: b is unused Line 4235: r2 is never used. This one is probably a bug. Line 5426: b2 is never used Line 5958: b is unused Line 6337: alloc2 is unused. Looks like another bug Will fix. Awesome, thanks! -- Andrei
Re: 64-bit DMD for windows?
On 5/12/2014 5:01 PM, Andrej Mitrovic via Digitalmars-d wrote: On 5/12/14, Nick Sabalausky via Digitalmars-d wrote: You don't need a 64-bit version: Compiling 64-bit programs doesn't require a 64-bit compiler. Just install VC++, use the DMD 2.065 Win installer, and then toss in the -m64 flag when compiling. Works fine. Doesn't matter if DMD itself is 32-bit. As Vladimir in IRC reminded me, there is one use-case: You may need it for some intensive CTFE stuff (excessive memory allocations and no freeing by the compiler). That is, if you need more than 3/4 gigs. Right, there's certainly that. But that has nothing to do with whether you're trying to build a 64-bit or 32-bit program, and (at least for Windows) it isn't even an issue at all unless you actually are hitting that limit (unlikely for a newcomer to D). It sounded like steven kladitis was worried about just being able to create 64-bit programs. For that, it makes no difference if the compiler itself is a 32- or 64-bit build.
Some simple ideas about GC
I'd like to share some thoughts on improving D's GC, nothing radically different though. A few observations - Pause times seem to be a much bigger problem than CPU usage or memory bandwith. Focus on reducing the pause times. - The GC code is already fairly optimized, so there is a very low profitability in small-scale code optimizations. Improve the algorithms not the code. - The current GC code is not hackable. First rewrite then improve. and corresponding ideas. - Marking could be parallelized, sweeping should be done in the background, GC could serve allocations during sweep from a separate pool (e.g. thread-local). - The current GC does a lot of bookkeeping work due to how the pools are organized (heterogeneous bin sizes). I suspect (but don't know) that there are big gains in organizing this differently. - A testable and more modular rewrite (using recent D practices) would encourage more contribution and is necessary for experimentation. -Martin
Re: FYI - mo' work on std.allocator
On Sunday, 27 April 2014 at 05:43:07 UTC, Andrei Alexandrescu wrote: Added SbrkRegion, SimpleBlocklist, and Blocklist. http://erdani.com/d/phobos-prerelease/std_allocator.html#.SbrkRegion http://erdani.com/d/phobos-prerelease/std_allocator.html#.SimpleBlocklist http://erdani.com/d/phobos-prerelease/std_allocator.html#.Blocklist https://github.com/andralex/phobos/blob/allocator/std/allocator.d Destruction is as always welcome. I plan to get into tracing tomorrow morning. Andrei Some comments on the version currently checked in (916032a0b6a76b6e37169121ee5cc680bb40b4c4): Line 3173: b2 is unused Line 3177: b3 is unused Line 3496: tids is unused Line 3512: b is unused Line 4235: r2 is never used. This one is probably a bug. Line 5426: b2 is never used Line 5958: b is unused Line 6337: alloc2 is unused. Looks like another bug
Re: More radical ideas about gc and reference counting
On 5/12/2014 2:32 PM, Steven Schveighoffer wrote: It's still forbidden. Andrei wrote a template that will verify this at runtime, but I don't recall its name. Can you cite the spec where it says it's forbidden? Forgotten templates are not a convincing argument. Regardless, Java can use a moving GC, and allows self references. The idea that self references prevent a moving GC is simply false. If you think about it a bit, you will understand why. I see this is not specified in the documentation. Not sure what happened here, but I'll have to think about it.
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 22:27:06 UTC, Kapps wrote: because it's so short. This is quite sufficient for most projects, but perhaps could be tweaked a bit more for certain aspects like gaming, possibly even enabling concurrent collection for generation 0/1, but I'm not sure if this works well or is feasible. Still, the important thing is to get a good general one to use first, like the default one .NET uses for workstation applications. I agree that getting a good (100% precise) GC is an important first step. I am not so sure about generation based GC when you have a window on a world map that you move around which roughly is FIFO (first in, first out). But to get good speed I think you are better off having multiple pools that can be released with no collection when a network-connection drops (if you have one conceptual pool per connection), and optimized allocators that give you pre-initialized objects etc. In the ideal world all of this is transparent once you have specified your memory model (in detail), so you only have to issue a "new PlayerConnection" in the main logic of your program and can tweak the memory handling elsewhere. That is not the D way, from what I can tell from the forum posts so far, because "new" is going to stay tied to one global GC heap. So you have to write utility functions… which makes programs less legible.
Re: borrowed pointers vs ref
On 5/12/2014 2:15 PM, Steven Schveighoffer wrote: Hm... the one piece that I think would be damaging is to not be able to take an address of the 'this' reference. It's probably OK to just use pointers and static functions in some cases, but member functions do not have that luxury. In other words, operators. Big example would be a doubly linked list used with ~=. @trusted/@system code will be able to take the address of a ref.
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 19:13:50 UTC, Ola Fosheim Grøstad wrote: On Monday, 12 May 2014 at 18:07:51 UTC, Kapps wrote: Depending on how tunable the GC is, I feel like it should be possible to get away with a GC even for soft real-time programs like games. Even if you manage to make it work for game clients you also should make it work for low latency game servers, as code sharing is an important advantage. What a game/world server requires differs a lot, but highly dynamic and flexible worlds have to keep the physics to a single node (or tight cluster) for a region. That means you want to have as many players as possible tied to that node. In essence you want both performance, low latency, reliability, and little overhead in an evolutionary context (it has to support heavy modification over time). My gut feeling is that a runtime satisfying one game design will not satisfy another one as long as one insists on one global GC. In essence, it will never really work well. IMO, the same goes for ARC since RC does not perform well with multi-threading even when you use near optimal patterns and strategies. If ARC is only to be used where speed does not matter then you might as well use shared_ptr. .NET allows configuring the garbage collector by specifying workstation (concurrent, background [allow generation 0/1 collection while a generation 2 collection is going], one primary heap and a large object heap) or server (not certain if concurrent/background, but multiple heaps that get handled in parallel during collections). Or in situations where you have many processes running at once, disabling concurrent collection to reduce context switching overhead. In reality, most people leave the default concurrent collector, which is what I'd hope the default for D would be, but if it was sufficiently tunable something like vibe.d could decide to go with something more similar to what .NET uses for servers (which ASP.NET uses by default). I haven't been able to find good concrete numbers online, but the few sources I've found say that generation 0/1 collection tends to take <1 to 2-3 milliseconds and is not run concurrently because it's so short. This is quite sufficient for most projects, but perhaps could be tweaked a bit more for certain aspects like gaming, possibly even enabling concurrent collection for generation 0/1, but I'm not sure if this works well or is feasible. Still, the important thing is to get a good general one to use first, like the default one .NET uses for workstation applications.
Re: More radical ideas about gc and reference counting
On 05/11/2014 08:18 PM, Rainer Schuetze wrote: 1. Use a scheme that takes a snapshot of the heap, stack and registers at the moment of collection and do the actual collection in another thread/process while the application can continue to run. This is the way Leandro Lucarellas concurrent GC works (http://dconf.org/2013/talks/lucarella.html), but it relies on "fork" that doesn't exist on every OS/architecture. A manual copy of the memory won't scale to very large memory, though it might be compressed to possible pointers. Worst case it will need twice as much memory as the current heap. There is a problem with this scheme, copy-on-write is extremely expensive when a mutation happens. That's one page fault (context switch) + copying a whole page + mapping the new page. It's much worse with huge pages (2MB page size).
Re: radical ideas about GC and ARC : need to be time driven?
Le 12/05/2014 06:26, Marco Leise a écrit : Am Mon, 12 May 2014 03:36:34 +1000 schrieb Manu via Digitalmars-d : On 12 May 2014 02:38, Marco Leise via Digitalmars-d wrote: Am Sun, 11 May 2014 14:52:50 +1000 schrieb Manu via Digitalmars-d : On 11 May 2014 05:39, H. S. Teoh via Digitalmars-d wrote: On Sat, May 10, 2014 at 09:16:54PM +0200, Xavier Bigand via Digitalmars-d wrote: - Same question if D migrate to ARC? I highly doubt D will migrate to ARC. ARC will probably become *possible*, but some language features fundamentally rely on the GC, and I can't see how that will ever be changed. Which ones are incompatible with ARC? Pass-by-value slices as 2 machine words 64bit pointers are only 40-48 bits, so there's 32bits waste for an offset... and if the base pointer is 32byte aligned (all allocated memory is aligned), then you can reclaim another 5 bits there... I think saving an arg register would probably be worth a shift. 32bit pointers... not so luck :/ video games consoles though have bugger all memory, so heaps of spare bits in the pointers! :P And remember how people abused the high bit in 32-bit until kernels were modified to support the full address space and the Windows world got that LARGE_ADDRESS_AWARE flag to mark executables that do not gamble with the high bit. On the positive side the talk about Rust, in particular how reference counted pointers decay to borrowed pointers made me think the same could be done for our "scope" args. A reference counted slice with 3 machine words could decay to a 2 machine word "scoped" slice. Most of my code at least just works on the slices and doesn't keep a reference to them. A counter example is when you have something like an XML parser - a use case that D traditionally (see Tango) excelled in. The GC environment and slices make it possible to replace string copies with cheap slices into the original XML string. I don't really understand why there is no parser with something like slices in a language without GC. It's not possible to put the array to a more globally place, then the parser API will use 2 indexes instead of the buffer as parameter?
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 21:22:09 UTC, Steven Schveighoffer wrote: On Mon, 12 May 2014 14:14:28 -0400, Ola Fosheim Grøstad wrote: On Monday, 12 May 2014 at 17:52:18 UTC, Walter Bright wrote: On 5/12/2014 7:46 AM, Steven Schveighoffer wrote: pointing at it is roughly 1/256. This problem is just about eliminated with 64-bit pointers. Not generally true. This presumes that the heap is not in the lower region of the address space and that you don't use 64 bit ints on the stack. I was thinking in terms of purely a random number happening to point at heap data. Practically speaking, I don't know the true likelihood based on the heap address scheme of 64-bit OSes, but Wicked topic. In AMD64 mode hi-mem is usually reserved for kernel etc. Traditionally the unixy heap grew from low towards high addresses: http://en.wikipedia.org/wiki/Sbrk But that is legacy. I think mmap is it… :-P And layout is randomized to reduce the effect of buffer overflow etc. :-( I know that we always have a complainer who will try and do an array-append test on 32-bit code, and end up exhausting memory unexpectedly. Uhuh. Not focusing on precise collection gets ugly.
Re: More radical ideas about gc and reference counting
On Mon, 12 May 2014 17:32:09 -0400, Steven Schveighoffer wrote: The workaround is simply to keep it around, but that's not always a scalable solution. Sorry, actually you can free it. That's the correct workaround. -Steve
Re: More radical ideas about gc and reference counting
On Mon, 12 May 2014 13:52:20 -0400, Walter Bright wrote: On 5/12/2014 7:46 AM, Steven Schveighoffer wrote: It doesn't matter where the false pointers are. The largest issue with false pointers is not how many false pointers there are. It only matters how large the block is that it "points" at. The larger your blocks get, the more likely they are "pointed" at by the stack. On 32-bit systems, allocate 1/256th of your memory space (i.e. 16.5MB), and the likelihood of random data on the stack pointing at it is roughly 1/256. This problem is just about eliminated with 64-bit pointers. Generally, it is a bad idea to allocate such large blocks on the GC heap. GC's work best when the size of the objects being allocated is very small relative to the size of the heap space. Fortunately, it's a mathematical inevitability that large allocations relative to the GC size are rare, and so it isn't much of a pain to handle them manually. The issue arises when one allocates such a large block for temporary use repeatedly, but expects it to be collected between allocations. The consequences are extremely disastrous. The workaround is simply to keep it around, but that's not always a scalable solution. And in fact, even if it's forbidden, "requires" is too strong a word -- there is no static or runtime prevention of this. It's still forbidden. Andrei wrote a template that will verify this at runtime, but I don't recall its name. Can you cite the spec where it says it's forbidden? Forgotten templates are not a convincing argument. Regardless, Java can use a moving GC, and allows self references. The idea that self references prevent a moving GC is simply false. If you think about it a bit, you will understand why. -Steve
Re: More radical ideas about gc and reference counting
Le 12/05/2014 19:14, Dicebot a écrit : On Monday, 12 May 2014 at 17:03:41 UTC, Manu via Digitalmars-d wrote: But D is *so close*... and I like it! >_< I have to say that this discussion has certainly left me somewhat intrigued by Rust though. I've never given it a fair go because I find the syntax so distasteful and deterring. I wonder if there's a market for a rust fork that re-skin's the language ;) Right now D has practical benefit of being more stable and library rich. But switching to Rust eventually does seem tempting as I find foundations of their type system much closer to my beliefs about "good coding practices". It lacks any good static reflection though. And this stuff is damn addictive when you try it of D caliber. All compile time things of D are marvelous. This with the compile time and the language less error prone make me want D. I am not sure I need safety so much. It's nice but not mandatory for any of my projects. The only one which has to be safe is DQuick.
Re: borrowed pointers vs ref
On Monday, 12 May 2014 at 20:36:10 UTC, Walter Bright wrote: It's been brought up more than once that the 'scope' storage class is an unimplemented borrowed pointer. But thinking a bit more along those lines, actually 'ref' fills the role of a borrowed pointer. One particularly apropos behavior is that struct member functions pass 'this' by ref, meaning that members can be called without the inc/dec millstone. ref is still incomplete as far as this goes, but we can go the extra distance with it, and then it will be of great help in supporting any ref counting solution. What it doesn't work very well with are class references. But Andrei suggested that we can focus the use of 'scope' to deal with that in an analogous way. What do you think? Anyone want to enumerate a list of the current deficiencies of 'ref' in regards to this, so we can think about solving it? I would prefer 'scope ref' that would allow the solution for classes and everything else to be unified, i.e. everything uses scope. When it comes to the implicit 'this' by ref, it could be redefined to pass by scope ref. Another reason is: I know this doesn't(and might never) work in D, but based on the intuitive meaning of 'ref' I fully expected the below example to work when I first started learning the language. struct A { ref int a_m; this(ref int a) { a_m = a; } } Whereas 'scope' on the other hand is self documenting imho.
Re: borrowed pointers vs ref
On 05/12/2014 10:36 PM, Walter Bright wrote: It's been brought up more than once that the 'scope' storage class is an unimplemented borrowed pointer. But thinking a bit more along those lines, actually 'ref' fills the role of a borrowed pointer. One particularly apropos behavior is that struct member functions pass 'this' by ref, meaning that members can be called without the inc/dec millstone. ref is still incomplete as far as this goes, but we can go the extra distance with it, and then it will be of great help in supporting any ref counting solution. What it doesn't work very well with are class references. But Andrei suggested that we can focus the use of 'scope' to deal with that in an analogous way. What do you think? I think everything should be treated uniformly. But a storage class is not sufficient. Anyone want to enumerate a list of the current deficiencies of 'ref' in regards to this, so we can think about solving it? Eg: - Cannot make tail const. / Cannot be reassigned. - Cannot store in data structures. - Cannot borrow slices of memory. - Closures? - (Probably more)
Re: More radical ideas about gc and reference counting
On Mon, 12 May 2014 14:14:28 -0400, Ola Fosheim Grøstad wrote: On Monday, 12 May 2014 at 17:52:18 UTC, Walter Bright wrote: On 5/12/2014 7:46 AM, Steven Schveighoffer wrote: pointing at it is roughly 1/256. This problem is just about eliminated with 64-bit pointers. Not generally true. This presumes that the heap is not in the lower region of the address space and that you don't use 64 bit ints on the stack. I was thinking in terms of purely a random number happening to point at heap data. Practically speaking, I don't know the true likelihood based on the heap address scheme of 64-bit OSes, but I know that we always have a complainer who will try and do an array-append test on 32-bit code, and end up exhausting memory unexpectedly. -Steve
Re: borrowed pointers vs ref
On Mon, 12 May 2014 16:36:12 -0400, Walter Bright wrote: It's been brought up more than once that the 'scope' storage class is an unimplemented borrowed pointer. But thinking a bit more along those lines, actually 'ref' fills the role of a borrowed pointer. One particularly apropos behavior is that struct member functions pass 'this' by ref, meaning that members can be called without the inc/dec millstone. ref is still incomplete as far as this goes, but we can go the extra distance with it, and then it will be of great help in supporting any ref counting solution. Hm... the one piece that I think would be damaging is to not be able to take an address of the 'this' reference. It's probably OK to just use pointers and static functions in some cases, but member functions do not have that luxury. In other words, operators. Big example would be a doubly linked list used with ~=. -Steve
Re: borrowed pointers vs ref
The first thing that comes to my mind is applying this somehow to the (T) vs (ref T) function problem. (const ref, scope ref, references to r-values, you know the problem.) At the moment I just follow this pattern. void foo(ref const T bar) { /* ... */ } // Second overload to make r-values just work. void foo(const T bar) { foo(bar); } auto ref sometimes works, sometimes it's more trouble than its worth.
Re: borrowed pointers vs ref
On 5/12/2014 2:13 PM, Walter Bright wrote: On 5/12/2014 1:49 PM, Kagamin wrote: How would you assign a borrowed pointer? A ref could only be assigned to another ref. I mean to a ref of the same or smaller scope.
Re: borrowed pointers vs ref
On 5/12/2014 1:49 PM, Kagamin wrote: How would you assign a borrowed pointer? A ref could only be assigned to another ref.
Re: 64-bit DMD for windows?
On 5/12/14, Nick Sabalausky via Digitalmars-d wrote: > You don't need a 64-bit version: Compiling 64-bit programs doesn't > require a 64-bit compiler. Just install VC++, use the DMD 2.065 Win > installer, and then toss in the -m64 flag when compiling. Works fine. > Doesn't matter if DMD itself is 32-bit. As Vladimir in IRC reminded me, there is one use-case: You may need it for some intensive CTFE stuff (excessive memory allocations and no freeing by the compiler). That is, if you need more than 3/4 gigs.
Re: borrowed pointers vs ref
How would you assign a borrowed pointer?
borrowed pointers vs ref
It's been brought up more than once that the 'scope' storage class is an unimplemented borrowed pointer. But thinking a bit more along those lines, actually 'ref' fills the role of a borrowed pointer. One particularly apropos behavior is that struct member functions pass 'this' by ref, meaning that members can be called without the inc/dec millstone. ref is still incomplete as far as this goes, but we can go the extra distance with it, and then it will be of great help in supporting any ref counting solution. What it doesn't work very well with are class references. But Andrei suggested that we can focus the use of 'scope' to deal with that in an analogous way. What do you think? Anyone want to enumerate a list of the current deficiencies of 'ref' in regards to this, so we can think about solving it?
Re: More radical ideas about gc and reference counting
On 5/12/2014 12:36 PM, Timon Gehr wrote: Do you mean the table is not actually global but passed by parameter, Yes. But note that the distinction between the two is often blurry. Under the hood on some systems, global data is accessed via the equivalent of a hidden parameter.
Re: More radical ideas about gc and reference counting
Andrei Alexandrescu: How did I give the impression it has anything to do with unions? -- Andrei OK, so yours is not an answer to my proposal, nor related to it. Bye, bearophile
Re: More radical ideas about gc and reference counting
On 5/12/14, 12:59 PM, bearophile wrote: Andrei Alexandrescu: I, too, felt the need of onGC() - actually preGC() - in my allocators implementation. ... A hook that nulls all freelist heads just as the collection process starts would be helpful. How is this going to help increase tracing precision of unions (and Algebraic built on top of unions)? How did I give the impression it has anything to do with unions? -- Andrei
Re: More radical ideas about gc and reference counting
Andrei Alexandrescu: I, too, felt the need of onGC() - actually preGC() - in my allocators implementation. ... A hook that nulls all freelist heads just as the collection process starts would be helpful. How is this going to help increase tracing precision of unions (and Algebraic built on top of unions)? Bye, bearophile
Re: More radical ideas about gc and reference counting
On 2014-05-12 19:14, Dicebot wrote: It lacks any good static reflection though. And this stuff is damn addictive when you try it of D caliber. It has macros, that basically requires great support for static reflection to be usable. -- /Jacob Carlborg
Re: More radical ideas about gc and reference counting
On 05/12/2014 06:37 PM, Walter Bright wrote: On 5/12/2014 5:15 AM, Timon Gehr wrote: On 05/12/2014 10:54 AM, Walter Bright wrote: On 5/11/2014 10:57 PM, Marco Leise wrote: Am Sun, 11 May 2014 17:50:25 -0700 schrieb Walter Bright : As long as those pointers don't escape. Am I right in that one cannot store a borrowed pointer into a global data structure? Right, and that's the point and entirely positive-to-do™. This means that a global data structure in Rust has to decide what memory allocation scheme its contents must use, Global variables are banned in Rust code outside of unsafe blocks. Global can also mean assigning through a reference passed as a parameter. Do you mean the table is not actually global but passed by parameter, or that the global table is accessed in unsafe code and then passed by parameter or something else?
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 18:07:51 UTC, Kapps wrote: Depending on how tunable the GC is, I feel like it should be possible to get away with a GC even for soft real-time programs like games. Even if you manage to make it work for game clients you also should make it work for low latency game servers, as code sharing is an important advantage. What a game/world server requires differs a lot, but highly dynamic and flexible worlds have to keep the physics to a single node (or tight cluster) for a region. That means you want to have as many players as possible tied to that node. In essence you want both performance, low latency, reliability, and little overhead in an evolutionary context (it has to support heavy modification over time). My gut feeling is that a runtime satisfying one game design will not satisfy another one as long as one insists on one global GC. In essence, it will never really work well. IMO, the same goes for ARC since RC does not perform well with multi-threading even when you use near optimal patterns and strategies. If ARC is only to be used where speed does not matter then you might as well use shared_ptr.
Next step on reference counting topics
There's been a lot of talk lately regarding improving resource management for D, and I'd like to figure the next logical step to take. It seems clear that we have reached a collective impasse on a few fundamentals, and that more just talk about it all is at the point of diminishing returns. One action item that is hopefully useful to people of all viewpoints is to double down on library support, and see how far we can get and what insights we collect from the experience. For that I'm proposing we start real work toward a state-of-the-art std.refcounted module. It would include adapters for class, array, and pointer types, and should inform language improvements for qualifiers (i.e. the tail-const problem), copy elision, literals, operators, and such. Who wants to champion this effort? Thanks, Andrei
Re: More radical ideas about gc and reference counting
On 13 May 2014 04:07, Kapps via Digitalmars-d wrote: > On Monday, 12 May 2014 at 16:03:28 UTC, Manu via Digitalmars-d > wrote: >> >> How long is a collect liable to take in the event the GC threads need >> >> to collect? Am I likely to lose my service threads for 100s of >> milliseconds at a time? >> >> I'll think on it, but I don't think there's anything practically >> applicable here, and it really sounds like it creates a lot more >> trouble and complexity than it addresses. > > > > Your concerns stem not as much from the speed concern of the GC, > but from the freeze-the-world aspect of it. Would a concurrent > collector not solve these issues? I originally thought it would... but the more I think on it, I don't think it would make an awful lot of difference in practise. If the stalls were 'short' (like 1-5ms on the background threads, 500µs-1ms on the realtime threads), then maybe it would be workable, but I don't know that it would be even close to that? Also, I think it would be very difficult to implement on a machine without virtual memory, or much of an operating system in general? The problem remains that with no free memory, frequency of collection becomes so high, that it's extremely unlikely full collection so often would be better than ARC. > As > http://msdn.microsoft.com/en-us/library/ee787088%28v=vs.110%29.aspx#concurrent_garbage_collection > explains a little bit, the actual time your threads spend frozen > should be little (but I admit I don't know exactly how little), > and so long as you don't allocate too much during the collection > itself (which you say you don't), you should be able to keep > running your code during the collection. If it's not possible to > implement concurrent collection in D (and it's already been shown > it is possible), then I'd agree that ARC is very important. But > depending on how little the stop-the-world time from a concurrent > GC can get, perhaps this could work around some issues that > you're desiring ARC for. A generational collector could help in > theory with your high memory usage situations. I doubt you > allocate a gigabyte each frame, so the actual generation 0 > content should be fairly low. Much of your memory usage should be > allocations that will not be freed for long periods of time, > while the per-frame and other short allocations should be fast to > collect as there aren't many of them. Yeah, it would probably be better, if it's possible. Implementation needs to be considered from the perspective of embedded systems with no OS or MMU, and as little as 64mb of ram (the smallest modern systems). Mid-range systems are 512mb and no MMU. 'next-gen' systems are basically like little PC's with crappy OS's, so more likely a decent GC is possible on a ps4/xbone... but very few have the luxury of developing for just one system. It occurred to me earlier that things like strings might enjoy their own separate heap. And maybe some special logic for strings that outlived their scope to be actively returned to their heap rather than waiting for collection. If the heap were successfully broken down into a suite of sub-heaps, I have absolutely no idea how to make estimates about the performance of this system, and if it would approach an acceptable level. I'm skeptical it would, and it still won't decrease collection frequency. But I'd be happy to be surprised. > Depending on how tunable the GC is, I feel like it should be > possible to get away with a GC even for soft real-time programs > like games. The problem is it's hard to tell until we get a > proper concurrent collector in D2, just like it's hard to tell > how significant the impact of ARC is until we get an optimized > implementation of it in the compiler. Neither of these is simple. > I do quite like the idea of ARC, it's just something that someone > would have to actually implement (well) in order to see how much > of an impact it really has in D. I understand the problem. The first hurdle is overcoming the hostility against it though. There is a severe prejudice. > For the truly low frequency > situations, you could get away with a library type for ARC as > well, and as you mentioned, for high frequency you would get > around ARC regardless. Yup.
Re: More radical ideas about gc and reference counting
On 5/12/14, 11:17 AM, Dmitry Olshansky wrote: 12-May-2014 22:08, Andrei Alexandrescu пишет: On 5/12/14, 10:25 AM, bearophile wrote: A hook that nulls all freelist heads just as the collection process starts would be helpful. One word - weak pointers. Then head of freelist is weak and can be collected at whim. Of course. My point here is that here you need simpler support than full-blown weak pointers. -- Andrei
Re: radical ideas about GC and ARC : need to be time driven?
On Sunday, 11 May 2014 at 05:16:26 UTC, Paulo Pinto wrote: This is what java.lang.ref.ReferenceQueue are for in Java, but one needs to be a GC expert on how to use it, otherwise it will hinder the GCs work. I think all memory-partitioning-related performance requires expert knowledge. If people care about performance and reliability they have to accept that they cannot blindly use abstractions or throw everything into the same bag. Java is probably a good example of how unrealistic it is to have a general programming language that does reasonable well in most domains. The outcome has not been "everybody under the Sun umbrella", but a wide variety of Java runtime-solutions and special systems.
Re: 64-bit DMD for windows?
On 5/12/2014 2:04 PM, steven kladitis wrote: On Monday, 12 May 2014 at 17:46:21 UTC, Kapps wrote: On Monday, 12 May 2014 at 16:47:21 UTC, steven kladitis wrote: It is NOT just for memory addressing , wchi is very simple under 64 bit, but also 64 bit registers, 16 of them , not just 8 32 bit. I think there should be a 64 bit version. This topic is 3 years old, DMD can already generate 64-bit programs on Windows (although I don't think DMD itself is 64-bit). I still only see 32 bit version for Windows. I admit I have a 32 bit laptop , over 10 years old :) . All other laptops and pcs I have are 64 bit processors. If anyone out there has a 64 bit; version 2.065 , for windows let me know. You don't need a 64-bit version: Compiling 64-bit programs doesn't require a 64-bit compiler. Just install VC++, use the DMD 2.065 Win installer, and then toss in the -m64 flag when compiling. Works fine. Doesn't matter if DMD itself is 32-bit.
Re: More radical ideas about gc and reference counting
12-May-2014 22:08, Andrei Alexandrescu пишет: On 5/12/14, 10:25 AM, bearophile wrote: A hook that nulls all freelist heads just as the collection process starts would be helpful. One word - weak pointers. Then head of freelist is weak and can be collected at whim. Andrei -- Dmitry Olshansky
Re: isUniformRNG
On 5/11/2014 8:16 AM, Joseph Rushton Wakeling via Digitalmars-d wrote:> On 11/05/14 05:58, Nick Sabalausky via Digitalmars-d wrote: >> The seed doesn't need to be compromised for synchronized RNGs to fubar >> security. > > Before we go onto the detail of discussion, thanks very much for the > extensive explanation. I was slightly worried that my previous email > might have come across as dismissive of your (completely understandable) > concerns. Oh, not at all. I've been finding the discussion rather interesting. :) > I'm actually quite keen that we can find a mutually agreeable > solution :-) > I agree. And I think our stances aren't quite as opposed as they may seem. > Sure, but this is a consequence of two things (i) CryptoRng is a value > type and (ii) it gets passed by value, not by reference. > > In your example, obviously one can fix the problem by having the > function declared instead as > > ubyte[] getRandomJunk(ref CryptoRng rand) { ... } > > but I suspect you'd say (and I would agree) that this is inadequate: > it's relying on programmer virtue to ensure the correct behaviour, and > sooner or later someone will forget that "ref". (It will also not > handle other cases, such as an entity that needs to internally store the > RNG, as we've discussed many times on the list with reference to e.g. > RandomSample or RandomCover.) > Right. Agreed on all points. > Obviously one _can_ solve the problem by the internal state variables of > the RNG being static, but I'd suggest to you that RNG-as-reference-type > (which doesn't necessarily need to mean "class") Yea, doesn't necessarily mean class, but if it is made a reference type then class is likely the best option. For example, I'd typically regard struct* in a D API as a code smell. > solves that particular > problem without the constraints that static internal variables have. > Pretty much agreed, but the only question is whether those constraints of using static internals are good/bad/inconsequential: For non-crypto RNGs: While I've tended to think the usefulness of a library-provided RNG that permits independent-but-identically-seeded instances is small and debatable, through this discussion I have become sufficiently convinced that they're worth permitting. Besides, even if there weren't need for it, the downsides of permitting such a feature (as long as it's not accident-prone) are minimal, if not zero. So I'm fine going the class route (or otherwise reference-based) and making internal state per-instance. Or even having a "duplicate this RNG with identical state" function, if people want it. I think we're pretty well agreed on non-crypto RNGs. Your stance is convincing here. For crypto-RNGs: A crypto-RNG exists for exactly one reason only: To stretch the useful lifetime of a limited source of truely non-deterministic randomness/entropy (or nearly-true randomness, if that's the best available). Because of this, any and all determinism is a concession, not a feature (unlike for non-crypto deterministic-RNGs). Even *how* you use it deliberately affects the internal state, not just "how many times you asked for a value". These things go all-out to throw any wrenches they can into any sources of determinism. I was actually quite impressed :) In fact, the seeding/reseeding is specifically defined to be completely internal to the crypto-RNG (at least with Hash_DRBG anyway, probably others) - the user *never* provides a seed - which intentionally makes it that much harder for an application to use deterministic seeds (which would compromise the security implications of the crypto-RNG, and therefore defeat the whole point of using a crypto-RNG instead of a normal RNG). All this is because determinism is NOT what a crypto-RNG is for, it's exactly what a crypto-RNG is specifically designed to fight against. What all that implies: A crypto-RNG shouldn't *explicitly* provide a way to get different instances with identical internal states, and definitely shouldn't let it happen by accident. It's also under absolutely no obligation whatsoever for relying on determinism to even be possible at all (it would carry certain difficulties anyway). Luckily though, that doesn't imply anything particularly profound. *If* it's even possible to get identical crypto-RNGs at all, then as long you have to work to do it (memcopying raw class data, providing a custom entropy source that's written to be deterministic, or even using muliple threads/processes, etc), then everything's all good. Therefore, a class or otherwise reference-based approach is fine for crypt-RNGs, too. I think my preference would still be to keep the internal state static here though (again, just speaking for crypto-RNGs only). As I've argued, the determinism is a non-feature for crypto-RNGs (they deliberately fight it every step of the way), and the shared state carries a couple entropy-related benefits (Reseeding one, ie acc
Re: More radical ideas about gc and reference counting
On 5/12/14, 10:25 AM, bearophile wrote: Walter Bright: Unions of pointers are so rare in actual code that treating them conservatively is not a big problem. std.variant.Algebraic is based on on std.variant.VariantN, and on std.variant.VariantN is based on an union, and often you use algebraic data types to represent trees and similar data structures that contain many references/pointers. Adding Adding an onGC() method to std.variant.VariantN you allow the GC to manage Algebraic well enough. I, too, felt the need of onGC() - actually preGC() - in my allocators implementation. Specifically, a thread-local freelist would save a pointer to the root in thread-local storage (i.e. a traditional D global variable). That would thread through a number of free nodes available for allocation. When a GC cycle occurs, it's okay if the list stays referenced; the GC will consider it "used" and won't do anything in particular about it. However, the GC cycle is a good opportunity to clean these freelists and offer the memory for other size classes, seeing as the freelists may grow unreasonably large and then just hold memory for no good reason. A hook that nulls all freelist heads just as the collection process starts would be helpful. Andrei
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 17:52:18 UTC, Walter Bright wrote: On 5/12/2014 7:46 AM, Steven Schveighoffer wrote: pointing at it is roughly 1/256. This problem is just about eliminated with 64-bit pointers. Not generally true. This presumes that the heap is not in the lower region of the address space and that you don't use 64 bit ints on the stack. Generally, it is a bad idea to allocate such large blocks on the GC heap. GC's work best when the size of the objects being allocated is very small relative to the size of the heap space. Generally not true. This is a deficiency of not having a smart allocator / precise scanning that use available meta information properly (obtained statically or by profiling). Fortunately, it's a mathematical inevitability that large allocations relative to the GC size are rare, and so it isn't much of a pain to handle them manually. Programmer pain is not measured in number of instances, but in terms of model complexity. Ola.
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 16:03:28 UTC, Manu via Digitalmars-d wrote: How long is a collect liable to take in the event the GC threads need to collect? Am I likely to lose my service threads for 100s of milliseconds at a time? I'll think on it, but I don't think there's anything practically applicable here, and it really sounds like it creates a lot more trouble and complexity than it addresses. Your concerns stem not as much from the speed concern of the GC, but from the freeze-the-world aspect of it. Would a concurrent collector not solve these issues? As http://msdn.microsoft.com/en-us/library/ee787088%28v=vs.110%29.aspx#concurrent_garbage_collection explains a little bit, the actual time your threads spend frozen should be little (but I admit I don't know exactly how little), and so long as you don't allocate too much during the collection itself (which you say you don't), you should be able to keep running your code during the collection. If it's not possible to implement concurrent collection in D (and it's already been shown it is possible), then I'd agree that ARC is very important. But depending on how little the stop-the-world time from a concurrent GC can get, perhaps this could work around some issues that you're desiring ARC for. A generational collector could help in theory with your high memory usage situations. I doubt you allocate a gigabyte each frame, so the actual generation 0 content should be fairly low. Much of your memory usage should be allocations that will not be freed for long periods of time, while the per-frame and other short allocations should be fast to collect as there aren't many of them. Depending on how tunable the GC is, I feel like it should be possible to get away with a GC even for soft real-time programs like games. The problem is it's hard to tell until we get a proper concurrent collector in D2, just like it's hard to tell how significant the impact of ARC is until we get an optimized implementation of it in the compiler. Neither of these is simple. I do quite like the idea of ARC, it's just something that someone would have to actually implement (well) in order to see how much of an impact it really has in D. For the truly low frequency situations, you could get away with a library type for ARC as well, and as you mentioned, for high frequency you would get around ARC regardless.
Re: 64-bit DMD for windows?
On Monday, 12 May 2014 at 17:46:21 UTC, Kapps wrote: On Monday, 12 May 2014 at 16:47:21 UTC, steven kladitis wrote: It is NOT just for memory addressing , wchi is very simple under 64 bit, but also 64 bit registers, 16 of them , not just 8 32 bit. I think there should be a 64 bit version. This topic is 3 years old, DMD can already generate 64-bit programs on Windows (although I don't think DMD itself is 64-bit). I still only see 32 bit version for Windows. I admit I have a 32 bit laptop , over 10 years old :) . All other laptops and pcs I have are 64 bit processors. If anyone out there has a 64 bit; version 2.065 , for windows let me know.
dmd and pkg-config
Since "dmd.conf" has specific flags depending on word size generated by dmd, is there any way to know this state i.e. like an environment variable? "pkg-config" command is an easy way to give the compiler the right flags for an specific library. Now these flags are for both, 32 and 64 bit. If "pkg-config" was able to know this information, then will first search on the right library path by setting PKG_CONFIG_PATH environment variable, avoiding the use of "--no-warn-search-mismatch" flag and becoming in a faster linking. Another solution should be include the "pkg-config" functionality into the compiler itself. Regards, -- Jordi Sayol
Re: More radical ideas about gc and reference counting
On 13 May 2014 03:44, Walter Bright via Digitalmars-d wrote: > On 5/12/2014 10:31 AM, Manu via Digitalmars-d wrote: >> >> I just searched through my code, and 7 out of 12 unions had pointers. > > > Relative number of objects with unions, not declarations with unions! Ah, well I have 3 different tree/graph structures with unions, and tree/graph nodes have a tendency to accumulate many instances.
Re: More radical ideas about gc and reference counting
On 5/12/2014 7:46 AM, Steven Schveighoffer wrote: It doesn't matter where the false pointers are. The largest issue with false pointers is not how many false pointers there are. It only matters how large the block is that it "points" at. The larger your blocks get, the more likely they are "pointed" at by the stack. On 32-bit systems, allocate 1/256th of your memory space (i.e. 16.5MB), and the likelihood of random data on the stack pointing at it is roughly 1/256. This problem is just about eliminated with 64-bit pointers. Generally, it is a bad idea to allocate such large blocks on the GC heap. GC's work best when the size of the objects being allocated is very small relative to the size of the heap space. Fortunately, it's a mathematical inevitability that large allocations relative to the GC size are rare, and so it isn't much of a pain to handle them manually. And in fact, even if it's forbidden, "requires" is too strong a word -- there is no static or runtime prevention of this. It's still forbidden. Andrei wrote a template that will verify this at runtime, but I don't recall its name.
Re: More radical ideas about gc and reference counting
Walter Bright: BTW, the RTinfo can be used to discriminate unions. I don't know if std.variant.VariantN is already using such RTinfo. I don't know much about RTinfo. Bye, bearophile
Re: 64-bit DMD for windows?
On Monday, 12 May 2014 at 16:47:21 UTC, steven kladitis wrote: It is NOT just for memory addressing , wchi is very simple under 64 bit, but also 64 bit registers, 16 of them , not just 8 32 bit. I think there should be a 64 bit version. This topic is 3 years old, DMD can already generate 64-bit programs on Windows (although I don't think DMD itself is 64-bit).
Re: More radical ideas about gc and reference counting
On 5/12/2014 10:25 AM, bearophile wrote: Walter Bright: Unions of pointers are so rare in actual code that treating them conservatively is not a big problem. std.variant.Algebraic is based on on std.variant.VariantN, and on std.variant.VariantN is based on an union, and often you use algebraic data types to represent trees and similar data structures that contain many references/pointers. Adding Adding an onGC() method to std.variant.VariantN you allow the GC to manage Algebraic well enough. BTW, the RTinfo can be used to discriminate unions.
Re: More radical ideas about gc and reference counting
On 5/12/2014 10:07 AM, Dicebot wrote: We have already had discussion where I did state that current @nogc implementation is not robust enough and failed to explain the use case for weaker @nogc clearly. Conclusion was that we should return to this topic after Don's DConf talk ;) Sure - next week!
Re: More radical ideas about gc and reference counting
On 5/12/2014 10:31 AM, Manu via Digitalmars-d wrote: I just searched through my code, and 7 out of 12 unions had pointers. Relative number of objects with unions, not declarations with unions!
Re: More radical ideas about gc and reference counting
On 13 May 2014 03:14, Dicebot via Digitalmars-d wrote: > On Monday, 12 May 2014 at 17:03:41 UTC, Manu via Digitalmars-d wrote: >> >> But D is *so close*... and I like it! >_< >> >> I have to say that this discussion has certainly left me somewhat >> intrigued by Rust though. >> I've never given it a fair go because I find the syntax so distasteful >> and deterring. >> I wonder if there's a market for a rust fork that re-skin's the language >> ;) > > > Right now D has practical benefit of being more stable and library rich. But > switching to Rust eventually does seem tempting as I find foundations of > their type system much closer to my beliefs about "good coding practices". > > It lacks any good static reflection though. And this stuff is damn addictive > when you try it of D caliber. They have a lot more work to do. There doesn't seem to be a useful windows compiler for a start... >_<
Re: More radical ideas about gc and reference counting
On 13 May 2014 03:17, Walter Bright via Digitalmars-d wrote: > On 5/12/2014 4:35 AM, bearophile wrote: >> >> I suggested to add an optional method named "onGC" to unions that if >> present is >> called at run-time by the GC to know what's the real type of stored data, >> to >> make tracing more precise. > > > Unions of pointers are so rare in actual code that treating them > conservatively is not a big problem. I find it fairly common. I just searched through my code, and 7 out of 12 unions had pointers.
Re: More radical ideas about gc and reference counting
Walter Bright: Unions of pointers are so rare in actual code that treating them conservatively is not a big problem. std.variant.Algebraic is based on on std.variant.VariantN, and on std.variant.VariantN is based on an union, and often you use algebraic data types to represent trees and similar data structures that contain many references/pointers. Adding Adding an onGC() method to std.variant.VariantN you allow the GC to manage Algebraic well enough. Bye, bearophile
Re: More radical ideas about gc and reference counting
On 5/12/2014 4:35 AM, bearophile wrote: I suggested to add an optional method named "onGC" to unions that if present is called at run-time by the GC to know what's the real type of stored data, to make tracing more precise. Unions of pointers are so rare in actual code that treating them conservatively is not a big problem.
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 17:03:41 UTC, Manu via Digitalmars-d wrote: But D is *so close*... and I like it! >_< I have to say that this discussion has certainly left me somewhat intrigued by Rust though. I've never given it a fair go because I find the syntax so distasteful and deterring. I wonder if there's a market for a rust fork that re-skin's the language ;) Right now D has practical benefit of being more stable and library rich. But switching to Rust eventually does seem tempting as I find foundations of their type system much closer to my beliefs about "good coding practices". It lacks any good static reflection though. And this stuff is damn addictive when you try it of D caliber.
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 17:03:18 UTC, Walter Bright wrote: On 5/12/2014 2:12 AM, Dicebot wrote: I think this is more of library writing culture problem than engineering problem. High quality library shouldn't rely on any internal allocations at all, deferring this decision to user code. Otherwise you will eventually have problems, GC or not. Consider my PR: https://github.com/D-Programming-Language/phobos/pull/2149 This is exactly what it does - it 'pushes' the decisions about allocating memory up out of the library to the user. I suspect a great deal of storage allocation can be removed from Phobos with this technique, without sacrificing performance, flexibility, or memory safety. (In fact, it improves on performance and flexibility!) We have already had discussion where I did state that current @nogc implementation is not robust enough and failed to explain the use case for weaker @nogc clearly. Conclusion was that we should return to this topic after Don's DConf talk ;)
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 16:16:06 UTC, bearophile wrote: Perhaps the game industry has to start the creation of a language designed for its needs, like the scientific people have done (Julia), the browser ones (Rust), the Web ones have done, etc. With lot of work in less than ten years you can have an usable language. I don't think games are unique or special. Most games are even in the "easy" space by having mostly static data. Meaning the amount of unexpected dynamic data is pretty low. Games also have the luxury of redefining the requirements spec to match available technology. The games industry does however have its own culture and paradigms and fashions… With subcultures. However, most interactive applications will suffer from the same issues if you increase the load so that they run out of headroom. Even unix commands like find and grep have latency requirements if the interaction is to be pleasant. By good fortune "find" and "grep" haven't changed their interface for 40+ years, so they were designed for low performance CPUs. That does not mean that you cannot design a better "find"-like application today that will run into runtime related usability issues if you freeze the world. At the end of the day, a system level language should support key strategies used for writing performant system level code in a reliable manner. It should also not lock you to a specific runtime that you couldn't easily write yourself. It should also not lock you to a specific model of how to structure your code (like monitors). I am not even sure it should provide OS abstractions, because that is not really system level programming. That is unixy (Posix) programming. A system level programming language should be free of OS and modelling related legacy.
Re: More radical ideas about gc and reference counting
On 5/12/2014 2:12 AM, Dicebot wrote: I think this is more of library writing culture problem than engineering problem. High quality library shouldn't rely on any internal allocations at all, deferring this decision to user code. Otherwise you will eventually have problems, GC or not. Consider my PR: https://github.com/D-Programming-Language/phobos/pull/2149 This is exactly what it does - it 'pushes' the decisions about allocating memory up out of the library to the user. I suspect a great deal of storage allocation can be removed from Phobos with this technique, without sacrificing performance, flexibility, or memory safety. (In fact, it improves on performance and flexibility!) I also agree with your larger point that if you are relying on an unknown library for time critical code, and that library was not designed with time criticality guarantees in mind, you're going to have nothing but trouble. Regardless of GC or RC.
Re: More radical ideas about gc and reference counting
On 13 May 2014 02:16, bearophile via Digitalmars-d wrote: > Manu: > > >> we are an industry in desperate need of salvation, >> it's LONG overdue, and I want something that actually works well for us, >> not a crappy set of compromises because the >> language has a fundamental incompatibility with my industry :/ > > > Perhaps the game industry has to start the creation of a language designed > for its needs, like the scientific people have done (Julia), the browser > ones (Rust), the Web ones have done, etc. With lot of work in less than ten > years you can have an usable language. But D is *so close*... and I like it! >_< I have to say that this discussion has certainly left me somewhat intrigued by Rust though. I've never given it a fair go because I find the syntax so distasteful and deterring. I wonder if there's a market for a rust fork that re-skin's the language ;)
Druntime regression
The latest git HEAD druntime has broken existing code: https://issues.dlang.org/show_bug.cgi?id=12738 :-( I'm no longer so sure it's a *regression*, strictly speaking, but it is definitely a breakage that's going to cause users pain. I.e., it better be in large bold print in the changelog for the next release if this change is going to stay. T -- The best way to destroy a cause is to defend it poorly.
Re: 64-bit DMD for windows?
On Thursday, 15 December 2011 at 21:05:05 UTC, captaindet wrote: On 2011-12-15 04:47, torhu wrote: On 14.12.2011 12:54, dmd.20.browse...@xoxy.net wrote: Hi, Is there a 64-bit version of DMD for windows? The download page offers only an x86 version. Or am I reading too much into that? Cheers, buk There's not much you would need a 64-bit compiler for on Windows. What are you going to use it for? now what is this for a strange comment? you need 64bit for windows for the same reasons than for any other platform: accessing loads of mem. yes, for some this is really important! for me it is actually a dealbreaker - i'd love to use D for my scientific programming, but my datasets often reach several GB... my computer has 16GB and i intend to make use of them. det It is NOT just for memory addressing , wchi is very simple under 64 bit, but also 64 bit registers, 16 of them , not just 8 32 bit. I think there should be a 64 bit version.
Re: More radical ideas about gc and reference counting
On 5/12/2014 5:15 AM, Timon Gehr wrote: On 05/12/2014 10:54 AM, Walter Bright wrote: On 5/11/2014 10:57 PM, Marco Leise wrote: Am Sun, 11 May 2014 17:50:25 -0700 schrieb Walter Bright : As long as those pointers don't escape. Am I right in that one cannot store a borrowed pointer into a global data structure? Right, and that's the point and entirely positive-to-do™. This means that a global data structure in Rust has to decide what memory allocation scheme its contents must use, Global variables are banned in Rust code outside of unsafe blocks. Global can also mean assigning through a reference passed as a parameter.
Re: More radical ideas about gc and reference counting
On 5/12/2014 3:18 AM, Marco Leise wrote: Your were arguing against Michel Fortin's proposal on the surface, when your requirement cannot even be fulfilled theoretically it seems. Lots of people use ARC without a GC. Which could mean that you don't like the idea of replacing D's GC with an ARC solution. I don't like the idea of replacing D's GC with ARC. But for different reasons.
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 08:45:56 UTC, Walter Bright wrote: 2. you can have the non-pausible code running in a thread that is not registered with the gc, so the gc won't pause it. This requires that this thread not allocate gc memory, but it can use gc memory allocated by other threads, as long as those other threads retain a root to it. This and @nogc is a very promising trend, but you should still be able to partion the GC search space. The key to controlled real time performance is to partition the search space, that goes for anything algorithmic; memory management inclusive. That applies to any scheme like owned, ARC, GC etc. It makes little sense having to trace everything if only the physics engine is the one churning memory like crazy. And fork() is not a solution without extensive structuring of allocations. Stuff like physics touch all pages the physics objects are onto like 100+ times per second, so you need to group allocations to pages based on usage patterns. (No point in forking if you get write traps on 50.000 pages the next time the physics engine run :-). 3. D allows you to create and use any memory management scheme you want. You are simply not locked into GC. For example, I rewrote my Empire game into D and it did not do any allocation at all - no GC, not even malloc. I know that you'll need to do allocation, I'm just pointing out that GC allocations and pauses are hardly inevitable. This is no doubt the best approach for a MMO client. You have a window on the world and cache as much as possible both to memory and disk. Basically get as much memory from the OS as you can hoard (with headroom set by heuristics) when your application has focus and release caches that are outside the window when you loose focus to another application. This means you need a dedicated runtime for games that can delay GC collection and eat into the caches when you are low on computational resources. You also need to distinguish between memory that is locked to RAM and memory that can swap. You should always lock memory for real time threads. So if you want to GC this, you need a GC that support multiple heaps. (Some hardware might also distinguish between RAM that is accessible by the GPU or that has different characteristics in areas such as persistence or performance.) 5. you can divide your app into multiple processes that communicate via interprocess communication. One of them pausing will not pause the others. You can even do things like turn off Why processes and not threads with their own local GC? 6. If you call C++ libs, they won't be allocating memory with the D GC. D code can call C++ code. If you run those C++ libs But what happens if that C++ code does "new HeapStuff(D_allocated_memory)" and then calls back to D? You cannot presume that C++ coders have the discipline to always allocate local memory from the stack, so basically you cannot GC collect while there are C++ functions on the stack. In order to get there the GC collector needs to understand the malloc heap and trace that one too. Auditing all C++ libraries I want to use is too much work, and tracing the malloc heap is too time consuming, so at the end of the day you'll get a more robust environment by only scanning (tracing) the stacks when there is only D function calls on the stack, with a precise collector. That means you need to partition the search space otherwise the collector might not run in time. Freezing the world is really ugly. Most applications are actually soft real time. Games are part hard real time, part soft real time. The difference between games and other applications is that there is less headroom so you have to do more work to make the "glitches" and "stuttering" occur sufficiently seldom to be perceived as acceptable by the end user. But games are not special.
Re: More radical ideas about gc and reference counting
Manu: we are an industry in desperate need of salvation, it's LONG overdue, and I want something that actually works well for us, not a crappy set of compromises because the language has a fundamental incompatibility with my industry :/ Perhaps the game industry has to start the creation of a language designed for its needs, like the scientific people have done (Julia), the browser ones (Rust), the Web ones have done, etc. With lot of work in less than ten years you can have an usable language. Bye, bearophile
Re: More radical ideas about gc and reference counting
On 12 May 2014 18:45, Walter Bright via Digitalmars-d wrote: > On 5/12/2014 12:12 AM, Manu via Digitalmars-d wrote: >> >> What? You've never offered me a practical solution. > > > I have, you've just rejected them. > > >> What do I do? > > > 1. you can simply do C++ style memory management. shared_ptr<>, etc. I already have C++. I don't want another one. > 2. you can have the non-pausible code running in a thread that is not > registered with the gc, so the gc won't pause it. This requires that this > thread not allocate gc memory, but it can use gc memory allocated by other > threads, as long as those other threads retain a root to it. It still sounds the same as manual memory management though in practise, like you say, the other thread must maintain a root to it, which means I need to manually retain it somehow, and when the worker thread finishes with it, it needs to send a signal or something back to say it's done so it can be released... it sounds more inconvenient than direct manual memory management in practise. Sounds slow too. Dec-ing a ref is certainly faster than inter-thread communication. This also makes library calls into effective RPC's if I can't call into them from the active threads. How long is a collect liable to take in the event the GC threads need to collect? Am I likely to lose my service threads for 100s of milliseconds at a time? I'll think on it, but I don't think there's anything practically applicable here, and it really sounds like it creates a lot more trouble and complexity than it addresses. > 3. D allows you to create and use any memory management scheme you want. You > are simply not locked into GC. For example, I rewrote my Empire game into D > and it did not do any allocation at all - no GC, not even malloc. I know > that you'll need to do allocation, I'm just pointing out that GC allocations > and pauses are hardly inevitable. C++ lets me create any memory management scheme I like by the same argument. I lose all the parts of the language that implicitly depend on the GC, and 3rd party libs (that don't care about me and my project). Why isn't it a reasonable argument to say that not having access to libraries is completely unrealistic? You can't write modern software without extensive access to libraries. Period. I've said before, I don't want to be a second class citizen with access to only a subset of the language. > 4. for my part, I have implemented @nogc so you can track down gc usage in > code. I have also been working towards refactoring Phobos to eliminate > unnecessary GC allocations and provide alternatives that do not allocate GC > memory. Unfortunately, these PR's just sit there. The effort is appreciated, but it was never a solution. I said @nogc was the exact wrong approach to my situation right from the start, and I predicted that would be used as an argument the moment it appeared. Tracking down GC usage isn't helpful when it leads you to a lib call that you can't change. And again, eliminating useful and productive parts of the language is not a goal we should be shooting for. I'll find it useful in the high-performance realtime bits; ie, the bits that I typically disassemble and scrutinise after every compile. But that's not what we're discussing here. I'm happy with D for my realtime code, I have the low-level tools I need to make the real-time code run fast. @nogc is a little bonus that will allow to guarantee no sneaky allocations are finding their way into the fast code, and that might save a little time, but I never really saw that as a significant problem in the first place. What we're talking about is productivity, convenience and safety in the non-realtime code. The vast majority of code, that programmers spend most of their days working on. Consider it this way... why do you have all these features in D that cause implicit allocation if you don't feel they're useful and important parts of the language? Assuming you do feel they're important parts of the language, why do you feel it's okay to tell me I don't deserve access to them? Surely I'm *exactly* the target market for D...? High-pressure, intensive production environments, still depending exclusively on native code, with code teams often in the realm of 50-100, containing many juniors, aggressive schedules which can't afford to waste engineering hours... this is a code environment that's prone to MANY bugs, and countless wasted hours as a consequence. Convenience and safety are important to me... I don't know what you think I'm interested in D for if you think I should be happy to abandon a whole chunk of the language, just because I have a couple of realtime threads :/ > 5. you can divide your app into multiple processes that communicate via > interprocess communication. One of them pausing will not pause the others. > You can even do things like turn off the GC collections in those processes, > and when they run out of memory just kill them and restart them. (This is
Re: More radical ideas about gc and reference counting
On Sun, 11 May 2014 16:33:04 -0400, Walter Bright wrote: On 5/11/2014 2:48 AM, Benjamin Thaut wrote: Mostly percise doesn't help. Its either fully percise or beeing stuck with a impercise mark & sweep. This is not correct. It helps because most of the false pointers will be in the heap, and the heap will be accurately scanned, nearly eliminating false references to garbage. It doesn't matter where the false pointers are. The largest issue with false pointers is not how many false pointers there are. It only matters how large the block is that it "points" at. The larger your blocks get, the more likely they are "pointed" at by the stack. On 32-bit systems, allocate 1/256th of your memory space (i.e. 16.5MB), and the likelihood of random data on the stack pointing at it is roughly 1/256. This problem is just about eliminated with 64-bit pointers. Yes. D, for example, requires that objects not be self-referential for this reason. As previously stated, self referencing does not preclude GC moving. This statement is simply false, you can self reference in D for objects. You cannot for structs, but not because of a possibility for the moving GC, but because of the requirement to be able to move a struct instance. And in fact, even if it's forbidden, "requires" is too strong a word -- there is no static or runtime prevention of this. -Steve
Re: More radical ideas about gc and reference counting
On Mon, 12 May 2014 03:39:12 -0400, Manu via Digitalmars-d wrote: On 12 May 2014 17:24, Dicebot via Digitalmars-d You will like Don's talk this year ;) I'm super-disappointed I can't make it this year! ?!! http://dconf.org/2014/talks/evans.html We were evicted from our house, have to move, and I can't bail for a week and leave that all on my mrs while she kicks along the fulltime job :( Oh that sucks... -Steve
Re: More radical ideas about gc and reference counting
On 05/12/2014 10:54 AM, Walter Bright wrote: On 5/11/2014 10:57 PM, Marco Leise wrote: Am Sun, 11 May 2014 17:50:25 -0700 schrieb Walter Bright : As long as those pointers don't escape. Am I right in that one cannot store a borrowed pointer into a global data structure? Right, and that's the point and entirely positive-to-do™. This means that a global data structure in Rust has to decide what memory allocation scheme its contents must use, Global variables are banned in Rust code outside of unsafe blocks. and cannot (without tagging) mix memory allocation schemes. ... Tagging won't help with all memory allocation schemes. For example, let's say a compiler has internally a single hash table of strings. With a GC, those strings can be statically allocated, or on the GC heap, or anything with a lifetime longer than the table's. But I don't see how this could work in Rust. It's possible if you don't make the table global. (OTOH in D this is not going to work at all.)
Kitchen Sales Ikea
This Forum is probably the best forum that i have ever used and i would just like to say how proud i am to be a member of this forum http://www.cheapexdisplaykitchensale.co.uk";> Kitchen Sales Ikea
Re: More radical ideas about gc and reference counting
On Sunday, 11 May 2014 at 18:18:41 UTC, Rainer Schuetze wrote: For a reasonable GC I currently see 2 possible directions: 1. Use a scheme that takes a snapshot of the heap, stack and registers at the moment of collection and do the actual collection in another thread/process while the application can continue to run. This is the way Leandro Lucarellas concurrent GC works (http://dconf.org/2013/talks/lucarella.html), but it relies on "fork" that doesn't exist on every OS/architecture. A manual copy of the memory won't scale to very large memory, though it might be compressed to possible pointers. Worst case it will need twice as much memory as the current heap. It would be very interesting how far we can push this model on the supported platforms. 2. Change the compiler to emit (library defined) write barriers for modifications of (possible) pointers. This will allow to experiment with more sophisticated GC algorithms (if the write barrier is written in D, we might also need pointers without barriers to implement it). I know Walter is against this, and I am also not sure if this adds acceptable overhead, but we don't have proof of the opposite, too. As we all know, the usual eager reference counting with atomic operations is not memory-safe, so my current favorite is "concurrent buffered reference counting" (see chapter 18.2/3 "The garbage collection handbook" by Richard Jones et al): reference count modifications are not performed directly by the write barrier, but it just logs the operation into a thread local buffer. This is then processed by a collector thread which also detects cycles (only on candidates which had their reference count decreased during the last cycle). Except for very large reference chains this scales with the number of executed pointer modifications and not with the heap size. I'm surprised that you didn't include: 3. Thread-local GC, isolated zones (restricting where references to objects of a particular heap can be placed), exempting certain threads from GC completely, ...
Re: More radical ideas about gc and reference counting
Timon Gehr: (Probably more, actually, because it does not provide precision-unfriendly constructs such as undiscriminated unions.) I suggested to add an optional method named "onGC" to unions that if present is called at run-time by the GC to know what's the real type of stored data, to make tracing more precise. Bye, bearophile
Re: More radical ideas about gc and reference counting
On 05/12/2014 02:50 AM, Walter Bright wrote: On 5/11/2014 1:59 PM, Timon Gehr wrote: On 05/11/2014 10:05 PM, Walter Bright wrote: That's clearly an additional benefit of the borrowed pointer notion. But have you examined generated Rust code for the cost of inc/dec? I haven't, but I don't see any way they could avoid this (very expensive) cost without borrowed pointers. Sure, but performance is the additional benefit. One constant theme in this thread, one I find baffling, is the regular dismissal of the performance implications of inc/dec. Irrelevant, I'm not doing that. (And again, reference counting is not the only allocation mechanism in Rust. AFAICT, most allocations use owned pointers.) Borrowed pointers are not necessary to support raw pointers - this can be (and is in some systems) supported by simply wrapping the raw pointer with a dummy reference count. ... I have no idea what this part is trying to bring across. The reason for borrowed pointers is performance. No, it is safety. Raw pointers give you all of the performance. Rust would be non-viable without them. True in that it would fail to meet its design goals. Rust provides a tracing garbage collector as well, so it is at least as viable as D regarding performance of safe memory management. (Probably more, actually, because it does not provide precision-unfriendly constructs such as undiscriminated unions.) I strongly suggest writing a snippet in [[insert your favorite proven technology RC language here]] and disassembling the result, and have a look at what inc/dec entails. ... I don't have trouble seeing the cost of reference counting. (And it's you who claimed that this is going to be the only memory allocation scheme in use in Rust code.) The thing is, if the compiler is capable of figuring out these lifetimes by examining the code, There are explicit lifetime annotations in function signatures. Yes, because the compiler cannot figure it out itself, so the programmer has to annotate. ... You are saying 'if the compiler is capable of figuring out these lifetimes by examining the code, then ...' and then you are saying that the compiler is incapable of figuring them out itself. What is it that we are arguing here? Are you saying the Rust compiler should infer all memory management automatically or that it cannot possibly do that, or something else? It is simply not true that type systems are inherently restricted to checking trivial properties. They can be made as strong as mathematical logic without much fuss. Again, Rust would not need borrowed pointers nor the annotations for them if this knowledge could be deduced by the compiler. Heck, if the compiler can deduce lifetimes accurately, It does not deduce anything. It checks that borrowed pointers do not outlive their source. Lifetime parameters are used to transport the required information across function signatures. you can get rid of GC and RC, and just have the compiler insert malloc/free in the right spots. ... That's a form of GC, and I already acknowledged that global region inference exists, but noted that this is not what is used. Note that there is a Java version that does this partway, sometimes it will replace a GC object with a stack allocated one if it is successful in deducing that the object lifetime does not exceed the lifetime of the function. ... I know. Also, inference is harder to control and less efficient than simply making the relevant information part of type signatures. http://en.wikipedia.org/wiki/Region-based_memory_management#Region_inference "This work was completed in 1995[9] and integrated into the ML Kit, a version of ML based on region allocation in place of garbage collection. This permitted a direct comparison between the two on medium-sized test programs, yielding widely varying results ("between 10 times faster and four times slower") depending on how "region-friendly" the program was; compile times, however, were on the order of minutes." Yes, one is & and the other is @. No, actually currently one is & and the other is RC AFAIK. Then Rust changed again. The document I read on borrowed pointers was likely out of date, though it had no date on it. ... Yes, most documents on Rust are at least slightly out of date. RC is not more general. It cannot refer to stack-allocated data, for instance. So there is no general pointer type that has an unbounded lifetime? ... How can it be general and have an unbounded lifetime and be safe? Sure, borrowing is very lightweight, but ultimately what is most important is that it solves the problem of multiple incompatible pointer types and makes the type system more expressive as well. Adding more pointer types makes a type system more expressive, by definition. ... No. Also, this is irrelevant, because I was highlighting the _importance_ of the fact that it does in this case. A function that uses none o
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 10:51:33 UTC, Marco Leise wrote: Time will tell if all "well written" D libraries will be @nogc to move the question of allocations to the user. If there was such a thing as "weakly &nogc" then we would could do this //some library function void foo(OR, IR)(OR o, IR i) @weak-nogc { //take things from i and put them in o } allocations would be possible during the execution of foo, but *only* in the implementations of OR and IR, which means that the developer gets the control and guarantees they need, but doesn't have to explicitly pre-allocate (which might not even be possible). I don't see how it would work with UFCS though...
Re: More radical ideas about gc and reference counting
Am Mon, 12 May 2014 09:32:58 + schrieb "Paulo Pinto" : > On Monday, 12 May 2014 at 09:05:39 UTC, John Colvin wrote: > > On Monday, 12 May 2014 at 08:45:56 UTC, Walter Bright wrote: > >> On 5/12/2014 12:12 AM, Manu via Digitalmars-d wrote: > >>> What? You've never offered me a practical solution. > >> > >> I have, you've just rejected them. > >> > >> > >>> What do I do? > >> > >> 1. you can simply do C++ style memory management. > >> shared_ptr<>, etc. > >> > >> 2. you can have the non-pausible code running in a thread that > >> is not registered with the gc, so the gc won't pause it. This > >> requires that this thread not allocate gc memory, but it can > >> use gc memory allocated by other threads, as long as those > >> other threads retain a root to it. > >> > >> 3. D allows you to create and use any memory management scheme > >> you want. You are simply not locked into GC. For example, I > >> rewrote my Empire game into D and it did not do any allocation > >> at all - no GC, not even malloc. I know that you'll need to do > >> allocation, I'm just pointing out that GC allocations and > >> pauses are hardly inevitable. > >> > >> 4. for my part, I have implemented @nogc so you can track down > >> gc usage in code. I have also been working towards refactoring > >> Phobos to eliminate unnecessary GC allocations and provide > >> alternatives that do not allocate GC memory. Unfortunately, > >> these PR's just sit there. > >> > >> 5. you can divide your app into multiple processes that > >> communicate via interprocess communication. One of them > >> pausing will not pause the others. You can even do things like > >> turn off the GC collections in those processes, and when they > >> run out of memory just kill them and restart them. (This is > >> not an absurd idea, I've heard of people doing that > >> effectively.) > >> > >> 6. If you call C++ libs, they won't be allocating memory with > >> the D GC. D code can call C++ code. If you run those C++ libs > >> in separate threads, they won't get paused, either (see (2)). > >> > >> 7. The Warp program I wrote avoids GC pauses by allocating > >> ephemeral memory with malloc/free, and (ironically) only using > >> GC for persistent data structures that should never be free'd. > >> Then, I just turned off GC collections, because they'd never > >> free anything anyway. > >> > >> 8. you can disable and enable collections, and you can cause > >> collections to be run at times when nothing is happening (like > >> when the user has not input anything for a while). > >> > >> > >> The point is, the fact that D has 'new' that allocates GC > >> memory simply does not mean you are obliged to use it. The GC > >> is not going to pause your program if you don't allocate with > >> it. Nor will it ever run a collection at uncontrollable, > >> random, asynchronous times. > > > > The only solutions to the libraries problem that I can see here > > require drastic separation of calls to said libraries from any > > even vaguely time critical code. This is quite restrictive. > > > > Yes, calling badly optimised libraries from a hot loop is a bad > > idea anyway, but the GC changes this from > > > > "well it might take a little more time than usual, but we can > > spare a few nano-seconds and it'll show up easily in the > > profiler" > > > > to > > > > "it might, sometimes, cause the GC to run a full collection on > > our 3.96 / 4.00 GB heap with an associated half-second pause." > > > > And here we go again, "I can't use that library, it's memory > > management scheme is incompatible with my needs, I'll have to > > rewrite it myself..." > > A badly placed malloc() in library code can also trigger OS > virtualization mechanisms and make processes being swapped out to > disk, with the respective overhead in disk access and time spent > on kernel code. > > So it is just not the "we can spare a few nano-seconds". > > -- > Paulo Yes, it could easily extend to a longer wait. I think we all know programs that hang while the system is swapping out. Don't let it get to that! A PC game would typically reduce caches or texture resolutions before running out of RAM. Linux has a threshold of free pages it tries to keep available at any time to satisfy occasional small allocations. http://www.science.unitn.it/~fiorella/guidelinux/tlk/node39.html All-in-all malloc is less likely to cause long pauses. It just allocates and doesn't ask itself if there might be dead memory to salvage to satisfy a request. Time will tell if all "well written" D libraries will be @nogc to move the question of allocations to the user. -- Marco
Re: More radical ideas about gc and reference counting
On Sunday, 11 May 2014 at 21:43:06 UTC, sclytrack wrote: I like this owner/unique, borrow thing. @ is managed (currently reference counted) ~ is owner & is borrow I like it too. But a few notes: 1) The managed pointer @T has been deprecated and you should use the standard library types Gc and Rc instead. 2) The owned pointer ~T has been largely removed from the language and you should use the standard library type Box instead. The basic idea is that if a function needs to have ownership of its argument, the function should take its argument by value. And if the function doesn't need the ownership, it should take its argument either by a mutable or immutable reference (they don't like to call it "borrowed pointer" anymore, it's called simply a "reference" now). Owned types get moved by default when you pass them to a function that takes its argument by value. You call the 'clone' method to make a copy of a variable of an owned type.
Re: More radical ideas about gc and reference counting
Am Mon, 12 May 2014 01:54:58 -0700 schrieb Walter Bright : > On 5/11/2014 10:57 PM, Marco Leise wrote: > > Am Sun, 11 May 2014 17:50:25 -0700 > > schrieb Walter Bright : > > > >> As long as those pointers don't escape. Am I right in that one cannot > >> store a > >> borrowed pointer into a global data structure? > > > > Right, and that's the point and entirely positive-to-do™. > > This means that a global data structure in Rust has to decide what memory > allocation scheme its contents must use, and cannot (without tagging) mix > memory > allocation schemes. > > For example, let's say a compiler has internally a single hash table of > strings. > With a GC, those strings can be statically allocated, or on the GC heap, or > anything with a lifetime longer than the table's. But I don't see how this > could > work in Rust. :( Good question. I have no idea. -- Marco
Re: More radical ideas about gc and reference counting
Am Sun, 11 May 2014 22:11:28 -0700 schrieb Walter Bright : > > But I thought ARC cannot be designed without GC to resolve > > cycles. > > It can be, there are various schemes to deal with that, including "don't > create > cycles". GC is just one of them. > > http://en.wikipedia.org/wiki/Reference_counting#Dealing_with_reference_cycles Yes that article mentions: a) "avoid creating them" b) "explicitly forbid reference cycles" c) "Judicious use of weak references" d) "manually track that data structure's lifetime" e) "tracing garbage collector" f) adding to a root list all objects whose reference count is decremented to a non-zero value and periodically searching all objects reachable from those roots. To pick up your statement again: »Your proposal still relies on a GC to provide the memory safety, […] it is a hybrid ARC/GC system.« a) and b) let's assume never creating cycles is not a feasible option in a systems programming language c) and d) don't provide said memory safety e) and f) ARE tracing garbage collectors ergo: »But I thought ARC cannot be designed without GC to resolve cycles.« Your were arguing against Michel Fortin's proposal on the surface, when your requirement cannot even be fulfilled theoretically it seems. Which could mean that you don't like the idea of replacing D's GC with an ARC solution. »This is the best horse I could find for the price. It is pretty fast and ...« »No, it still has four legs.« -- Marco
Re: The Current Status of DQt
On Fri, 09 May 2014 09:56:09 + Kagamin via Digitalmars-d wrote: > > Please see this public service announcement: > > http://xkcd.com/1179/ > > Though it lists 20130227 as discouraged format, but it's a valid > ISO 8601 format, and phobos Date.toISOString generates string in > that format: > http://dlang.org/phobos/std_datetime.html#.Date.toISOString Yes, it's supported, because it's standard, but it's preferred that toISOExtString be used precisely because the non-extended format is not only discouraged, but it's harder to read (which is probably why it's discouraged). - Jonathan M Davis
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 09:05:39 UTC, John Colvin wrote: On Monday, 12 May 2014 at 08:45:56 UTC, Walter Bright wrote: On 5/12/2014 12:12 AM, Manu via Digitalmars-d wrote: What? You've never offered me a practical solution. I have, you've just rejected them. What do I do? 1. you can simply do C++ style memory management. shared_ptr<>, etc. 2. you can have the non-pausible code running in a thread that is not registered with the gc, so the gc won't pause it. This requires that this thread not allocate gc memory, but it can use gc memory allocated by other threads, as long as those other threads retain a root to it. 3. D allows you to create and use any memory management scheme you want. You are simply not locked into GC. For example, I rewrote my Empire game into D and it did not do any allocation at all - no GC, not even malloc. I know that you'll need to do allocation, I'm just pointing out that GC allocations and pauses are hardly inevitable. 4. for my part, I have implemented @nogc so you can track down gc usage in code. I have also been working towards refactoring Phobos to eliminate unnecessary GC allocations and provide alternatives that do not allocate GC memory. Unfortunately, these PR's just sit there. 5. you can divide your app into multiple processes that communicate via interprocess communication. One of them pausing will not pause the others. You can even do things like turn off the GC collections in those processes, and when they run out of memory just kill them and restart them. (This is not an absurd idea, I've heard of people doing that effectively.) 6. If you call C++ libs, they won't be allocating memory with the D GC. D code can call C++ code. If you run those C++ libs in separate threads, they won't get paused, either (see (2)). 7. The Warp program I wrote avoids GC pauses by allocating ephemeral memory with malloc/free, and (ironically) only using GC for persistent data structures that should never be free'd. Then, I just turned off GC collections, because they'd never free anything anyway. 8. you can disable and enable collections, and you can cause collections to be run at times when nothing is happening (like when the user has not input anything for a while). The point is, the fact that D has 'new' that allocates GC memory simply does not mean you are obliged to use it. The GC is not going to pause your program if you don't allocate with it. Nor will it ever run a collection at uncontrollable, random, asynchronous times. The only solutions to the libraries problem that I can see here require drastic separation of calls to said libraries from any even vaguely time critical code. This is quite restrictive. Yes, calling badly optimised libraries from a hot loop is a bad idea anyway, but the GC changes this from "well it might take a little more time than usual, but we can spare a few nano-seconds and it'll show up easily in the profiler" to "it might, sometimes, cause the GC to run a full collection on our 3.96 / 4.00 GB heap with an associated half-second pause." And here we go again, "I can't use that library, it's memory management scheme is incompatible with my needs, I'll have to rewrite it myself..." A badly placed malloc() in library code can also trigger OS virtualization mechanisms and make processes being swapped out to disk, with the respective overhead in disk access and time spent on kernel code. So it is just not the "we can spare a few nano-seconds". -- Paulo
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 09:05:39 UTC, John Colvin wrote: On Monday, 12 May 2014 at 08:45:56 UTC, Walter Bright wrote: The only solutions to the libraries problem that I can see here require drastic separation of calls to said libraries from any even vaguely time critical code. This is quite restrictive. I think this is more of library writing culture problem than engineering problem. High quality library shouldn't rely on any internal allocations at all, deferring this decision to user code. Otherwise you will eventually have problems, GC or not.
Re: More radical ideas about gc and reference counting
On Monday, 12 May 2014 at 08:45:56 UTC, Walter Bright wrote: On 5/12/2014 12:12 AM, Manu via Digitalmars-d wrote: What? You've never offered me a practical solution. I have, you've just rejected them. What do I do? 1. you can simply do C++ style memory management. shared_ptr<>, etc. 2. you can have the non-pausible code running in a thread that is not registered with the gc, so the gc won't pause it. This requires that this thread not allocate gc memory, but it can use gc memory allocated by other threads, as long as those other threads retain a root to it. 3. D allows you to create and use any memory management scheme you want. You are simply not locked into GC. For example, I rewrote my Empire game into D and it did not do any allocation at all - no GC, not even malloc. I know that you'll need to do allocation, I'm just pointing out that GC allocations and pauses are hardly inevitable. 4. for my part, I have implemented @nogc so you can track down gc usage in code. I have also been working towards refactoring Phobos to eliminate unnecessary GC allocations and provide alternatives that do not allocate GC memory. Unfortunately, these PR's just sit there. 5. you can divide your app into multiple processes that communicate via interprocess communication. One of them pausing will not pause the others. You can even do things like turn off the GC collections in those processes, and when they run out of memory just kill them and restart them. (This is not an absurd idea, I've heard of people doing that effectively.) 6. If you call C++ libs, they won't be allocating memory with the D GC. D code can call C++ code. If you run those C++ libs in separate threads, they won't get paused, either (see (2)). 7. The Warp program I wrote avoids GC pauses by allocating ephemeral memory with malloc/free, and (ironically) only using GC for persistent data structures that should never be free'd. Then, I just turned off GC collections, because they'd never free anything anyway. 8. you can disable and enable collections, and you can cause collections to be run at times when nothing is happening (like when the user has not input anything for a while). The point is, the fact that D has 'new' that allocates GC memory simply does not mean you are obliged to use it. The GC is not going to pause your program if you don't allocate with it. Nor will it ever run a collection at uncontrollable, random, asynchronous times. The only solutions to the libraries problem that I can see here require drastic separation of calls to said libraries from any even vaguely time critical code. This is quite restrictive. Yes, calling badly optimised libraries from a hot loop is a bad idea anyway, but the GC changes this from "well it might take a little more time than usual, but we can spare a few nano-seconds and it'll show up easily in the profiler" to "it might, sometimes, cause the GC to run a full collection on our 3.96 / 4.00 GB heap with an associated half-second pause." And here we go again, "I can't use that library, it's memory management scheme is incompatible with my needs, I'll have to rewrite it myself..."
Re: More radical ideas about gc and reference counting
Walter Bright: But I don't see how this could work in Rust. Ask it to competent Rust developers/programmers. Bye, bearophile
Re: More radical ideas about gc and reference counting
On 5/11/2014 10:57 PM, Marco Leise wrote: Am Sun, 11 May 2014 17:50:25 -0700 schrieb Walter Bright : As long as those pointers don't escape. Am I right in that one cannot store a borrowed pointer into a global data structure? Right, and that's the point and entirely positive-to-do™. This means that a global data structure in Rust has to decide what memory allocation scheme its contents must use, and cannot (without tagging) mix memory allocation schemes. For example, let's say a compiler has internally a single hash table of strings. With a GC, those strings can be statically allocated, or on the GC heap, or anything with a lifetime longer than the table's. But I don't see how this could work in Rust.