http://d.puremagic.com/issues/show_bug.cgi?id=3463
Sean Cavanaugh <worksonmymach...@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |worksonmymach...@gmail.com --- Comment #127 from Sean Cavanaugh <worksonmymach...@gmail.com> 2011-04-15 01:40:35 PDT --- (In reply to comment #120) > (In reply to comment #118) > > True, and it works tolerably well. To do a moving gc, however, you need more > > precise information. > > I don't want a moving GC. I want a fast GC. > > ("I" in this context means D users with the same requirements, mainly video > game developers.) > > I understand the advantages of a moving GC - heap compaction allowing for an > overall smaller managed heap etc., but I hope you understand that sacrificing > speed for these goals is not an unilateral improvement for everyone. I am a game developer, and this thread is fairly fascinating to me, as memory management and good support for Intel SSE2(and AVX) or PowerPC VMX are two of the biggest issues to me when considering alternative languages or the question of 'will this language be suitable in the future'. The SSE problem seems workable with extern C'd C++ DLLs code to handle the heavy math, which leaves the GC as a big 'what does this mean' when evaluating the landscape. The reality is a lot of game engines allocate a surprising amount of memory at run time. The speed of malloc itself is rarely an issue as most searches take reasonably similar amount of time. The real problems with heavy use of malloc are thread lock contention in the allocator, and fragmentation. Fragmentation causes two problems: large allocation failures when memory is low (say 1 MB allocation when 30 MB is 'free'), and virtual pages are unable to be reclaimed due to a stray allocation or two within the page. Lock contention is solved by making separate heaps. Fragmentation is fought also fought by separating the heaps, but organizing the allocations coherently either time-wise or by allocation type where like sized objects pooled into a special pool for objects of that size. As a bonus fixed size object pools have const time for allocation, except when the pool has to grow, but we try real hard to pre-size these to the worst case values. On my last project we had about 8 dlmalloc based heaps and 15 fixed sized allocator pools, to solve these problems. I would greatly prefer a GC to compact the heap to keep the peak memory down, because in embeded (console) environments memory is a constant but time is fungible. VM might be available on the environments, but it isn't going to be backed by disk. Instead the idea of the VM is that it is a tool to fight fragmentation of the underlying physical pages, and to help you get contiguous space to work with. There is also pressure to use larger (64k, 1MB, 4MB pages) pages to keep the TLB lookups fast, which hurts even more with fragmentation. Tiny allocations holding onto these big pages prevents them from being reclaimed, which makes getting those allocations moved somewhere better pretty important. Now the good news is a huge amount of resources in a game do not need to be allocated into a garbage collected space. For the most part anything you send to the GPU data is far better off being written into its memory system and left alone. Physics data and Audio data have similar behaviors for the most part and can be allocated through malloc or aligned forms of malloc (for SSE friendlies). So from a game's developers point of I need to know when the GC will run either by configuration or by manually driving it. Both allow me to run a frame with most of the AI and physics disabled to give more of the time to the collector. A panic execution GC pass that I wasn't expecting is acceptable, provided I get notified of it, as I would expect this to be an indicator memory is getting tight to the point an Out of Memory is imminent. A panic GC is a QA problem as we can tell them where and how often the are occurring and they can in turn tell the designers making the art/levels that they need to start trimming the memory usage a bit. Ideally the GC would be able to run in less time than a single frame (say 10-15ms for a 30fps game). Taking away some amount of time every frame is also acceptable. For example spending 1ms of every frame to do 1ms worth of data movement or analysis for compacting would be a reasonable thing to allow, even if it was in addition to the multi-millisecond spikes at some time interval (30 frames, 30 seconds whatever). Making the whole thing friendly to having lots of CPU cores wouldn't hurt either. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------