Upcoming ACM Lecture on D next Tuesday at George Mason University
http://events.insidenova.com/fairfax-va/events/show/180300086-dc-acm-the-d-programming-language-with-walter-bright See all you D.C. area people there!
Re: OOP, faster data layouts, compilers
Many thanks for the links, they provide very nice discussions. Specially the link below, that you can follow from your first link, http://c0de517e.blogspot.com/2011/04/2011-current-and-future-programming.html But in what concerns game development, D2 might already be too late. I know a bit of it, since a live a bit on that part of the universe. Due to XNA(Windows and XBox 360), Mono/Unity, and now WP7, many game studios have started to move their tooling into C#. And some of them are nowadays even using it for the server side code. Java used to have a foot there, specially due to the J2ME game development, with a small push thanks to Android. Which decreased since Google made the NDK available. If one day Microsoft really lets C# free, the same way ATT somehow did with C and C++, then C# might actually be the next C++, at least in what game development is concerned. And the dependency on a JIT environment is an implementation issue. The Bartok compiler in Singularity compiles to native code, and Mono also provides a similar option. So who knows? -- Paulo bearophile bearophileh...@lycos.com wrote in message news:ioqdhe$2030$1...@digitalmars.com... Through Reddit I've found a set of wordy slides, Design for Performance, on designing efficient games code: http://www.scribd.com/doc/53483851/Design-for-Performance http://www.reddit.com/r/programming/comments/guyb2/designing_code_for_performance/ The slide touch many small topics, like the need for prefetching, desing for cache-aware code, etc. One of the main topics is how to better lay data structures in memory for modern CPUs. It shows how object oriented style leads often to collections of little trees, for example arrays of object references (or struct pointers) that refer to objects that contain other references to sub parts. Iterating on such data structures is not so efficient. The slides also discuss a little the difference between creating an array of 2-item structs, or a struct that contains two arrays of single native values. If the code needs to scan just one of those two fields, then the struct that contains the two arrays is faster. Similar topics were discussed better in Pitfalls of Object Oriented Programming (2009): http://research.scee.net/files/presentations/gcapaustralia09/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf In my opinion if D2 has some success then one of its significant usages will be to write fast games, so the design/performance concerns expressed in those two sets of slides need to be important for D design. D probably allows to lay data in memory as shown in those slides, but I'd like some help from the compiler too. I don't think the compilers will be soon able to turn an immutable binary tree into an array, to speedup its repeated scanning, but maybe there are ways to express semantics in the code that will allow them future smarter compilers to perform some of those memory layout optimization, like transposing arrays. A possible idea is a @no_inbound_pointers that forbids taking the addess of the items, and allows the compiler to modify the data layout a little. Bye, bearophile
Re: Implementing std.log
I currently use the logger written by Masahiro Nakagawa and it has handled what I need. You can get it from: http://www.bitbucket.org/repeatedly/scrap/src/tip/logger.d Zz Robert Clipsham Wrote: Hey folks, I've just finished porting my web framework from D1/Tango to D2/Phobos, and in the transition lost logging functionality. As I'll be writing a logging library anyway, I wondered if there'd be interest in a std.log? If so, is there a current logging library we would like it to be based on, or should we design from scratch? I know there has been discussion about Google's http://google-glog.googlecode.com/svn/trunk/doc/glog.html and another candidate may be http://logging.apache.org/log4j/ . Do we want a comprehensive logging library, or just the basics? (Possibly with some method for extension if needed). -- Robert http://octarineparrot.com/
Re: std.parallelism: VOTE IN THIS THREAD
YES it is a step in the right direction, I have ome comments, but I will put them in another thread
Re: opDispatch, duck typing, and error messages
On 04/22/2011 12:24 AM, Adam D. Ruppe wrote: I just made an innocent little change to one of my programs, hit compile, and got this vomit: /home/me/d/dmd2/linux/bin/../../src/phobos/std/conv.d(97): Error: template std.conv.toImpl(T,S) if (!implicitlyConverts!(S, T) isSomeString!(T) isInputRange!(Unqual!(S)) isSomeChar!(ElementType!(S))) toImpl(T,S) if (!implicitlyConverts!(S ,T) isSomeString!(T) isInputRange!(Unqual!(S)) isSomeChar!(ElementType!(S))) matches more than one template declar ation, /home/me/d/dmd2/linux/bin/../../src/phobos/std/conv.d(185):toImpl(T,S) if (isSomeString!(T) !isSomeChar!(ElementT ype!(S)) (isInputRange!(S) || isInputRange!(Unqual!(S and /home/me/d/dmd2/linux/bin/../../src/phobos/std/conv.d(289) :toImpl(T,S) if (is(S : Object) isSomeString!(T)) Who... took a bit to figure out what it was saying. The bottom line: one of my classes matched both Object and isInputRange because it offers an unrestricted opDispatch. [...] Things I think would help: [...] Or, there's a whole new approach: e) Duck typing for ranges in to!() might be a bad idea. Again, remember, a class might legitimately offer a range interface, so it would trigger this message without opDispatch. Maybe we could replace template constraints, esp. 'is' stuff, by (structural) interfaces. The difference in my views is structural interface is a compile-time / static feature, while duck typing is runtime/dynamic. If ranges are meant to be structs, maybe isInputRange should check is(T == struct)? This doesn't sit right with me though. The real problem is to!() - other range functions probably don't overload on classes separately than ranges, so it won't matter there. I think the best thing to do is simply to prefer Object over range. toImpl(T) if (isInputRange!(T) (!is(T : Object))) Or something along those lines. Why? If the object has it's own toString/writeTo methods, it seems fairly obvious to me anyway that to!string ought to simply call them, regardless or what else there is. Sure. I hit and discussed a similar issue (maybe the same one in fact). The problem was with template formatValue which constraints: (1) for structs, ignore programmer-defined toString in favor of standard format for ranges (2) for classes, simply fail because of conflict (double match) There's a bug report (search for 'formatValue'). Denis -- _ vita es estrany spir.wikidot.com
Re: opDispatch, duck typing, and error messages
On 04/22/2011 01:25 AM, Adam D. Ruppe wrote: bearophile wrote: Maybe exceptions nature should be changed a little so they store __FILE__ and __LINE__ on default (exceptions without this information are kept on request, for optimization purposes). I'd like that. Even with a stack trace, it's nice to have that available right at the top. A while ago, someone posted a stack tracer printer for Linux to the newsgroup. Using that, this program: void main() { throw new Exception(test); } dmd test60 -debug -g backtrace.d Prints: object.Exception: test ./test60(_Dmain+0x30) [0x807a5e8] ./test60(extern (C) int rt.dmain2.main(int, char**) . void runMain()+0x1a) [0x807d566] ./test60(extern (C) int rt.dmain2.main(int, char**) . void tryExec(void delegate())+0x24) [0x807d4c0] ./test60(extern (C) int rt.dmain2.main(int, char**) . void runAll()+0x32) [0x807d5aa] ./test60(extern (C) int rt.dmain2.main(int, char**) . void tryExec(void delegate())+0x24) [0x807d4c0] ./test60(main+0x96) [0x807d466] /lib/libc.so.6(__libc_start_main+0xe6) [0xf75a5b86] ./test60() [0x807a4e1] No line or file info! I'd really like to have something there. Though, actually, whether it's in the message or in the stack trace doesn't really matter. As long as it's there somewhere. Most my custom exceptions use default params in their constructor to add it. Perhaps the base Exception should too? Also, addresses could go (useless). Denis -- _ vita es estrany spir.wikidot.com
Re: opDispatch, duck typing, and error messages
On 04/22/2011 03:53 AM, Robert Jacques wrote: On Thu, 21 Apr 2011 18:24:55 -0400, Adam D. Ruppe destructiona...@gmail.com wrote: [snip] Or, there's a whole new approach: e) Duck typing for ranges in to!() might be a bad idea. Again, remember, a class might legitimately offer a range interface, so it would trigger this message without opDispatch. If ranges are meant to be structs, maybe isInputRange should check is(T == struct)? This doesn't sit right with me though. The real problem is to!() - other range functions probably don't overload on classes separately than ranges, so it won't matter there. I think the best thing to do is simply to prefer Object over range. toImpl(T) if (isInputRange!(T) (!is(T : Object))) Or something along those lines. Why? If the object has it's own toString/writeTo methods, it seems fairly obvious to me anyway that to!string ought to simply call them, regardless or what else there is. There's actually a bug report regarding the toString vs range semantics issue, it's issue 5354 ( http://d.puremagic.com/issues/show_bug.cgi?id=5354 ). Also note that classes (but not structs as of yet, see bug 5719) can provide their own to!T conversions. However, what you ran into deserves a new bug report, since to!string should always be able to fall back to toString and it didn't. Agreed. A programmer who defines toString *means* it to be used for conversion to string (esp. for write* funcs). Please support and vote for this bug ;-) Denis -- _ vita es estrany spir.wikidot.com
Re: link from a dll to another function in another dll?
That example was a bit incomplete, preceding was the following code: import std.c.windows.windows; import core.dll_helper; pragma(lib,kernel33.lib); __gshared HINSTANCE g_hInst; extern (Windows) BOOL DllMain(HINSTANCE hInstance, ULONG ulReason, LPVOID pvReserved) { switch (ulReason) { case DLL_PROCESS_ATTACH: g_hInst = hInstance; dll_process_attach( hInstance, true ); break; case DLL_PROCESS_DETACH: dll_process_detach( hInstance, true ); break; case DLL_THREAD_ATTACH: dll_thread_attach( true, true ); break; case DLL_THREAD_DETACH: dll_thread_detach( true, true ); break; } return true; }
Re: OOP, faster data layouts, compilers
On 04/22/2011 02:55 AM, Paulo Pinto wrote: Many thanks for the links, they provide very nice discussions. Specially the link below, that you can follow from your first link, http://c0de517e.blogspot.com/2011/04/2011-current-and-future-programming.html But in what concerns game development, D2 might already be too late. I know a bit of it, since a live a bit on that part of the universe. Due to XNA(Windows and XBox 360), Mono/Unity, and now WP7, many game studios have started to move their tooling into C#. And some of them are nowadays even using it for the server side code. Java used to have a foot there, specially due to the J2ME game development, with a small push thanks to Android. Which decreased since Google made the NDK available. If one day Microsoft really lets C# free, the same way ATT somehow did with C and C++, then C# might actually be the next C++, at least in what game development is concerned. And the dependency on a JIT environment is an implementation issue. The Bartok compiler in Singularity compiles to native code, and Mono also provides a similar option. So who knows? -- Paulo I don't think C# is the next C++; it's impossible for C# to be what C/C++ is. There is a purpose and a place for Interpreted languages like C# and Java, just like there is for C/C++. What language do you think the interpreters for Java and C# are written in? (Hint: It's not Java or C#.) I also don't think that the core of Unity (or any decent game engine) is written in an interpreted language either, which basically means the guts are likely written in either C or C++. The point being made is that Systems Programming Languages like C/C++ and D are picked for their execution speed, and Interpreted Languages are picked for their ease of programming (or development speed). Since D is picked for execution speed, we should seriously consider every opportunity to improve in that arena. The OP wasn't just for the game developers, but for game framework developers as well.
Re: OOP, faster data layouts, compilers
Am 22.04.2011 18:48, schrieb Kai Meyer: I don't think C# is the next C++; it's impossible for C# to be what C/C++ is. There is a purpose and a place for Interpreted languages like C# and Java, just like there is for C/C++. What language do you think the interpreters for Java and C# are written in? (Hint: It's not Java or C#.) I also don't think that the core of Unity (or any decent game engine) is written in an interpreted language either, which basically means the guts are likely written in either C or C++. The point being made is that Systems Programming Languages like C/C++ and D are picked for their execution speed, and Interpreted Languages are picked for their ease of programming (or development speed). Since D is picked for execution speed, we should seriously consider every opportunity to improve in that arena. The OP wasn't just for the game developers, but for game framework developers as well. IMHO D won't be successful for games as long as it only supports Windows, Linux and OSX on PC (-like) hardware. We'd need support for modern game consoles (XBOX360, PS3, maybe Wii) and for mobile devices (Android, iOS, maybe Win7 phones and other stuff). This means good PPC (maybe the PS3's Cell CPU would need special support even though it's understands PPC code? I don't know.) and ARM support and support for the operating systems and SDKs used on those platforms. Of course execution speed is very important as well, but D in it's current state is not *that* bad in this regard. Sure, the GC is a bit slow, but in high performance games you shouldn't use it (or even malloc/free) all the time, anyway, see http://www.digitalmars.com/d/2.0/memory.html#realtime Another point: I find Minecraft pretty impressive. It really changed my view upon Games developed in Java. Cheers, - Daniel
Re: OOP, faster data layouts, compilers
On 04/22/2011 11:05 AM, Daniel Gibson wrote: Am 22.04.2011 18:48, schrieb Kai Meyer: I don't think C# is the next C++; it's impossible for C# to be what C/C++ is. There is a purpose and a place for Interpreted languages like C# and Java, just like there is for C/C++. What language do you think the interpreters for Java and C# are written in? (Hint: It's not Java or C#.) I also don't think that the core of Unity (or any decent game engine) is written in an interpreted language either, which basically means the guts are likely written in either C or C++. The point being made is that Systems Programming Languages like C/C++ and D are picked for their execution speed, and Interpreted Languages are picked for their ease of programming (or development speed). Since D is picked for execution speed, we should seriously consider every opportunity to improve in that arena. The OP wasn't just for the game developers, but for game framework developers as well. IMHO D won't be successful for games as long as it only supports Windows, Linux and OSX on PC (-like) hardware. We'd need support for modern game consoles (XBOX360, PS3, maybe Wii) and for mobile devices (Android, iOS, maybe Win7 phones and other stuff). This means good PPC (maybe the PS3's Cell CPU would need special support even though it's understands PPC code? I don't know.) and ARM support and support for the operating systems and SDKs used on those platforms. Of course execution speed is very important as well, but D in it's current state is not *that* bad in this regard. Sure, the GC is a bit slow, but in high performance games you shouldn't use it (or even malloc/free) all the time, anyway, see http://www.digitalmars.com/d/2.0/memory.html#realtime Another point: I find Minecraft pretty impressive. It really changed my view upon Games developed in Java. Cheers, - Daniel Hah, Minecraft. Have you tried loading up a high resolution texture pack yet? There's a reason why it looks like 8-bit graphics. It's not Java that makes Minecraft awesome, imo :)
Re: OOP, faster data layouts, compilers
Am 22.04.2011 19:11, schrieb Kai Meyer: On 04/22/2011 11:05 AM, Daniel Gibson wrote: Am 22.04.2011 18:48, schrieb Kai Meyer: I don't think C# is the next C++; it's impossible for C# to be what C/C++ is. There is a purpose and a place for Interpreted languages like C# and Java, just like there is for C/C++. What language do you think the interpreters for Java and C# are written in? (Hint: It's not Java or C#.) I also don't think that the core of Unity (or any decent game engine) is written in an interpreted language either, which basically means the guts are likely written in either C or C++. The point being made is that Systems Programming Languages like C/C++ and D are picked for their execution speed, and Interpreted Languages are picked for their ease of programming (or development speed). Since D is picked for execution speed, we should seriously consider every opportunity to improve in that arena. The OP wasn't just for the game developers, but for game framework developers as well. IMHO D won't be successful for games as long as it only supports Windows, Linux and OSX on PC (-like) hardware. We'd need support for modern game consoles (XBOX360, PS3, maybe Wii) and for mobile devices (Android, iOS, maybe Win7 phones and other stuff). This means good PPC (maybe the PS3's Cell CPU would need special support even though it's understands PPC code? I don't know.) and ARM support and support for the operating systems and SDKs used on those platforms. Of course execution speed is very important as well, but D in it's current state is not *that* bad in this regard. Sure, the GC is a bit slow, but in high performance games you shouldn't use it (or even malloc/free) all the time, anyway, see http://www.digitalmars.com/d/2.0/memory.html#realtime Another point: I find Minecraft pretty impressive. It really changed my view upon Games developed in Java. Cheers, - Daniel Hah, Minecraft. Have you tried loading up a high resolution texture pack yet? There's a reason why it looks like 8-bit graphics. It's not Java that makes Minecraft awesome, imo :) No I haven't. What I find impressive is this (almost infinitely) big world that is completely changeable, i.e. you can build new stuff everywhere, you can dig tunnels everywhere (ok, somewhere really deep there's a limit) and the game still runs smoothly. Haven't seen something like that in any game before.
Re: OOP, faster data layouts, compilers
On 04/22/2011 11:20 AM, Daniel Gibson wrote: Am 22.04.2011 19:11, schrieb Kai Meyer: On 04/22/2011 11:05 AM, Daniel Gibson wrote: Am 22.04.2011 18:48, schrieb Kai Meyer: I don't think C# is the next C++; it's impossible for C# to be what C/C++ is. There is a purpose and a place for Interpreted languages like C# and Java, just like there is for C/C++. What language do you think the interpreters for Java and C# are written in? (Hint: It's not Java or C#.) I also don't think that the core of Unity (or any decent game engine) is written in an interpreted language either, which basically means the guts are likely written in either C or C++. The point being made is that Systems Programming Languages like C/C++ and D are picked for their execution speed, and Interpreted Languages are picked for their ease of programming (or development speed). Since D is picked for execution speed, we should seriously consider every opportunity to improve in that arena. The OP wasn't just for the game developers, but for game framework developers as well. IMHO D won't be successful for games as long as it only supports Windows, Linux and OSX on PC (-like) hardware. We'd need support for modern game consoles (XBOX360, PS3, maybe Wii) and for mobile devices (Android, iOS, maybe Win7 phones and other stuff). This means good PPC (maybe the PS3's Cell CPU would need special support even though it's understands PPC code? I don't know.) and ARM support and support for the operating systems and SDKs used on those platforms. Of course execution speed is very important as well, but D in it's current state is not *that* bad in this regard. Sure, the GC is a bit slow, but in high performance games you shouldn't use it (or even malloc/free) all the time, anyway, see http://www.digitalmars.com/d/2.0/memory.html#realtime Another point: I find Minecraft pretty impressive. It really changed my view upon Games developed in Java. Cheers, - Daniel Hah, Minecraft. Have you tried loading up a high resolution texture pack yet? There's a reason why it looks like 8-bit graphics. It's not Java that makes Minecraft awesome, imo :) No I haven't. What I find impressive is this (almost infinitely) big world that is completely changeable, i.e. you can build new stuff everywhere, you can dig tunnels everywhere (ok, somewhere really deep there's a limit) and the game still runs smoothly. Haven't seen something like that in any game before. The random world generator is amazing, but it's not speed. The polygon count of the game is excruciatingly low because the client is smart enough to only draw the faces of blocks that are visible. The very bottom (bedrock) and they very top of the sky (as high as you can build blocks) is 256 blocks tall. The game is full of low-level bit-stuffing (like stacks of 64). The genius of the game is not in any special features of Java, it's in the data structure and data generator, which can be done much faster in other languages. But it begs the question, why does it need to be faster? It is fast enough in the JVM (unless you load up the high resolution textures, in which case the game becomes unbearably slow when viewing long distances.) The purpose of the original post was to indicate that some low level research shows that underlying data structures (as applied to video game development) can have an impact on the performance of the application, which D (I think) cares very much about.
Linus with some good observations on garbage collection
http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html
Re: Linus with some good observations on garbage collection
Just a reminder: that post is 9 years old.
Re: Linus with some good observations on garbage collection
El 22/04/2011 19:36, Walter Bright escribió: http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html I've always been surprised when discussions usually just bring garbage collection as the only alternative to explicit manual memory management. I imagined it as a garbage truck that has its own schedule and may let a lot of trash pile up before passing by. I always naively thought, why not just free immediately when an object gets no references? Not an expert, so there may be reasons I don't see, but now that Linus says somethnig along the lines, I'll ask. Why not? Isn't it much easier to do refcount++ and refcount--, and if refcount==0 immediately free()? Memory will be available to other needs faster, no need for an additional thread, or a lot of memory consumed before the advanced garbage truck decides to come in, or slight pauses when collecting trash (maybe only in old implementations), and the implementation is much simpler... OK, I knew about that cyclic references problem. But Linus doesn't seem to see a big problem and solutions can be found with care...
Re: Linus with some good observations on garbage collection
This sort of reference count with cyclic dependency detector is how a lot of scripting languages do it, or did it in the past. The problem was that lazy generational GCs are believed to have better throughput in general. I'd like to say were proved rather than are believed, but I don't actually know where to go for such evidence. However, I do believe many scripting languages, such as python, eventually ditched the reference counting technique for generational, and Java has very fast GC, so I am inclined to believe those real-life solutions than Linus. Mike On Fri, Apr 22, 2011 at 2:32 PM, Alvaro alvaro.seg...@gmail.com wrote: El 22/04/2011 19:36, Walter Bright escribió: http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html I've always been surprised when discussions usually just bring garbage collection as the only alternative to explicit manual memory management. I imagined it as a garbage truck that has its own schedule and may let a lot of trash pile up before passing by. I always naively thought, why not just free immediately when an object gets no references? Not an expert, so there may be reasons I don't see, but now that Linus says somethnig along the lines, I'll ask. Why not? Isn't it much easier to do refcount++ and refcount--, and if refcount==0 immediately free()? Memory will be available to other needs faster, no need for an additional thread, or a lot of memory consumed before the advanced garbage truck decides to come in, or slight pauses when collecting trash (maybe only in old implementations), and the implementation is much simpler... OK, I knew about that cyclic references problem. But Linus doesn't seem to see a big problem and solutions can be found with care...
Re: Linus with some good observations on garbage collection
On Fri, 22 Apr 2011 14:32:06 -0400, Alvaro alvaro.seg...@gmail.com wrote: El 22/04/2011 19:36, Walter Bright escribió: http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html I've always been surprised when discussions usually just bring garbage collection as the only alternative to explicit manual memory management. I imagined it as a garbage truck that has its own schedule and may let a lot of trash pile up before passing by. I always naively thought, why not just free immediately when an object gets no references? Because you then have to update potentially two reference counts every time you assign a pointer. GC's save you from doing that. I know way way less than Torvalds, but my naive brain says GC's still win because often times, slightly noticeable drops in performance are worth having code that doesn't corrupt memory. This may not be true for kernel development, but then again, we aren't all developing kernels ;) -Steve
Re: OOP, faster data layouts, compilers
Kai Meyer: The purpose of the original post was to indicate that some low level research shows that underlying data structures (as applied to video game development) can have an impact on the performance of the application, which D (I think) cares very much about. The idea of the original post was a bit more complex: how can we invent new/better ways to express semantics in D code that will not forbid future D compilers to perform a bit of changes in the layout of data structures to increase code performance? Complex transforms of the data layout seem too much complex for even a good compiler, but maybe simpler ones will be possible. And I think to do this the D code needs some more semantics. I was suggesting an annotation that forbids inbound pointers, that allows the compiler to move data around a little, but this is just a start. Bye, bearophile
Re: Linus with some good observations on garbage collection
Also add to it that in many cases you're dealing with a threaded environment, so those refcounts have to be locked (either via mutexes, or more commonly just atomic) operations which are far more expensive than non-atomic. More so when there's actual contention for the refcounted resource. On 4/22/2011 11:53 AM, Michael Stover wrote: This sort of reference count with cyclic dependency detector is how a lot of scripting languages do it, or did it in the past. The problem was that lazy generational GCs are believed to have better throughput in general. I'd like to say were proved rather than are believed, but I don't actually know where to go for such evidence. However, I do believe many scripting languages, such as python, eventually ditched the reference counting technique for generational, and Java has very fast GC, so I am inclined to believe those real-life solutions than Linus. Mike On Fri, Apr 22, 2011 at 2:32 PM, Alvaro alvaro.seg...@gmail.com mailto:alvaro.seg...@gmail.com wrote: El 22/04/2011 19:36, Walter Bright escribió: http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html I've always been surprised when discussions usually just bring garbage collection as the only alternative to explicit manual memory management. I imagined it as a garbage truck that has its own schedule and may let a lot of trash pile up before passing by. I always naively thought, why not just free immediately when an object gets no references? Not an expert, so there may be reasons I don't see, but now that Linus says somethnig along the lines, I'll ask. Why not? Isn't it much easier to do refcount++ and refcount--, and if refcount==0 immediately free()? Memory will be available to other needs faster, no need for an additional thread, or a lot of memory consumed before the advanced garbage truck decides to come in, or slight pauses when collecting trash (maybe only in old implementations), and the implementation is much simpler... OK, I knew about that cyclic references problem. But Linus doesn't seem to see a big problem and solutions can be found with care...
Re: OOP, faster data layouts, compilers
On Fri, Apr 22, 2011 at 12:31 PM, Kai Meyer k...@unixlords.com wrote: On 04/22/2011 11:20 AM, Daniel Gibson wrote: Am 22.04.2011 19:11, schrieb Kai Meyer: On 04/22/2011 11:05 AM, Daniel Gibson wrote: Am 22.04.2011 18:48, schrieb Kai Meyer: I don't think C# is the next C++; it's impossible for C# to be what C/C++ is. There is a purpose and a place for Interpreted languages like C# and Java, just like there is for C/C++. What language do you think the interpreters for Java and C# are written in? (Hint: It's not Java or C#.) I also don't think that the core of Unity (or any decent game engine) is written in an interpreted language either, which basically means the guts are likely written in either C or C++. The point being made is that Systems Programming Languages like C/C++ and D are picked for their execution speed, and Interpreted Languages are picked for their ease of programming (or development speed). Since D is picked for execution speed, we should seriously consider every opportunity to improve in that arena. The OP wasn't just for the game developers, but for game framework developers as well. IMHO D won't be successful for games as long as it only supports Windows, Linux and OSX on PC (-like) hardware. We'd need support for modern game consoles (XBOX360, PS3, maybe Wii) and for mobile devices (Android, iOS, maybe Win7 phones and other stuff). This means good PPC (maybe the PS3's Cell CPU would need special support even though it's understands PPC code? I don't know.) and ARM support and support for the operating systems and SDKs used on those platforms. Of course execution speed is very important as well, but D in it's current state is not *that* bad in this regard. Sure, the GC is a bit slow, but in high performance games you shouldn't use it (or even malloc/free) all the time, anyway, see http://www.digitalmars.com/d/2.0/memory.html#realtime Another point: I find Minecraft pretty impressive. It really changed my view upon Games developed in Java. Cheers, - Daniel Hah, Minecraft. Have you tried loading up a high resolution texture pack yet? There's a reason why it looks like 8-bit graphics. It's not Java that makes Minecraft awesome, imo :) No I haven't. What I find impressive is this (almost infinitely) big world that is completely changeable, i.e. you can build new stuff everywhere, you can dig tunnels everywhere (ok, somewhere really deep there's a limit) and the game still runs smoothly. Haven't seen something like that in any game before. The random world generator is amazing, but it's not speed. The polygon count of the game is excruciatingly low because the client is smart enough to only draw the faces of blocks that are visible. The very bottom (bedrock) and they very top of the sky (as high as you can build blocks) is 256 blocks tall. The game is full of low-level bit-stuffing (like stacks of 64). The genius of the game is not in any special features of Java, it's in the data structure and data generator, which can be done much faster in other languages. But it begs the question, why does it need to be faster? It is fast enough in the JVM (unless you load up the high resolution textures, in which case the game becomes unbearably slow when viewing long distances.) Actually, the world is 128 blocks tall, and divided into 16x128x16 block chunks. To elaborate on the bit stuffing, at the end of the day, each block is 2.5 bytes (type, metadata, and some lighting info) with exceptions for things like chests. The reason Minecraft runs so well in Java, from my point of view, is that the authors resisted the Java urge to throw objects at the problem and instead put everything into large byte arrays and wrote methods to manipulate them. From that perspective, using Java would be about the same as using any language, which let them stick to what they knew without incurring a large performance penalty. However, it's also true that as soon as you try to use a 128x128 texture pack, you very quickly become disillusioned with Minecraft's performance.
using dylib with dmd
Greetings every one, I desperatly searched net about how to compile a d2 program that uses dylib on mac os, but unfortunately no luck. What I've been doing: I have a C library compiled as (taken from http://developer.apple.com/): gcc -dynamiclib -std=gnu99 Ratings.c -current_version 0.1 -compatibility_version 0.1 - fvisibility=default -o libRatings.A.g -- so I have libRatings.A.dylib that my testapp will use. I've converted Ratings.h header of the lib to Ratings.d file, all functions enclosed in extern(C); I created a test program: test_d_client.d import Ratings; import std.stdio; void main() { writeln(Starting test); char[] value1 = *.dup; addRating(value1.ptr); writeln(rating: %s, ratings()); } And now, when I execute: dmd test_d_client.d I get following output: Undefined symbols: _addRating, referenced from: __Dmain in test_d_client.o _ratings, referenced from: __Dmain in test_d_client.o ld: symbol(s) not found collect2: ld returned 1 exit status --- errorlevel 1 What I am doing wrong here?
Re: using dylib with dmd
frostmind Wrote: And now, when I execute: dmd test_d_client.d I get following output: Undefined symbols: _addRating, referenced from: __Dmain in test_d_client.o _ratings, referenced from: __Dmain in test_d_client.o ld: symbol(s) not found collect2: ld returned 1 exit status --- errorlevel 1 What I am doing wrong here? You need to tell the linker where your library is. You can just pass it to dmd on the commandline.
Re: using dylib with dmd
Thank you for your response! Hopefully I've done it right. Now when everything is within the same folder, and I execute: dmd test_d_client.d -L. (so I'm telling to look for libs in current dir) Response is now different: ld: in ., can't map file, errno=22 collect2: ld returned 1 exit status --- errorlevel 1 What else could be done here to resolve it?
Re: Linus with some good observations on garbage collection
Brad Roberts wrote: Also add to it that in many cases you're dealing with a threaded environment, so those refcounts have to be locked (either via mutexes, or more commonly just atomic) operations which are far more expensive than non-atomic. More so when there's actual contention for the refcounted resource. That is only a problem if the reference count of that resource changes at a very high frequency. The described problem also implies that the threads would not need any form of synchronization for the data (otherwise the reference count certainly would not be a bottleneck.) I cannot, at the moment, think of a real-world example where this would not imply bad design. Can you help me out? Michael Stover wrote: I'd like to say were proved rather than are believed, but I don't actually know where to go for such evidence. However, I do believe many scripting languages, such as python, eventually ditched the reference counting technique for generational, and Java has very fast GC, so I am inclined to believe those real-life solutions than Linus. Mike Well, the GC may be fast when compared to other GCs, but it has to be designed to run on general data whose reference structure can be arbitrary. Often, the objects/references have a somewhat specialized structure that a smart programmer can exploit, especially if the references point to private data. But that only matters if performance is of prime interest, and the gains may be not very big. But, as pointed out by Linus, the prime performance problem is _not_ the GC, but the mindset that comes with it. Most programmers that grew up in a managed environment tend to use very many new keywords all over their code, instead of allocating large chunks of memory at once. (Java/C#/etc encourage you to do this.) When they then try to write a C++ program, they do the same. The resulting memory bugs are then blamed on the lack of a GC (you can have GC in C/C++, but most of the people I have talked to do not know this.) They then happily change back to Java, that has a very fast GC. The important thing to note here is that the work required to deallocate all these many memory locations does not magically disappear, but it is delegated to the GC, which will mostly do it faster and more reliable than a programmer which has to do it manually. But the problem does not lie in the deallocations, it's in the allocations. Consider this analogy: Scenario 1: Many people like candy. Those are wrapped in colorful little pieces of paper. Every day, every person buys one piece of candy in the candy shop (new Candy()!) and on the way back home they throw away the wrapping paper somewhere on the street (reassign reference). Those are garbage. In the evening, some creepy guy comes to search the whole street for those small pieces of paper. Would you call that guy a garbage collector? He collects garbage, but in the real world garbage collectors work more like this: Scenario 2: Still, many people like candy. Every person buys a bag of candy in the candy shop once a year (new Candy[365]). When all the candy is eaten, they put all the garbage in one bag and put it to their front door (reassign reference to whole array). A very handsome guy collects all those bags. He is very much more efficient than the guy in example 1. (Arguably, memory usage is bigger in that particular example, but in computer programs, the allocating process can reuse the memory. The analogy breaks down here.) Note that I am not saying that garbage collection is bad. If the references form a very complicated structure, or if a reference to the containing object is not necessarily kept (iE. array slicing), it can be very useful as well as faster than manual memory management. Also, custom allocators can reduce the lots-of-small-allocations problem, but they have some overhead too. Advanced GCs may do this automatically, I don't know. The reason Java is garbage collected is _not_ performance, but primarily reliability. In big programming projects, it can be hard to know where a reference to an object may be kept, if there are circular references etc, as programs keep expanding without the programmers understanding the whole project in detail. GC also removes bugs related to memory management. I think true and reliable OO is not possible without a GC. The downside is, that many Java/... programmers don't concern themselves much with the internals of memory allocations, and are _very_ proud of it. This is also the reason I think it is a bad idea to deprecate D's 'delete'. -Timon
Re: Linus with some good observations on garbage collection
Timon Gehr: But, as pointed out by Linus, the prime performance problem is _not_ the GC, but the mindset that comes with it. Most programmers that grew up in a managed environment tend to use very many new keywords all over their code, instead of allocating large chunks of memory at once. (Java/C#/etc encourage you to do this.) In C99 (and Ada) you avoid the allocation of some dynamic arrays with new thanks to variable length arrays. This is also the reason I think it is a bad idea to deprecate D's 'delete'. D used to have scoped class instances, scoped classes, and delete, their replacements are not good enough yet. In CommonLisp you have hints for the GC, they are safe and they help you help speedup the work of the GC. Such hints probably need to be integrated with the type system, so they may need to be built-ins as scope/delete were. I am not seeing enough discussion about this. Bye, bearophile
Transients or scoped immutability
This post contains uncooked ideas. I'd like to create data structures: - That once created are immutable, so there is no risk to write on them, etc; - On the stack too, avoiding slower heap allocations and avoiding copying them from the mutable to the immutable version; - Avoiding to keep in the function name space a dead name of the mutable version of the data structure; - Avoiding calls to functions that may contain loops that DMD doesn't inline; - Avoiding too much complex code for the programmer. A syntax idea, a do{}transient(name1, name2, ...);: void foo(char[] data) { do { int[256] count; foreach (char c; data) count[c]++; int x = ... auto bar = map!(...)(...x...); } transient(const count, const bar); // Here x is not visible. // Here count and bar are visible but read-only. } void main() { foo(this is a string); } A different syntax, that inverts the precedent idea and uses a sub-scope (it's a bit like a with(){}, but its purpose is not to access fields of a struct): void foo(char[] data) { int[256] count; foreach (char c; data) count[c]++; int x = ... auto bar = map!(...)(...x...); // Here x, count, data and bar are visible and mutable. scope (const count, const bar, data) { // Here x is not visible. // Here count and bar are visible but read-only. // Here data is visible and mutable } // Here x, count, data and bar are visible and mutable. } void main() { foo(this is a string); } I think the first idea is a bit less bug-prone, and it avoids too much indenting of the code. Probably there are ways to invent a better syntax. They have added the idea of transients is Clojure too: http://clojure.org/transients?responseToken=07a82a51e4651b10f3a8ee4be09fe1f9f This idea is good and it will help, but it needs a function call: http://d.puremagic.com/issues/show_bug.cgi?id=5081 Bye, bearophile
Re: OOP, faster data layouts, compilers
On 4/22/2011 2:20 PM, bearophile wrote: Kai Meyer: The purpose of the original post was to indicate that some low level research shows that underlying data structures (as applied to video game development) can have an impact on the performance of the application, which D (I think) cares very much about. The idea of the original post was a bit more complex: how can we invent new/better ways to express semantics in D code that will not forbid future D compilers to perform a bit of changes in the layout of data structures to increase code performance? Complex transforms of the data layout seem too much complex for even a good compiler, but maybe simpler ones will be possible. And I think to do this the D code needs some more semantics. I was suggesting an annotation that forbids inbound pointers, that allows the compiler to move data around a little, but this is just a start. Bye, bearophile In many ways the biggest thing I use regularly in game development that I would lose by moving to D would be good built-in SIMD support. The PC compilers from MS and Intel both have intrinsic data types and instructions that cover all the operations from SSE1 up to AVX. The intrinsics are nice in that the job of register allocation and scheduling is given to the compiler and generally the code it outputs is good enough (though it needs to be watched at times). Unlike ASM, intrinsics can be inlined so your math library can provide a platform abstraction at that layer before building up to larger operations (like vectorized forms of sin, cos, etc) and algorithms (like frustum cull checks, k-dop polygon collision etc), which makes porting and reusing the algorithms to other platforms much much easier, as only the low level layer needs to be ported, and only outliers at the algorithm level need to be tweaked after you get it up and running. On the consoles there is AltiVec (VMX) which is very similar to SSE in many ways. The common ground is basically SSE1 tier operations : 128 bit values operating on 4x32 bit integer and 4x32 bit float support. 64 bit AMD/Intel makes SSE2 the minimum standard, and a systems language on those platforms should reflect that. Loading and storing is comparable across platforms with similar alignment restrictions or penalties for working with unaligned data. Packing/swizzle/shuffle/permuting are different but this is not a huge problem for most algorithms. The lack of fused multiply and add on the Intel side can be worked around or abstracted (i.e. always write code as if it existed, have the Intel version expand to multiple ops). And now my wish list: If you have worked with shader programming through HLSL or CG the expressiveness of doing the work in SIMD is very high. If I could write something that looked exactly like HLSL but it was integrated perfectly in a language like D or C++, it would be pretty huge to me. The amount of math you can have in a line or two in HLSL is mind boggling at times, yet extremely intuitive and rather easy to debug.
Re: OOP, faster data layouts, compilers
Sean Cavanaugh: In many ways the biggest thing I use regularly in game development that I would lose by moving to D would be good built-in SIMD support. The PC compilers from MS and Intel both have intrinsic data types and instructions that cover all the operations from SSE1 up to AVX. The intrinsics are nice in that the job of register allocation and scheduling is given to the compiler and generally the code it outputs is good enough (though it needs to be watched at times). This is a topic quite different from the one I was talking about, but it's an interesting topic :-) SIMD intrinsics look ugly, they add lot of noise to the code, and are very specific to one CPU, or instruction set. You can't design a clean language with hundreds of those. Once 256 or 512 bit registers come, you need to add new intrinsics and change your code to use them. This is not so good. D array operations are probably meant to become smarter, when you perform a: int[8] a, b, c; a = b + c; A future good D compiler may use just two inlined istructions, or little more. This will probably include shuffling and broadcasting properties too. Maybe this kind of code is not as efficient as handwritten assembly code (or C code that uses SIMD intrinsics) but it's adaptable to different CPUs, future ones too, it's much less noisy, and it seems safer. I think such optimizations are better left to the back-end, so lot of time ago I've asked it to LLVM devs, for future LDC: http://llvm.org/bugs/show_bug.cgi?id=6956 The presence of such well implemented vector ops will not forbid another D compiler to add true SIMD intrinsics too. Unlike ASM, intrinsics can be inlined so your math library can provide a DMD may eventually need this feature of the LDC compiler: http://www.dsource.org/projects/ldc/wiki/InlineAsmExpressions Bye, bearophile
Re: Linus with some good observations on garbage collection
== Quote from bearophile (bearophileh...@lycos.com)'s article Timon Gehr: But, as pointed out by Linus, the prime performance problem is _not_ the GC, but the mindset that comes with it. Most programmers that grew up in a managed environment tend to use very many new keywords all over their code, instead of allocating large chunks of memory at once. (Java/C#/etc encourage you to do this.) In C99 (and Ada) you avoid the allocation of some dynamic arrays with new thanks to variable length arrays. Variable length arrays are just sugary syntax for a call to alloca. This is also the reason I think it is a bad idea to deprecate D's 'delete'. D used to have scoped class instances, scoped classes, and delete, their replacements are not good enough yet. In CommonLisp you have hints for the GC, they are safe and they help you help speedup the work of the GC. Such hints probably need to be integrated with the type system, so they may need to be built-ins as scope/delete were. I am not seeing enough discussion about this. Bye, bearophile I've always felt that Vala's system is better thought out, which is incidentally based on a reference counting system. This makes destructors in Vala deterministic and can be used to implement an RAII pattern for resource management. To get around the common pitfalls of reference counting systems, they introduce two keywords which alter the relationship between the allocated object and the GC, 'weak' and 'unowned'. Rather than bore you with the gritty details here, see link: http://live.gnome.org/Vala/ReferenceHandling
Re: Linus with some good observations on garbage collection
Iain Buclaw: Variable length arrays are just sugary syntax for a call to alloca. I have an enhancement request in Bugzilla on VLA, with longer discussions. Just two comments: - It seems alloca() can be implemented with two different semantics: to deallocate at the end of the function or to deallocate at the end of the scope. Usually alloca() deallocates at the end of the function, but that semantic confusion is dangerous. VLA deallocate at the end of the scope, just like any other static array. - To use alloca you need to use pointers, maybe even slices, it's not DRY, etc. So syntax sugar helps. In the meantime I've changed my mind a little. Now D I prefer something better than C99 VLAs. I'd like D-VLAs with the same syntax as C99 VLAs but with a safer semantics, closer to this one (but the alloca used here must deallocate at the end of the scope): enum size_t MAX_VLA_SIZE = 1024; static assert (is(typeof(size) == size_t)); T* ptr = null; if ((size * T.sizeof) MAX_VLA_SIZE) ptr = cast(T*)alloca(size * T.sizeof); T[] array = (ptr == null) ? new T[size] : ptr[0 .. size]; array[] = T.init; This has some advantages: when alloca returns null, or when the array is large, it uses the GC. This allows to both avoid some stack overflows and reduce the risk of the stack memory from becoming too much cold. Rather than bore you with the gritty details here, see link: http://live.gnome.org/Vala/ReferenceHandling This is interesting. Bye, bearophile
Re: Linus with some good observations on garbage collection
Now D I prefer something better than C99 VLAs. I'd like D-VLAs with the same syntax as C99 VLAs but with a safer semantics, Never mind, that semantics can't use that syntax, otherwise you have hidden heap allocations... The syntax has to change (because the D-VLA semantics seems OK to me). Bye, bearophile
Re: OOP, faster data layouts, compilers
On 4/23/2011 4:22 AM, Andrew Wiley wrote: The reason Minecraft runs so well in Java, from my point of view, is that the authors resisted the Java urge to throw objects at the problem and instead put everything into large byte arrays and wrote methods to manipulate them. From that perspective, using Java would be about the same as using any language, which let them stick to what they knew without incurring a large performance penalty. FYI, Markus, the author, has been a figure in the Java game development community for years. He was the original client programmer for Wurm Online[1] (where the landscape is 'infinite' and tiled) and a frequent participant in the Java4k competition[2] (with Left4kDead[3] perhaps being his most popular). I think it's a safe assumption that the techniques he put to use in Minecraft were learned from his experiments with the Wurm landscape and with cramming Java games into 4kb. [1] http://www.wurmonline.com/ [2] http://www.java4k.com/index.php?action=home [3] http://www.mojang.com/notch/j4k/l4kd/
Re: Transients or scoped immutability
bearophile Wrote: This post contains uncooked ideas. I'd like to create data structures: - That once created are immutable, so there is no risk to write on them, etc; - On the stack too, avoiding slower heap allocations and avoiding copying them from the mutable to the immutable version; - Avoiding to keep in the function name space a dead name of the mutable version of the data structure; - Avoiding calls to functions that may contain loops that DMD doesn't inline; - Avoiding too much complex code for the programmer. There was talk in the past of allowing a pure function create and modify a class and return that class as immutable. This was a suggestion on how to create immutable classes. The details were never really hashed out, but maybe it is possible for any returned value (of a pure function) to be implicitly converted to immutable. To me it sounds really nice an clean, do you think it would work for you?
Re: Transients or scoped immutability
Jesse Phillips: To me it sounds really nice an clean, do you think it would work for you? Maybe you have missed the last two lines of my post: This idea is good and it will help, but it needs a function call: http://d.puremagic.com/issues/show_bug.cgi?id=5081 I like that idea and I think it will be good to have in D. But I think it's not enough, to solve the problem I have shown it requires a not simple function signature, you need to instantiate the static array before the call point (that's not nice and asks for two names for the same array), to use ref both in input and output, etc. Bye, bearophile
Re: Linus with some good observations on garbage collection
2011/4/22 Timon Gehr timon.g...@gmx.ch: That is only a problem if the reference count of that resource changes at a very high frequency. The described problem also implies that the threads would not need any form of synchronization for the data (otherwise the reference count certainly would not be a bottleneck.) Michael Stover wrote: I'd like to say were proved rather than are believed, but I don't actually know where to go for such evidence. However, I do believe many scripting languages, such as python, eventually ditched the reference counting technique for generational, and Java has very fast GC, so I am inclined to believe those real-life solutions than Linus. Well, the GC may be fast when compared to other GCs, but it has to be designed to run on general data whose reference structure can be arbitrary. Often, the objects/references have a somewhat specialized structure that a smart programmer can exploit, especially if the references point to private data. But that only matters if performance is of prime interest, and the gains may be not very big. All in all, I think the best approach is a pragmatic one, where different types of resources can be handled according to different schemes. I.E. default to GC-manage everything. After profiling, determining what resources are mostly used, and where, optimize allocation for those resources, preferably to scoped allocation, or if not possible, reference-counted. Premature optimization is a root of much evil, for instance, the malloc-paranoid might very well resort to abuse of struct:s, leading either to lots of manual pointers, or excessive memory copying. Incidentally, this was the main thing that attracted me to D. Be lazy/productive where performance doesn't matter much, and focus optimization on where it does.
Re: OOP, faster data layouts, compilers
On 4/22/2011 4:41 PM, bearophile wrote: Sean Cavanaugh: In many ways the biggest thing I use regularly in game development that I would lose by moving to D would be good built-in SIMD support. The PC compilers from MS and Intel both have intrinsic data types and instructions that cover all the operations from SSE1 up to AVX. The intrinsics are nice in that the job of register allocation and scheduling is given to the compiler and generally the code it outputs is good enough (though it needs to be watched at times). This is a topic quite different from the one I was talking about, but it's an interesting topic :-) SIMD intrinsics look ugly, they add lot of noise to the code, and are very specific to one CPU, or instruction set. You can't design a clean language with hundreds of those. Once 256 or 512 bit registers come, you need to add new intrinsics and change your code to use them. This is not so good. In C++ the intrinsics are easily wrapped by __forceinline global functions, to provide a platform abstraction against the intrinsics. Then, you can write class wrappers to provide the most common level of functionality, which boils down to a class to do vectorized math operators for + - * / and vectorized comparison functions == != = = and . From HLSL you have to borrow the 'any' and 'all' statements (along with variations for every permutation of the bitmask of the test result) to do conditional branching for the tests. This pretty much leaves swizzle/shuffle/permuting and outlying features (8,16,64 bit integers) in the realm of 'ugly'. From here you could build up portable SIMD transcendental functions (sin, cos, pow, log, etc), and other libraries (matrix multiplication, inversion, quaternions etc). I would say in D this could be faked provided the language at a minimum understood what a 128 (SSE1 through 4.2) and 256 bit value (AVX) was and how to efficiently move it via registers for function calls. Kind of 'make it at least work in the ABI, come back to a good implementation later' solution. There is some room to beat Microsoft here, as the the code visual studio 2010 outputs currently for 64 bit environments cannot pass 128 bit SIMD values by register (forceinline functions are the only workaround), even though scalar 32 and 64 bit float values are passed by XMM register just fine. The current hardware landscape dictates organizing your data in SIMD friendly manners. Naive OOP based code is going to de-reference too many pointers to get to scattered data. This makes the hardware prefetcher work too hard, and it wastes cache memory by only using a fraction of the RAM from the cache line, plus wasting 75-90% of the bandwidth and memory on the machine. D array operations are probably meant to become smarter, when you perform a: int[8] a, b, c; a = b + c; Now the original topic pertains to data layouts, of which SIMD, the CPU cache, and efficient code all inter-relate. I would argue the above code is an idealistic example, as when writing SIMD code you almost always have to transpose or rotate one of the sets of data to work in parallel across the other one. What happens when this code has to branch? In SIMD land you have to test if any or all 4 lanes of SIMD data need to take it. And a lot of time the best course of action is to compute the other code path in addition to the first one, AND the fist result and NAND the second one and OR the results together to make valid output. I could maybe see a functional language doing ok at this. The only reasonable construct to be able to explain how common this is in optimized SIMD code, is to compare it to is HLSL's vectorized ternary operator (and understanding that 'a' and 'b' can be fairly intricate chunks of code if you are clever): float4 a = {1,2,3,4}; float4 b = {5,6,7,8}; float4 c = {-1,0,1,2}; float4 d = {0,0,0,0}; float4 foo = (c d) ? a : b; results with foo = {5,6,3,4} For a lot of algorithms the 'a' and 'b' path have similar cost, so for SIMD it executes about 2x faster than the scalar case, although better than 2x gains are possible since using SIMD also naturally reduces or eliminates a ton of branching which CPUs don't really like to do due to their long pipelines. And as much as Intel likes to argue that a structure containing positions for a particle system should look like this because it makes their hardware benchmarks awesome, the following vertex layout is a failure: struct ParticleVertex { float[1000] XPos; float[1000] YPos; float[1000] ZPos; } The GPU (or Audio devices) does not consume it this way. The data is also not cache coherent if you are trying to read or write a single vertex out of the structure. A hybrid structure which is aware of the size of a SIMD register is the next logical choice: align(16) struct ParticleVertex { float[4] XPos; float[4] YPos; float[4] ZPos; } ParticleVertex[250] ParticleVertices; // struct is also
Re: OOP, faster data layouts, compilers
Sean Cavanaugh: In C++ the intrinsics are easily wrapped by __forceinline global functions, to provide a platform abstraction against the intrinsics. When AVX will become 512 bits wide, or you need to use a very different set of vector register, your global functions need to change, so the code that calls them too has to change. This is acceptable for library code, but it's not good for D built-ins operations. D built-in vector ops need to be more clean, general and long-lasting, even if they may not fully replace SSE intrinsics. I would say in D this could be faked provided the language at a minimum understood what a 128 (SSE1 through 4.2) and 256 bit value (AVX) was and how to efficiently move it via registers for function calls. Also think about what the D ABI will be 15-25 years from now. D design must look a bit more forward too. Now the original topic pertains to data layouts, It was about how to not preclude future D compilers from shuffling data around a bit by themselves :-) I would argue the above code is an idealistic example, as when writing SIMD code you almost always have to transpose or rotate one of the sets of data to work in parallel across the other one. Right. float4 a = {1,2,3,4}; float4 b = {5,6,7,8}; float4 c = {-1,0,1,2}; float4 d = {0,0,0,0}; float4 foo = (c d) ? a : b; Recently I have asked for a D vector comparison operation too, (the compiler is supposed able to splits them into register-sized chunks for the comparisons), this is good for AVX instructions (a little problem here is that I think currently DMD allocates memory on heap to instantiate those four little arrays): int[4] a = [1,2,3,4]; int[4] b = [5,6,7,8] int[4] c = [-1,0,1,2]; int[4] d = [0,0,0,0]; int[4] foo = (c[] d[]) ? a[] : b[]; Things get real messy when you have multiple vertex attributes as decisions to keep them together or separate are conflicting and both choices make sense to different systems :) It's not easy for future compilers to perform similar auto-vectorizations :-) Bye and thank you for your answer, bearophile
Re: Temporarily disable all purity for debug prints
On Fri, Apr 22, 2011 at 12:34 AM, dennis luehring dl.so...@gmx.net wrote: On 17.04.2011 22:45, Andrew Wiley wrote: On Sun, Apr 17, 2011 at 3:30 PM, dennis luehringdl.so...@gmx.net wrote: On 11.04.2011 23:27, bearophile wrote: From what I am seeing, in a D2 program if I have many (tens or more) pure functions that call to each other, and I want to add (or activate) a printf/writeln inside one (or few) of those functions to debug it, I may need to temporarily comment out the pure attribute of many functions (because printing can't be allowed in pure functions). As more and more D2 functions become pure in my code and in Phobos, something like a -disablepure compiler switch (and printf/writeln inside debug{}) may allow more handy debugging with prints (if the purity is well managed by the compiler then I think disabling the pure attributes doesn't change the program output). Bye, bearophile sounds a little bit like the need to see an private/protected part of an interface in unittest scenarios - just to be able to test it in a whitebox-testing without changing the attributes of the productive-code Isn't this already there because private makes things visible to all other code in the same module? ok - but what about protected? as a whitebox tester im not able(allowed) to change productive code,but i need to test through all the code (especially when doing code-coverage stuff) As far as I'm aware, all the visibility levels make things visible to code in the same module.
Re: Linus with some good observations on garbage collection
Ulrik Mikaelsson wrote: All in all, I think the best approach is a pragmatic one, where different types of resources can be handled according to different schemes. I.E. default to GC-manage everything. After profiling, determining what resources are mostly used, and where, optimize allocation for those resources, preferably to scoped allocation, or if not possible, reference-counted. Premature optimization is a root of much evil, for instance, the malloc-paranoid might very well resort to abuse of struct:s, leading either to lots of manual pointers, or excessive memory copying. Incidentally, this was the main thing that attracted me to D. Be lazy/productive where performance doesn't matter much, and focus optimization on where it does. That is very true. GC is almost always fast enough or even faster. And it is clearly most convenient. And yes, identify bottlenecks first, optimize later. But I also think programs that have some concern about efficient memory allocation (with GC or without GC) tend to be better designed in general. This actually increases productivity. Plus, it reduces the need for complicated optimizations later on. This in order increases maintainability. -Timon
Re: Next Release
Well, then I'd better make sure that I get my most recent updates to std.datetime in soon. - Jonathan M Davis Does your library take into account that there's no year 0?
Re: Web development howto?
On 22/04/2011 03:53, Jaime Barciela wrote: Hello everyone, I'm going though TDPL and I just joined this list. I've been looking for guidance on how to do web applications in D but I haven't found anything. My background is not C/C++ but Java (and Delphi many years ago) so I have not only a new language but a new culture to get used to as well. Could somebody give me some pointers? Thanks Jaime The simplest way to make a web application with D is to use CGI/FastCGI etc. There are also at least two frameworks in development (that I know of), one is significantly more developed. http://arsdnet.net/dcode/ - see web.d, cgi.d etc, this is the most mature (that I know of). https://github.com/mrmonday/serenity - One I'm working on. It's due to undergo significant changes and is lacking a lot of basic functionality, so I'd avoid it, for now at least. -- Robert http://octarineparrot.com/
Re: Next Release
Well, then I'd better make sure that I get my most recent updates to std.datetime in soon. - Jonathan M Davis Does your library take into account that there's no year 0? Actually, for ISO 8601, which the library follows, there _is_ a year 0. Date, DateTime, and SysTime all have the function yearBC which will give you the year as you would normally expect (1 B.C. being immediately prior to 1 A.D. with no year 0). But the ISO standard calls for a year 0, and I followed the standard (it's also way easier to deal with programmatically). So, other than the yearBC function, it treats 0 as the year prior to 1 A.D., and the years prior to 0 are negative. - Jonathan M Davis
Expression templates in D1
I have been trying to create some simple expression templates in D1 but I've run into trouble. Here is a reduced test case: class A { void opSub_r(T:int)(T a) { } void opSub(T)(T a) { } } void main(char[][] args) { A a; a - 1; a - a; // line 20 1 - a; } The error is: test.d(20): Error: template test.A.opSub_r(T : int) does not match any function template declaration test.d(20): Error: template test.A.opSub_r(T : int) cannot deduce template function from argument types !()(A) My goal is to adjust the code such that the three expressions in the main function all compile. Both opSub's must be templated functions (as far as I can tell) so that the expression templates can work properly. Here is my full code, incidentally: http://ideone.com/2vZdN Any ideas of any workabouts? Has anyone done expression templates in D1 and got them to work? And yes... the code above works fine in D2, but I want to try to get it to work in D1 for now. Thanks, -SiegeLord