Upcoming ACM Lecture on D next Tuesday at George Mason University

2011-04-22 Thread Walter Bright

http://events.insidenova.com/fairfax-va/events/show/180300086-dc-acm-the-d-programming-language-with-walter-bright

See all you D.C. area people there!


Re: OOP, faster data layouts, compilers

2011-04-22 Thread Paulo Pinto
Many thanks for the links, they provide very nice discussions.

Specially the link below, that you can follow from your first link,
http://c0de517e.blogspot.com/2011/04/2011-current-and-future-programming.html

But in what concerns game development, D2 might already be too late.

I know a bit of it, since a live a bit on that part of the universe.

Due to XNA(Windows and XBox 360), Mono/Unity, and now WP7, many game studios
have started to move their tooling into C#. And some of them are nowadays 
even using
it for the server side code.

Java used to have a foot there, specially due to the J2ME game development, 
with a small
push thanks to Android. Which decreased since Google made the NDK available.

If one day Microsoft really lets C# free, the same way ATT  somehow did 
with C and C++, then C#
might actually be the next C++, at least in what game development is 
concerned.

And the dependency on a JIT environment is an implementation issue. The 
Bartok compiler in Singularity
compiles to native code, and Mono also provides a similar option.

So who knows?

--
Paulo



bearophile bearophileh...@lycos.com wrote in message 
news:ioqdhe$2030$1...@digitalmars.com...
 Through Reddit I've found a set of wordy slides, Design for Performance, 
 on designing efficient games code:
 http://www.scribd.com/doc/53483851/Design-for-Performance
 http://www.reddit.com/r/programming/comments/guyb2/designing_code_for_performance/

 The slide touch many small topics, like the need for prefetching, desing 
 for cache-aware code, etc. One of the main topics is how to better lay 
 data structures in memory for modern CPUs. It shows how object oriented 
 style leads often to collections of little trees, for example  arrays of 
 object references (or struct pointers) that refer to objects that contain 
 other references to sub parts. Iterating on such data structures is not so 
 efficient.

 The slides also discuss a little the difference between creating an array 
 of 2-item structs, or a struct that contains two arrays of single native 
 values. If the code needs to scan just one of those two fields, then the 
 struct that contains the two arrays is faster.

 Similar topics were discussed better in Pitfalls of Object Oriented 
 Programming (2009):
 http://research.scee.net/files/presentations/gcapaustralia09/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf

 In my opinion if D2 has some success then one of its significant usages 
 will be to write fast games, so the design/performance concerns expressed 
 in those two sets of slides need to be important for D design.

 D probably allows to lay data in memory as shown in those slides, but I'd 
 like some help from the compiler too.  I don't think the compilers will be 
 soon able to turn an immutable binary tree into an array, to speedup its 
 repeated scanning, but maybe there are ways to express semantics in the 
 code that will allow them future smarter compilers to perform some of 
 those memory layout optimization, like transposing arrays. A possible idea 
 is a @no_inbound_pointers that forbids taking the addess of the items, and 
 allows the compiler to modify the data layout a little.

 Bye,
 bearophile 




Re: Implementing std.log

2011-04-22 Thread Zz
I currently use the logger written by Masahiro Nakagawa and it has handled what 
I need.

You can get it from:
http://www.bitbucket.org/repeatedly/scrap/src/tip/logger.d

Zz

Robert Clipsham Wrote:

 Hey folks,
 
 I've just finished porting my web framework from D1/Tango to D2/Phobos, 
 and in the transition lost logging functionality. As I'll be writing a 
 logging library anyway, I wondered if there'd be interest in a std.log? 
 If so, is there a current logging library we would like it to be based 
 on, or should we design from scratch?
 
 I know there has been discussion about Google's 
 http://google-glog.googlecode.com/svn/trunk/doc/glog.html and another 
 candidate may be http://logging.apache.org/log4j/ . Do we want a 
 comprehensive logging library, or just the basics? (Possibly with some 
 method for extension if needed).
 
 -- 
 Robert
 http://octarineparrot.com/



Re: std.parallelism: VOTE IN THIS THREAD

2011-04-22 Thread Fawzi Mohamed

YES

it is a step in the right direction, I have ome comments, but I will  
put them in another thread


Re: opDispatch, duck typing, and error messages

2011-04-22 Thread spir

On 04/22/2011 12:24 AM, Adam D. Ruppe wrote:

I just made an innocent little change to one of my programs, hit
compile, and got this vomit:

/home/me/d/dmd2/linux/bin/../../src/phobos/std/conv.d(97): Error: template
std.conv.toImpl(T,S) if (!implicitlyConverts!(S,
T)  isSomeString!(T)  isInputRange!(Unqual!(S))
isSomeChar!(ElementType!(S))) toImpl(T,S) if (!implicitlyConverts!(S
,T)  isSomeString!(T)  isInputRange!(Unqual!(S))
isSomeChar!(ElementType!(S))) matches more than one template declar
ation, /home/me/d/dmd2/linux/bin/../../src/phobos/std/conv.d(185):toImpl(T,S)
if (isSomeString!(T)  !isSomeChar!(ElementT
ype!(S))  (isInputRange!(S) || isInputRange!(Unqual!(S and
/home/me/d/dmd2/linux/bin/../../src/phobos/std/conv.d(289)
:toImpl(T,S) if (is(S : Object)  isSomeString!(T))



Who... took a bit to figure out what it was saying. The bottom
line: one of my classes matched both Object and isInputRange because
it offers an unrestricted opDispatch.

[...]

Things I think would help:

[...]

Or, there's a whole new approach:

e) Duck typing for ranges in to!() might be a bad idea. Again, remember,
a class might legitimately offer a range interface, so it would
trigger this message without opDispatch.


Maybe we could replace template constraints, esp. 'is' stuff, by (structural) 
interfaces. The difference in my views is structural interface is a 
compile-time / static feature, while duck typing is runtime/dynamic.



If ranges are meant to be structs, maybe isInputRange should check
is(T == struct)? This doesn't sit right with me though. The real
problem is to!() - other range functions probably don't overload
on classes separately than ranges, so it won't matter there.


I think the best thing to do is simply to prefer Object over range.

toImpl(T) if (isInputRange!(T)  (!is(T : Object)))

Or something along those lines. Why? If the object has it's own
toString/writeTo methods, it seems fairly obvious to me anyway that
to!string ought to simply call them, regardless or what else there is.


Sure. I hit and discussed a similar issue (maybe the same one in fact). The 
problem was with template formatValue which constraints:
(1) for structs, ignore programmer-defined toString in favor of standard format 
for ranges

(2) for classes, simply fail because of conflict (double match)
There's a bug report (search for 'formatValue').

Denis
--
_
vita es estrany
spir.wikidot.com



Re: opDispatch, duck typing, and error messages

2011-04-22 Thread spir

On 04/22/2011 01:25 AM, Adam D. Ruppe wrote:

bearophile wrote:

  Maybe exceptions nature should be changed a little so they store
__FILE__ and __LINE__ on default (exceptions without this
information are kept on request, for optimization purposes).


I'd like that. Even with a stack trace, it's nice to have that
available right at the top.

A while ago, someone posted a stack tracer printer for Linux to
the newsgroup. Using that, this program:

void main() {
throw new Exception(test);
}

  dmd test60 -debug -g backtrace.d

Prints:

object.Exception: test

./test60(_Dmain+0x30) [0x807a5e8]
./test60(extern (C) int rt.dmain2.main(int, char**) . void runMain()+0x1a) 
[0x807d566]
./test60(extern (C) int rt.dmain2.main(int, char**) . void tryExec(void
delegate())+0x24) [0x807d4c0]
./test60(extern (C) int rt.dmain2.main(int, char**) . void runAll()+0x32) 
[0x807d5aa]
./test60(extern (C) int rt.dmain2.main(int, char**) . void tryExec(void
delegate())+0x24) [0x807d4c0]
./test60(main+0x96) [0x807d466]
/lib/libc.so.6(__libc_start_main+0xe6) [0xf75a5b86]
./test60() [0x807a4e1]



No line or file info! I'd really like to have something there.
Though, actually, whether it's in the message or in the stack
trace doesn't really matter. As long as it's there somewhere.

Most my custom exceptions use default params in their constructor
to add it. Perhaps the base Exception should too?


Also, addresses could go (useless).

Denis
--
_
vita es estrany
spir.wikidot.com



Re: opDispatch, duck typing, and error messages

2011-04-22 Thread spir

On 04/22/2011 03:53 AM, Robert Jacques wrote:

On Thu, 21 Apr 2011 18:24:55 -0400, Adam D. Ruppe destructiona...@gmail.com
wrote:
[snip]

Or, there's a whole new approach:

e) Duck typing for ranges in to!() might be a bad idea. Again, remember,
a class might legitimately offer a range interface, so it would
trigger this message without opDispatch.

If ranges are meant to be structs, maybe isInputRange should check
is(T == struct)? This doesn't sit right with me though. The real
problem is to!() - other range functions probably don't overload
on classes separately than ranges, so it won't matter there.


I think the best thing to do is simply to prefer Object over range.

toImpl(T) if (isInputRange!(T)  (!is(T : Object)))

Or something along those lines. Why? If the object has it's own
toString/writeTo methods, it seems fairly obvious to me anyway that
to!string ought to simply call them, regardless or what else there is.


There's actually a bug report regarding the toString vs range semantics issue,
it's issue 5354 ( http://d.puremagic.com/issues/show_bug.cgi?id=5354 ). Also
note that classes (but not structs as of yet, see bug 5719) can provide their
own to!T conversions.

However, what you ran into deserves a new bug report, since to!string should
always be able to fall back to toString and it didn't.

Agreed. A programmer who defines toString *means* it to be used for conversion 
to string (esp. for write* funcs). Please support and vote for this bug ;-)


Denis
--
_
vita es estrany
spir.wikidot.com



Re: link from a dll to another function in another dll?

2011-04-22 Thread maarten van damme
That example was a bit incomplete, preceding was the following code:

import std.c.windows.windows;
import core.dll_helper;

pragma(lib,kernel33.lib);

__gshared HINSTANCE g_hInst;

extern (Windows)
BOOL DllMain(HINSTANCE hInstance, ULONG ulReason, LPVOID pvReserved)
{
switch (ulReason)
{
case DLL_PROCESS_ATTACH:
g_hInst = hInstance;
dll_process_attach( hInstance, true );
break;

case DLL_PROCESS_DETACH:
dll_process_detach( hInstance, true );
break;

case DLL_THREAD_ATTACH:
dll_thread_attach( true, true );
break;

case DLL_THREAD_DETACH:
dll_thread_detach( true, true );
break;
}
return true;
}


Re: OOP, faster data layouts, compilers

2011-04-22 Thread Kai Meyer

On 04/22/2011 02:55 AM, Paulo Pinto wrote:

Many thanks for the links, they provide very nice discussions.

Specially the link below, that you can follow from your first link,
http://c0de517e.blogspot.com/2011/04/2011-current-and-future-programming.html

But in what concerns game development, D2 might already be too late.

I know a bit of it, since a live a bit on that part of the universe.

Due to XNA(Windows and XBox 360), Mono/Unity, and now WP7, many game studios
have started to move their tooling into C#. And some of them are nowadays
even using
it for the server side code.

Java used to have a foot there, specially due to the J2ME game development,
with a small
push thanks to Android. Which decreased since Google made the NDK available.

If one day Microsoft really lets C# free, the same way ATT  somehow did
with C and C++, then C#
might actually be the next C++, at least in what game development is
concerned.

And the dependency on a JIT environment is an implementation issue. The
Bartok compiler in Singularity
compiles to native code, and Mono also provides a similar option.

So who knows?

--
Paulo





I don't think C# is the next C++; it's impossible for C# to be what 
C/C++ is. There is a purpose and a place for Interpreted languages like 
C# and Java, just like there is for C/C++. What language do you think 
the interpreters for Java and C# are written in? (Hint: It's not Java or 
C#.) I also don't think that the core of Unity (or any decent game 
engine) is written in an interpreted language either, which basically 
means the guts are likely written in either C or C++. The point being 
made is that Systems Programming Languages like C/C++ and D are picked 
for their execution speed, and Interpreted Languages are picked for 
their ease of programming (or development speed). Since D is picked for 
execution speed, we should seriously consider every opportunity to 
improve in that arena. The OP wasn't just for the game developers, but 
for game framework developers as well.


Re: OOP, faster data layouts, compilers

2011-04-22 Thread Daniel Gibson
Am 22.04.2011 18:48, schrieb Kai Meyer:
 
 I don't think C# is the next C++; it's impossible for C# to be what
 C/C++ is. There is a purpose and a place for Interpreted languages like
 C# and Java, just like there is for C/C++. What language do you think
 the interpreters for Java and C# are written in? (Hint: It's not Java or
 C#.) I also don't think that the core of Unity (or any decent game
 engine) is written in an interpreted language either, which basically
 means the guts are likely written in either C or C++. The point being
 made is that Systems Programming Languages like C/C++ and D are picked
 for their execution speed, and Interpreted Languages are picked for
 their ease of programming (or development speed). Since D is picked for
 execution speed, we should seriously consider every opportunity to
 improve in that arena. The OP wasn't just for the game developers, but
 for game framework developers as well.

IMHO D won't be successful for games as long as it only supports
Windows, Linux and OSX on PC (-like) hardware.
We'd need support for modern game consoles (XBOX360, PS3, maybe Wii) and
for mobile devices (Android, iOS, maybe Win7 phones and other stuff).
This means good PPC (maybe the PS3's Cell CPU would need special support
even though it's understands PPC code? I don't know.) and ARM support
and support for the operating systems and SDKs used on those platforms.

Of course execution speed is very important as well, but D in it's
current state is not *that* bad in this regard. Sure, the GC is a bit
slow, but in high performance games you shouldn't use it (or even
malloc/free) all the time, anyway, see
http://www.digitalmars.com/d/2.0/memory.html#realtime

Another point: I find Minecraft pretty impressive. It really changed my
view upon Games developed in Java.

Cheers,
- Daniel


Re: OOP, faster data layouts, compilers

2011-04-22 Thread Kai Meyer

On 04/22/2011 11:05 AM, Daniel Gibson wrote:

Am 22.04.2011 18:48, schrieb Kai Meyer:


I don't think C# is the next C++; it's impossible for C# to be what
C/C++ is. There is a purpose and a place for Interpreted languages like
C# and Java, just like there is for C/C++. What language do you think
the interpreters for Java and C# are written in? (Hint: It's not Java or
C#.) I also don't think that the core of Unity (or any decent game
engine) is written in an interpreted language either, which basically
means the guts are likely written in either C or C++. The point being
made is that Systems Programming Languages like C/C++ and D are picked
for their execution speed, and Interpreted Languages are picked for
their ease of programming (or development speed). Since D is picked for
execution speed, we should seriously consider every opportunity to
improve in that arena. The OP wasn't just for the game developers, but
for game framework developers as well.


IMHO D won't be successful for games as long as it only supports
Windows, Linux and OSX on PC (-like) hardware.
We'd need support for modern game consoles (XBOX360, PS3, maybe Wii) and
for mobile devices (Android, iOS, maybe Win7 phones and other stuff).
This means good PPC (maybe the PS3's Cell CPU would need special support
even though it's understands PPC code? I don't know.) and ARM support
and support for the operating systems and SDKs used on those platforms.

Of course execution speed is very important as well, but D in it's
current state is not *that* bad in this regard. Sure, the GC is a bit
slow, but in high performance games you shouldn't use it (or even
malloc/free) all the time, anyway, see
http://www.digitalmars.com/d/2.0/memory.html#realtime

Another point: I find Minecraft pretty impressive. It really changed my
view upon Games developed in Java.

Cheers,
- Daniel


Hah, Minecraft. Have you tried loading up a high resolution texture pack 
yet? There's a reason why it looks like 8-bit graphics. It's not Java 
that makes Minecraft awesome, imo :)


Re: OOP, faster data layouts, compilers

2011-04-22 Thread Daniel Gibson
Am 22.04.2011 19:11, schrieb Kai Meyer:
 On 04/22/2011 11:05 AM, Daniel Gibson wrote:
 Am 22.04.2011 18:48, schrieb Kai Meyer:

 I don't think C# is the next C++; it's impossible for C# to be what
 C/C++ is. There is a purpose and a place for Interpreted languages like
 C# and Java, just like there is for C/C++. What language do you think
 the interpreters for Java and C# are written in? (Hint: It's not Java or
 C#.) I also don't think that the core of Unity (or any decent game
 engine) is written in an interpreted language either, which basically
 means the guts are likely written in either C or C++. The point being
 made is that Systems Programming Languages like C/C++ and D are picked
 for their execution speed, and Interpreted Languages are picked for
 their ease of programming (or development speed). Since D is picked for
 execution speed, we should seriously consider every opportunity to
 improve in that arena. The OP wasn't just for the game developers, but
 for game framework developers as well.

 IMHO D won't be successful for games as long as it only supports
 Windows, Linux and OSX on PC (-like) hardware.
 We'd need support for modern game consoles (XBOX360, PS3, maybe Wii) and
 for mobile devices (Android, iOS, maybe Win7 phones and other stuff).
 This means good PPC (maybe the PS3's Cell CPU would need special support
 even though it's understands PPC code? I don't know.) and ARM support
 and support for the operating systems and SDKs used on those platforms.

 Of course execution speed is very important as well, but D in it's
 current state is not *that* bad in this regard. Sure, the GC is a bit
 slow, but in high performance games you shouldn't use it (or even
 malloc/free) all the time, anyway, see
 http://www.digitalmars.com/d/2.0/memory.html#realtime

 Another point: I find Minecraft pretty impressive. It really changed my
 view upon Games developed in Java.

 Cheers,
 - Daniel
 
 Hah, Minecraft. Have you tried loading up a high resolution texture pack
 yet? There's a reason why it looks like 8-bit graphics. It's not Java
 that makes Minecraft awesome, imo :)

No I haven't.
What I find impressive is this (almost infinitely) big world that is
completely changeable, i.e. you can build new stuff everywhere, you can
dig tunnels everywhere (ok, somewhere really deep there's a limit) and
the game still runs smoothly. Haven't seen something like that in any
game before.


Re: OOP, faster data layouts, compilers

2011-04-22 Thread Kai Meyer

On 04/22/2011 11:20 AM, Daniel Gibson wrote:

Am 22.04.2011 19:11, schrieb Kai Meyer:

On 04/22/2011 11:05 AM, Daniel Gibson wrote:

Am 22.04.2011 18:48, schrieb Kai Meyer:


I don't think C# is the next C++; it's impossible for C# to be what
C/C++ is. There is a purpose and a place for Interpreted languages like
C# and Java, just like there is for C/C++. What language do you think
the interpreters for Java and C# are written in? (Hint: It's not Java or
C#.) I also don't think that the core of Unity (or any decent game
engine) is written in an interpreted language either, which basically
means the guts are likely written in either C or C++. The point being
made is that Systems Programming Languages like C/C++ and D are picked
for their execution speed, and Interpreted Languages are picked for
their ease of programming (or development speed). Since D is picked for
execution speed, we should seriously consider every opportunity to
improve in that arena. The OP wasn't just for the game developers, but
for game framework developers as well.


IMHO D won't be successful for games as long as it only supports
Windows, Linux and OSX on PC (-like) hardware.
We'd need support for modern game consoles (XBOX360, PS3, maybe Wii) and
for mobile devices (Android, iOS, maybe Win7 phones and other stuff).
This means good PPC (maybe the PS3's Cell CPU would need special support
even though it's understands PPC code? I don't know.) and ARM support
and support for the operating systems and SDKs used on those platforms.

Of course execution speed is very important as well, but D in it's
current state is not *that* bad in this regard. Sure, the GC is a bit
slow, but in high performance games you shouldn't use it (or even
malloc/free) all the time, anyway, see
http://www.digitalmars.com/d/2.0/memory.html#realtime

Another point: I find Minecraft pretty impressive. It really changed my
view upon Games developed in Java.

Cheers,
- Daniel


Hah, Minecraft. Have you tried loading up a high resolution texture pack
yet? There's a reason why it looks like 8-bit graphics. It's not Java
that makes Minecraft awesome, imo :)


No I haven't.
What I find impressive is this (almost infinitely) big world that is
completely changeable, i.e. you can build new stuff everywhere, you can
dig tunnels everywhere (ok, somewhere really deep there's a limit) and
the game still runs smoothly. Haven't seen something like that in any
game before.


The random world generator is amazing, but it's not speed. The polygon 
count of the game is excruciatingly low because the client is smart 
enough to only draw the faces of blocks that are visible. The very 
bottom (bedrock) and they very top of the sky (as high as you can build 
blocks) is 256 blocks tall. The game is full of low-level bit-stuffing 
(like stacks of 64). The genius of the game is not in any special 
features of Java, it's in the data structure and data generator, which 
can be done much faster in other languages. But it begs the question, 
why does it need to be faster? It is fast enough in the JVM (unless 
you load up the high resolution textures, in which case the game becomes 
unbearably slow when viewing long distances.)


The purpose of the original post was to indicate that some low level 
research shows that underlying data structures (as applied to video game 
development) can have an impact on the performance of the application, 
which D (I think) cares very much about.


Linus with some good observations on garbage collection

2011-04-22 Thread Walter Bright

http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Andrej Mitrovic
Just a reminder: that post is 9 years old.


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Alvaro

El 22/04/2011 19:36, Walter Bright escribió:

http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html


I've always been surprised when discussions usually just bring garbage 
collection as the only alternative to explicit manual memory management. 
I imagined it as a garbage truck that has its own schedule and may let a 
lot of trash pile up before passing by. I always naively thought, why 
not just free immediately when an object gets no references?


Not an expert, so there may be reasons I don't see, but now that Linus 
says somethnig along the lines, I'll ask. Why not? Isn't it much easier 
to do refcount++ and refcount--, and if refcount==0 immediately 
free()? Memory will be available to other needs faster, no need for an 
additional thread, or a lot of memory consumed before the advanced 
garbage truck decides to come in, or slight pauses when collecting trash 
(maybe only in old implementations), and the implementation is much 
simpler...


OK, I knew about that cyclic references problem. But Linus doesn't 
seem to see a big problem and solutions can be found with care...


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Michael Stover
This sort of reference count with cyclic dependency detector is how a lot of
scripting languages do it, or did it in the past.  The problem was that lazy
generational GCs are believed to have better throughput in general.

I'd like to say were proved rather than are believed, but I don't
actually know where to go for such evidence.  However, I do believe many
scripting languages, such as python, eventually ditched the reference
counting technique for generational, and Java has very fast GC, so I am
inclined to believe those real-life solutions than Linus.

Mike

On Fri, Apr 22, 2011 at 2:32 PM, Alvaro alvaro.seg...@gmail.com wrote:

 El 22/04/2011 19:36, Walter Bright escribió:

  http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html


 I've always been surprised when discussions usually just bring garbage
 collection as the only alternative to explicit manual memory management. I
 imagined it as a garbage truck that has its own schedule and may let a lot
 of trash pile up before passing by. I always naively thought, why not just
 free immediately when an object gets no references?

 Not an expert, so there may be reasons I don't see, but now that Linus says
 somethnig along the lines, I'll ask. Why not? Isn't it much easier to do
 refcount++ and refcount--, and if refcount==0 immediately free()? Memory
 will be available to other needs faster, no need for an additional thread,
 or a lot of memory consumed before the advanced garbage truck decides to
 come in, or slight pauses when collecting trash (maybe only in old
 implementations), and the implementation is much simpler...

 OK, I knew about that cyclic references problem. But Linus doesn't seem
 to see a big problem and solutions can be found with care...



Re: Linus with some good observations on garbage collection

2011-04-22 Thread Steven Schveighoffer

On Fri, 22 Apr 2011 14:32:06 -0400, Alvaro alvaro.seg...@gmail.com wrote:


El 22/04/2011 19:36, Walter Bright escribió:

http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html


I've always been surprised when discussions usually just bring garbage  
collection as the only alternative to explicit manual memory management.  
I imagined it as a garbage truck that has its own schedule and may let a  
lot of trash pile up before passing by. I always naively thought, why  
not just free immediately when an object gets no references?


Because you then have to update potentially two reference counts every  
time you assign a pointer.  GC's save you from doing that.


I know way way less than Torvalds, but my naive brain says GC's still win  
because often times, slightly noticeable drops in performance are worth  
having code that doesn't corrupt memory.  This may not be true for kernel  
development, but then again, we aren't all developing kernels ;)


-Steve


Re: OOP, faster data layouts, compilers

2011-04-22 Thread bearophile
Kai Meyer:

 The purpose of the original post was to indicate that some low level 
 research shows that underlying data structures (as applied to video game 
 development) can have an impact on the performance of the application, 
 which D (I think) cares very much about.

The idea of the original post was a bit more complex: how can we invent 
new/better ways to express semantics in D code that will not forbid future D 
compilers to perform a bit of changes in the layout of data structures to 
increase code performance? Complex transforms of the data layout seem too much 
complex for even a good compiler, but maybe simpler ones will be possible. And 
I think to do this the D code needs some more semantics. I was suggesting an 
annotation that forbids inbound pointers, that allows the compiler to move data 
around a little, but this is just a start.

Bye,
bearophile


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Brad Roberts
Also add to it that in many cases you're dealing with a threaded environment, 
so those refcounts have to be locked
(either via mutexes, or more commonly just atomic) operations which are far 
more expensive than non-atomic.  More so
when there's actual contention for the refcounted resource.

On 4/22/2011 11:53 AM, Michael Stover wrote:
 This sort of reference count with cyclic dependency detector is how a lot of 
 scripting languages do it, or did it in the
 past.  The problem was that lazy generational GCs are believed to have better 
 throughput in general.
 
 I'd like to say were proved rather than are believed, but I don't 
 actually know where to go for such evidence.
  However, I do believe many scripting languages, such as python, eventually 
 ditched the reference counting technique for
 generational, and Java has very fast GC, so I am inclined to believe those 
 real-life solutions than Linus.
 
 Mike
 
 On Fri, Apr 22, 2011 at 2:32 PM, Alvaro alvaro.seg...@gmail.com 
 mailto:alvaro.seg...@gmail.com wrote:
 
 El 22/04/2011 19:36, Walter Bright escribió:
 
 http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html
 
 
 I've always been surprised when discussions usually just bring garbage 
 collection as the only alternative to
 explicit manual memory management. I imagined it as a garbage truck that 
 has its own schedule and may let a lot of
 trash pile up before passing by. I always naively thought, why not just 
 free immediately when an object gets no
 references?
 
 Not an expert, so there may be reasons I don't see, but now that Linus 
 says somethnig along the lines, I'll ask. Why
 not? Isn't it much easier to do refcount++ and refcount--, and if 
 refcount==0 immediately free()? Memory will be
 available to other needs faster, no need for an additional thread, or a 
 lot of memory consumed before the advanced
 garbage truck decides to come in, or slight pauses when collecting trash 
 (maybe only in old implementations), and
 the implementation is much simpler...
 
 OK, I knew about that cyclic references problem. But Linus doesn't seem 
 to see a big problem and solutions can be
 found with care...
 
 



Re: OOP, faster data layouts, compilers

2011-04-22 Thread Andrew Wiley
On Fri, Apr 22, 2011 at 12:31 PM, Kai Meyer k...@unixlords.com wrote:

 On 04/22/2011 11:20 AM, Daniel Gibson wrote:

 Am 22.04.2011 19:11, schrieb Kai Meyer:

 On 04/22/2011 11:05 AM, Daniel Gibson wrote:

 Am 22.04.2011 18:48, schrieb Kai Meyer:


 I don't think C# is the next C++; it's impossible for C# to be what
 C/C++ is. There is a purpose and a place for Interpreted languages like
 C# and Java, just like there is for C/C++. What language do you think
 the interpreters for Java and C# are written in? (Hint: It's not Java
 or
 C#.) I also don't think that the core of Unity (or any decent game
 engine) is written in an interpreted language either, which basically
 means the guts are likely written in either C or C++. The point being
 made is that Systems Programming Languages like C/C++ and D are picked
 for their execution speed, and Interpreted Languages are picked for
 their ease of programming (or development speed). Since D is picked for
 execution speed, we should seriously consider every opportunity to
 improve in that arena. The OP wasn't just for the game developers, but
 for game framework developers as well.


 IMHO D won't be successful for games as long as it only supports
 Windows, Linux and OSX on PC (-like) hardware.
 We'd need support for modern game consoles (XBOX360, PS3, maybe Wii) and
 for mobile devices (Android, iOS, maybe Win7 phones and other stuff).
 This means good PPC (maybe the PS3's Cell CPU would need special support
 even though it's understands PPC code? I don't know.) and ARM support
 and support for the operating systems and SDKs used on those platforms.

 Of course execution speed is very important as well, but D in it's
 current state is not *that* bad in this regard. Sure, the GC is a bit
 slow, but in high performance games you shouldn't use it (or even
 malloc/free) all the time, anyway, see
 http://www.digitalmars.com/d/2.0/memory.html#realtime

 Another point: I find Minecraft pretty impressive. It really changed my
 view upon Games developed in Java.

 Cheers,
 - Daniel


 Hah, Minecraft. Have you tried loading up a high resolution texture pack
 yet? There's a reason why it looks like 8-bit graphics. It's not Java
 that makes Minecraft awesome, imo :)


 No I haven't.
 What I find impressive is this (almost infinitely) big world that is
 completely changeable, i.e. you can build new stuff everywhere, you can
 dig tunnels everywhere (ok, somewhere really deep there's a limit) and
 the game still runs smoothly. Haven't seen something like that in any
 game before.


 The random world generator is amazing, but it's not speed. The polygon
 count of the game is excruciatingly low because the client is smart enough
 to only draw the faces of blocks that are visible. The very bottom (bedrock)
 and they very top of the sky (as high as you can build blocks) is 256 blocks
 tall. The game is full of low-level bit-stuffing (like stacks of 64). The
 genius of the game is not in any special features of Java, it's in the data
 structure and data generator, which can be done much faster in other
 languages. But it begs the question, why does it need to be faster? It is
 fast enough in the JVM (unless you load up the high resolution textures,
 in which case the game becomes unbearably slow when viewing long distances.)


Actually, the world is 128 blocks tall, and divided into 16x128x16 block
chunks.
To elaborate on the bit stuffing, at the end of the day, each block is 2.5
bytes (type, metadata, and some lighting info) with exceptions for things
like chests.

The reason Minecraft runs so well in Java, from my point of view, is that
the authors resisted the Java urge to throw objects at the problem and
instead put everything into large byte arrays and wrote methods to
manipulate them. From that perspective, using Java would be about the same
as using any language, which let them stick to what they knew without
incurring a large performance penalty.

However, it's also true that as soon as you try to use a 128x128 texture
pack, you very quickly become disillusioned with Minecraft's performance.


using dylib with dmd

2011-04-22 Thread frostmind
Greetings every one,

I desperatly searched net about how to compile a d2
program that uses dylib on mac os, but unfortunately no luck.

What I've been doing:

I have a C library compiled as (taken from http://developer.apple.com/):
gcc -dynamiclib -std=gnu99 Ratings.c -current_version 0.1 
-compatibility_version 0.1 -
fvisibility=default -o libRatings.A.g
 -- so I have libRatings.A.dylib that my testapp will use.

I've converted Ratings.h header of the lib to Ratings.d file, all functions 
enclosed in extern(C);

I created a test program: test_d_client.d

import Ratings;
import std.stdio;

void main() {
writeln(Starting test);

char[] value1 = *.dup;
addRating(value1.ptr);
writeln(rating: %s, ratings());
}

And now, when I execute: dmd test_d_client.d I get following output:

Undefined symbols:
  _addRating, referenced from:
  __Dmain in test_d_client.o
  _ratings, referenced from:
  __Dmain in test_d_client.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
--- errorlevel 1

What I am doing wrong here?


Re: using dylib with dmd

2011-04-22 Thread Jesse Phillips
frostmind Wrote:

 And now, when I execute: dmd test_d_client.d I get following output:
 
 Undefined symbols:
   _addRating, referenced from:
   __Dmain in test_d_client.o
   _ratings, referenced from:
   __Dmain in test_d_client.o
 ld: symbol(s) not found
 collect2: ld returned 1 exit status
 --- errorlevel 1
 
 What I am doing wrong here?

You need to tell the linker where your library is. You can just pass it to dmd 
on the commandline.


Re: using dylib with dmd

2011-04-22 Thread frostmind
Thank you for your response! Hopefully I've done it right.

Now when everything is within the same folder,
and I execute:
dmd test_d_client.d -L.

(so I'm telling to look for libs in current dir)
Response is now different:

ld: in ., can't map file, errno=22
collect2: ld returned 1 exit status
--- errorlevel 1

What else could be done here to resolve it?


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Timon Gehr
Brad Roberts wrote:
 Also add to it that in many cases you're dealing with a threaded environment, 
 
so those refcounts have to be locked
 (either via mutexes, or more commonly just atomic) operations which are far 
 more
expensive than non-atomic.  More so
 when there's actual contention for the refcounted resource.

That is only a problem if the reference count of that resource changes at a very
high frequency. The described problem also implies that the threads would not 
need
any form of synchronization for the data (otherwise the reference count 
certainly
would not be a bottleneck.)

I cannot, at the moment, think of a real-world example where this would not 
imply
bad design. Can you help me out?

Michael Stover wrote:
 I'd like to say were proved rather than are believed, but I don't actually
know where to go for such evidence.
  However, I do believe many scripting languages, such as python, eventually
ditched the reference counting technique for
 generational, and Java has very fast GC, so I am inclined to believe those
real-life solutions than Linus.

 Mike

Well, the GC may be fast when compared to other GCs, but it has to be designed 
to
run on general data whose reference structure can be arbitrary. Often, the
objects/references have a somewhat specialized structure that a smart programmer
can exploit, especially if the references point to private data. But that only
matters if performance is of prime interest, and the gains may be not very big.

But, as pointed out by Linus, the prime performance problem is _not_ the GC, but
the mindset that comes with it. Most programmers that grew up in a managed
environment tend to use very many new keywords all over their code, instead of
allocating large chunks of memory at once. (Java/C#/etc encourage you to do 
this.)
When they then try to write a C++ program, they do the same. The resulting 
memory
bugs are then blamed on the lack of a GC (you can have GC in C/C++, but most of
the people I have talked to do not know this.) They then happily change back to
Java, that has a very fast GC.

The important thing to note here is that the work required to deallocate all 
these
many memory locations does not magically disappear, but it is delegated to the 
GC,
which will mostly do it faster and more reliable than a programmer which has to 
do
it manually. But the problem does not lie in the deallocations, it's in the
allocations.

Consider this analogy:

Scenario 1: Many people like candy. Those are wrapped in colorful little pieces 
of
paper. Every day, every person buys one piece of candy in the candy shop (new
Candy()!) and on the way back home they throw away the wrapping paper somewhere 
on
the street (reassign reference). Those are garbage. In the evening, some creepy
guy comes to search the whole street for those small pieces of paper. Would you
call that guy a garbage collector? He collects garbage, but in the real world
garbage collectors work more like this:

Scenario 2: Still, many people like candy. Every person buys a bag of candy in 
the
candy shop once a year (new Candy[365]). When all the candy is eaten, they put 
all
the garbage in one bag and put it to their front door (reassign reference to 
whole
array). A very handsome guy collects all those bags. He is very much more
efficient than the guy in example 1. (Arguably, memory usage is bigger in that
particular example, but in computer programs, the allocating process can reuse 
the
memory. The analogy breaks down here.)

Note that I am not saying that garbage collection is bad. If the references 
form a
very complicated structure, or if a reference to the containing object is not
necessarily kept (iE. array slicing), it can be very useful as well as faster 
than
manual memory management. Also, custom allocators can reduce the
lots-of-small-allocations problem, but they have some overhead too. Advanced GCs
may do this automatically, I don't know.

The reason Java is garbage collected is _not_ performance, but primarily
reliability. In big programming projects, it can be hard to know where a 
reference
to an object may be kept, if there are circular references etc, as programs keep
expanding without the programmers understanding the whole project in detail. GC
also removes bugs related to memory management. I think true and reliable OO is
not possible without a GC. The downside is, that many Java/... programmers don't
concern themselves much with the internals of memory allocations, and are _very_
proud of it. This is also the reason I think it is a bad idea to deprecate D's
'delete'.


-Timon


Re: Linus with some good observations on garbage collection

2011-04-22 Thread bearophile
Timon Gehr:

 But, as pointed out by Linus, the prime performance problem is _not_ the GC, 
 but
 the mindset that comes with it. Most programmers that grew up in a managed
 environment tend to use very many new keywords all over their code, instead 
 of
 allocating large chunks of memory at once. (Java/C#/etc encourage you to do 
 this.)

In C99 (and Ada) you avoid the allocation of some dynamic arrays with new 
thanks to variable length arrays.


 This is also the reason I think it is a bad idea to deprecate D's 'delete'.

D used to have scoped class instances, scoped classes, and delete, their 
replacements are not good enough yet. In CommonLisp you have hints for the GC, 
they are safe and they help you help speedup the work of the GC. Such hints 
probably need to be integrated with the type system, so they may need to be 
built-ins as scope/delete were. I am not seeing enough discussion about this.

Bye,
bearophile


Transients or scoped immutability

2011-04-22 Thread bearophile
This post contains uncooked ideas.

I'd like to create data structures:
- That once created are immutable, so there is no risk to write on them, etc;
- On the stack too, avoiding slower heap allocations and avoiding copying them 
from the mutable to the immutable version;
- Avoiding to keep in the function name space a dead name of the mutable 
version of the data structure;
- Avoiding calls to functions that may contain loops that DMD doesn't inline;
- Avoiding too much complex code for the programmer.

A syntax idea, a do{}transient(name1, name2, ...);:


void foo(char[] data) {
do {
int[256] count;
foreach (char c; data)
count[c]++;
int x = ...
auto bar = map!(...)(...x...);
} transient(const count, const bar);
// Here x is not visible.
// Here count and bar are visible but read-only.
}
void main() {
foo(this is a string);
}


A different syntax, that inverts the precedent idea and uses a sub-scope (it's 
a bit like a with(){}, but its purpose is not to access fields of a struct):


void foo(char[] data) {
int[256] count;
foreach (char c; data)
count[c]++;
int x = ...
auto bar = map!(...)(...x...);
// Here x, count, data and bar are visible and mutable.
scope (const count, const bar, data) {
// Here x is not visible.
// Here count and bar are visible but read-only.
// Here data is visible and mutable
}
// Here x, count, data and bar are visible and mutable.
}
void main() {
foo(this is a string);
}


I think the first idea is a bit less bug-prone, and it avoids too much 
indenting of the code. Probably there are ways to invent a better syntax.

They have added the idea of transients is Clojure too:
http://clojure.org/transients?responseToken=07a82a51e4651b10f3a8ee4be09fe1f9f

This idea is good and it will help, but it needs a function call:
http://d.puremagic.com/issues/show_bug.cgi?id=5081

Bye,
bearophile


Re: OOP, faster data layouts, compilers

2011-04-22 Thread Sean Cavanaugh

On 4/22/2011 2:20 PM, bearophile wrote:

Kai Meyer:


The purpose of the original post was to indicate that some low level
research shows that underlying data structures (as applied to video game
development) can have an impact on the performance of the application,
which D (I think) cares very much about.


The idea of the original post was a bit more complex: how can we invent 
new/better ways to express semantics in D code that will not forbid future D 
compilers to perform a bit of changes in the layout of data structures to 
increase code performance? Complex transforms of the data layout seem too much 
complex for even a good compiler, but maybe simpler ones will be possible. And 
I think to do this the D code needs some more semantics. I was suggesting an 
annotation that forbids inbound pointers, that allows the compiler to move data 
around a little, but this is just a start.

Bye,
bearophile



In many ways the biggest thing I use regularly in game development that 
I would lose by moving to D would be good built-in SIMD support.  The PC 
compilers from MS and Intel both have intrinsic data types and 
instructions that cover all the operations from SSE1 up to AVX.  The 
intrinsics are nice in that the job of register allocation and 
scheduling is given to the compiler and generally the code it outputs is 
good enough (though it needs to be watched at times).


Unlike ASM, intrinsics can be inlined so your math library can provide a 
platform abstraction at that layer before building up to larger 
operations (like vectorized forms of sin, cos, etc) and algorithms (like 
frustum cull checks, k-dop polygon collision etc), which makes porting 
and reusing the algorithms to other platforms much much easier, as only 
the low level layer needs to be ported, and only outliers at the 
algorithm level need to be tweaked after you get it up and running.


On the consoles there is AltiVec (VMX) which is very similar to SSE in 
many ways.  The common ground is basically SSE1 tier operations : 128 
bit values operating on 4x32 bit integer and 4x32 bit float support.  64 
bit AMD/Intel makes SSE2 the minimum standard, and a systems language on 
those platforms should reflect that.


Loading and storing is comparable across platforms with similar 
alignment restrictions or penalties for working with unaligned data. 
Packing/swizzle/shuffle/permuting are different but this is not a huge 
problem for most algorithms.  The lack of fused multiply and add on the 
Intel side can be worked around or abstracted (i.e. always write code as 
if it existed, have the Intel version expand to multiple ops).


And now my wish list:

If you have worked with shader programming through HLSL or CG the 
expressiveness of doing the work in SIMD is very high.  If I could write 
something that looked exactly like HLSL but it was integrated perfectly 
in a language like D or C++, it would be pretty huge to me.  The amount 
of math you can have in a line or two in HLSL is mind boggling at times, 
yet extremely intuitive and rather easy to debug.




Re: OOP, faster data layouts, compilers

2011-04-22 Thread bearophile
Sean Cavanaugh:

 In many ways the biggest thing I use regularly in game development that
 I would lose by moving to D would be good built-in SIMD support.  The PC
 compilers from MS and Intel both have intrinsic data types and
 instructions that cover all the operations from SSE1 up to AVX.  The
 intrinsics are nice in that the job of register allocation and
 scheduling is given to the compiler and generally the code it outputs is
 good enough (though it needs to be watched at times).

This is a topic quite different from the one I was talking about, but it's an 
interesting topic :-)

SIMD intrinsics look ugly, they add lot of noise to the code, and are very 
specific to one CPU, or instruction set. You can't design a clean language with 
hundreds of those. Once 256 or 512 bit registers come, you need to add new 
intrinsics and change your code to use them. This is not so good.

D array operations are probably meant to become smarter, when you perform a:

int[8] a, b, c;
a = b + c;

A future good D compiler may use just two inlined istructions, or little more. 
This will probably include shuffling and broadcasting properties too.

Maybe this kind of code is not as efficient as handwritten assembly code (or C 
code that uses SIMD intrinsics) but it's adaptable to different CPUs, future 
ones too, it's much less noisy, and it seems safer.

I think such optimizations are better left to the back-end, so lot of time ago 
I've asked it to LLVM devs, for future LDC:
http://llvm.org/bugs/show_bug.cgi?id=6956

The presence of such well implemented vector ops will not forbid another D 
compiler to add true SIMD intrinsics too.


 Unlike ASM, intrinsics can be inlined so your math library can provide a

DMD may eventually need this feature of the LDC compiler:
http://www.dsource.org/projects/ldc/wiki/InlineAsmExpressions

Bye,
bearophile


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Iain Buclaw
== Quote from bearophile (bearophileh...@lycos.com)'s article
 Timon Gehr:
  But, as pointed out by Linus, the prime performance problem is _not_ the 
  GC, but
  the mindset that comes with it. Most programmers that grew up in a managed
  environment tend to use very many new keywords all over their code, 
  instead of
  allocating large chunks of memory at once. (Java/C#/etc encourage you to do 
  this.)
 In C99 (and Ada) you avoid the allocation of some dynamic arrays with new 
 thanks
to variable length arrays.

Variable length arrays are just sugary syntax for a call to alloca.

  This is also the reason I think it is a bad idea to deprecate D's 'delete'.
 D used to have scoped class instances, scoped classes, and delete, their
replacements are not good enough yet. In CommonLisp you have hints for the GC,
they are safe and they help you help speedup the work of the GC. Such hints
probably need to be integrated with the type system, so they may need to be
built-ins as scope/delete were. I am not seeing enough discussion about this.
 Bye,
 bearophile


I've always felt that Vala's system is better thought out, which is incidentally
based on a reference counting system. This makes destructors in Vala 
deterministic
and can be used to implement an RAII pattern for resource management.
To get around the common pitfalls of reference counting systems, they introduce
two keywords which alter the relationship between the allocated object and the 
GC,
'weak' and 'unowned'.

Rather than bore you with the gritty details here, see link:
http://live.gnome.org/Vala/ReferenceHandling


Re: Linus with some good observations on garbage collection

2011-04-22 Thread bearophile
Iain Buclaw:

 Variable length arrays are just sugary syntax for a call to alloca.

I have an enhancement request in Bugzilla on VLA, with longer discussions. Just 
two comments:
- It seems alloca() can be implemented with two different semantics: to 
deallocate at the end of the function or to deallocate at the end of the scope. 
Usually alloca() deallocates at the end of the function, but that semantic 
confusion is dangerous. VLA deallocate at the end of the scope, just like any 
other static array.
- To use alloca you need to use pointers, maybe even slices, it's not DRY, etc. 
So syntax sugar helps.

In the meantime I've changed my mind a little. Now D I prefer something better 
than C99 VLAs. I'd like D-VLAs with the same syntax as C99 VLAs but with a 
safer semantics, closer to this one (but the alloca used here must deallocate 
at the end of the scope):

enum size_t MAX_VLA_SIZE = 1024;
static assert (is(typeof(size) == size_t));
T* ptr = null;
if ((size * T.sizeof)  MAX_VLA_SIZE)
ptr = cast(T*)alloca(size * T.sizeof);
T[] array = (ptr == null) ? new T[size] : ptr[0 .. size];
array[] = T.init;

This has some advantages: when alloca returns null, or when the array is large, 
it uses the GC. This allows to both avoid some stack overflows and reduce the 
risk of the stack memory from becoming too much cold.


 Rather than bore you with the gritty details here, see link:
 http://live.gnome.org/Vala/ReferenceHandling

This is interesting.

Bye,
bearophile


Re: Linus with some good observations on garbage collection

2011-04-22 Thread bearophile
 Now D I prefer something better than C99 VLAs. I'd like D-VLAs with the same 
 syntax as C99 VLAs but with a safer semantics,

Never mind, that semantics can't use that syntax, otherwise you have hidden 
heap allocations... The syntax has to change (because the D-VLA semantics seems 
OK to me).

Bye,
bearophile


Re: OOP, faster data layouts, compilers

2011-04-22 Thread Mike Parker

On 4/23/2011 4:22 AM, Andrew Wiley wrote:







The reason Minecraft runs so well in Java, from my point of view, is
that the authors resisted the Java urge to throw objects at the problem
and instead put everything into large byte arrays and wrote methods to
manipulate them. From that perspective, using Java would be about the
same as using any language, which let them stick to what they knew
without incurring a large performance penalty.



FYI, Markus, the author, has been a figure in the Java game development 
community for years. He was the original client programmer for Wurm 
Online[1] (where the landscape is 'infinite' and tiled) and a frequent 
participant in the Java4k competition[2] (with Left4kDead[3] perhaps 
being his most popular). I think it's a safe assumption that the 
techniques he put to use in Minecraft were learned from his experiments 
with the Wurm landscape and with cramming Java games into 4kb.


[1] http://www.wurmonline.com/
[2] http://www.java4k.com/index.php?action=home
[3] http://www.mojang.com/notch/j4k/l4kd/


Re: Transients or scoped immutability

2011-04-22 Thread Jesse Phillips
bearophile Wrote:

 This post contains uncooked ideas.
 
 I'd like to create data structures:
 - That once created are immutable, so there is no risk to write on them, etc;
 - On the stack too, avoiding slower heap allocations and avoiding copying 
 them from the mutable to the immutable version;
 - Avoiding to keep in the function name space a dead name of the mutable 
 version of the data structure;
 - Avoiding calls to functions that may contain loops that DMD doesn't inline;
 - Avoiding too much complex code for the programmer.

There was talk in the past of allowing a pure function create and modify a 
class and return that class as immutable. This was a suggestion on how to 
create immutable classes. The details were never really hashed out, but maybe 
it is possible for any returned value (of a pure function) to be implicitly 
converted to immutable.

To me it sounds really nice an clean, do you think it would work for you?


Re: Transients or scoped immutability

2011-04-22 Thread bearophile
Jesse Phillips:

 To me it sounds really nice an clean, do you think it would work for you?

Maybe you have missed the last two lines of my post:

 This idea is good and it will help, but it needs a function call:
 http://d.puremagic.com/issues/show_bug.cgi?id=5081

I like that idea and I think it will be good to have in D. But I think it's not 
enough, to solve the problem I have shown it requires a not simple function 
signature, you need to instantiate the static array before the call point 
(that's not nice and asks for two names for the same array), to use ref both in 
input and output, etc.

Bye,
bearophile


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Ulrik Mikaelsson
2011/4/22 Timon Gehr timon.g...@gmx.ch:
 That is only a problem if the reference count of that resource changes at a 
 very
 high frequency. The described problem also implies that the threads would not 
 need
 any form of synchronization for the data (otherwise the reference count 
 certainly
 would not be a bottleneck.)

 Michael Stover wrote:
 I'd like to say were proved rather than are believed, but I don't 
 actually
 know where to go for such evidence.
  However, I do believe many scripting languages, such as python, eventually
 ditched the reference counting technique for
 generational, and Java has very fast GC, so I am inclined to believe those
 real-life solutions than Linus.

 Well, the GC may be fast when compared to other GCs, but it has to be 
 designed to
 run on general data whose reference structure can be arbitrary. Often, the
 objects/references have a somewhat specialized structure that a smart 
 programmer
 can exploit, especially if the references point to private data. But that only
 matters if performance is of prime interest, and the gains may be not very 
 big.

All in all, I think the best approach is a pragmatic one, where
different types of resources can be handled according to different
schemes.

I.E. default to GC-manage everything. After profiling, determining
what resources are mostly used, and where, optimize allocation for
those resources, preferably to scoped allocation, or if not possible,
reference-counted.

Premature optimization is a root of much evil, for instance, the
malloc-paranoid might very well resort to abuse of struct:s, leading
either to lots of manual pointers, or excessive memory copying.

Incidentally, this was the main thing that attracted me to D. Be
lazy/productive where performance doesn't matter much, and focus
optimization on where it does.


Re: OOP, faster data layouts, compilers

2011-04-22 Thread Sean Cavanaugh

On 4/22/2011 4:41 PM, bearophile wrote:

Sean Cavanaugh:


In many ways the biggest thing I use regularly in game development that
I would lose by moving to D would be good built-in SIMD support.  The PC
compilers from MS and Intel both have intrinsic data types and
instructions that cover all the operations from SSE1 up to AVX.  The
intrinsics are nice in that the job of register allocation and
scheduling is given to the compiler and generally the code it outputs is
good enough (though it needs to be watched at times).


This is a topic quite different from the one I was talking about, but it's an 
interesting topic :-)

SIMD intrinsics look ugly, they add lot of noise to the code, and are very 
specific to one CPU, or instruction set. You can't design a clean language with 
hundreds of those. Once 256 or 512 bit registers come, you need to add new 
intrinsics and change your code to use them. This is not so good.


In C++ the intrinsics are easily wrapped by __forceinline global 
functions, to provide a platform abstraction against the intrinsics.


Then, you can write class wrappers to provide the most common level of 
functionality, which boils down to a class to do vectorized math 
operators for + - * / and vectorized comparison functions == != = =  
and .  From HLSL you have to borrow the 'any' and 'all' statements 
(along with variations for every permutation of the bitmask of the test 
result) to do conditional branching for the tests.  This pretty much 
leaves swizzle/shuffle/permuting and outlying features (8,16,64 bit 
integers) in the realm of 'ugly'.


From here you could build up portable SIMD transcendental functions 
(sin, cos, pow, log, etc), and other libraries (matrix multiplication, 
inversion, quaternions etc).


I would say in D this could be faked provided the language at a minimum 
understood what a 128 (SSE1 through 4.2) and 256 bit value (AVX) was and 
how to efficiently move it via registers for function calls.  Kind of 
'make it at least work in the ABI, come back to a good implementation 
later' solution.  There is some room to beat Microsoft here, as the the 
code visual studio 2010 outputs currently for 64 bit environments cannot 
pass 128 bit SIMD values by register (forceinline functions are the only 
workaround), even though scalar 32 and 64 bit float values are passed by 
XMM register just fine.


The current hardware landscape dictates organizing your data in SIMD 
friendly manners.  Naive OOP based code is going to de-reference too 
many pointers to get to scattered data.  This makes the hardware 
prefetcher work too hard, and it wastes cache memory by only using a 
fraction of the RAM from the cache line, plus wasting 75-90% of the 
bandwidth and memory on the machine.




D array operations are probably meant to become smarter, when you perform a:

int[8] a, b, c;
a = b + c;



Now the original topic pertains to data layouts, of which SIMD, the CPU 
cache, and efficient code all inter-relate.  I would argue the above 
code is an idealistic example, as when writing SIMD code you almost 
always have to transpose or rotate one of the sets of data to work in 
parallel across the other one.  What happens when this code has to 
branch?  In SIMD land you have to test if any or all 4 lanes of SIMD 
data need to take it.  And a lot of time the best course of action is to 
compute the other code path in addition to the first one, AND the fist 
result and NAND the second one and OR the results together to make valid 
output.  I could maybe see a functional language doing ok at this.   The 
only reasonable construct to be able to explain how common this is in 
optimized SIMD code, is to compare it to is HLSL's vectorized ternary 
operator (and understanding that 'a' and 'b' can be fairly intricate 
chunks of code if you are clever):


float4 a = {1,2,3,4};
float4 b = {5,6,7,8};
float4 c = {-1,0,1,2};
float4 d = {0,0,0,0};
float4 foo = (c  d) ? a : b;

results with foo = {5,6,3,4}

For a lot of algorithms the 'a' and 'b' path have similar cost, so for 
SIMD it executes about 2x faster than the scalar case, although better 
than 2x gains are possible since using SIMD also naturally reduces or 
eliminates a ton of branching which CPUs don't really like to do due to 
their long pipelines.




And as much as Intel likes to argue that a structure containing 
positions for a particle system should look like this because it makes 
their hardware benchmarks awesome, the following vertex layout is a failure:


struct ParticleVertex
{
float[1000] XPos;
float[1000] YPos;
float[1000] ZPos;
}

The GPU (or Audio devices) does not consume it this way. The data is 
also not cache coherent if you are trying to read or write a single 
vertex out of the structure.


A hybrid structure which is aware of the size of a SIMD register is the 
next logical choice:


align(16)
struct ParticleVertex
{
float[4] XPos;
float[4] YPos;
float[4] ZPos;
}
ParticleVertex[250] ParticleVertices;

// struct is also 

Re: OOP, faster data layouts, compilers

2011-04-22 Thread bearophile
Sean Cavanaugh:

 In C++ the intrinsics are easily wrapped by __forceinline global
 functions, to provide a platform abstraction against the intrinsics.

When AVX will become 512 bits wide, or you need to use a very different set of 
vector register, your global functions need to change, so the code that calls 
them too has to change. This is acceptable for library code, but it's not good 
for D built-ins operations. D built-in vector ops need to be more clean, 
general and long-lasting, even if they may not fully replace SSE intrinsics.


 I would say in D this could be faked provided the language at a minimum
 understood what a 128 (SSE1 through 4.2) and 256 bit value (AVX) was and
 how to efficiently move it via registers for function calls.

Also think about what the D ABI will be 15-25 years from now. D design must 
look a bit more forward too.


 Now the original topic pertains to data layouts,

It was about how to not preclude future D compilers from shuffling data around 
a bit by themselves :-)


 I would argue the above
 code is an idealistic example, as when writing SIMD code you almost
 always have to transpose or rotate one of the sets of data to work in
 parallel across the other one.

Right.


 float4 a = {1,2,3,4};
 float4 b = {5,6,7,8};
 float4 c = {-1,0,1,2};
 float4 d = {0,0,0,0};
 float4 foo = (c  d) ? a : b;

Recently I have asked for a D vector comparison operation too, (the compiler is 
supposed able to splits them into register-sized chunks for the comparisons), 
this is good for AVX instructions (a little problem here is that I think 
currently DMD allocates memory on heap to instantiate those four little arrays):

int[4] a = [1,2,3,4];
int[4] b = [5,6,7,8]
int[4] c = [-1,0,1,2];
int[4] d = [0,0,0,0];
int[4] foo = (c[]  d[]) ? a[] : b[];


 Things get real messy when you have multiple vertex attributes as
 decisions to keep them together or separate are conflicting and both
 choices make sense to different systems :)

It's not easy for future compilers to perform similar auto-vectorizations :-)

Bye and thank you for your answer,
bearophile


Re: Temporarily disable all purity for debug prints

2011-04-22 Thread Andrew Wiley
On Fri, Apr 22, 2011 at 12:34 AM, dennis luehring dl.so...@gmx.net wrote:

 On 17.04.2011 22:45, Andrew Wiley wrote:

 On Sun, Apr 17, 2011 at 3:30 PM, dennis luehringdl.so...@gmx.net
  wrote:

 On 11.04.2011 23:27, bearophile wrote:


  From what I am seeing, in a D2 program if I have many (tens or more)
 pure
 functions that call to each other, and I want to add (or activate) a
 printf/writeln inside one (or few) of those functions to debug it, I may
 need to temporarily comment out the pure attribute of many functions
 (because printing can't be allowed in pure functions).

 As more and more D2 functions become pure in my code and in Phobos,
 something like a -disablepure compiler switch (and printf/writeln inside
 debug{}) may allow more handy debugging with prints (if the purity is
 well
 managed by the compiler then I think disabling the pure attributes
 doesn't
 change the program output).

 Bye,
 bearophile


 sounds a little bit like the need to see an private/protected part of an
 interface in unittest scenarios - just to be able to test it in a
 whitebox-testing without changing the attributes of the productive-code


 Isn't this already there because private makes things visible to all
 other code in the same module?


 ok - but what about protected? as a whitebox tester im not able(allowed) to
 change productive code,but i need to test through all the code (especially
 when doing code-coverage stuff)


As far as I'm aware, all the visibility levels make things visible to code
in the same module.


Re: Linus with some good observations on garbage collection

2011-04-22 Thread Timon Gehr
Ulrik Mikaelsson wrote:
 All in all, I think the best approach is a pragmatic one, where
 different types of resources can be handled according to different
 schemes.

 I.E. default to GC-manage everything. After profiling, determining
 what resources are mostly used, and where, optimize allocation for
 those resources, preferably to scoped allocation, or if not possible,
 reference-counted.

 Premature optimization is a root of much evil, for instance, the
 malloc-paranoid might very well resort to abuse of struct:s, leading
 either to lots of manual pointers, or excessive memory copying.

 Incidentally, this was the main thing that attracted me to D. Be
 lazy/productive where performance doesn't matter much, and focus
 optimization on where it does.

That is very true. GC is almost always fast enough or even faster. And it is
clearly most convenient. And yes, identify bottlenecks first, optimize later. 
But
I also think programs that have some concern about efficient memory allocation
(with GC or without GC) tend to be better designed in general. This actually
increases productivity. Plus, it reduces the need for complicated optimizations
later on. This in order increases maintainability.

-Timon


Re: Next Release

2011-04-22 Thread Joel Christensen


Well, then I'd better make sure that I get my most recent updates to
std.datetime in soon.

- Jonathan M Davis


Does your library take into account that there's no year 0?


Re: Web development howto?

2011-04-22 Thread Robert Clipsham

On 22/04/2011 03:53, Jaime Barciela wrote:

Hello everyone,

I'm going though TDPL and I just joined this list.

I've been looking for guidance on how to do web applications in D but
I haven't found anything.

My background is not C/C++ but Java (and Delphi many years ago) so I
have not only a new language but a new culture to get used to as well.

Could somebody give me some pointers?

Thanks
Jaime


The simplest way to make a web application with D is to use CGI/FastCGI 
etc. There are also at least two frameworks in development (that I know 
of), one is significantly more developed.


http://arsdnet.net/dcode/ - see web.d, cgi.d etc, this is the most 
mature (that I know of).


https://github.com/mrmonday/serenity - One I'm working on. It's due to 
undergo significant changes and is lacking a lot of basic functionality, 
so I'd avoid it, for now at least.


--
Robert
http://octarineparrot.com/


Re: Next Release

2011-04-22 Thread Jonathan M Davis
  Well, then I'd better make sure that I get my most recent updates to
  std.datetime in soon.
  
  - Jonathan M Davis
 
 Does your library take into account that there's no year 0?

Actually, for ISO 8601, which the library follows, there _is_ a year 0. Date, 
DateTime, and SysTime all have the function yearBC which will give you the 
year as you would normally expect (1 B.C. being immediately prior to 1 A.D. 
with no year 0). But the ISO standard calls for a year 0, and I followed the 
standard (it's also way easier to deal with programmatically). So, other than 
the yearBC function, it treats 0 as the year prior to 1 A.D., and the years 
prior to 0 are negative.

- Jonathan M Davis


Expression templates in D1

2011-04-22 Thread SiegeLord
I have been trying to create some simple expression templates in D1 but I've 
run into trouble. Here is a reduced test case:

class A
{
void opSub_r(T:int)(T a)
{

}

void opSub(T)(T a)
{

}
}

void main(char[][] args)
{  
A a;
a - 1;
a - a; // line 20
1 - a;
}

The error is:

test.d(20): Error: template test.A.opSub_r(T : int) does not match any function 
template declaration
test.d(20): Error: template test.A.opSub_r(T : int) cannot deduce template 
function from argument types !()(A)

My goal is to adjust the code such that the three expressions in the main 
function all compile. Both opSub's must be templated functions (as far as I can 
tell) so that the expression templates can work properly. Here is my full code, 
incidentally:

http://ideone.com/2vZdN

Any ideas of any workabouts? Has anyone done expression templates in D1 and got 
them to work?

And yes... the code above works fine in D2, but I want to try to get it to work 
in D1 for now.

Thanks,

-SiegeLord