Re: Stroustrup's slides about c++11 and c++14

2014-09-14 Thread Sean Cavanaugh via Digitalmars-d

On 9/13/2014 3:10 PM, eles wrote:

This presentation:

https://parasol.tamu.edu/people/bs/622-GP/C++14TAMU.pdf

He criticizes C99 VLA (slide 24) as being an abomination

But the surprise comes at the end (slide 57), where he also
criticizes... the static if as being a total abomination. Well, this
is D, I told myself.

Are those points valid?:

static if is a total abomination
• Unstructured, can do everything (just like goto)
• Complicates static analysis (AST-based tools get hard to write)
• Blocks the path for concepts
• Specifies how things are done (implementation)
• Is three slightly different “ifs” using a common syntax
• Redefines the meaning of common notation (such as { ... })



Its a downright cute and cuddly abomination to have as opposed to the 
one that exists in the world of #define and its minions 
#if/#ifdef/#elif/#if defined()/#else/#endif





Re: A Perspective on D from game industry

2014-06-18 Thread Sean Cavanaugh via Digitalmars-d

On 6/18/2014 1:05 AM, c0de517e wrote:

On Wednesday, 18 June 2014 at 03:28:48 UTC, Sean Cavanaugh wrote:


I had a nice sad 'ha ha' moment when I realized that msvc can't cope
with restrict on the pointers feeding into the simd intrinsics; you
have to cast it away.  So much for that perf :)


http://blogs.msdn.com/b/vcblog/archive/2013/07/12/introducing-vector-calling-convention.aspx



VectorCall is all about working the original x64 ABI that only lets you 
officially pass float and double point scalars around in the xmm 
registers.   vectors require writing a bunch of helper forceinline 
functions that always operate on pointers or references, as passing by 
value lacked vectorcall, and on x86 pass by value for xmm types won't 
even compile.


Ultimately the code ends up with calls to you have to call something 
like _mm_store_ps or _mm_stream_ps etc, those are the functions that 
take pointers, and you have to cast away volatile (and afaik restrict is 
ignored on them as well but you don't to cast it away).




Re: A Perspective on D from game industry

2014-06-17 Thread Sean Cavanaugh via Digitalmars-d

On 6/15/2014 4:34 PM, Joakim wrote:


He clarifies in the comments:

D is not 'high-performance' the same way as C and C++ are not. Systems
is not the same as high-performance. Fortran always has been more
'high-performance' than C/C++ as it doesn't have pointer aliasing (think
that C++ introduced restrict, which is the bread and butter of a HPC
language only in C++11, same for threading, still no vector types...)
for example. ISPC is a HPC language or Julia, Fortran, even Numpy if you
want, not D or C or C++
http://c0de517e.blogspot.in/2014/06/where-is-my-c-replacement.html?showComment=1402865174608#c415780017887651116



I had a nice sad 'ha ha' moment when I realized that msvc can't cope 
with restrict on the pointers feeding into the simd intrinsics; you have 
to cast it away.  So much for that perf :)


Re: Tail pad optimization, cache friendlyness and C++ interrop

2014-06-14 Thread Sean Cavanaugh via Digitalmars-d

On 6/11/2014 8:56 AM, Remo wrote:


This is pretty strange behavior.
At least on Windows I can not confirm this.
Visual Studio 2013, Intel Compiler and Clang for windows have the same
consistent behavior here.

private  do NOT affect struct size.

But there is a parameter in Visual Studio that will affect it called
Struct Member Alignment
http://msdn.microsoft.com/en-us/library/xh3e3fd0.aspx

1 Byte (/Zp1)  sizeof(S1)=5,  sizeof(S2)=6
2 Byte (/Zp2)  sizeof(S1)=6,  sizeof(S2)=8
4 Byte (/Zp4)  sizeof(S1)=8,  sizeof(S2)=12
8 Byte (/Zp8)  sizeof(S1)=8,  sizeof(S2)=12   this is the default.

Of course the same can be done with #pragma pack(?) .

There is also __declspec(align(?)) and in C++11  alignas(?) but its
behavior is different from #pragma pack(?) .




For inheritance I've seen gcc and clang allow the inheriting to go into 
the tail padding of the previous class in the hierarchy, technically 
legal but always gives me scary thoughts before going to sleep at night.


MSVC pads up the classes to the alignment size and inheritance starts 
from there, a bit wasteful but far safer.


struct foo  { int x; char y; };
struct bar : public foo { char z; }

sizeof(foo == 8);
sizeof(bar == 8); // clang and gcc on linux
sizeof(bar == 12); // msvc

The windows case is generally safer as memset(class, 0, sizeof(class)) 
won't clobber inherited members, but its also not supposed to be legal 
to do things like memsetting anything that fails is_pod etc in c++, 
however since nearly all code I've worked with has constructors is_pod 
is useless and people just memset struct-like objects without a care int 
he world, which can be fun when porting :)





Re: win64 as orphan?

2014-06-12 Thread Sean Cavanaugh via Digitalmars-d-learn

On 6/9/2014 11:42 AM, lurker wrote:

i agree with you, but you should have posted in announce, so that
adrei can use it for some marketing.
i too wait now for a long, long time to use it with win64. i am also
giving up - i guess it will stay a linux/apple show.
maybe, as a multiple os compiler, you can use lazarus or code typhon.
cheers.


On Monday, 9 June 2014 at 15:04:19 UTC, trail wrote:

will the sorry state of the win64 headers and programs like dfl
be fixed or is it time to leave the language to linux and move on
to something else?




Clang can parse windows.h these days, it might be worthwhile to use 
their toolchain to dump the various SDKs of windows.h into some kind of 
database with it, and write an exporter for the database to D.   I 
imagine there is some overlap here that other languages could use 
something like this to provide up to date windows bindings (MingW in 
particular, and anyone else making new languages)


I'm sure some hand additions would need to exist but a huge amount of 
the API could probably be handled with something like that.




Re: what keeps a COM object alive?

2013-06-12 Thread Sean Cavanaugh

On 6/11/2013 10:38 PM, finalpatch wrote:

A typical COM server would create a new object (derived from IUnknown),
return it to the caller (potentially written in other languages).
Because the object pointer now resides outside of D's managed heap, does
that mean the object will be destroyed when the GC runs? A normal COM
object written in C++ relies on reference counting to manage its life
cycle but in D the ref counting seems not serving any purpose. The
AddRef()/Release() of std.c.windows.com.ComObject maintains a ref count,
but doesn't really use it for any thing. There's a comment in Release()
says let the GC reap it, but how does the GC know the object is okay
to destroy?


COM is by definition ref counted.  In D you generally need to store a 
COM pointer in a struct and maintain the refcount with that struct 
(increment on copy, decrement in destructor, etc).  Its not too hard, as 
'alias this' usage can wrap the pointer's methods easily enough.





Re: DMD under 64-bit Windows 7 HOWTO

2013-06-04 Thread Sean Cavanaugh

On 6/1/2013 11:08 PM, Sean Cavanaugh wrote:

On 6/1/2013 11:06 PM, Sean Cavanaugh wrote:

On 6/1/2013 8:57 PM, Adam Wilson wrote:


Ok, so how did you get VisualD to not use OPTLINK?



I have my project settings set to 'Combined compile and link'
(bottom-most option of the General part of the project settings).

dmd is invoking the linker specified in the sc.ini this way

(its a small and monolithic executable)




The other configuration types there seemed to work for me as well, so
I'm having trouble even breaking it.


I remember having to add in -m64 to the command line int he settings 
somewhere as well, but this is from memory and I'm writing this from 
work from memory before I forget again :)





Re: Slow performance compared to C++, ideas?

2013-06-04 Thread Sean Cavanaugh

On 6/4/2013 12:58 AM, Andrei Alexandrescu wrote:


Unless fresh arguments, facts, or perspectives come about, I am
personally not convinced, based on this thread so far, that we should
operate a language change.



The best you could do without a language change is to establish some 
good conventions for class design:


A) prefer writing UFCS functions
B) make it the norm for classes to segment their functions a lot like 
the public/private setup of C++ classes:   Have blocks for final, 
override, final override.   Ideally there is a virtual keyword there 
like the others.



At least in C++ land I can tell if a code library is useful to me by 
searching for how it uses virtuals, among other things like searching 
for catches and throws since I am stuck on platforms without exception 
handling and am working in a codebase that is designed without it.



I also find the idea that the linker could help a bit iffy since once 
you start DLLEXPORT'ing your classes the dependency tree basically 
expands to your whole codebase and it quickly cascades into being unable 
to eliminate most functions and global variables from the final image. 
You end up with fat binaries and its hard to prune dead stuff, and it 
has to be done by hand by walking the map files for iffy-objects and 
tracking down their dependencies.  This is especially true if nearly 
everything is virtual.



I suppose in theory there is an alternative change which wouldn't 
require a new keyword, and that would be to make it so that methods are 
only virtual if they are defined in an interface, and forbidding them 
coming into existence at class scope.  But this would increase the code 
maintenance probably a bit too much.





Re: Slow performance compared to C++, ideas?

2013-06-04 Thread Sean Cavanaugh

On 6/4/2013 2:25 AM, Walter Bright wrote:


One possibility is to introduce virtual as a storage class that
overrides final. Hence, one could write a class like:

class C {
   final:
 void foo();
 void baz();
 virtual int abc();
 void def();
}

This would not break any existing code, and Manu would just need to get
into the habit of having final: as the first line in his classes.


The problem isn't going to be in your own code, it will be in using 
everyone elses.





Re: Slow performance compared to C++, ideas?

2013-06-04 Thread Sean Cavanaugh

On 6/4/2013 2:46 AM, Walter Bright wrote:

On 6/4/2013 12:32 AM, Sean Cavanaugh wrote:

The problem isn't going to be in your own code, it will be in using
everyone elses.


If you're forced to use someone else's code and are not allowed to
change it in any way, then you're always going to have problems with
badly written APIs.

Even Manu mentioned that he's got problems with C++ libraries because of
this, and C++ has non-virtual by default.



Changing third party libraries is a maintenance disaster, as they can 
rename files and make other changes that cause your local modifications 
to disappear into the ether after a merge.  We have a good number of 
customizations to wxWidgets here, and I had to carefully port them all 
up to the current wx codebase because the old one wasn't safe to use in 
64 bits on windows.


Also, final-izing a third party library is going to be a pretty 
dangerous thing to do and likely introduce some serious bugs along the way.





Re: DMD under 64-bit Windows 7 HOWTO

2013-06-01 Thread Sean Cavanaugh

On 5/25/2013 8:24 PM, Manu wrote:

I  might just add, that if you have Visual Studio installed (which I
presume many Windows dev's do), then you don't need to do ANYTHING.
DMD64 just works if VS is present.

I didn't do a single thing to get DMD-Win64 working. And it's working great.

You should make sure this is clear at the top of any wiki entry.

Perhaps a future push to convince Walter to port DMD-Win32 to
COFF/WinSDK aswell might be nice ;)
Win32 is still an important platform for many (most?) users.




under VS2012 I had to edit sc.ini to point directly to the linker:

LINKCMD64=C:\Program Files (x86)\Microsoft Visual Studio 
11.0\VC\bin\amd64\link.exe


and in visuald add the win8 sdk lib path to the lib directories:

C:\Program Files (x86)\Windows Kits\8.0\Lib\win8\um\x64

And from there it just worked

The other variables the stock sc.ini uses are only set if you run 
vsvars32.bat (and the newer batch files only enable 1 platform at a time 
instead of all of them which can trip people up trying to do everything 
from a single command line)





Re: DMD under 64-bit Windows 7 HOWTO

2013-06-01 Thread Sean Cavanaugh

On 6/1/2013 11:06 PM, Sean Cavanaugh wrote:

On 6/1/2013 8:57 PM, Adam Wilson wrote:


Ok, so how did you get VisualD to not use OPTLINK?



I have my project settings set to 'Combined compile and link'
(bottom-most option of the General part of the project settings).

dmd is invoking the linker specified in the sc.ini this way

(its a small and monolithic executable)




The other configuration types there seemed to work for me as well, so 
I'm having trouble even breaking it.


Re: DMD under 64-bit Windows 7 HOWTO

2013-06-01 Thread Sean Cavanaugh

On 6/1/2013 8:57 PM, Adam Wilson wrote:


Ok, so how did you get VisualD to not use OPTLINK?



I have my project settings set to 'Combined compile and link' 
(bottom-most option of the General part of the project settings).


dmd is invoking the linker specified in the sc.ini this way

(its a small and monolithic executable)




Re: Slow performance compared to C++, ideas?

2013-05-31 Thread Sean Cavanaugh

On 5/31/2013 4:42 AM, Manu wrote:


People already have to type 'override' in every derived class, and
they're happy to do that. Requiring to type 'virtual' in the base is
hardly an inconvenience by contrast. Actually, it's quite orthogonal.
D tends to prefer being explicit. Why bend the rules in this case,
especially considering the counterpart (override) is expected to be
explicit? Surely both being explicit is what people would expect?



Maybe the solution is to make everyone equally unhappy:

all (non constructor) class methods must be either final, override, or 
virtual, if you leave one of these off, you get an error :)





Re: DConf 2013 Day 1 Talk 6: Concurrent Garbage Collection for D by Leandro Lucarella

2013-05-27 Thread Sean Cavanaugh

On 5/24/2013 11:12 PM, Diggory wrote:

On 64-bit windows there is also the GetWriteWatch function which lets
you access the dirty flag in the page table = no page faults = super
efficient concurrent generational GC. Just a shame it doesn't exist on
32-bit systems for some reason.


There's all sorts of interesting stuff in 64 bit windows :)  The user 
mode thread scheduler is pretty cool.


On the flip side: 32 bit is in its twilight days, and I am reasonably 
confident the next game I work on will be the last one that even 
supports 32 bits.  Then I can finally use all the new 64 bit goodies :) 
  32 bits will be reserved for phones and tablets (and even then 
tablets will probably be making the switch pretty soon-ish)





Re: dmd 2.063 beta 5

2013-05-23 Thread Sean Cavanaugh

On 5/23/2013 11:17 PM, Walter Bright wrote:

On 5/23/2013 8:53 PM, Steven Schveighoffer wrote:

On Thu, 23 May 2013 23:38:32 -0400, Walter Bright
newshou...@digitalmars.com
wrote:


On 5/23/2013 7:38 PM, Steven Schveighoffer wrote:

This is one change where ALL code broken by this change
is fixable with a simple solution, and at some point, people will
have to deal
with this.


Yes, but it is not so urgent as RIGHT NOW YOU MUST FIX IT. Hence, the
warning.


If they aren't in the mood to change their code, they don't have to
upgrade to
the latest compiler.


Q: Why do we bother with the whole warning and deprecation thing anyway?

A: Because forcing people to conform to our schedule with no warning is
a bit presumptuous. Someone writing a library, for example, has to be up
to date, and anyone using that library, for another, has to deal with it
if it fails to compile. If your code is built from multiple blobs from
various places, it is unreasonable to expect them to all instantly and
simultaneously upgrade.



I would argue the language should be designed with the ability to easily 
add keywords without breaking things.  I would expect a simple 
implementation to be that each source file needs to be marked up with a 
version of the compiler it is designed for, and the compiler could tell 
you that A) you are using what is now a keywords B) compile the code 
correctly anyway.


The lack of this kind of feature is why c++11's new null type is named 
nullptr, and not null, and why pure virtuals are declared with '=0' 
instead of a keyword.  Arguably c++11's range based for loops are also 
designed with this constraint (avoid adding a keyword), but is a not 
ugly like =0 pure virtuals :)









Re: D on next-gen consoles and for game development

2013-05-23 Thread Sean Cavanaugh

On 5/24/2013 12:25 AM, deadalnix wrote:

On Friday, 24 May 2013 at 00:44:14 UTC, Andrei Alexandrescu wrote:

Custom allocators will probably be very useful, but if there's one thing
STL has taught me, it's hard to use them effectively, and in practise,
nobody ever uses them.


Agreed.



To benefit from a custom allocator, you need to be under a very specific
use case. Generic allocator are pretty good in most cases.



Most general allocators choke on multi-threaded code, so a large part of 
customizing allocations is to get rid lock contention.


While STL containers can have basic allocator templates assigned to 
them, if you really need performance you typically need to control all 
the different kinds of allocations a container does.


For example, a std::unordered_set allocates a ton of link list nodes to 
keep iterators stable inserts and removes, but the actual data payload 
is another separate allocation, as is some kind of root data structure 
to hold the hash tables.  In STL land this is all allocated through a 
single allocator object, making it very difficult (nearly impossible in 
a clean way) to allocate the payload data with some kind of fixed size 
block allocator, and allocate the metadata and link list nodes with a 
different allocator.  Some people would complain this exposes 
implementation details of a class, but the class is a template, it 
should be able to be configured to work the way you need it to.



class tHashMapNodeDefaultAllocator
{
public:
static void* allocateMemory(size_t size, size_t alignment)
{
return mAlloc(size, alignment);
}
static void freeMemory(void* pointer) NOEXCEPT
{
mFree(pointer);
}
};


template typename DefaultKeyType, typename DefaultValueType
class tHashMapConfiguration
{
public:
typedef typename tHashClassDefaultKeyType HashClass;
typedef typename tEqualsClassDefaultKeyType EqualClass;
typedef tHashMapNodeDefaultAllocator NodeAllocator;
typedef typename tDynamicArrayConfigurationtypename 
tHashMapNodeDefaultKeyType, DefaultValueType NodeArrayConfiguration;

};


template typename KeyType, typename ValueType, typename 
HashMapConfiguration = tHashMapConfigurationKeyType, ValueType

class tHashMap
{
};


// the tHashMap also has an array inside, so there is a way to configure 
that too:



class tDynamicArrayDefaultAllocator
{
public:
static void* allocateMemory(size_t size, size_t alignment)
{
return mAlloc(size, alignment);
}
static void freeMemory(void* pointer) NOEXCEPT
{
mFree(pointer);
}
};


class tDynamicArrayDefaultStrategy
{
public:
static size_t nextAllocationSize(size_t currentSize, size_t 
objectSize, size_t numNewItemsRequested)

{
// return some size to grow the array by when the capacity is 
reached

return currentSize + numNewItemsRequested * 2;
}
}


template typename DefaultObjectType
class tDynamicArrayConfiguration
{
public:
typedef tDynamicArrayDefaultStrategy DynamicArrayStrategy;
typedef tDynamicArrayDefaultAllocator DynamicArrayAllocator;
};







Re: WindowProc in a class - function and pointer problem

2013-05-23 Thread Sean Cavanaugh

On 5/22/2013 8:49 PM, evilrat wrote:

On Wednesday, 22 May 2013 at 21:42:32 UTC, D-sturbed wrote:


Yes I'm in the multiple Window case, every window is wraped in a
class and has its own message handler. I know that Win, in its
callback system, often lets you retrieve a pointer to something, and I
haven't get it was possible in this case...(which is you seem to
describe). I will try this tomorrow.


you can't really make it without static wndproc. if you don't know how
to do it just go google some wndproc in a class tutors for c++, i can
even recommend you one(i had used my own port of this in D before i
switched to crossplatform lib for my needs) -
http://blogs.msdn.com/b/oldnewthing/archive/2005/04/22/410773.aspx



I had a partial port of WTL over to D which worked well enough for what 
I needed, the core of the WndProc handling is down below.   Each HWND is 
owned by the thread it was created on, so assigning it into D 
associative array (which is thread local) can do the trick.  It involved 
a hack, before any call to the Win32 CreateWindow/CreateWindowEx APIs, 
to capture the association with the new HWND and to remove them when 
they get cleaned up.  There is probably a better way but this worked 
enough to get started, and should work with multiple windows reasonably 
well.







CWindowInterface g_CreatingWindow;
CWindowInterface[HWND] g_HWNDtoObject;


extern (Windows)
int CWindowProc(HWND hWnd, UINT uMsg, WPARAM wParam, LPARAM lParam)
{
int WindowProcReturnValue = 0;

/*
auto EventMsg = appender!string();
formattedWrite(EventMsg, EventMsg (hWnd=%1$s) (uMsg=%2$s) 
(wParam=%3$s) (lParam=%4$s)\n, cast(DWORD)hWnd, cast(DWORD)uMsg, 
cast(DWORD)wParam, cast(DWORD)lParam);

OutputDebugString(toUTFz!(const(wchar)*)(EventMsg.data));
*/

if (g_CreatingWindow)
{
g_HWNDtoObject[hWnd] = g_CreatingWindow;
g_CreatingWindow = null;
}
auto Window = g_HWNDtoObject[cast(HANDLE)hWnd];
if (Window is null)
{
WindowProcReturnValue = DefWindowProc(hWnd, uMsg, wParam, lParam);
}
else
{
WindowProcReturnValue = Window.WindowProc(hWnd, uMsg, wParam, 
lParam);

}
if (uMsg == WM_NCDESTROY)
{
g_HWNDtoObject.remove(hWnd);
}
return WindowProcReturnValue;
}



Re: Interface vs pure abstract class - speed.

2013-05-13 Thread Sean Cavanaugh

On 5/12/2013 12:31 PM, SundayMorningRunner wrote:

Hello, let's say I have the choice between using an abstract class or an
interface to declare a plan, a template for the descendants.

 From the dmd compiler point of view, should I use the abstract class
version (so I guess that for each method call, there will be a few MOV,
in order to extract the relative address from the vmt before a CALL) or
the interface version (are the CALL directly performed in this case).
Are interface faster ? (to get the address used by the CALL).

Thx.


If you actually need speed you would need to use something known as the 
curiously recurring template, and avoid using virtual (as much as 
possible) altogether.




Re: 1 matches bool, 2 matches long

2013-04-29 Thread Sean Cavanaugh

On 4/29/2013 7:30 AM, deadalnix wrote:


(that is: ifzero(), infnonzero(), whilezero(), whilenonzero()).






int x = 3;
if (!!x)
{
// do something
}


Its not official but this already works in the C like langauges, as a 
way to 'promote to bool'





Re: Recipe and best practice for accessing COM

2012-09-09 Thread Sean Cavanaugh

On 9/9/2012 7:30 AM, newToCOM wrote:

Still struggling..

test.d:
---
( ... )  /* Other imports */
import win32.directx.d2d1;

alias win32.directx.d2d1.IID IID;
IID IID_ID2D1Factory = { 0x06152247, 0x6F50, 0x465A, [0x92, 0x45, 0x11,
0x8B, 0xFD, 0x3B, 0x60, 0x07] };

extern (Windows)
int WinMain( ... )
{
 ( ... ) /* Boilerplate code */
}

int myWinMain( ... )
{
 ( ... ) /* Register class, Create window, show window, message loop */
}

LRESULT WndProc( ... )
{
 switch (message)
 {
 ( ... ) /* Default and case WM_DESTROY */

 case WM_CREATE:

 // arg2
 REFIID pIID_ID2D1Factory;
 pIID_ID2D1Factory = cast(REFIID)IID_ID2D1Factory;

 // arg3
 D2D1_FACTORY_OPTIONS factoryOptions;
 factoryOptions.debugLevel = D2D1_DEBUG_LEVEL.INFORMATION;
 const D2D1_FACTORY_OPTIONS* pfactoryOptions = factoryOptions;

 // arg4
 void* pID2D1Factory;


 HRESULT hr = D2D1CreateFactory(
 D2D1_FACTORY_TYPE.SINGLE_THREADED,
 pIID_ID2D1Factory,
 pfactoryOptions,
 pID2D1Factory
 );

 writeln(hr);   // Prints 0
 writeln(pID2D1Factory);// Prints a non-null pointer

 // Trying to use the interface
 float dpiX, dpiY;
 pID2D1Factory.GetDesktopDpi(dpiX, dpiY);
 // !!! Error: undefined identifier 'GetDesktopDpi'
 // I thought I had a pointer to the interface and could call
 // its methods...

 return 0;

--

* Coffimplib was used to convert the d2d1.lib from Microsoft DirectX SDK
(June 2010). The converted d2d1.dll was placed in C:\dcode\dlls
* With test.d in C:\dcode, the file was compiled and linked in C:\dcode
with:
dmd test.d %cd%\dlls\d2d1.lib -I%cd%\dlls\ -version=DXSDK_JUNE_2010


- Why is the method identifier undefined?
- Are there any code examples showing the right use of the bindings?


In this example the pID2D1Factory is a void*, so it will need a cast to 
the proper type with a cast(ID2D1Factory) soemtime after the create call;


Since this particular API takes an out void* (since it is capable of 
creating multiple unrelated types), it would need to look something like 
this:


HRESULT hr = D2D1CreateFactory(
D2D1_FACTORY_TYPE.SINGLE_THREADED,
pIID_ID2D1Factory,
pfactoryOptions,
pID2D1Factory);

if (SUCCEEDED(hr))
{
assert(pID2D1Factory !is null);
ID2D1Factory factory = cast(ID2D1Factory)pID2D1Factory;
factory.GetDesktopDpi(dpiX, dpiY);
}


I've been super busy at work so haven't had much time to respond to this 
thread.



Technically the API's fourth argument could be rewritten to be an 'out 
IUnknown', but then you are stuck calling QueryInterface, and back to 
having the exact same problem with the QueryInterface method.





Re: Recipe and best practice for accessing COM

2012-09-09 Thread Sean Cavanaugh

On 9/9/2012 7:57 AM, Sean Cavanaugh wrote:

On 9/9/2012 7:30 AM, newToCOM wrote:

I've been super busy at work so haven't had much time to respond to this
thread.



I also have a D version of something resembling ATL's CComPtr which I am 
finally happy enough with to share, that I could post when i get home 
later tonight.


The class is a good argument for keeping the rather esoteric opDot 
operator, since alias this is extremely dangerous for smart pointer 
structs in D.




Re: GC vs. Manual Memory Management Real World Comparison

2012-09-06 Thread Sean Cavanaugh

On 9/6/2012 4:30 AM, Peter Alexander wrote:


In addition to Walter's response, it is very rare for advanced compiler
optimisations to make 2x difference on any non-trivial code. Not
impossible, but it's definitely suspicious.




I love trying to explain to people our debug builds are too slow because 
they have instrumented too much of the code, and haven't disabled any of 
it.  A lot of people are pushed into debugging release builds as a 
result, which is pretty silly.


Now there are some pathological cases:
  non-inlined constructors can sometimes kill in some cases you for 3d 
vector math type libraries
  128 bit SIMD intrinsics with microsofts compiler in debug builds 
makes horrifically slow code, each operation has its results written to 
memory and then is reloaded for the next 'instruction'.  I believe its 
two order of magnitudes slower (the extra instructions, plus pegging the 
read and write ports of the CPU hurt quite a lot too).  These tend to be 
right functions so can be optimized in debug builds selectively . . .


Re: More on vectorized comparisons

2012-08-23 Thread Sean Cavanaugh

On 8/22/2012 7:19 PM, bearophile wrote:

Some time ago I have suggested to add support to vector comparisons in
D, because this is sometimes useful and in the modern SIMD units there
is hardware support for such operations:


I think that code is semantically equivalent to:

void main() {
 double[] a = [1.0, 1.0, -1.0, 1.0, 0.0, -1.0];
 double[] b = [10,   20,   30,  40,  50,   60];
 double[] c = [1, 2,3,   4,   5,6];
 foreach (i; 0 .. a.length)
 if (a[i]  0)
 b[i] += c[i];
}


After that code b is:
[11, 22, 30, 44, 50, 60]


This means the contents of the 'then' branch of the vectorized
comparison is done only on items of b and c where the comparison has
given true.

This looks useful. Is it possible to implement this in D, and do you
like it?


Well, right now the binary operators == != = =  and  are required to 
return bool instead of allowing a user defined type, which prevents a 
lot of the sugar you would want to make the code nice to write.  Without 
the sugar the code would ends up this:


foreach(i; 0 .. a.length)
{
float4 mask = greaterThan(a[i], float4(0,0,0,0));
b[i] = select(mask, b[i] + c[i], b[i]);
}

in GPU shader land this expression is at least simpler to write:

foreach(i; 0 .. a.length)
{
b[i] = (b[i]  0) ? (b[i] + c[i]) : b[i];
}


All of these implementations are equivalent and remove the branch from 
the code flow, which is pretty nice for the CPU pipeline.   In SIMD the 
comparisons generate masks into a register which you can immediately 
use.  On modern (SSE4) CPUs the select is a single instruction, on older 
ones it takes three: (mask  A) | (~mask  B), but its all better than a 
real branch.


If you have a large amount of code needing a branch, you can take the 
mask generated by the compare, and extract it into a CPU register, and 
compare it for 0, nonzero, specific or any bits set.  a float4 
comparison ends up generating 4 bits, so the code with a real branch is 
like:


if (any(a[i]  0))
{
// do stuff if any of a[i] are greater than zero
}   
if (all(a[i]  0))
{
// do stuff if all of a[i] are greater than zero
}
if ((getMask(a[i]  0)  0x7) == 0x7)
{
// do stuff if the first three elements are greater than zero
}




Re: NaNs Just Don't Get No Respect

2012-08-20 Thread Sean Cavanaugh

On 8/20/2012 12:41 AM, Nick Sabalausky wrote:

On Sun, 19 Aug 2012 01:21:03 -0500
Sean Cavanaughworksonmymach...@gmail.com  wrote:


Nobody knows how floats work, without being locked in a closet for a
at least a week and told to change their doubles back into floats and
fix their code, since thats where 99% of precision problems actually
come from.



Sorry, I don't understand, where are you saying 99% of precision
problems come from?



Chaining algorithms together without regard to how the floats are 
propagated through the system.  Also, accumulating lots of values with 
addition and subtraction can be problematic, as can subtracting two 
values to near-zero values also leaves you with a most of the error.


There is a pretty classic example in 3d graphics: the transform from 
object space to world space to projection space.  Even if these matrices 
are combined, your geometry can be transformed a very large distance 
from the world origin and lose a lot of precision in the process.  The 
way to fight this is to use camera instead of world space as much as 
possible.  If you don't do this all the vertices on all the meshes in 
your world snap to different values at varying rates when they move or 
the camera moves, causing one of the main forms of z-fighting.   Plus 
the further you get from your world origin the worse things get, which 
makes building something on the scale of an open-world game rather 
difficult.




Re: NaNs Just Don't Get No Respect

2012-08-19 Thread Sean Cavanaugh

On 8/19/2012 12:12 AM, dennis luehring wrote:

Am 19.08.2012 06:12, schrieb Jonathan M Davis:

On Friday, August 17, 2012 17:03:13 Walter Bright wrote:

Our discussion on this in the last few days inspired me to write a
blog post
about it:

http://www.reddit.com/r/programming/comments/yehz4/nans_just_dont_get_no_res

pect/

http://www.drdobbs.com/cpp/nans-just-dont-get-no-respect/240005723


FWIW, I'm very surprised by how negatively many programmers seem to
react to
NaN. I think that how D handles it is great.


i think many of these programmers seems to think that D is introducing
this NaN-Type ...



Nobody knows how floats work, without being locked in a closet for a at 
least a week and told to change their doubles back into floats and fix 
their code, since thats where 99% of precision problems actually come from.




Re: First working Win64 program!

2012-08-12 Thread Sean Cavanaugh

On 8/12/2012 4:12 PM, Walter Bright wrote:

On 8/12/2012 1:38 AM, Alex Rønne Petersen wrote:

One question: Will the 32-bit tool chain also be able to use the MSVC
runtime
and linker eventually?


It's not the current plan. Frankly, I think 32 bits is rapidly becoming
irrelevant on the desktop.


Post windows 8 launch we should start seeing mainstream games shipping 
32 and 64 bit binaries together in the same box.  We already have moved 
off of 32 bit in house for our editors and tools.  The biggest hangup is 
Microsoft keeps shipping 32 bit OSes, and we still have to support XP at 
least through the end of the year.  With a little luck Win8 will be the 
last 32 bit one.




Re: First working Win64 program!

2012-08-12 Thread Sean Cavanaugh

On 8/12/2012 6:43 PM, torhu wrote:

On 12.08.2012 23:21, Sean Cavanaugh wrote:


Post windows 8 launch we should start seeing mainstream games shipping
32 and 64 bit binaries together in the same box. We already have moved
off of 32 bit in house for our editors and tools. The biggest hangup is
Microsoft keeps shipping 32 bit OSes, and we still have to support XP at
least through the end of the year. With a little luck Win8 will be the
last 32 bit one.



Can I ask, what are the reasons you want to move to 64 bits on the
Windows platform? Is it higher memory requirements or something else?
The game with the highest memory use I've got installed is AFAIK
Starcraft II, still at only about one GB. And as you know, 64 bit apps
can have lower performance than 32 bits in some cases. So I'm curious to
know what the reasons are in your case.


  32 bit Windows games are capped at around 1.3 GB due to WinXP 
support.  You can get closer to 1.7 GB of address space out of your 32 
bit apps when run under 64 bit windows, but thats about it, without 
playing with /3GB LARGEADDRESSAWARE flags etc.  Games that push 1.3 GB 
or more run the risk of crashing due to both address space fragmentation 
and running out of memory from the heap.


  In XP, you also run the risk of crashing when alt-tabing out of the 
game and back.  The video card's address space gets unmapped while you 
are away, and the app might have fragmented your nice 512 MB of 
contiguous while processing in the background, which causes the driver 
to fail to map the device back into your address space when you alt-tab 
crash, in a pretty much unrecoverable error.  Vista fixed this by 
mapping the resources to the address space when you lock the resources, 
instead of a huge chunk of the video card when the app had an open and 
valid (not-lost) device context.


  Also, having the full address space opens up the ability to store 
game data into memory mapped files, which can greatly simplify loading 
data.  Halo was designed this way for the XBOX, though for the PC 
version we had to modify the code to handle loading to alternate 
addresses with patched fixups if some random dll had taken the prefered 
load address at startup.  Since it was done in 32 bits each level had 
the same load address, but in 64 bits you could give each environment 
its own address range, which makes it very nice for getting a new area 
loaded while you are playing in the previous one (run a separate thread 
to touch all the new pages etc).






Re: First working Win64 program!

2012-08-12 Thread Sean Cavanaugh

On 8/12/2012 8:15 PM, Andrej Mitrovic wrote:

On 8/13/12, Sean Cavanaughworksonmymach...@gmail.com  wrote:

we had to modify the code


Sure enough I've found your name:
http://www.microsoft.com/games/mgsgamecatalog/halopccredits.aspx

I noticed you before here but never realized you worked on Halo. It's
cool to see people of your caliber having interest in D! :)


I have a theory that game development accelerates the rate at which you 
learn to hate C++


Re: First working Win64 program!

2012-08-12 Thread Sean Cavanaugh

On 8/12/2012 8:22 PM, torhu wrote:


Ok, so using LARGEADDRESSAWARE doesn't improve the situation on XP 64?
What about on Vista 64?


On XP64 it would help some, but the video adapter is still mapped to a 
huge contiguous range due to the XP driver model.  Basically you get 1 
extra GB (2.3GB effective usable instead of 1.3).


Under 64 bit Vista/7 32 bit LAA apps get almost a full 4 GB to play 
with, and if they change their textures to default pool or use D3D10 or 
newer can get their texture data out of the app's address space as well, 
which is a huge percentage of a game's memory usage.


Strange fallout changing modules

2012-08-11 Thread Sean Cavanaugh
	While working on project using COM I noticed that the win32 bindings 
project declares its own version of IUnknown and all the classes in the 
project derive from this.


	So I set out to see the feasibility of modifying the win32 module to 
use std.c.windows.com.IUnknown.  This was a bit of work as there are a 
duplicate redefinitions in the project for methods like CoInitialize, 
structs like GUID, aliases like IID etc, but fairly simple.


	In the end I had a handful (approximately 10) files that needed fixing. 
The fix was simple: add 'private import std.c.windows.com' to the top of 
each file generating symbol collisions, and replace the duplicate 
symbols the compiler was complaining about with an alias to the 
equivalent object in std.c.windows.com.


	The code compiles great, but now the weird thing:  Before this change I 
had to compile 13 .d files out of the win32 module.  After the change I 
had to add 37 others, and also suffer numerous import lib not found 
warnings because I am effectively compiling dead code that pragma 
includes lib's I don't need.


Can anyone explain this behavioral change?


	Also, the more I work on this the more I wonder if D should also not 
allow any other IUnknown interfaces to be declared, that are not an 
alias to std.c.windows.com (or wherever each platform's canonical 
version lives).  It makes it impossible to easily write a static assert 
in interface containers that emulate ATL's CComPtr: i.e.  i have to write


static assert(is(T : std.c.windows.com.IUnknown) || is(T : 
win32.unknwn.IUnknown));


instead of:

static assert(is(T : std.c.windows.com.IUnknown));

and also hope nobody adds another one somewhere else.


Re: Strange fallout changing modules

2012-08-11 Thread Sean Cavanaugh

On 8/11/2012 1:50 AM, Sean Cavanaugh wrote:

While working on project using COM I noticed that the win32 bindings
project declares its own version of IUnknown and all the classes in the
project derive from this.

So I set out to see the feasibility of modifying the win32 module to use
std.c.windows.com.IUnknown. This was a bit of work as there are a
duplicate redefinitions in the project for methods like CoInitialize,
structs like GUID, aliases like IID etc, but fairly simple.

In the end I had a handful (approximately 10) files that needed fixing.
The fix was simple: add 'private import std.c.windows.com' to the top of
each file generating symbol collisions, and replace the duplicate
symbols the compiler was complaining about with an alias to the
equivalent object in std.c.windows.com.

The code compiles great, but now the weird thing: Before this change I
had to compile 13 .d files out of the win32 module. After the change I
had to add 37 others, and also suffer numerous import lib not found
warnings because I am effectively compiling dead code that pragma
includes lib's I don't need.

Can anyone explain this behavioral change?


Also, the more I work on this the more I wonder if D should also not
allow any other IUnknown interfaces to be declared, that are not an
alias to std.c.windows.com (or wherever each platform's canonical
version lives). It makes it impossible to easily write a static assert
in interface containers that emulate ATL's CComPtr: i.e. i have to write

static assert(is(T : std.c.windows.com.IUnknown) || is(T :
win32.unknwn.IUnknown));

instead of:

static assert(is(T : std.c.windows.com.IUnknown));

and also hope nobody adds another one somewhere else.


I always leave out something important before hitting send:

The reason I had to add the additional modules, was it was complaining 
about ModuleInfoZ's were missing for 5-10 modules (adding 1 sometimes 
added several new link errors).  I ended up having to add 37 modules to 
get the project to link


Re: vector Cross/Dot using core.simd?

2012-08-11 Thread Sean Cavanaugh

On 8/11/2012 8:23 PM, F i L wrote:

I'm trying to write a Cross and Dot function using core.simd.float4 and DMD

Does anyone know anything about SIMD operations that may be able to help
me translate these functions into a D equivalent? I would very much
appreciate your help.



Some reference:

C++ simd intrinsic for dot product (requires SSE 4.1, very modern)
_mm_dp_ps

C++ simd instrinsic for horizontal add (requires SSE3, also reasonably 
modern)

_mm_hadd_ps

If you are on SSE2 (which is the base spec for x64) and also the minimum 
CPU target we use at work for commercial game development, you are stuck 
doing shuffles and adds for dot product, which effectively process these 
operations as scalar).


Ideally one of the sides of the dot product is an array and you can 
vectorize the dot product itself (1 vector vs 4 others, or 4 v 4).  This 
is common when setting up shapes like view frustum culling (point tested 
against 6-8 extruded planes in an array)





Re: One against binary searches

2012-07-31 Thread Sean Cavanaugh

On 7/30/2012 12:28 PM, Don Clugston wrote:

On 30/07/12 17:40, bearophile wrote:

This author writes very detailed analyses of low-level computational
matters, that appear on Reddit. This blog post he suggests to introduce
offseted binary or quaternary search instead of binary search in
Phobos:

http://www.pvk.ca/Blog/2012/07/30/binary-search-is-a-pathological-case-for-caches/



Bye,
bearophile


Fantastic article, thanks!
The fact that physical addressing can influence L2 cache misses was
completely new to me.


Binary searches also confuse hardware branch prediction by the code flow 
being wrong 50% of the time.   Small linear lists are actually faster to 
test because of this (and the added bonus of cache coherency).


Re: Recipe and best practice for accessing COM

2012-07-25 Thread Sean Cavanaugh

On 7/24/2012 2:01 PM, newToCOM wrote:

I am trying to use COM to access Windows functionality and APIs.
I have read the interface documentation and some documentation at
MSDN. I have seen the included sample snippet for IHello and the
slides Modern COM programming in D, but it is still not clear
exactly what to do and what the best practice is.

So, my question is:
I have downloaded Windows SDK and coffimplib. How do I proceed to
access d2d1 and draw a rectangle?

A stepwise description of the recommended procedure is highly
appreciated!



I made D bindings for d3d11, directwrite, d2d1, and a few other directx 
APis.


They are located at:

http://www.dsource.org/projects/bindings/wiki/DirectX


The bindings haven't had extensive testing but the bits i've used seem 
to work correctly (mostly in d3d11)



The WindowsApi project is also necessary:

http://www.dsource.org/projects/bindings/wiki/WindowsApi



Re: Is this actually supposed to be legal?

2012-07-17 Thread Sean Cavanaugh

On 7/17/2012 12:23 PM, Jonathan M Davis wrote:

On Tuesday, July 17, 2012 14:48:32 David Nadlinger wrote:

On Tuesday, 17 July 2012 at 05:24:26 UTC, Jonathan M Davis wrote:

This code strikes me as being a bug:


class MyBase(T)
{}

class MySubA : MyBase!MySubA
{}

class MySubB : MyBase!MySubB
{}

void main()
{}



This pattern is actually quite common in C++ code, and referred
to as CRTP (curiously recurring template pattern). If you propose
to kill it, Andrei is going to get mad at you. ;)


Well, it certainly seems insane to me at first glance - particularly when you
take compile time reflection into account, since the derived classes'
definitions are now effectively recursive (so I suspect that the situation is
worse in D, since C++ doesn't have conditional compliation like D does). But
if it's supposed to be legal, I guess that it's suppose to be legal. I'd never
seen the idiom before, and it seemed _really_ off to me, which is why I brought
it up. But I'd have to study it in order to give an informed opinion on it.

- Jonathan M Davis



A 'proper' D port of this kind design would be to use mixins instead of 
the template.  They both accomplish the same thing:


The template (or mixins) are written to call functions in the user 
defined type.  A simple example would be the C++ WTL library: A user 
defined control defines its own window style, but the template code is 
responsible for creating the window, and accesses the style and class 
flags from the user defined type.


The advantage is the same in both: you avoid making the interface 
virtual, you still get to use some generic code.




Re: LLVM IR influence on compiler debugging

2012-07-08 Thread Sean Cavanaugh

On 7/7/2012 11:05 PM, Andrei Alexandrescu wrote:


Compilation is a huge bottleneck for any major C++ code base, and adding
hardware (distributing compilation etc) is survival, but definitely
doesn't scale to make the problem negligible.

In contrast, programmers have considerable control about generating fast
code.



Our bottleneck with a large C++ codebase (Unreal Engine based game) is 
linking.  Granted we have beefy workstations (HP Z800 with dual quad or 
hex core xeons and hyperthreading), but a full build+link is 4-5 min, 
and a single change+link is over 2 min.


You can also speed up C++ compiling by merging a bunch of the .cpp files 
together (google unity c++ build), though if you go too crazy you will 
learn compilers eventually do explode when fed 5-10 megs of source code 
per translation unit heh.


Re: align(16) struct member throws an exception with movdqa

2012-06-13 Thread Sean Cavanaugh

On 6/11/2012 7:15 AM, Trass3r wrote:

I think it has been fixed for the next version of DMD already. Any
idea why align isn't letting me use movdqa?


Cause align doesn't work the way you think it does.
In fact I still don't understand how it works at all.


The language align keyword can only reduce the alignment from the 
platform default (typically 8).  A serious flaw if you ask me . . . .


Re: Memory reordering explained by example

2012-05-18 Thread Sean Cavanaugh

On 5/16/2012 5:59 AM, Gor Gyolchanyan wrote:

The problem is, that ancient processor architectures are used for modern
processors and software.
The correct solution to the concurrency problems would be a new
architecture, designed to naturally deal with concurrency.




We have them, they are called GPUs




Re: What library functionality would you most like to see in D?

2012-05-11 Thread Sean Cavanaugh

On 5/10/2012 12:49 PM, Sean Kelly wrote:

On Jul 30, 2011, at 10:27 PM, Jonathan M Davis wrote:


I think that it would be useful to query the community for what piece of
library functionality they don't currently have in D and would most like to
see. For instance, there is no official logging framework in D or any 3rd party
libraries which do it AFAIK. So, that could be one type of functionality that
you may like to see. Now, there is a prospective implementation for std.log
which shouldn't be all that far away from being reviewed, so listing that here
wouldn't be all that useful, since it's on its way. But what other major
functionality do you miss in D that other languages' that you use have
available in their libraries?



I can't say that these exist in other standard libraries either, but I want:

1. A high-performance sockets API.
2. A robust logging tool (ie. Boost.Log).


The whole windows SDK (all the C functions, all the COM stuff).   The 
defaults in dmd are rather limited, and even the win32 bindings project 
on dsource.org has a lot of hole being a port from Ming that is only up 
to WinXP levels for the most part.


I would imagine parsing the sdk headers and turning it into a language 
neutral database that could be annotated with language specific metadata 
would be rather useful, especially as other languages and toolchains 
could all benefit from being able to make their own bindings from it.


Re: Does D have too many features?

2012-05-08 Thread Sean Cavanaugh

On 5/8/2012 3:36 PM, foobar wrote:

On Tuesday, 8 May 2012 at 19:00:01 UTC, deadalnix wrote:


I think that goal is misunderstood. It is aimed at human being, not
compiler.

If one read D code that look like C, it should be able to understand
it easily. I is not supped to compile with 100% exact semantic.


Unfortunately that is not the case.
The stated argument is that compiling C code with a D compiler should
either compile exactly the same or produce a compilation error.


Thousands of my C/C++ floating point constants are broken by the CTFE 
change since as 'integer'-like float constants 31.f and won't compile 
anymore, since its trying to do f(31) to them for me now . . .






Re: Does D have too many features?

2012-05-08 Thread Sean Cavanaugh

On 5/8/2012 7:56 PM, Sean Cavanaugh wrote:

On 5/8/2012 3:36 PM, foobar wrote:

On Tuesday, 8 May 2012 at 19:00:01 UTC, deadalnix wrote:


I think that goal is misunderstood. It is aimed at human being, not
compiler.

If one read D code that look like C, it should be able to understand
it easily. I is not supped to compile with 100% exact semantic.


Unfortunately that is not the case.
The stated argument is that compiling C code with a D compiler should
either compile exactly the same or produce a compilation error.


Thousands of my C/C++ floating point constants are broken by the CTFE
change since as 'integer'-like float constants 31.f and won't compile
anymore, since its trying to do f(31) to them for me now . . .





s/CTFE/UFCS


Re: Does D have too many features?

2012-05-08 Thread Sean Cavanaugh

On 5/8/2012 7:56 PM, Sean Cavanaugh wrote:

On 5/8/2012 3:36 PM, foobar wrote:

On Tuesday, 8 May 2012 at 19:00:01 UTC, deadalnix wrote:


I think that goal is misunderstood. It is aimed at human being, not
compiler.

If one read D code that look like C, it should be able to understand
it easily. I is not supped to compile with 100% exact semantic.


Unfortunately that is not the case.
The stated argument is that compiling C code with a D compiler should
either compile exactly the same or produce a compilation error.


Thousands of my C/C++ floating point constants are broken by the CTFE
change since as 'integer'-like float constants 31.f and won't compile
anymore, since its trying to do f(31) to them for me now . . .





s/CTFE/UFCS


Re: Class methods in D?

2012-05-04 Thread Sean Cavanaugh

On 5/3/2012 1:32 PM, Simon wrote:

On 03/05/2012 18:21, Mehrdad wrote:

In Windows, you need to register a window class before you can
actually create an instance of it.


If you are mucking about on 'doze you might find my dubious port of the
ATL window classes relevant:

http://www.sstk.co.uk/atlWinD.php

That does all that tedious registering of windows classes etc.
I used a static class member IIRC.

I've ripped this off of MS so use at your own risk. ;)



Heh, I've got a miniature (probably 20-30% complete) version of the WTL 
ported to D here, but without any ATL aside from the parts of CWindow.


The WTL uses the curiously recurring template design which also works in 
D, so a window class is something like this in D:



class GameWindow : CWindowUserBase!(CWindow, GameWindow)
{
bool isFullscreen;
bool isResizing;
bool suppressRendering;
bool allowCapture;
wGameMessageLoop messageLoop;
GameScene gameScene;
RenderDevice renderDevice;
DeviceContext immediateContext;
SwapChain swapChain;
Tid renderingThread;

mixin DECLARE_WND_CLASS!(wWindowClass, CS_HREDRAW | CS_VREDRAW | 
CS_DBLCLKS, COLOR_WINDOWFRAME);


static DWORD GetWndStyle(DWORD InStyle)
{
return InStyle | WS_OVERLAPPEDWINDOW | WS_CLIPCHILDREN | 
WS_CLIPSIBLINGS;

}
static DWORD GetWndExStyle(DWORD InStyleEx)
{
return InStyleEx | WS_EX_APPWINDOW | WS_EX_WINDOWEDGE;
}
static string GetWndCaption()
{
return ;
}

/// lots of code deleted

mixin(HOOK_MSG_WM_DESTROY!(OnDestroy));
mixin(HOOK_MSG_WM_MOVE!(OnMove));
mixin(HOOK_MSG_WM_SIZE!(OnSize));


mixin REGISTER_MESSAGE_MAP!(
BIND_MSG_WM_DESTROY!(OnDestroy),
BIND_MSG_WM_MOVE!(OnMove),
BIND_MSG_WM_SIZE!(OnSize));

mixin MESSAGE_HANDLER!();
}




So the answer to the OP's question is, make the class stuff static and 
use mixins for the functions so the scope works out.


Re: Class methods in D?

2012-05-04 Thread Sean Cavanaugh

On 5/3/2012 1:41 PM, Mehrdad wrote:

On Thursday, 3 May 2012 at 18:32:18 UTC, Simon wrote:

On 03/05/2012 18:21, Mehrdad wrote:

In Windows, you need to register a window class before you can
actually create an instance of it.


If you are mucking about on 'doze you might find my dubious port of
the ATL window classes relevant:

http://www.sstk.co.uk/atlWinD.php

That does all that tedious registering of windows classes etc.
I used a static class member IIRC.

I've ripped this off of MS so use at your own risk. ;)


lol. thanks.




I could at give out the incomplete WTL port to D I had been working on 
off and on over the last year, as it has an open license to start with 
(but as WTL includes ATL some parts have to be built from scratch like 
CWindow which is a PITA).   Even without all the rest of the library the 
message crack module is very useful on Win32 systems.




Re: pure functions/methods

2012-04-20 Thread Sean Cavanaugh

On 4/20/2012 3:06 AM, Namespace wrote:

The sense of pure functions isn't clear to me.
What is the advantage of pure functions / methods?
I inform the compiler with const that this method does not change the
current object, and therefore he can optimize (at least in C++) this
method. How and what optimized the compiler if i have pure or const
pure functions / methods?


Simplest explanation I can think of is:

a const function of a class can't modify its own classes data
a pure function can't modify any data, or call other functions that are 
not also pure (though there are exceptions)




Re: Windows: Throwing Exceptions from Fibers in D2.059: Access Violation

2012-04-19 Thread Sean Cavanaugh

On 4/19/2012 10:00 PM, Jameson Ernst wrote:

On Thursday, 19 April 2012 at 00:07:45 UTC, Sean Kelly wrote:

On Apr 18, 2012, at 4:06 PM, Andrew Lauritzen wrote:


I'm still interested in if anyone has any suggested workarounds or
experience using Win32 fibers in D2 as well.


The x32 Windows code should be pretty well tested. If this is using
the x64 code though, that's all quite new. I'll give this a try when I
find some time, but can't suggest a workaround offhand. It almost
sounds alignment-related, which could be tricky.


Been following D for a while now, and fibers right in the std lib are a
huge draw for me. I'm not an expert on them, but on the topic of x64
fibers, I have some exposure to them trying to contribute x64 windows
support to bsnes, which uses its own home-grown fiber/coroutine system.

Out of curiosity I took a look at the D fiber context code, and noticed
that the x64 windows version doesn't seem to save the XMM6-15 registers
(unless I missed it), which is something I forgot to do also. MSDN
indicates that they are nonvolatile, which could potentially cause
problems for FP heavy code on x64 windows.

Not sure if I should file a bug for this, as I haven't tried an x64
windows fiber in D yet to make sure it's actually a problem first.


Fibers seem like a last resort to me.  They are fairly difficult to make 
bulletproof due to the thread local storage issues and a few other 
problems with context switches.  Win7 scheduling and management of real 
threads is a lot better than in previous versions, and in x64 mode there 
is also user mode scheduling (UMS) system and the library built on top 
of it (ConcRT), which gets you almost all of the benefits of fibers but 
you get to use 'real' threads, plus the added bonus of being able to 
switch to another thread when you stall on a page fault or block on a 
kernel object (something fiber's can't do).


Re: The Downfall of Imperative Programming

2012-04-09 Thread Sean Cavanaugh

On 4/9/2012 3:28 PM, Mirko Pilger wrote:

i guess this might be of interest to some.

http://fpcomplete.com/the-downfall-of-imperative-programming/

http://www.reddit.com/r/programming/comments/s112h/the_downfall_of_imperative_programming_functional/




I would counter a flow based programming approach solves a lot of the 
same concurrency problems and doesn't tie you to a programming style for 
the actual code (functional vs declarative) as each module can be made 
to do whatever it wants or needs to do.


Re: unzip parallel, 3x faster than 7zip

2012-04-06 Thread Sean Cavanaugh

On 4/5/2012 6:53 PM, Jay Norwood wrote:






I'm curious why win7 is such a dog when removing directories. I see a
lot of disk read activity going on which seems to dominate the delete
time. This doesn't make any sense to me unless there is some file
caching being triggered on files being deleted. I don't see any virus
checker app being triggered ... it all seems to be system read activity.
Maybe I'll try non cached flags, write truncate to 0 length before
deleting and see if that results in faster execution when the files are
deleted...





If you delete a directory containing several hundred thousand 
directories (each with 4-5 files inside, don't ask), you can see windows 
freeze for long periods (10+seconds) of time until it is finished, which 
affects everything up to and including the audio mixing (it starts 
looping etc).




Re: Confused about github rebasing

2012-03-15 Thread Sean Cavanaugh

On 3/15/2012 3:56 PM, Alex Rønne Petersen wrote:

On 15-03-2012 21:53, Gour wrote:

On Thu, 15 Mar 2012 13:49:14 -0700
H. S. Teohhst...@quickfur.ath.cx wrote:


Another question. How to I repair my current history, which is all
messed up now?


By not using DVCS which allows you to rewrite history (hint: check
Fossil). ;)


It's perfectly useful in DVCS. Without it, you'd have a mess of a
history when you send your changes upstream. That's not really acceptable.



Why would you delete history?  Thats pretty much the primary purpose of 
source control.




Re: How about colors and terminal graphics in std.format?

2012-03-12 Thread Sean Cavanaugh

On 3/12/2012 10:58 PM, Chad J wrote:

On 03/12/2012 10:37 PM, James Miller wrote:

I do want to be able to format things besides color with the color
formatting function. Maybe I can pick out the color format specifiers
first and then pass the rest to format. It'd be a shame to reimplement
format.



There are something like 4 million UTF characters designated for 
user-defined use.  I had hooked some range of this into RGBA color codes 
for easy rendering text for D3D, as the function needs to parse the 
string to generate texture UVs for the glyphs, and might as well be 
setting some vertex attributes for color along the way etc.




Re: Multiple return values...

2012-03-10 Thread Sean Cavanaugh

On 3/10/2012 4:37 AM, Manu wrote:


If I pass a structure TO a function by value, I know what happens, a
copy is written to the stack which the function expects to find there.


This is only true if the compiler is forced to use the ABI, when 
inlining is impossible, or the type being passed is too complex. 
Structs of pods, most compilers do magical things to provided you don't 
actively work against the code gen (virtual methods, dllexports etc), 
too many separate .obj units in C++ etc.




Re: Breaking backwards compatiblity

2012-03-10 Thread Sean Cavanaugh

On 3/10/2012 3:49 PM, bearophile wrote:

Walter:


I'm talking about the name change. It's far and away the most common thing I
have to edit when moving code from D1=  D2.


We need good/better ways to manage Change and make it faster and less painful, 
instead of refusing almost all change right now.
Things like more fine-graded deprecation abilities, smarter error messages in libraries 
that suggest how to fix the code, tools that update the code (py2to3 or the Go language 
tool to update the programs), things like the strange future built-in Python 
package, and so on.

Bye,
bearophile



I would think if a language designed in a migration path that worked it 
would allow things to be more fluid (on the library and language side). 
 This is one thing I believe has really hurt C++, since the 'good 
words' like null couldn't be used they had to settle for dumber names 
like nullptr_t.  From what I gather D's way 'out' is abuse of @


I would much rather the language be able to expand its turf at the 
expense of the existing codebases, as long as there was a way to 
_migrate the code cleanly_.


I envision something like this would work:

In addition to the 'module mypackage.mymodule' statement at the top of 
each file, should be a version number of D of some sort that the code 
was last built against.  A very high level language revision like D1 or D2.


Newer compilers would maintain the previous front-ends ability to parse 
these older files, purely for the sake of outputing TODO-like messages 
for how to upgrade the codebase to a newer version.


A simple example right now from D1 to D2 would be the way floating point 
literals are parsed is no longer compatible.  The UFCS changes could 
silently break existing code in theory and probably should be pointed 
out in some way before upgrading code from D1 to D2.


Re: Breaking backwards compatiblity

2012-03-10 Thread Sean Cavanaugh

On 3/10/2012 4:22 PM, Nick Sabalausky wrote:

H. S. Teohhst...@quickfur.ath.cx  wrote in message
news:mailman.437.1331414346.4860.digitalmar...@puremagic.com...


True. But I found Linux far more superior in terms of being usable on
very old hardware.


There have been exceptions to that: About 10-12 years ago, GNOME (or at
least Nautlus) and KDE were *insanely* bloated to the pount of making
Win2k/XP seem ultra-lean.




Both KDE and Gnome UIs and apps using these Uis still feel very sluggish 
to me on a modern machine.  I don't get this feeling at all on Win7, 
even on a much slower machine (my laptop for instance).




Re: Multiple return values...

2012-03-10 Thread Sean Cavanaugh

On 3/10/2012 8:08 PM, Mantis wrote:

Tuple!(float, float) callee() {
do something to achieve result in st0,st1
fst st0, st1 into stack
load stack values into EAX, EDX
ret
}

void caller() {
call callee()
push EAX, EDX into a stack
fld stack values into st0, st1
do something with st0, st1
}

As opposed to:

Tuple!(float, float) callee() {
do something to achieve result in st0,st1
ret
}

void caller() {
call callee()
do something with st0, st1
}

Is there something I miss here?


Yes, the fact the FPU stack is deprecated :)


Re: Arbitrary abbreviations in phobos considered ridiculous

2012-03-07 Thread Sean Cavanaugh

On 3/7/2012 8:20 PM, Kapps wrote:

On Wednesday, 7 March 2012 at 19:12:25 UTC, H. S. Teoh wrote:

Supporting stuff like 5.hours will introduce additional complications to
D's lexical structure, though. The lexer will have to understand it as
(int:5)(.)(ident:hours) rather than (float:5.)(ident:hours). And then if
you actually *wanted* a float, you'd have another ambiguity: 5..hours
could mean (float:5.)(.)(ident:hours) or (int:5)(..)(hours). And
5.0.hours just looks... weird.


T


Actually, Kenji's pull request for UFCS already takes care of this.
Things like 5. aren't allowed, nor is 5.f; a number has to follow the
decimal. So 5.f would invoke UFCS function f with the integer 5. I
believe this change is already merged, though the rest of the UFCS pull
request isn't unfortunately.


I can definitely confirm that porting C/C++ floating point literals to D 
is a huge pain in the ass because of this change, so its definitely in 
to the degree it breaks my math and physics libs heavily :)


Re: 0 negative loop condition bug or misunderstanding on my part

2012-03-07 Thread Sean Cavanaugh

On 3/7/2012 12:57 PM, Jonathan M Davis wrote:

On Wednesday, March 07, 2012 11:01:05 Timon Gehr wrote:

On 03/07/2012 07:05 AM, ixid wrote:

Ah, thank you, so it's wrapping. That seems like a bad idea, what is the

I suspect that the reality of the matter is that if we disallowed implicit
conversions between signed and unsigned, a number of bugs would completely go
away, but others would creep in as a result, and the overal situation wouldn't
necessarily be any better, but I don't know. My initial reaction would be to
agree with you, but there are definitely cases where such an approach would get
annoying and bug-prone (due to the casting involved). But regardless, I really
don't think that you're going to convince Walter on this one, given what he's
said in the past.

- Jonathan M Davis


After writing enough container libraries and whatnot in C++, I always 
end up going bed while thinking life would be so much easier if implicit 
signed/unsigned conversions were not allowed.


Then I go to sleep, wake up, and realize how much more horrific the code 
would be in other places if this was actually true.


The best compromise would probably have been to make size_t signed when 
migrating to 64 bit memory addressing, and leave the unsigned/signed 
problems specificaly with size_t values back in the past of 32 bit and 
older systems.



On a related note I would love to see something in std.somewhere (conv?) 
to provide a complete set of narrowing and signed/unsigned conversion 
functions: The matrix is basically the following methods:
1)  Reduce size (64-32-16-8 bits) but maintain signedness with three 
types:

2)  Methods to flip signedness of the value (but mainting bitwidth)


multiplied against three types of operations:
a)  truncating (i.e. mask off the lower bits)
b)  saturating (values outside of range are clamped to min or max of the 
narrower range)

c)  throwing (values outside of narrow range throw a range exception)



Re: Extend vector ops to boolean operators?

2012-03-06 Thread Sean Cavanaugh

On 3/6/2012 2:30 PM, H. S. Teoh wrote:

It'd be really cool if I could do this:

void func(int[] vector, int[] bounds) {
assert(vector[]= 0  vector[]  bounds[]);
...
}

Is there any reason why we shouldn't implement this?


T



This same problem exists for making proper syntactical sugar for simd 
comparison functions.


!= ==  (opEquals is required to return a bool)

and

= =   (opCmp is required to return an int)


Granted its possible to live without the sugar but the code looks more 
like asm, and reading the code takes longer without the operators in it.




Re: John Carmack applauds D's pure attribute

2012-02-25 Thread Sean Cavanaugh

On 2/25/2012 4:08 PM, Paulo Pinto wrote:

Am 25.02.2012 21:26, schrieb Peter Alexander:

On Saturday, 25 February 2012 at 20:13:42 UTC, so wrote:

On Saturday, 25 February 2012 at 18:47:12 UTC, Nick Sabalausky wrote:


Interesting. I wish he'd elaborate on why it's not an option for his
daily
work.


Not the design but the implementation, memory management would be the
first.


Memory management is not a problem. You can manage memory just as easily
in D as you can in C or C++. Just don't use global new, which they'll
already be doing.


I couldn't agree more.

The GC issue comes around often, but I personally think that the main
issue is that the GC needs to be optimized, not that manual memory
management is required.

Most standard compiler malloc()/free() implementations are actually
slower than most advanced GC algorithms.



Games do basically everything in 33.3 or 16.6 ms intervals (30 or 60 fps 
respectively).  20fps and lower is doable but the input gets extra-laggy 
very easily, and it is visually choppy.


Ideally GC needs to run in a real-time manner, say periodically every 10 
or 20 seconds and taking at most 10ms.  Continuously would be better, 
something like 1-2ms of overhead spread out over 16 or 32 ms.  Also, a 
periodic GC that freezes everything needs to run at a 
predictable/controllable time, so you can do things like skip AI updates 
for that frame and keep the frame from being 48ms or worse.


These time constraints are going to limit the heap size of a GC heap to 
the slower of speed of memory/computation, until the GC can be made into 
some variety of a real-time collector.  This is less of a problem for 
games, because you can always allocate non-gc memory with malloc/free or 
store your textures and meshes exclusively in video memory as d3d/opengl 
resources.


The fact malloc/free and the overhead of refcounting takes longer is 
largely meaningless, because the cost is spread out.  If the perf of 
malloc/free is a problem you can always make more heaps, as the main 
cost is usually lock contention.


The STL containers are pretty much unusable due to how much memory they 
waste, how many allocations they require, and the inability to replace 
their allocators in any meaningful way that allows you to used fixed 
size block allocators.  Hashes for instance require multiple different 
kinds of allocations but they are forced to all go through the same 
allocator.  Also, the STL containers tend to allocate huge amounts of 
slack that is hard to get rid of.


Type traits and algorithms are about the only usable parts of the STL.



Re: Inheritance of purity

2012-02-24 Thread Sean Cavanaugh

On 2/24/2012 10:29 AM, H. S. Teoh wrote:

On Fri, Feb 24, 2012 at 02:43:01AM -0800, Walter Bright wrote:
[...]

Like the switch from command line to GUI, perhaps there are some that
are ready to switch from text files to some visually graphy thingy for
source code. But D ain't such a language. I don't know what such a
language would look like. I've never thought much about it before,
though I heard there was a toy language for kids that you programmed
by moving boxes around on the screen.


That sounds like an awesome concept for a programming game. :-)


T



Graphics Shaders can be developed with a UI of this nature.

Just google around for the UDK material editor for an example.  As 
advanced as it is, it will look crude in a few more years too.


Re: Linking with d3d11.dll/lib

2012-02-24 Thread Sean Cavanaugh

On 2/23/2012 5:03 PM, John Burton wrote:

I'm trying to use the d3d11 bindings in
http://www.dsource.org/projects/bindings/wiki/DirectX to call
direct3d11 functions from my D program.

I've managed to get the code to compiler but when it links I get
this error -

Error 42 Symbol Undefined _D3D11CreateDeviceAndSwapChain@48

I have copied the D3D11.lib from the D3D SDK into my project and
then run COFFIMPLIB d3d11.lib -f to convert it and included the
library in my project settings.

Is there something else I need to do to make this link? Any help
appreciated.


I modified my test app to use D3D11CreateDeviceAndSwapChain instead of 
D3D11CreateDevice and it links and runs against either function.  The 
only trouble I had was when I went out of my way to delete the import 
libraries (d3d11.lib) in order to see what the link error looks like.


I am currently using dmd2 2.057 and building my project with Visual 
Studio + VisualD plugin (0.30).


Re: Linking with d3d11.dll/lib

2012-02-23 Thread Sean Cavanaugh

On 2/23/2012 5:03 PM, John Burton wrote:

I'm trying to use the d3d11 bindings in
http://www.dsource.org/projects/bindings/wiki/DirectX to call
direct3d11 functions from my D program.

I've managed to get the code to compiler but when it links I get
this error -

Error 42 Symbol Undefined _D3D11CreateDeviceAndSwapChain@48

I have copied the D3D11.lib from the D3D SDK into my project and
then run COFFIMPLIB d3d11.lib -f to convert it and included the
library in my project settings.

Is there something else I need to do to make this link? Any help
appreciated.



Its been quite a while since I've had time to work on my side project 
(which is what prompted the d3d11 module to get written).


I can look into this later when I get home, as well as strip down my 
test app and publish it somewhere.


The app is pretty simple - it creates a window, a d3d11 device, a vertex 
and pixel shader, a vertex and index buffer, and draws a triangle.


It would be more but I had been working on a stripped down version of 
the an ATL/WTL like wrapper for D in the meantime, in order to handle 
HWND objects and generate message maps with mixins.


Re: The Right Approach to Exceptions

2012-02-18 Thread Sean Cavanaugh

On 2/18/2012 7:56 PM, Zach wrote:

On Sunday, 19 February 2012 at 01:29:40 UTC, Nick Sabalausky wrote:

Another one for the file of Crazy shit Andrei says ;)

From experience, I (and clearly many others here) find a sparse, flat
exception hierarchy to be problematic and limiting. But even with a
rich detailed exception hierarchy, those (ie, Andrei) who want to
limit themselves to catching course-grained exceptions can do so,
thanks to the nature of subtyping. So why are we even discussing this?


How about we revisit ancient design decisions which once held true...
but no longer is the case due to our god-language being more expressive?

In my experience an exception hierarchy is never good enough, it
suffers from the same problems as most frameworks also do... they
simplify/destroy too much info of from the original error.

ex why don't we throw a closure? Of course we could go crazy with mixins
and CTFE,CTTI,RTTI aswell... imho the goal should not be to do as good
as java, the goal is progress! Java should copy our design if
anything... we could have a very rich exception structure... without the
need for a hierarchy.

try(name) // try extended to support full closure syntax
{
DIR* dir = opendir(toStringz(name));

if(dir==0  errno==ENOTDIR)
throw; // throws the entire try block as a closure
}



My C++ experience revolves around some rather large codebases that do 
not have exceptions enabled at all.  The more I look into seeing what it 
would take to start using them, all I see are some pretty huge downsides:


The exact nature of the error for a lot of exceptions for like File I/O 
are very platform specific.  Throwing these objects effectively bubbles 
up highly platform specific data up to the caller, which infects the 
caller with the need to deal with platform specific data (posix codes vs 
GetLastError, vs various return values from the functions that failed 
etc).  This is not a good thing for keeping a codebase isolated from 
platform differences.


There has been seeing a huge shift in needing to make everything I work 
on threaded on a fine-grained level (game simulation, graphics 
rendering, various physics based systems and simulations, networking 
etc).  The way exceptions work, unwind the stack until someone 'cares', 
is not a good model in this environment.  Ideally everything is on its 
own thread or exists as a job in a job queue, or exists as some kind of 
node in a flow based system.  In these environments there is no caller 
on the _stack_ to unwind to.  You have to package up another async 
message with the failure and handle it somewhere else.  In many ways 
this is superior, as it solves the problem exceptions were created to 
solve : route errors to the proper code that 'cares'.


In the Von Neumann model this has been made difficult by the stack 
itself.  Thinking of exceptions as they are currently implemented in 
Java, C++, D, etc is automatically artificially constraining how they 
need to work.  Exceptions as they are now exist as a way to jump up the 
stack some arbitrary amount (until a handler is found).  The real 
problem that needs solving is 'route errors the most appropriate place 
available' with the constraint of 'keep the program in a working state'. 
 I would prefer a system that works equally well in both kinds of 
environments, as we are already mixing the styles, and switching code 
from one to the other requires a large amount of refactoring due to the 
differences in error handling.






Re: The Right Approach to Exceptions

2012-02-18 Thread Sean Cavanaugh

On 2/18/2012 11:07 PM, Walter Bright wrote:

On 2/18/2012 8:08 PM, bearophile wrote:

To improve this discussion a small benchmark is useful to see how much
bloat this actually causes.


It'll increase with reflection and perfect garbage collection.


Are these coming?  :)


Re: std.simd module

2012-02-04 Thread Sean Cavanaugh

On 2/4/2012 7:37 PM, Martin Nowak wrote:

Am 05.02.2012, 02:13 Uhr, schrieb Manu turkey...@gmail.com:


On 5 February 2012 03:08, Martin Nowak d...@dawgfoto.de wrote:


Let me restate the main point.
Your approach to a higher level module wraps intrinsics with named
functions.
There is little gain in making simd(AND, f, f2) to and(f, f2) when
you can
easily take this to the level GLSL achieves.



What is missing to reach that level in your opinion? I think I basically
offer that (with some more work)
It's not clear to me what you object to...
I'm not prohibiting the operators, just adding the explicit functions,
which may be more efficient in certain cases (they receive the version).

Also the 'gains' of wrapping an intrinsic in an almost identical function
are, portability, and potential optimisation for hardware versioning. I'm
specifically trying to build something that's barely above the intrinsics
here, although a lot of the more arcane intrinsics are being collated
into
their typically useful functionality.

Are you just focused on the primitive math ops, or something broader?


GLSL achieves very clear and simple to write construction and conversion
of values.

I think wrapping the core.simd vector types in an alias this struct
makes it a snap
to define conversion through constructors and swizzling through
properties/opDispatch.
Then you can overload operands to do the implementation specific stuff
and add named methods
for the rest.



The GLSL or HLSL sync is fairly nice, but has a few advantages that are 
harder to take advantage of on PC SIMD:


The hardware that runs HLSL can handle natively operate on data types 
'smaller' than the register, either handled natively or by turning all 
the instructions into a mass of scalar ops that are then run in parallel 
as best as possible.  In SIMD land on CPU's the design is much more 
rigid: we are effectively stuck using float and float4 data types, and 
emulating float2 and float3.For a very long time there was not even 
a a dot product instruction, as from Intel's point of view your data is 
transposed incorrectly if you needed to do one (plus they have to handle 
dot2, dot3, dot4 etc).


The cost of this emulation of float2 and float3 types is that we have to 
put 'some data' in the unused slots of the SIMD register on swizzle 
operations, which will usually lead to the SIMD instructions generating 
INF's and NANs in that slot and hurting performance.


The other major problem with the shader swizzle syntax is that it 
'doesnt scale'.  If you are using a 128 register holding 8 shorts or 16 
bytes, what are the letters here?  Shaders assume 4 is the limit so you 
have either xyzw and rgba.  Then there are platform considerations (i.e. 
you can can't swizzle 8 bit data on SSE, you have to use a series of 
pack|unpack and shuffles, but VMX can easily)


That said: shader swizzle syntax is very nice, it can certainly reduce 
the amount of code you write by a huge factor (though the codegen is 
another matter)  Even silly tricks with swizzling literals in HLSL are 
useful like the following code to sum up some numbers:


if (dot(a, 1.f.xxx)  0)




Re: std.simd module

2012-02-04 Thread Sean Cavanaugh


Looks good so far:

  it could use float[2] code wherever there is float[3] code 
(magnitude2 etc)


  any/all should have template overloads to let you specificy exactly 
which channels match, and simple hardcoded ones for the common cases 
(any1, any2, any3, any4 aka the default 'any')


  I have implementations of floor/ceil/round(to-even) that work on 
pre-SSE4 hardware for float and doubles I can give out they are fairly 
simple, as well as the main transcendentals (pow, exp, log, sin, cos, 
tan, asin, acos, atan).  sinh and cosh being the only major ones I left out.


I just need a place or address to post or mail the code.

  D should be able to handle names and overloading better, though 
giving everything unique names was the design choice I made for my 
library, primarily to make the code searchable and potentially portable 
to C (aside from the heavy use of const references as argument types).




On 2/4/2012 1:57 PM, Manu wrote:

So I've been trying to collate a sensible framework for a standard
cross-platform simd module since Walter added the SIMD stuff.
I'm sure everyone will have a million opinions on this, so I've drawn my
approach up to a point where it properly conveys the intent, and I've
proven the code gen works, and is good. Now I figure I should get
everyone to shoot it down before I commit to the tedious work filling in
all the remaining blanks.

(Note: I've only written code against GDC as yet, since DMD's SSE only
supports x64, and x64 is not supported in Windows)
https://github.com/TurkeyMan/phobos/blob/master/std/simd.d

The code might surprise a lot of people... so I'll give a few words
about the approach.

The key goal here is to provide the lowest level USEFUL set of
functions, all the basic functions that people actually use in their
algorithms, without requiring them to understand the quirks of various
platforms vector hardware.
Different SIMD hardware tends to have very different shuffling,
load/store, component addressing, support for more/less of the primitive
maths operations, etc.
This library, which is the lowest level library I expect programmers
would ever want to use in their apps, should provide that API at the
lowest useful level.

First criticism I expect is for many to insist on a class-style vector
library, which I personally think has no place as a low level, portable API.
Everyone has a different idea of what the perfect vector lib should look
like, and it tends to change significantly with respect to its application.

I feel this flat API is easier to implement, maintain, and understand,
and I expect the most common use of this lib will be in the back end of
peoples own vector/matrix/linear algebra libs that suit their apps.

My key concern is with my function names... should I be worried about
name collisions in such a low level lib? I already shadow a lot of
standard float functions...
I prefer them abbreviated in this (fairly standard) way, keeps lines of
code short and compact. It should be particularly familiar to anyone who
has written shaders and such.

Opinions? Shall I continue as planned?




Re: indent style for D

2012-01-29 Thread Sean Cavanaugh

On 1/29/2012 5:36 AM, Trass3r wrote:

http://www.d-programming-language.org/dstyle.html in regard to
indent-style, can someone shed some light what is recommended practice
for it within D community?


Everyone thinks his way is the best.


Thats because it is :)



curley braces on the same line as conditionals is a refactoring landmine IMO

I've never seen an editor that would enforce only leading characters on 
a line as tabs, and until all of them do it, spaces is the only thing 
that makes sense to me, since they are never 'wrong'.


The codebase I use at work is full of tabs and I can tolerate them, but 
not knowing how many times to hit backspace on some chunk of code 
containing whitespace in the middle of it is really annoying.  Yes there 
is undo but it starts infringing on my flow (replacing zen with anger, 
the emperor would be pleased . . .)




Re: dmd2

2012-01-29 Thread Sean Cavanaugh

On 1/29/2012 10:24 AM, Chad J wrote:

Hey guys,

I know this is a bit late given the deprecation of D1 and all, but why
did we name the D2 compiler dmd instead of dmd2?

It's rather annoyed me when trying to work with multiple D projects of
mixed kind in the same environment. Using the same compiler name for two
different programming languages seems like a Bad Idea.

- Chad



On an unrelated note it looks like D would have to get to 14 or 15 
before typing in D14 on google search all by itself would likely be a #1 
hit without additional keywords :)




Re: indent style for D

2012-01-29 Thread Sean Cavanaugh

On 1/29/2012 5:03 PM, Iain Buclaw wrote:

On 29 January 2012 14:17, bearophilebearophileh...@lycos.com  wrote:

Denis Shelomovskij:


Am I mistaken? If no, am I missing some major spaces advantages? If no,
lets use tabs.


D2 style guide should *require* D2 to be edited using a mono-spaced font, and the D2 
front-end should enforce this with a -ms compiler switch.

Bye,
bearophile



I think D should go even further than that and drop those horrid curly
braces, handling all code blocks by indentation.




There are huge swaths of unused unicode values, including a rather 
large chunk reserved for custom user-implementation.


Clearly we need to redesign the language to use custom symbols that make 
sense, instead of the archaic typesetting symbols we use now and have a 
unique set of programmer symbols.


We can even solve the space vs tab problem once and for all, by making a 
whitespace key and removing the obsolete buttons to make room for all 
the new ones (which would be tab, spacebar, and enter).




Re: dmd2

2012-01-29 Thread Sean Cavanaugh

On 1/29/2012 3:37 PM, Timon Gehr wrote:

On 01/29/2012 06:55 PM, Sean Cavanaugh wrote:

On 1/29/2012 10:24 AM, Chad J wrote:

Hey guys,

I know this is a bit late given the deprecation of D1 and all, but why
did we name the D2 compiler dmd instead of dmd2?

It's rather annoyed me when trying to work with multiple D projects of
mixed kind in the same environment. Using the same compiler name for two
different programming languages seems like a Bad Idea.

- Chad



On an unrelated note it looks like D would have to get to 14 or 15
before typing in D14 on google search all by itself would likely be a #1
hit without additional keywords :)



???

http://www.google.com/search?q=d



D Magazine is #1 for me, which is all about things to do in Dallas.  The 
digital mars page is #17 on page 2, wikipedia article on D is #4


Re: Biggest Issue with D - Definition and Versioning

2012-01-17 Thread Sean Cavanaugh
Hmm my experiences are similar for 90% of companies, though I have seen 
some exceptions (Perforce is receptive of feedback and bugs, Certain 
divisions of Microsoft are communicative, but not MSConnect).   The 
common denominator for communication looks pretty simple to me:


If there is anyone between you and the developer on the other end, it is 
doomed to be a black hole.   MSConnect is a form-letter 'your bug is a 
duplicate, we can't reproduce it, and please test it in our new version 
for us so we can fail to fix it even though we know about it', but 
working directly with divisions in Microsoft is much possible (i.e. 
console support for programmers working on XBOX is stellar for instance, 
completely the opposite experience of reporting bugs to MSConnect).



On 1/16/2012 10:32 PM, Walter Bright wrote:

On 1/16/2012 12:00 AM, Gour wrote:

Recently I was evaluating one CMS written in one popular Python
framework and after reporting bug which makes it unusable for even
simple page layout, hearing nothing from the developer and then seeing
it's fixed after more than two months, it was not difficult to abandon
idea to base our sites on such a product.



Your other ideas are well considered, but I have to take issue with this
one.

I have submitted many, many bug reports over the decades to Major
Software Vendors with well supported software products.

How many times have I gotten anything other than a robo-reply?

zero

When my company has paid $ in the 5 figures for premium tech service,
what is the response to bug reports?

nothing

-- or --

that's not a bug

-- or --

you're a unique snowflake and nobody else has that problem
so we won't fix it

How many times has a bug I reported ever been fixed, even waiting a year
for the next update?

zero

I take that back. One time I got so mad about this I contacted the CEO
of the Major Software Vendor (I knew him personally) and he got out a
crowbar, went to see the dev team, and (allegedly) thwacked a few of
them. The bug still never got fixed, but I got an acknowledgment.

This has obviously never impeded anyone from using their software tools.

It's also why:

1. I never bother filing bug reports to Major Software Vendors anymore.

2. With Digital Mars products, anyone can file a bug report with
Bugzilla without needing me to acknowledge or filter it.

3. Anyone can read and comment on those bug reports.

4. I think we've had great success using Github and allowing anyone to
fork  fix  publish.

I know our response to bug reports is far from perfect, but at least we
aren't hiding under a rock.

It's also true that if a company wanted to bet the farm on D, and were
willing to put some money behind it, their concerns would get priority,
as full time professional developers could get hired to do it.




Re: SIMD support...

2012-01-17 Thread Sean Cavanaugh

On 1/16/2012 7:21 PM, Danni Coy wrote:

(dual quaternions? Are they
used in games?)

yes



While the GPU tends to do this particular step of the work, the answer 
in general is 'definitely'.  One of the most immediate applications of 
dual quats was to improve the image quality of joints on characters that 
twist and rotate at the same time (shoulder blades, wrists, etc), at a 
minimal increase (or in some cases equivalent) computational cost over 
older methods.


http://isg.cs.tcd.ie/projects/DualQuaternions/



Re: start on SIMD documentation

2012-01-14 Thread Sean Cavanaugh
What about the 256 bit types that are already present in AVX instruction 
set?


I've written a several C++ based SIMD math libraries (for SSE2 up 
through AVX), and PPC's VMX instruction sets that you can find on game 
consoles.


The variable type naming is probably the most annoying thing to work out.

For HLSL they use float, float1, float2, float3, float4 and int, uint 
and double versions, and this convention works out quite well until you 
start having to deal with smaller integer types or FP16 half floats.


However on the CPU side of things there are signed and unsigned 8, 16, 
32, 64 and 128 bit values.  It gets even more complicated in that not 
all the math operations or comparisons are supported on the non-32 bit 
types.  The hardware is really designed for you to pack and unpack the 
smaller types to 32 bit do the work and pack the results back, and the 
64 bit integer support is also a bit spotty (esp wrt multiply and divide).


On 1/13/2012 2:57 PM, bearophile wrote:

Walter:


What's our vector, Victor?
http://www.youtube.com/watch?v=fVq4_HhBK8Y


Thank you Walter :-)



If int4 is out, I'd prefer something like vint4. Something short.


Current names:

void16
double2
float4
byte16
ubyte16
short8
ushort8
int4
uint4
long2

Your suggestion:

vvoid16
vdouble2
vfloat4
vbyte16
vubyte16
vshort8
vushort8
vint4
vuint4
vlong2


My suggestion:

void16v
double2v
float4v
byte16v
ubyte16v
short8v
ushort8v
int4v
uint4v
long2v

Bye,
bearophile




Re: SIMD support...

2012-01-14 Thread Sean Cavanaugh

MS has three types, __m128, __m128i and __m128d  (float, int, double)

Six if you count AVX's 256 forms.

On 1/7/2012 6:54 PM, Peter Alexander wrote:

On 7/01/12 9:28 PM, Andrei Alexandrescu wrote:
I agree with Manu that we should just have a single type like __m128 in
MSVC. The other types and their conversions should be solvable in a
library with something like strong typedefs.



Re: SIMD support...

2012-01-14 Thread Sean Cavanaugh

On 1/6/2012 9:44 AM, Manu wrote:

On 6 January 2012 17:01, Russel Winder rus...@russel.org.uk
mailto:rus...@russel.org.uk wrote:
As said, I think these questions are way outside the scope of SIMD
vector libraries ;)
Although this is a fundamental piece of the puzzle, since GPGPU is no
use without SIMD type expression... but I think everything we've
discussed here so far will map perfectly to GPGPU.


I don't think you are in any danger as the GPGPU instructions are more 
flexible than the CPU SIMD counterparts GPU hardware natively works with 
float2, float3 extremely well.  GPUs have VLIW instructions that can 
effectively add a huge number of instruction modifiers to their 
instructions (things like built in saturates of 0..1 range on variable 
arguments _reads_, arbitrary swizzle on read and write, write masks that 
leave partial data untouched etc, all in one clock).


The CPU SIMD stuff is simplistic by comparions.  A good bang for the 
buck would be to have some basic set of operators (* / + -   == != = 
= and especially ? (the ternary operator)), and versions of 'any' and 
'all' from HLSL for dynamic branching, that can work at the very least 
for integer, float, and double types.


Bit shifting is useful (esp manipulating floats for transcendental 
functions or workingw ith half FP16 types requires a lot of), but should 
be restricted to integer types.  Having dedicated signed and unsigned 
right shifts would be pretty nice to (since about 95% of my right shifts 
end up needing to be of the zero-extended variety even though I had to 
cast to 'vector integers')




Re: SIMD support...

2012-01-14 Thread Sean Cavanaugh

On 1/6/2012 7:58 PM, Manu wrote:

On 7 January 2012 03:46, Vladimir Panteleev vladi...@thecybershadow.net
mailto:vladi...@thecybershadow.net wrote:

I've never seen a memcpy on any console system I've ever worked on that
takes advantage if its large registers... writing a fast memcpy is
usually one of the first things we do when we get a new platform ;)


Plus memcpy is optimized for reading and writing to cached virtual 
memory, so you need several others to write to write-combined or 
uncached memory efficiently and whatnot.


Re: SIMD support...

2012-01-14 Thread Sean Cavanaugh

On 1/15/2012 12:09 AM, Walter Bright wrote:

On 1/14/2012 9:58 PM, Sean Cavanaugh wrote:

MS has three types, __m128, __m128i and __m128d (float, int, double)

Six if you count AVX's 256 forms.

On 1/7/2012 6:54 PM, Peter Alexander wrote:

On 7/01/12 9:28 PM, Andrei Alexandrescu wrote:
I agree with Manu that we should just have a single type like __m128 in
MSVC. The other types and their conversions should be solvable in a
library with something like strong typedefs.



The trouble with MS's scheme, is given the following:

__m128i v;
v += 2;

Can't tell what to do. With D,

int4 v;
v += 2;

it's clear (add 2 to each of the 4 ints).


Working with their intrinsics in their raw form for real code is pure 
insanity :)  You need to wrap it all with a good math library (even if 
90% of the library is the intrinsics wrapped into __forceinlined 
functions), so you can start having sensible operator overloads, and so 
you can write code that is readable.



if (any4(a  b))
{
  // do stuff
}


is way way way better than (pseudocode)

if (__movemask_ps(_mm_gt_ps(a, b)) == 0x0F)
{
}



and (if the ternary operator was overrideable in C++)

float4 foo = (a  b) ? c : d;

would be better than

float4 mask = _mm_gt_ps(a, b);
float4 foo = _mm_or_ps(_mm_and_ps(mask, c), _mm_nand_ps_(mask, d));



Re: SIMD support...

2012-01-14 Thread Sean Cavanaugh

On 1/13/2012 7:38 AM, Manu wrote:

On 13 January 2012 08:34, Norbert Nemec norb...@nemec-online.de
mailto:norb...@nemec-online.de wrote:


This has already been concluded some days back, the language has a quite
of types, just like GCC.


So I would definitely like to help out on the SIMD stuff in some way, as 
I have a lot of experience using SIMD math to speed up the games I work 
on.  I've got a vectorized set of transcendetal (currently in the form 
of MSVC++ intrinics) functions for float and double that would be a good 
start if anyone is interested.  Beyond that I just want to help 'make it 
right' because its a topic I care alot about, and is my personal biggest 
gripe with the langauge at the moment.


I also have experience with VMX as they two are not exactly the same, it 
definitely would help to avoid making the code too intel-centric (though 
typically the VMX is the more flexible design as it can do dynamic 
shuffling based on the contents of the vector registers etc)


Re: dmd 2.057 release

2012-01-03 Thread Sean Cavanaugh

On 1/3/2012 1:25 PM, Walter Bright wrote:

On 1/3/2012 10:55 AM, Alex Rønne Petersen wrote:

On 03-01-2012 19:47, Walter Bright wrote:

On 1/3/2012 6:49 AM, Alex Rønne Petersen wrote:

Perhaps some kind of experimental releases would be better. It could
help
getting new features out to the community (and thus tested) faster.


We call them betas g.

But anyone can pull the latest from github and use it, many do.


That's not very practical for most users. Some kind of
ready-to-download builds
would be much better. As others suggested, the auto-tester publishing
builds for
download would be ideal.


Using a nightly build is not very practical for most users, either,
probably the same group.


Well there is always the google (and mozilla) route of force-feeding the 
latest binaries to everyone :)


Re: dmd and C++11

2012-01-01 Thread Sean Cavanaugh

On 12/29/2011 10:16 AM, Trass3r wrote:

On Thursday, 29 December 2011 at 16:00:47 UTC, Vladimir Panteleev wrote:

On Thursday, 29 December 2011 at 15:58:55 UTC, Trass3r wrote:

What's the stance on using C++11 features in the dmd source code in
the future?


Well, how many C++11 features does DMC support?


*sigh*
Totally forgot about that. Can't we finally get rid of that crappy
toolchain :'(
btw, wasn't there a patch to make dmd compile with VisualStudio cl?


Visual Studio support of C++11 is pretty weak (or maybe more accurately 
'spotty') in VS2010, and this will be virtually unchanged in the next 
version.  On the plus side what is there (rvalue references, auto) is 
nice to have.  On the downside what is missing is really sad (varaidic 
templates, range for, the full set of C++11 type traits, the new unicode 
string types and literals, initializer lists, and a few others I can't 
remember easily)




Re: Carmack about static analysis

2011-12-27 Thread Sean Cavanaugh

On 12/25/2011 10:23 PM, Andrei Alexandrescu wrote:


As a first step, we must make all allocations except stack type-aware,
and leave only the stack to be imprecise.



Couldn't the GC'ing the stack be handled in a similar style to how the 
Windows x64 ABI functions with respect to exception handling?


The unwinding code lives outside the normal program flow to reduce 
runtime overhead as much as posisble, at least until an exception is 
thrown.  At which point the exception handlers traverse this data 
structure to unwind the stack from wherever you are in your function 
(and its caller and so on).


I would imagine a GC system could do something very similar.  There are 
a few MSDN blogs with a bunch of useful links out to more detailed 
information:


http://blogs.msdn.com/b/freik/archive/2006/01/04/509372.aspx

http://blogs.msdn.com/b/freik/archive/2005/03/17/398200.aspx

http://msdn.microsoft.com/en-us/library/1eyas8tf.aspx



Re: is d-runtime non-gc safe?

2011-12-07 Thread Sean Cavanaugh

On 12/5/2011 3:22 PM, Norbert Nemec wrote:

On 05.12.2011 21:40, Tobias Pankrath wrote:

Right - thanks for the hint!

That would leave the following rules for real-time audio code in D:

[snip]


What's about message passing? Is message passing hard real time ready?


The issue actually came up for me a few weeks ago.

In principle, all you need are non-blocking queues to send messages to,
from or even between RT threads. With the atomic operations in D, such a
queue is fairly straightforward to implement.

Unfortunately, I could not find any OS support for non-blocking message
passing. More specifically, there does not seem to be any way to wake a
sleeping thread by sending a message from a RT thread. The only option
is periodically querying the message queue. Not optimal, but probably
sufficient for most purposes.



In windows land you would use MsgWaitForMultipleObject on the sleepy 
thread and wake it up with a kernel object of some kind (an Event most 
likely)


Re: std.stdio overhaul by Steve Schveighoffer

2011-09-08 Thread Sean Cavanaugh

On 9/7/2011 2:19 AM, Jacob Carlborg wrote:

On 2011-09-06 19:05, Daniel Murphy wrote:

Andrei Alexandrescuseewebsiteforem...@erdani.org wrote in message
news:j45isu$2t3h$1...@digitalmars.com...


Yah, I also think the documentation makes it easy to clarify which
module
is the preferred one.

I think there's a lot of merit to simply appending a '2' to the module
name. There only place where the '2' occurs is in the name of the
module,
and there aren't many modules we need to replace like that.


I still can never remember if I'm supposed to be using std.regex or
std.regexp.
When the new one is finished are we going to have 3?

It's definately benificial to avoid breaking code, but I really disagree
that phobos has reached that point yet. The breaking changes need to
stop,
but stopping prematurely will leave phobos permanently disfigured.


I agree.




In the COM based land for D3D, there is just a number tacked onto the 
class name.  We are up to version 11 (e.x. ID3D11Device).  It works well 
and is definitely nicer once you are used to it, than calling everything 
New or FunctionEx, and left wondering what to do when you rev the 
interface again.  Once you solve making 3 versions of an interface work 
cleanly, nice it should be a good system.


Making all the modules versioned in some way would probably be ideal. 
The way linux shared libraries are linked could be used as a model, just 
make the 'friendly unversioned' module name an alias of some sort to the 
latest version of the library.  Any code needing the older version can 
specify it explicitly.  An approach like this would need to be done 
within D, as symbol links are a problem for some platforms (though at 
least its possible on windows these days).




Re: Programming Windows D Examples are now Online!

2011-07-09 Thread Sean Cavanaugh

On 6/21/2011 1:08 AM, Brad Anderson wrote:

On Mon, Jun 20, 2011 at 10:14 PM, Andrej Mitrovic
andrej.mitrov...@gmail.com mailto:andrej.mitrov...@gmail.com wrote:

This is a translation project of Charles Petzold's Programming Windows
(5th edition) book code samples.

Currently over 120 code samples have been translated into D, with only
a few modules remaining.

Everything else you need to know is in the Readme file:
https://github.com/AndrejMitrovic/DWindowsProgramming

The examples were tested on fresh installs of XP and Win7 with the
only dependency being DMD v2.053 and an NT operating system. I hope
everyone will be able to build these examples without too much
trouble. *crosses fingers*



Awesome.  If only I hadn't sold my copy a month ago :(.  It's great that
Petzold was OK with you doing this.  He's a great author. Code is
probably the most interesting technical book I've ever read (sorry, Andrei).


I had ported a subset of the WTL functionality over to D for a personal 
project.  This primarily consists of a D version of port of atlcrack.h 
for creating a message map for an HWND wrapped class, and some template 
mixins to hook up the message map to a class.


The WTL uses the curiously recurring template design, and D can do the 
same thing, but is also able to use mixins which makes things a lot more 
flexible.


The message handlers and bindings a bulk of the annoying work, so while 
my 'DWindow' class was only filled in just enough to function for my 
app, filling out Win32 API calls as needed is pretty trivial, compared 
to making all the message handlers typesafe and extract their arguments 
properly.


I am not working on this at the moment but it would be a good starting 
point especially if you are working with GDI directly.  I am also still 
quite new to D so style or alternative techniques available in the 
language might have been overlooked.



Message cracking and handler binding helpers (dwindowmessages.d):

http://codepad.org/0ApNSvas

Simple 'DWindow' and MSG class and aliases (dwindow.d)

http://codepad.org/Q4Flyanw

A 'GameWindow' class with a bunch of message handlers hooked up 
correctly, and the message pump.


http://codepad.org/JJpfqE7a


win32fix.d, stuff that should be in the win32 bindings on the bindings 
web site but isn't  (I was using rawinput for the mouse which is rather 
esoteric to most programmers).  They are also missing from the Microsoft 
C/C++ headers, they are that obscure :)



http://codepad.org/xA18pGEX




Re: The issue with libraries that use Windows API prototypes

2011-07-09 Thread Sean Cavanaugh

On 7/9/2011 2:42 PM, Andrej Mitrovic wrote:

Btw, the issue with those conflicting functions could be resolved by
careful uses of selective imports:

import win32.wingdi;
// This overwrites win32\wingdi : wglMakeCurrent, wglDeleteContext,
wglCreateContext;
import derelict.opengl.wgl : wglMakeCurrent, wglDeleteContext, wglCreateContext;
// This overwrites win32.basetsd or wherever HGLRC is
import derelict.util.wintypes : HGLRC;

But again, that doesn't scale as it gets tedious to have to do this in
every module..


The Win32 Bindings project on dsource.org project is far more complete 
(even if it is missing some of Vista and a ton of Win7 functions and 
enumerations).  These apparently are a port of the mingw headers, but I 
can't help but wonder if it would be possible to split the different 
with the mingw developers and have the bindings live in a language 
neutral database with a code generator for C/C++ headers and D 
'headers'.  I would also expect this to be of interest to Mono to 
generate a complete and correct set of PInvoke bindings as well, and any 
other language that wants be able to talk native to Win32.


The other major weak leak I've run across are usages of COM and 
IUnknown.  They are defined in a few places as well which is problematic 
if you are trying to write COM interfaces that are provided by C/C++ 
dlls ( in my case, D3D11 bindings ).


import std.c.windows.com;
import win32.unknwn;

and a D template version of CComPtr from atl has to do two checks to see 
if it is a valid IUnknown object:


struct ComPtr(T)
{
public:
static assert(is(T : std.c.windows.com.IUnknown) || is(T : 
win32.unknwn.IUnknown));

T p;
alias p this;
...
}


This is a pretty huge problem since its a giant violation of DRY (Don't 
Repeat Yourself).  There should only be one way of doing some things.





Re: reddit discussion about Go turns to D again

2011-05-15 Thread Sean Cavanaugh

On 5/15/2011 11:04 AM, dsimcha wrote:

On 5/15/2011 11:41 AM, Robert Clipsham wrote:

Automatically using a parallel algorithm if it's likely to improve
speed? Awesome. I assume that std.parallelism sets up a thread pool upon
program start so that you don't have the overhead of spawning threads
when you use a parallel algorithm for the first time?



No, it does so lazily. It seemed silly to me to do this eagerly when it
might never be used. If you want to make it eager all you have to do is
reference the taskPool property in the first line of main().



I haven't looked at the library in depth, but after taking a peek I'm 
left wondering how to configure the stack size.  My concern is what to 
do if the parallel tasks are running out of stack, or (more likely) are 
given way too much stack because they need to 'handle anything'.




Re: reddit discussion about Go turns to D again

2011-05-15 Thread Sean Cavanaugh

On 5/15/2011 11:49 AM, dsimcha wrote:

On 5/15/2011 12:21 PM, Sean Cavanaugh wrote:


I haven't looked at the library in depth, but after taking a peek I'm
left wondering how to configure the stack size. My concern is what to do
if the parallel tasks are running out of stack, or (more likely) are
given way too much stack because they need to 'handle anything'.



I never thought to make this configurable because I've personally never
needed to configure it. You mean the stack sizes of the worker threads?
I just use whatever the default in core.thread is. This probably errs on
the side of too big, but usually you're only going to have as many
worker threads as you have cores, so it's not much waste in practice.

If you really need this to be configurable, I'll add it for the next
release.



I'm used to working with embedded environments so its just something I 
notice right away when looking at threaded libraries.  We generally have 
to tune all the stacks to the bare minimum to get memory back, since its 
extremely noticeable when running on a system without virtual memory.  A 
surprising number of tasks can be made run safely on 1 or 2 pages of stack.


Looking into core.thread the default behavior (at least on windows) is 
to use the same size as the main thread's stack, so basically whatever 
is linked in as the startup stack is used.  It is a safe default as 
threads can handle anything the main thread can and vice versa, but is 
usually pretty wasteful for real-world work tasks.


A single thread pool by itself is never really the problem.  Pretty much 
each set of middleware that does threading makes their own threads and 
thread pools, and it can and up pretty fast.  Even a relatively simple 
application can end up with something like 20 to 50 threads, if any of 
the libraries are threaded (audio, physics, networking, background io etc).


Anyway, if you have lots of threads for various reasons, you can quickly 
have 50-100MB or more of your address space eaten up with stack.  If 
this happens to you say, on an XBOX 360, thats 10-20% of the RAM on the 
system, and tuning the stacks is definitely not a waste of time, but it 
has to be possible to do it :)




As an unrelated note I got the magic VC exception that names threads in 
the visual studio debugger to work in D pretty easily, if anyone wants 
it I've linked it :)


http://snipt.org/xHok







Re: reddit discussion about Go turns to D again

2011-05-15 Thread Sean Cavanaugh

On 5/15/2011 12:45 PM, dsimcha wrote:


Fair enough. So I guess stackSize should just be a c'tor parameter and
there should be a global for the default pool, kind of like
defaultPoolThreads? Task.executeInNewThread() would also take a stack
size. Definitely do-able, but I'm leery of cluttering the API with a
feature that's probably going to be used so infrequently.


I would say sleep on it at least.  At least with source libraries we 
have the option of hacking the library or forking it (and ripping it out 
of std if need be for nefarious purposes :)




private module stuff

2011-05-08 Thread Sean Cavanaugh
	So I was learning how to make a module of mine very strict with private 
parts, and was surprised I could only do this with global variables and 
functions.   Enums, structs, and classes are fully visible outside the 
module regardless of being wrapped in a private{} or prefixed with 
private.  Am I expecting too much here?




rough code:


module mymoduletree.mymodule;

private
{
struct foo
{
int x;
int y;
}

int somevar = 4;
}


.


module someothertree.othermodule;

import mymoduletree.mymodule;

int bar(int arg)
{
foo var;  // why can i use this type here??
var.x = arg;
var.y = somevar;  // this generates an error (access of somevar is 
private and not allowed)

return var.y;
}





Re: private module stuff

2011-05-08 Thread Sean Cavanaugh

On 5/8/2011 4:05 AM, Jonathan M Davis wrote:

Sean Cavanaugh:

So I was learning how to make a module of mine very strict with private

parts, and was surprised I could only do this with global variables and
functions.   Enums, structs, and classes are fully visible outside the
module regardless of being wrapped in a private{} or prefixed with
private.  Am I expecting too much here?


You are expecting the right thing. If you are right, then it's a bug that
eventually needs to be fixed. Take a look in Bugzilla, there are several
already reported import/module-related bugs.


They're private _access_ but still visible. I believe that that is the correct
behavior and not a bug at all. I believe that it's necessary for stuff like
where various functions in std.algorithm return auto and return a private
struct which you cannot construct yourself. Code which uses that struct needs
to know about it so that it can use it properly, but since it's private, it
can't declare it directly. It works because the types are appropriately
generic (ranges in this case). Regardless, I believe that the current behavior
with private is intended.

- Jonathan M Davis



The more I play with private/protected/package the more confused I am by it.

For the most part the rules are:
Functions and variables have protection
Types (enum, struct, class) do not
	this and ~this are special and are not considered functions, and are 
always public

struct and class members always default to public



If you search phobos you will find occurences of 'private struct' and 
'private class', so even the people writing libraries are expecting 
something to be happening that isn't.  For example:



//in std.parallelism:
private struct AbstractTask {
mixin BaseMixin!(TaskStatus.notStarted);

void job() {
runTask(this);
}
}


//and in std.demangle:
private class MangleException : Exception
{
this()
{
super(MangleException);
}
}


and in my code I can compile the following without compile-time errors:


import std.parallelism;
import std.demangle;

int main()
{
MangleException bar = new MangleException();

AbstractTask foo;
foo.job();

return 0;
}




With the language the way it is now, it is nonsensical to have the 
attributes public/protected/package/private/export precede the keyword 
struct, class, or enum.


Re: Beta List

2011-05-07 Thread Sean Cavanaugh

On 5/7/2011 12:24 AM, Andrej Mitrovic wrote:

Here's a quick weblink:
http://news.gmane.org/gmane.comp.lang.d.dmd.beta/cutoff=624


It works on my machine :-D




Re: Difference between stack-allocated class and struct

2011-05-03 Thread Sean Cavanaugh
Here is my prototype COM compile-time reflection based wrapper mixin 
(which I have abandoned in favor of alias this since it covers 95% of my 
use cases even though it isn't perfectly safe).  I am new at D so you 
have been warned, though this part of the language seems pretty 
straightforward enough.  It is possible the track down the argument 
names but takes some rather intricate parsing to do correctly (and the 
example someone linked me in another thread of mine chocked on const 
types due to parsing bugs).



http://snipt.org/xsu


The wrapped interface also needs to be a mixin so it can be created in 
the correct module, with visibility to all the types passed to the 
interface's methods.



So something like the following is going to fail miserably unless ComPtr 
is also made into a mixin and instantiated in the correct module.



struct ComPtr(T)
{
public:
T m_Pointer;
mixin(WrapComInterface!(T)(m_Pointer)
};




Re: How about a Hash template?

2011-04-30 Thread Sean Cavanaugh

On 4/29/2011 6:19 PM, Alexander wrote:

On 29.04.2011 21:58, Andrei Alexandrescu wrote:


You need to replace the assert and compile with -O -release -inline. My results:


[snip]

Still, straight comparison wins - 2x faster ;)

/Alexander



When understanding the CPU platform you are on, one of the benchmarks 
you can do is to measure how many linear integer comparions you can do 
vs a binary search of the same size, and graph it out.


There is a crossover point where the binary search will be faster, but 
with modern CPUs the number of linear items you can search increases 
every year.


The linear search also makes extremely good use of the cache and 
hardware prefetching, and the branches (as well as the loop itself) will 
be predicted correctly until the terminating condition is found, where 
the binary search is mispredicted 50% of the time.  The last time I 
measured the crossover point around 60 integer values, and it wouldn't 
surprise me at all that its over 100 on newer chipsets (Sandy Bridge, 
Bulldozer etc).





Re: Is int faster than byte/short?

2011-04-30 Thread Sean Cavanaugh

On 4/30/2011 10:34 AM, Mariusz Gliwiński wrote:

Hello,
I'm trying to learn high-performance real-time programming.

One of my wonderings are:
Should i use int/uint for all standard arithmetic operations or
int/short/byte (depends on actual case)?
I believe this question has following subquestions:
* Arithmetic computations performance
* Memory access time

My actual compiler is DMD, but I'm interested in GDC as well.

Lastly, one more question would be:
Could someone recommend any books/resources for this kind of
informations and tips that could be applied to D? I'd like to defer my
own experiments with generated assembly and profiling, but i suppose
people already published general rules that i could apply for my
programming.

Thanks,
Mariusz Gliwiński


My experience with this pattern of thinking is to use the largest data 
type that makes sense, unless you have a profiler saying you need to do 
something different.  However, if you get being obsessive-compulsive 
about having 'the perfectly sized integer types' for the code, it is 
possible to fall into the trap of over-using unsigned types 'because the 
value can never be negative'.  Unsigned 8 and 16 bit values usually have 
a good reason to be unsigned, but when you start getting to 32 and 64 
bit values it makes a lot less sense most of the time.


When working with non-X86 platforms other problems are usually much more 
severe: More expensive thread synchronization primitives, lack of 
efficient variable bit bit-shifting (run-time determined number of bits 
shifted), non-existent branch prediction, or various floating point code 
promoting to emulated double precision code silently on hardware that 
can only do single precision floating point etc.




Quirks of 'alias foo this'

2011-04-25 Thread Sean Cavanaugh
So my research into making a nice friendly to use COM interface wrapper 
for D has had a few odd turns and I am wondering if there is an answer 
to making the implementation nicer.


I discovered the 'alias foo this' syntax to let structs nearly 
seamlessly impersonate a member variable.  This has turned out to solve 
most of my original need to wrap the functions, but it is imperfect.


The main problem with 'alias foo this' is that I am having is that I 
can't find a way to catch code reading the aliased variable, in cases of 
assignment or implicit conversion to foo type.  I can catch writes just 
fine with opAssign, but finding a way to overload the reads have me stumped.


I did some experiments with wraping the methods with some mixin 
templates, but using 'alias foo this' is about 100% more useful, 
intuitive and 99.9% less code to write :)



Examples (The fully ComPtr code is down further):


// initializes to null by default
ComPtr!(ID3D11Device) device;
ComPtr!(ID3D11Device) otherdevice;

// The 'device' argument to D3D11CreateDevice is implemented as
// 'out ID3D11Device', and uses the 'alias p this' feature to
// auto-magically write directly into device.p;  Ideally
// I could hook this and either call SafeRelease here or assert
// that the p variable is null before being written to.
// This also represents the a case that you can write to the
// struct without detecting it.
HRESULT rslt = D3D11CreateDevice(
null,
D3D11_DRIVER_TYPE.HARDWARE,
null,
0 | D3D11_CREATE_DEVICE.DEBUG,
null,
0,
D3D11_SDK_VERSION,
device,
featureLevel,
null);


// post-blit case, works
otherdevice = device;   


// gives me a copy of 'p' due to 'alias p this'
ID3D11Device rawdevice = device;


// assignment back the other direction is caught by opAssign
// this is also the code path used if there are multiple COM
// interfaces in the hierarchy (IUnknown-ID3D11Resource-ID3D11Texture)
// and post-blit isn't used because the types are different.
device = rawdevice;






My current version of ComPtr:



struct ComPtr(T)
{
public:
static assert(is(T : std.c.windows.com.IUnknown) || is(T : 
win32.unknwn.IUnknown));

T p;
alias p this;

private:
this(T inPtr)
{
p = inPtr;
}

public:
this(this)
{
if (p !is null)
{
p.AddRef();
}
}
~this()
{
SafeRelease();
}

// Attach and Detach set/unset the pointer without messing with the 
refcount (unlike opAssign assignment)

void Attach(T other)
{
SafeRelease();
p = other;
}
T Detach()
{
T rval = p;
p = null;
return rval;
}

const bool HasData()
{
return (p !is null);
}

void opAssign(T other)
{
if (other !is null)
{
other.AddRef();
SafeRelease();
p = other;
}
else
{
SafeRelease();
}
}

void SafeRelease()
{
if (p !is null)
{
p.Release();
p = null;
}
}
}



Re: Implementing std.log

2011-04-24 Thread Sean Cavanaugh

On 4/20/2011 11:09 AM, Robert Clipsham wrote:

Hey folks,

I've just finished porting my web framework from D1/Tango to D2/Phobos,
and in the transition lost logging functionality. As I'll be writing a
logging library anyway, I wondered if there'd be interest in a std.log?
If so, is there a current logging library we would like it to be based
on, or should we design from scratch?

I know there has been discussion about Google's
http://google-glog.googlecode.com/svn/trunk/doc/glog.html and another
candidate may be http://logging.apache.org/log4j/ . Do we want a
comprehensive logging library, or just the basics? (Possibly with some
method for extension if needed).




I just wanted to mention Pantheios as a C++ logging system to take look 
at as well, I didn't see it mentioned in this thread and it seems to 
have all the major requirements for frontend/backend chaining and 
whatnot that people have brought up.  The code is on sourceforge to boot.




Re: GDC2, LDC2 Status Updates?

2011-04-24 Thread Sean Cavanaugh

On 4/24/2011 3:51 PM, Iain Buclaw wrote:


As some anecdote goes, bugs will be found once you stop looking.


Or when you want to show your app to someone else :)  I suspect this 
increases geometrically with the number of people watching and how many 
times you tell other people how cool it will be.





Re: OOP, faster data layouts, compilers

2011-04-22 Thread Sean Cavanaugh

On 4/22/2011 2:20 PM, bearophile wrote:

Kai Meyer:


The purpose of the original post was to indicate that some low level
research shows that underlying data structures (as applied to video game
development) can have an impact on the performance of the application,
which D (I think) cares very much about.


The idea of the original post was a bit more complex: how can we invent 
new/better ways to express semantics in D code that will not forbid future D 
compilers to perform a bit of changes in the layout of data structures to 
increase code performance? Complex transforms of the data layout seem too much 
complex for even a good compiler, but maybe simpler ones will be possible. And 
I think to do this the D code needs some more semantics. I was suggesting an 
annotation that forbids inbound pointers, that allows the compiler to move data 
around a little, but this is just a start.

Bye,
bearophile



In many ways the biggest thing I use regularly in game development that 
I would lose by moving to D would be good built-in SIMD support.  The PC 
compilers from MS and Intel both have intrinsic data types and 
instructions that cover all the operations from SSE1 up to AVX.  The 
intrinsics are nice in that the job of register allocation and 
scheduling is given to the compiler and generally the code it outputs is 
good enough (though it needs to be watched at times).


Unlike ASM, intrinsics can be inlined so your math library can provide a 
platform abstraction at that layer before building up to larger 
operations (like vectorized forms of sin, cos, etc) and algorithms (like 
frustum cull checks, k-dop polygon collision etc), which makes porting 
and reusing the algorithms to other platforms much much easier, as only 
the low level layer needs to be ported, and only outliers at the 
algorithm level need to be tweaked after you get it up and running.


On the consoles there is AltiVec (VMX) which is very similar to SSE in 
many ways.  The common ground is basically SSE1 tier operations : 128 
bit values operating on 4x32 bit integer and 4x32 bit float support.  64 
bit AMD/Intel makes SSE2 the minimum standard, and a systems language on 
those platforms should reflect that.


Loading and storing is comparable across platforms with similar 
alignment restrictions or penalties for working with unaligned data. 
Packing/swizzle/shuffle/permuting are different but this is not a huge 
problem for most algorithms.  The lack of fused multiply and add on the 
Intel side can be worked around or abstracted (i.e. always write code as 
if it existed, have the Intel version expand to multiple ops).


And now my wish list:

If you have worked with shader programming through HLSL or CG the 
expressiveness of doing the work in SIMD is very high.  If I could write 
something that looked exactly like HLSL but it was integrated perfectly 
in a language like D or C++, it would be pretty huge to me.  The amount 
of math you can have in a line or two in HLSL is mind boggling at times, 
yet extremely intuitive and rather easy to debug.




Re: OOP, faster data layouts, compilers

2011-04-22 Thread Sean Cavanaugh

On 4/22/2011 4:41 PM, bearophile wrote:

Sean Cavanaugh:


In many ways the biggest thing I use regularly in game development that
I would lose by moving to D would be good built-in SIMD support.  The PC
compilers from MS and Intel both have intrinsic data types and
instructions that cover all the operations from SSE1 up to AVX.  The
intrinsics are nice in that the job of register allocation and
scheduling is given to the compiler and generally the code it outputs is
good enough (though it needs to be watched at times).


This is a topic quite different from the one I was talking about, but it's an 
interesting topic :-)

SIMD intrinsics look ugly, they add lot of noise to the code, and are very 
specific to one CPU, or instruction set. You can't design a clean language with 
hundreds of those. Once 256 or 512 bit registers come, you need to add new 
intrinsics and change your code to use them. This is not so good.


In C++ the intrinsics are easily wrapped by __forceinline global 
functions, to provide a platform abstraction against the intrinsics.


Then, you can write class wrappers to provide the most common level of 
functionality, which boils down to a class to do vectorized math 
operators for + - * / and vectorized comparison functions == != = =  
and .  From HLSL you have to borrow the 'any' and 'all' statements 
(along with variations for every permutation of the bitmask of the test 
result) to do conditional branching for the tests.  This pretty much 
leaves swizzle/shuffle/permuting and outlying features (8,16,64 bit 
integers) in the realm of 'ugly'.


From here you could build up portable SIMD transcendental functions 
(sin, cos, pow, log, etc), and other libraries (matrix multiplication, 
inversion, quaternions etc).


I would say in D this could be faked provided the language at a minimum 
understood what a 128 (SSE1 through 4.2) and 256 bit value (AVX) was and 
how to efficiently move it via registers for function calls.  Kind of 
'make it at least work in the ABI, come back to a good implementation 
later' solution.  There is some room to beat Microsoft here, as the the 
code visual studio 2010 outputs currently for 64 bit environments cannot 
pass 128 bit SIMD values by register (forceinline functions are the only 
workaround), even though scalar 32 and 64 bit float values are passed by 
XMM register just fine.


The current hardware landscape dictates organizing your data in SIMD 
friendly manners.  Naive OOP based code is going to de-reference too 
many pointers to get to scattered data.  This makes the hardware 
prefetcher work too hard, and it wastes cache memory by only using a 
fraction of the RAM from the cache line, plus wasting 75-90% of the 
bandwidth and memory on the machine.




D array operations are probably meant to become smarter, when you perform a:

int[8] a, b, c;
a = b + c;



Now the original topic pertains to data layouts, of which SIMD, the CPU 
cache, and efficient code all inter-relate.  I would argue the above 
code is an idealistic example, as when writing SIMD code you almost 
always have to transpose or rotate one of the sets of data to work in 
parallel across the other one.  What happens when this code has to 
branch?  In SIMD land you have to test if any or all 4 lanes of SIMD 
data need to take it.  And a lot of time the best course of action is to 
compute the other code path in addition to the first one, AND the fist 
result and NAND the second one and OR the results together to make valid 
output.  I could maybe see a functional language doing ok at this.   The 
only reasonable construct to be able to explain how common this is in 
optimized SIMD code, is to compare it to is HLSL's vectorized ternary 
operator (and understanding that 'a' and 'b' can be fairly intricate 
chunks of code if you are clever):


float4 a = {1,2,3,4};
float4 b = {5,6,7,8};
float4 c = {-1,0,1,2};
float4 d = {0,0,0,0};
float4 foo = (c  d) ? a : b;

results with foo = {5,6,3,4}

For a lot of algorithms the 'a' and 'b' path have similar cost, so for 
SIMD it executes about 2x faster than the scalar case, although better 
than 2x gains are possible since using SIMD also naturally reduces or 
eliminates a ton of branching which CPUs don't really like to do due to 
their long pipelines.




And as much as Intel likes to argue that a structure containing 
positions for a particle system should look like this because it makes 
their hardware benchmarks awesome, the following vertex layout is a failure:


struct ParticleVertex
{
float[1000] XPos;
float[1000] YPos;
float[1000] ZPos;
}

The GPU (or Audio devices) does not consume it this way. The data is 
also not cache coherent if you are trying to read or write a single 
vertex out of the structure.


A hybrid structure which is aware of the size of a SIMD register is the 
next logical choice:


align(16)
struct ParticleVertex
{
float[4] XPos;
float[4] YPos;
float[4] ZPos;
}
ParticleVertex[250] ParticleVertices;

// struct is also

Getting function argument names?

2011-04-20 Thread Sean Cavanaugh
I am working on a template class to provide function wrappers for COM 
based objects, so the calling code doesn't have to dereference the 
underlying pointer.  In C++ we get this behavior for 'free' by 
overloading operator-.  In D I can create a fairly funky mixin template 
to inspect the contained class and auto-generate a bunch of wrapper 
functions.


Its easy enough to get the function list, by looping on 
__traits(getVirtualFunctions) for a class type, and feeding the results 
into MemberFunctionsTuple!().  This needs to be followed up by repeating 
the process for each base class specified , via BaseClassesTuple!().



So the real question:

Is there any way to get to the friendly names for function arguments 
discovered through the traits system?  MemberFunctionsTuple!() only 
returns the types as far as i can tell, which technically is enough for 
me to automatically name them something boring ArgA ArgB ArgC 
procedurally, is not a very nice thing to look at.




  1   2   >