On 22.08.2012 12:23, Józef Kucia wrote:
On Tue, Aug 21, 2012 at 10:52 PM, gurketsky <gurket...@googlemail.com
<mailto:gurket...@googlemail.com>> wrote:
I just like to present the state of the ID3DXConstantTable
implementation, so that possibly no work is done twice. This goes
specifically to Józef. I'm not sure what's the plan on this. There
are two problems which arise and I did not had time to sort those
out, yet.
Thanks for notifying me. I was about to write some tests for structures
in constant tables. I see that you've already written such tests.
Cheers,
Józef Kucia
Well, I just had a closer look again. My speed test triggers a problem,
so it's not really comparable. But it looks like native only allocates
handles, if it really needs them. So I'm not sure we like to go the same
approach or if it is fine allocating them all like I've done that. I
fixed the test and native speed advantage was blown away. Although, if
that's fine with your opinion, you could reuse the code or I could clean
it up a bit and send it. I haven't send it and improved it, yet, because
the problem with solution 1. or 2. isn't solved for me. I'm a little bit
against version 1., because I think speed might be an issue and no one
had a technical argument against 2. I don't think a extra handle list
for handles with "small" values is the way to got, because it may also
hit the values where strings could be in memory. The extra handle list
would be 3., but I think it needs a lot more memory for just adding a
"layer" for checking the handles. Thus 2. is a lot better than 3. (which
I haven't explained in detail).
To tackle the problem:
1. The handle and table mixing could be worked around by using a global
list for all tables and searching the handles in there. That should be
easy to add. The problem I see with that, might be speed related, when
we have a lot of handles, searching the list will be slow. Through, I'm
fine with that solution and would add it. It will show, if it really is
so slow...
2. But the other solution isn't dead for me, yet. I had another look at
the D3DXHANDLE usage and the question what the hell is
D3DXCONSTTABLE_LARGEADDRESSAWARE used for? It was said, it's bad and
broken, but I haven't seen why, just that it is ugly, but I couldn't see
a technical reason not to do so. What's specifically the problem:
The argument is, that D3DXHANDLES are distinguished from strings by
using the highest bit (bit #31). Thus with LARGEADDRESSAWARE the usage
of stings as D3DXHANDLEs is not allowed anymore (see
http://msdn.microsoft.com/en-us/library/windows/desktop/bb943959%28v=vs.85%29.aspx).
Also this would speed up the detection of a handle dramatically, well it
doesn't check for validness, but native doesn't do that, if you pass a
garbled handle, it will crash.
D3DXHANDLE handle_from_constant(struct ctab_constant *constant)
{
if (largeadressaware && constant) return (D3DXHANDLE)constant;
if (constant) return (D3DXHANDLE)((UINT_PTR)constant | 0x80000000);
return NULL;
}
struct ctab_constant *is_valid_constant(struct ID3DXConstantTableImpl
*table, D3DXHANDLE handle)
{
if (largeadressaware) return (struct ctab_constant *)handle;
if ((UINT_PTR)handle >> 31) return (struct ctab_constant
*)((UINT_PTR)handle & 0x7fffffff);
return get_constant_by_name(table, NULL, handle);
}
According to http://en.wikipedia.org/wiki/Virtual_address_space:
32bit on 32bit without LARGEADDRESSAWARE: has only 2gb (default 32bit)
32bit on 64bit without LARGEADDRESSAWARE: has only 2gb (default 32bit)
64bit on 64bit without LARGEADDRESSAWARE: has only 2gb
32bit on 32bit with LARGEADDRESSAWARE: has 3gb
32bit on 64bit with LARGEADDRESSAWARE: has 4gb
64bit on 64bit with LARGEADDRESSAWARE: has 8tb (default 64bit)
So in cases, where the exe is linked with LARGEADDRESSAWARE, d3dx9 would
have to be used with D3DXCONSTTABLE_LARGEADDRESSAWARE. That way it's the
same for os with 32bit and 64bit. The only problem I see, nowhere is
said, that the 2gb will always be the lowest 2gb. But my tests showed,
that I always get the lower 31bit of addresses in my test runs when
allocating memory. Thus I'm very unlucky by not getting a higher address
or this might be the way it works on windows. Has anyone a technical
argument against this solution?
I hope this helps you to make the correct decisions.
Cheers
Rico