Hello Thilo,

I just saw your commit 2077 and I think using Hunk_AllocateTempMemory for
network queues will not work. Hunk_FreeTempMemory will only free memory if
you free the memory in LIFO order, otherwise it will just keep the memory
allocated until Hunk_ClearTempMemory is called.

Hunk_ClearTempMemory can be called any time from FS_FreeFile when the number
of open files is zero, so you cannot be sure that your memory will stay
valid over a few frames. E.g. QVM could call RegisterShader, so the renderer
opens and closes a jpg file and all Temp Memory will be gone !

Hunk_AllocateTempMemory/Hunk_FreeTempMemory are great to allocate some work
memory in a function that you're going to free when you leave the funcion
(like alloca), but they're no replacement for malloc/free.


Also regarding rev 2052, I think that changing the default rounding mode
from round-to-nearest to round-to-zero should be avoided, as it is less
precise and round-to-nearest is in fact what all libraries expect. The only
function which has round-to-zero semantics by the spec is ftol, and the
introduction of Q_ftol indicates to me that speed was more important to id
than standards compliance. Q_ftol is only used in places where precision is
not important (e.g. converting colors from float to byte), but the default
rounding mode affects every fp operation, so it could influence the game
physics etc.

This is especially bad as you have to switch the mxcsr back to
round-to-nearest for the snapvector function. Control register changes are
very expensive and should be avoided.

Speaking of snapvector, I also would avoid the maskmovdqu instruction as it
writes directly to memory, bypassing the cache. When you re-read the vector
later you will always encounter a cache-miss, so I think it would be better
to write the result back with moveups, preserving the original 4th value :

  qsnapvectorsse PROC
movaps xmm1, ssemask ; initialize the mask register
movups xmm0, [rcx] ; here is stored our vector. Read 4 values in one go
movaps xmm2, xmm0 ; keep a copy of the original data
andps xmm0, xmm1 ; set the fourth value to zero in xmm0
andnps xmm1, xmm2 ; set values one, two and three to zero in xmm1
cvtps2dq xmm0, xmm0 ; convert 4 single fp to int
cvtdq2ps xmm0, xmm0 ; convert 4 int to single fp
orps xmm0, xmm1 ; combine all 4 values again
movups [rcx], xmm0  ; write 3 rounded and 1 unchanged values back to memory
ret
  qsnapvectorsse ENDP

(assuming the global round mode is round-to-nearest ofc).
_______________________________________________
ioquake3 mailing list
[email protected]
http://lists.ioquake.org/listinfo.cgi/ioquake3-ioquake.org
By sending this message I agree to love ioquake3 and libsdl.

Reply via email to