Re: Are there now 64-bit processors that deal with denorms routinely with no exception or interrupt?

Piotr Holonowicz Wed, 05 Jul 2023 04:45:48 -0700

Hi,

Long time ago have implemented a Gammatone filterbank using CUDA onNVIDIA GPU's. I had to use the double type due to the filter stabilityproblems at the low frequencies ( below ~200 Hz ). However, at that timeI did not know that the CUDA has quite a fine grained control of therounding behavior. So right now the CUDA (I do not know how it is withthe OpenCL) provides the following features:1. AFAIK, GPU kernels (= functions you execute on GPU) do not throw anyexceptions for misbehaving floating point representation. So your codewould rather produce a garbage than crash, unless it fails for otherreason (eg. wrong memory handling, race conditions).2. You can decide with the compiler flags to generate code that eitherallow more precision or faster execution (*eg. -ftz flag drives thehandling of denormalized floats. **|--ftz=true|**flushes denormal valuesto zero and **|--ftz=false|**preserves denormal values.)*3. Inside GPU kernels you can use special mathematical intrinsics andmost of them comes with multiple suffixes, eg /__fadd_rn()/ does afloating point addition with rounding to the nearest even number, while/__fadd_rz()/ rounds the result towards zero.4. You can program it using ANSI C language (if you like to suffer ;) )or a bit limited C++ ( but not as limited as eg. C++ Embedded, thenewest CUDA allows to use even the C++20 [again, with some restrictions]).So a GPU can be treated as thousands of multithreaded DSP's running atonce. Of course, CUDA programming is definitely not trivial but IMHO itanswers positively to the question posted in the subject of this thread.


Cheers,

Piotr H.


On 27.04.2023 06:23, Ethan Duni wrote:

The only place I’ve seen arbitrary precision arithmetic used in audio is in 
coding. If you’re doing arithmetic coding or FPC or similar, you need to do 
arithmetic operations on words of large, potentially variable size (hundreds of 
bits, say). This is still fixed point, and is not really in the audio 
processing part - it’s about converting between vectors of quantized values and 
a packed bitstream representation.

I guess for general audio DSP it doesn’t work if you have any feedback paths. 
Because the precision will grow without bound, and so your process will slow to 
a crawl and then run out of memory and crash. Even a simple one pole smoother 
will explode in a matter of seconds.

So you are forced to limit the precision to some reasonable value, in which 
case why not just use double and be done with it.

Ethan D

On Apr 26, 2023, at 1:41 AM, Andy Farnell<padawa...@obiwannabe.co.uk>  wrote:

I'm also very much emjoying this thread and realise that the focus is
on microprocesor ALU behaviour.

But to widen it a little, let's remember that compilers and languages
are in many senses inseperable from silicon in modern computing.

Returning to the spirit of the OP - that surely in 2023 we shouldn't
have to worry about denormals - why aren't there coding workflows that
make the problem disappear? Our position is roughly; Floats are easy
to use, but there's no getting around problems at the small end of
precision, whereas fixed arithmetic needs more careful code and fails
hideously at the large end of precision.

Given the premise that we don't have to worry about efficiency any
longer, what's missing in my mind is a third option that is
arbitrary-precision arithmetic, common to uLISP, Scheme etc.

Audio signals, as per reverbs, filters and exp decaying envelopes,
tend to zero, where the denormal problem pops up, and remains for all
time until interrupted. Other than accidental divide by <bignum>, most
signals are transiently out of range for only a very short time,

 From a computing point-of-view, having absorbed the initial overhead
of variable-length bignum arithmetic, might audio DSP not benefit
greatly as it would seem to prefer the statistically rare and
occasional very large values than more commonplace runs approaching
zero?

Anyone know of successful arbitrary-precision work in audio DSP or is
the whole conceit misguided for reasons I have missed?

cheers,
a.

On Mon, Apr 24, 2023 at 08:51:35PM +0300, Sampo Syreeni wrote:

On 2023-04-08, robert bristow-johnson wrote:

Listen, people here that know me from the 1990s, know that I was a staunch
fix-point advocate.

And people don't know me. For a reason: I'm not a practitioner, but a
theoretical amateur in the field. You have every reason to chastise me.
However, about floats, fixpoint and denormalisation...

It's just easier and mathematically simpler to work in fixpoint. Earlier you
just couldn't have the range for audio work you needed, so there had to be
floats, A/μ-laws, and whatnot. But now you don't need them. Definitely you
don't need them in discussing 64-bit arithmetic; as I said, even usual
32-bit C-float lets you have a linear range of 24 bits, signed, which is
more than enough.

I do understand where floats and denormals come from. They're half about
numerical analysis, and half about software development ease. If you want to
deal with quantities which range wildly over orders of magnitude, and their
inexatitude is relative, not absolute, you'll want floats. There's really no
substitute for floats there. Then if you want your numerical algorithm to
not underflow, in many cases you'd want your floats to denormalise — which
is to say, suddenly behave linearly, unlike floats as an exponential
representation do, and like fixpoint does.

Since we're talking on a sound-minded group, I perhaps should remind you of
"the gain structure". How analogue studios controlled their noise.

I believe the choice of digital gauge is very much the same as that one. If
you do it wrong, on the analogue side you'll be left with unbearable noise.
On the digital side, you'll be left with digital rounding noise. But if you
control your gain structure right, especially within nowadays rather wide
24-bit architecture, you really don't even have to think about it too much.
It mostly just works out.

(I'd actually say, calibrate your studio absolutely, like them movie people
do. Done so, a 24-bit linear stream goes below perceptual threholds when
quietest, and exceeds the threshold of pain at max. So it linearly covers
the whole range, and the representaation can be worked with as wholly linear
— except that nobody currently does so. Not even the movie people; even they
mix in relative amplitude and only then set the final absolute calibration.
I think that's thoroughly stupid; the one little wing of our important audio
work which actually *has* a set amplitude reference, chooses not to utilize
it.)

If given an assignment of developing an audio processing system using
fixed-point math, I will not shrink away from the challenge, but **if**
the project is "Hey we got this 64-bit ARM with FPU in it and gobs of
memory, I don't want my code to be checking for saturation and "minding
the gain structure".  Fuck no.

You don't have to mind saturation if your gain structure is well thought
out. If you know the limits and averages of your input signals, and scale
your sums appropriately. It's not even hard. Even for me, as a rank amateur.

What's really hard is controlling nonlinearity in your signal processing
algorithm. What's doubly hard is controlling the semi-logarithmic tendency
floats obviously do, *and* at the same time ne linear ramp denormals
produce. Floats and denormals *do* make it easier for the average chump to
churn out his average numerical codes, but once we go into numerical
analysis, signal processing, for real, as this list is ostensibly about,
that mix of linear and semi-logarithmic is a horror. You don't want to deal
with it; if you don't think it's a horror, then you haven't tried to deal
with to begin with.

My favourite here is dithering. There's no known closed form solution of how
to do any of it given denormals. You can do some of it using floats, but
only after approximating them via a true exponential. Denormals, they are
impossible to analyze together with floats; you can't easily do mathematics
in the linear and multiplicative domains, at the same time. Especially when
you cut off the regime arbirarily, at your lowest float bit depth; that's
yet another arbitrary nonlinearity to your analysis, right there.

But, if my tool is a 64-bit processor that can do 64x64 to 64-bit result
in the same nanosecond instruction cycle as anything else (like 32-bit
fixed-point processing), why would I toss that headroom and legroom away?

But it can't, basically because of carry propagation in digital circuitry.
It is impossible to do sums or products in O(1) circuit width. This means
that 32-bit arithmetic will always in the end be more efficient than its
64-bit bigger brother. That being when it applies; if you really need 64
bits, then the hardware often helps you. But if the underlying algorithm can
be parallelized to two 32-bit ALU's, the narrower bitwidth will necessarily
win out.

It's only when a **final** sample value is getting output, that I should
need to worry about gain, saturation, quantization, and noise-shaping.  I
shouldn't have to worry about it anywhere else. Not if I'm using a 64-bit
ARM.

I beg to differ. My ideal is that whatever you bring into your audio
processing chain, is absolutely referenced. It has an absolute decibel level
attached to it. Like them movie people attempt to do it.

If you work like that, and your gain structure follows, there's no
saturation or quantization or anything like that, anywhere, ever. You can
and *will* work within the 24-bit linear range of even a 32-bit float,
because 1) going lower than 1-bit would be unhearable, and 2) going to the
full 24 bits would literally split your ears. Then between those wide
limits, you have full linearity, which helps you produce better and more
stable algorithms.
--
Sampo Syreeni, aka decoy 
-de...@iki.fi,https://urldefense.proofpoint.com/v2/url?u=http-3A__decoy.iki.fi_front&d=DwIDaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=TRvFbpof3kTa2q5hdjI2hccynPix7hNL2n0I6DmlDy0&m=vtOGzNBjAn2ZUHBbbrTjtPH5lT5cspQKezjas3QGq6riiKpwBIILlEmP4abhgKuV&s=EYOxKpb4AY3RHbbrEfII0e9ytjZ1iFe_oZdrui2uTYM&e=
+358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2

Re: Are there now 64-bit processors that deal with denorms routinely with no exception or interrupt?

Reply via email to