I run into denorm issues all the time with 32 bit, what I do is introduce an 
epsilon “noise-floor”, below which I alias signal to 0. 

I’m told that this approach is some kind of dithering because the input value 
is more or less random distribution.


Sent from my iPhone

> On May 10, 2023, at 7:42 PM, Sampo Syreeni <de...@iki.fi> wrote:
> 
> On 2023-04-24, robert bristow-johnson wrote:
> 
>>> It's just easier and mathematically simpler to work in fixpoint.
>> 
>> Whoa! That's very interesting! Seems to me that the common sentiment was to 
>> the contrary. With floating point you don't have to worry about scaling and 
>> trading off headroom with quantization noise floor.
> 
> This is two-pronged. If you "just want it to work, as if it was reals", then 
> floats are easier. Indeed with denormals included too. But if you *really* 
> want it to work *exactly* to the limit, and you know what you're doing, the 
> linearity of fixpoint saves you a lot of trouble. In, say, the numerical 
> stability analysis of your filters, and in the latency and bother possibly 
> coming from underflow exceptions.
> 
> When analysed to the hilt to beginwith, fixpoint is just far simpler. It's 
> much more regular than floating point. When properly dithered, it's more or 
> less linear, which floating point is not, and can't be made so in any known 
> way. You can shove the conventional LTI theory at fixpoint even in filter 
> topology, whereas with floats, especially with denormals, you can not.
> 
> The basic example of this is a slowly, exponentially decaying reverb tail. 
> Something like that is a numerical nightmare in float arithmetic.
> Your sound will inevitably decay into the denormalised range, as the typical 
> case arising from zero input. Then you'll have to take in louder sounds, so 
> you're suddenly forced to sum denormals to whatever louder. Fuck. This is 
> then the typical thing in music, with any kind of decent dynamic range, and 
> pauses.
> 
> In fixpoint of sufficient width, you just dither everything on input, mind 
> your gain structure, and let your filters decay down into the noise floor. In 
> the stochastically linear fashion that theory guarantees.
> 
> Obviously it's much more difficult than this in the end. Using the easiest 
> additive TPDF dither we now typically use, you'll be adding noise at every 
> step of the way. It adds up, so intermediate representations in fixpoint 
> might need *lots* more precicion than 24 bits. Doing complex filter 
> topologies, you'd theoretically need add noise every step of the way, if you 
> can't prove every step of the way scales the noise down. Which you usually 
> can't do, or won't have the knowhow to show. Also, if you do subtractive 
> dither — the ideal, and my favourite — no general theory exists of how to use 
> it within entire processing topologies.
> 
> So maybe you'd have to go even towards the 64-bit range. Really wide. 
> Especially since audio editing software and editing practice has been going 
> towards stupendously many simultaneously sounding little clips of sound, 
> summed together. There the background noise compounds not only from 
> individual sources, but from all of the processing applied to them. If you do 
> the math in absolute amplitude like I like to do, you can see it really does 
> compound, about inversa quadratically in the number of sources (also: 
> overlaying edits), and quadratically in the number of connections in FIR 
> filters and their internal connections. In IIR work, much faster even, and 
> there you can't even linearize too well via dithering, so that your filter 
> topology easily ends up nonlinear. Cf. 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__timbreluces.com_assets_sacd.pdf&d=DwIDaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=TRvFbpof3kTa2q5hdjI2hccynPix7hNL2n0I6DmlDy0&m=OaFhI1ty5xoUgtdkgSc2oha-1kI1pftPFhotOSuiq1Bukx_wnO3dseEV1kp7YjHW&s=7joJnqk5sIFtTxZVHqNqURUNHH9CNjkp9nGoTCC-nr0&e=
>   , the analysis generalises from just delta-sigma-ADC to everything 
> happening within your digital filter.
> 
> Then it's worse when doing floats. Because they're semi-logarithmic. You 
> can't do optimal dithering with them. Especially you can't do it with them 
> through any network of basic LTI signal processing operations. So you'll be 
> left out with an untenable mess of nonlinearity.
> 
> The tome I learnt my DSP-fu from was Alan V. Oppenheim's "Digital Signal 
> Processing", derived from his dissertation. In that there were plenty of 
> interesting and useful ideas, ranging from the unification of continuous and 
> discrete time Fourier theory, to finally even homomorphic signal processing 
> as a novel idea. But in the third fourth of the treatise, also a principled 
> treatment of certain nonlinear aspects of DSP, such as limit cycles and dead 
> bands.
> 
> That's the stuff that matters, here. How linear the notionally linear digital 
> circuits we build, actually are. What can be done to linearize them further. 
> How compositional and compositionally linear can they really be.
> 
> Because just as an example, consider a least-significant-bit worth of 
> positive bias coming from a 16-bit ADC, into a signal chain handling basic 
> 32-bit floats. Unless every stage of your filter topology is mathematically 
> guaranteed to attenuate DC fast enough, that bias/DC will propagate into the 
> next stage of the filter, and in a recirculating IIR topology, might break 
> numerical stability. Soon and definitely would cross from just affecting the 
> mantissa, to crossing a threshold to the next value of the exponent, in a 
> float representation. Which is then hihgly nonlinear.
> 
> Then, when that happens, you can often hear the transition. It's typically 
> low level, but it can still be heard. It sounds like an aliasing transition, 
> with *all* of the digitally, aliasingly induced "metallic" harmonics being 
> induced at the same time, transiently "for no apparent reason".
> 
> This mostly doesn't happen in fixpoint when you know what you're doing. 
> Because even 24/44 is more or less linear and so analyzable in the classical 
> LTI framework. Because of the quadratic scaling of noise, and how we do gain 
> structure in the studio, we can even sum lots of sound sources and edited 
> clips over each other at 32-bit fix, without building up noise beyond the 
> hearing threshold of a human.
> 
> But building up truly and provably silent digital filters... That takes real 
> effort. It's a thingy most and especially I really struggle with. And that 
> problem won't be solved with wider floats of fixeds, as if you could just 
> code and let your numerical accuracy mayhem to the gods. No, no-no, if you 
> actually want to code properly, it takes hard math. It really does, even 
> beyond my capability.
> 
>> But you should not *have* to scale your sums with floating point anyway.
> 
> But you do: floats are a semi-logarithitmic representation of the real line. 
> That's what makes floats so horrific to beginwith. They aren't really 
> suitable for LTI processing, but, if anything, to something like astronomy, 
> where we deal with widely differing degrees of scale. Things unlinear, unlike 
> how we deal with linear wave phenomena such as sound, and how linearly we as 
> people tend to perceive them.
> 
> (And sorry, I might be responding to myself. If so, you ought to be 
> chevroning my post as well...well. Top-posting in particular is difficult to 
> answer to, in a principled fashion. Every tail of another's post ought to be 
> cut short. <3 )
> -- 
> Sampo Syreeni, aka decoy - de...@iki.fi, 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__decoy.iki.fi_front&d=DwIDaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=TRvFbpof3kTa2q5hdjI2hccynPix7hNL2n0I6DmlDy0&m=OaFhI1ty5xoUgtdkgSc2oha-1kI1pftPFhotOSuiq1Bukx_wnO3dseEV1kp7YjHW&s=zlBxzf7-hdqK-ldIqQmCxON3H37iaJ7RX6d0nCIHFlM&e=
>  +358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2

Reply via email to