Re: about subtractive dither, for audio and other use (also scientific)

Andy Farnell Sat, 20 Nov 2021 01:32:16 -0800

Sometimes rewording a problem helps you see a way forward, so I'll
offer this:


The resync problem you express in (2) can be reimagined as a code
cracking challenge. So let's imagine a "Bad Cryptographers" scenario;

Alice is communicating with Bob using a steganographic overlay on a
signal sent in the clear (It just happens from your viewpoint that
this signal has ideal statistical properties for doing dither).  For
Bob, decrypting the signal is synonymous with removing it, and
of course Bob knows the key in advance.

As you know, PRNGs are not at all random and cryptographers struggle
with finding seeds producing long and well distributed cycle lengths.
If an adversary knew the seed parameters then searching the space is
possible, although time consuming in reality.

What if Alice and Bob were spectacularly bad cryptographers and
arranged that their "messages" were drawn from a small set of repeated
sub-codes and also published a list of those. In other words they're
making the cracking time as short as possible. There is only one
discoverable private key and no forward secrecy, so finding the key of
one sub-code means finding them all.

Now maybe your problem looks like this (unless I have gravely
misunderstood it); You want to find a set of known codes with the best
statistical properties for signal dithering which can be expressed
generatively (perhaps as a set of keys for generators of keys for
generators... chained thusly) that can be compactly sent or pre-known
by the receiver. Searching (resyncing) is now measured in milliseconds.

So, I guess this problem is an old one, and the application area I
imagine you could look is frequency-hopping or spread-spectrum
crypto where you _want_ to be able to relock onto a sequence
using a bunch of pre-shared keys.

best,
Andy

On Sat, Nov 20, 2021 at 01:22:57AM +0200, Sampo Syreeni wrote:
> Hi. As many of you may know, there are two different means dithering a
> signal: subtractive and additive. The first requires you to know the precise
> dither signal put in the A/D stage when doing D/A, which makes it difficult
> to apply. As such additive dithering is nowadays used, because it works
> single-ended, and is good enough for most work.
> 
> I would like to combine the advantages of each approach, but hit a snag some
> years back. I'd like to return to the subject, and solicit some help.
> 
> Additive dither works for audio applications because it can be shown that
> adding a squarely distributed 1 LSB peak-to-peak zero centered noise signal
> before a quantizer decouples the mean of the quantization error from the
> utility signal, and adding each successive one decouples an extra
> statistical moment. Crucially, two such noise signals added lead to
> decoupling not only of the long term average, but also variance, meaning
> power. As such, at the lowest levels, audible noise modulation (coupling of
> the error signal to average power) goes away at precisely two such signals
> added.
> 
> (Adding independent white noises of given statistical distribution to each
> other, leads to the resulting noise having a distribution the convolution of
> the formers'. As such, adding two squarely distributed noises leads to a
> triangularly distributed outcome; the TPDF dither we so often use when
> rounding down from higher word widths. "That Good-Enough.")
> 
> Things are much better on the subtractive side: if you know the dither
> signal from the start, adding and then subtracting a single independent
> noise channel, rectangularly distributed (RDPF) will decouple *all* of the
> statistical moments from the error signal at the same time. It's as good as
> it gets, from the get-go. It's also easy to simulate, so go about it: 6dB
> extra on the voltage, 3dB extra on power, and whatever the units are on the
> higher moments; it's *very* much audible when you get there.
> 
> The trouble is, then, how to transmit the subtractive dithering signal to
> the receiver side. Obviously, you can't use a full side channel. But in the
> digital domain you *can* use pseudo-random number generators. Those whom are
> known to have nice statistical and computational properties. The side
> channel to indicate a particular one is lean, and with the cryptographically
> secure ones, only needs something like 128 bits in order to be
> indistinguishable from white noise. So, when that side channel is available,
> why not tag an (audio?) signal as having been dithered with the optimum
> subtractive dither?
> 
> Which in fact isn't in this case the RPDF kind, but the TPDF one. Because
> absent the metadata -- often the case with running digital from disparate
> sources -- you'd be left with just an additive RPDF dither, which is not
> enough to make the power-coupling of the noise signal to go away. So
> instead, you should do subtractive TPDF dithering, so that the overall
> signal stays well-enough dithered to second degree (and the ear) even if you
> don't know enough to do the subtraction. That's easy fail-over. (If you get
> to do it, it's free from statistical error to all degrees, again.) And
> still, if you can do subtractive, it makes *very* little difference if the
> subtractee is RPDF or TPDF; everything still cancels out. (The only problem
> being that sqrt(2) difference at the edges of the A/D/A range, which ought
> to have some margin anyways. Hitting the rails, and all that, is not much of
> a problem in multibit converters, which we are talking about here.)
> 
> So in NASA kind of work, this suggests a simple configuration: an agreed
> upon configuration of a pseudo random number generator which feeds a dither
> D/A, or maybe some side data to give its seed. This is particularly easy to
> implement if rounding down in the digital domain -- such as in the audio
> work we do and which inspired this post -- but of course, it's the same from
> say 22 bit sensor values down to 8 bit telemetry, in a space probe. Also, it
> *can* be done so-so even for analog signals before the first round of
> digitization, and has been done.
> 
> Next in my line of thought is, how about if we *can't* confer that metadata
> reliably. And herein is the cinch I'd like some help with...
> 
> How do you reliably detect from streaming audio (signal) content, that this
> sort of coding (which subtractive dither very much is, changing the signal
> as it is) has been applied? Without out-of-band metadata? Well the metadata
> has to be in-band by definition, so that it will have to somewhat disturb
> the signal. It also has to be able to be detected from the encoded signal
> with reasonable effort in both software and hardware, yet not skew the
> signal statistically, even if it has to leave *some* recognizable mark.
> 
> To date my best architecture works as follows:
> 
> 1) I XOR all of the bits of a sample together, on the theory that there is
> going to be some randomizing noise in there, and so some entropy and
> independence in the resulting bits per sample.
> 
>   1a) This is useful because you can do the operation over bit stuffed
>   words, the optimum way of embedding stuff into words of wider width.
>   If the less significant bits are set to zero, all of the entropy of
>   the higher order ones will work all the same. That is especially
>   significant in audio work, because it's always 8-18-24 bits in
>   exchange, and often passed over channels which do not declare the
>   accuracy. In the digital domain, those extra zero cease to have any
>   meaning, by design.
> 
> 2) This is because I 1) want to synchronize against what my subtractive
>    coder is doing if bits are lost, and in the longer run, be able to
>    correlate against running data even at the analog level. This is
>    after all about synchronous decoding, and its usual gain.
> 
>   2a) If I'm given a certain running stream of bits, with bit errors and
>       then synchronization errors in it, I want to be able to
>       resynchronize. The typical way to do this would be to run a
>       continuous convolution, detecting a signature. But that's *very*
>       ineffient. Instead I do it stochastically, flagging a small stream
>       of bits from my above XOR circuit, in a running shift register.
>       Then I compare it (after some processing) with a fixed, hashed
>       value. That is very efficient in hardware, and I already know
>       much doable by precomputation even in 8-bit software (cf. how CRC
>       table calculation goes; only using linear polynomial stuff here,
>       so it can be accelerated in hardware via the crypto-krimitives as
>       well).
> 
>   2b) I run the bit stream over an inverse comma-code decoder as well.
>       This is a poor-mans's version of doing a matched filtering operation,
>       because such a decode of such a code essentially stops there and only
>       there, where there isn't an overlapping match. I use a short much
>       like that as my first latch, so that the second, heavier
>       comparison doesn't have to be engaged as often.
> 
> 3) So the idea is that every now and then, if my machinery has been told
>    it's to do subtractive dithering, it'll a) do what it does by
>    metadata, 1) going along as it has from the beginning, or 2)
>    sometimes falling out of sync, so that it searches for the next
>    pseudo-stochastic syncho point, and resynchs then, digitally.
> 
>   3a) If it loses synch, it'd be able to tell within which interval,
>       even if kind of unpredictable by design.
> 
>   3b) The machinery would always leave behind in its subtractive
>       dithering a well defined, well-spread out tract, which would be
>       detectable in more intensive analysis. And sO, the signature could
>       once again be deterministically subtracted out, to arrive at
>       almost pristine signal.
> 
>   3c) The coding theory principle here is that if you only reserve one
>       signal out of the billiards available within even a second of
>       speech band communication, or a 2400 Baud modem thingy, it's one over
>       billiard. Reserving that sort of minimum bandwidth, with proper
>       coding, is unnoticeable. Which is what I'm doing here, with the
>       continuous, deterministic noise introduced for dithering
>       purposes. What I'm introducing here is a couple of bits of
>       information, in the form of on-and-off low level noise.
> 
> So finally... Help me! I haven't been able to make this work quite like the
> theory above went. When I use the most performant algorithms and circuits I
> know for the above, it always comes back to a linear feedback shift register
> network, and even nonlinearity doesn't naively help. This project of mine is
> basically stalled because of that. So, please help me 1) latch on right for
> occasional acquisition without synchroneity or metadata, 2) do low
> computational complexity and high statistical homogeneity serial random
> number generation, and e.g. 3) help me extend the approach towards the
> analog domain, e.g. in space communication.
> -- 
> Sampo Syreeni, aka decoy - [email protected], http://decoy.iki.fi/front
> +358-40-3751464, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2

signature.asc
Description: Digital signature

Re: about subtractive dither, for audio and other use (also scientific)

Reply via email to