On 6/25/14 1:40 PM, Ethan Duni wrote:
which is the point. so these iterations don't have to happen *during* runtime.
At the cost of additional memory overhead, sure. Pretty much any algorithm
can be refactored in various ways to trade off runtime MIPS, memory
overhead, etc. Which implementational choice is best depends on the context
- how precious CPU cycles are vs how precious memory is, etc. In some
cases, trading MIPS for memory is desirable, in others it's not.
similarly to "why do real-time additive synthesis when computing wavetables in
advance will result is the same waveform?"
The answer is "because memory is more precious than MIPS in your specific
implementational context."
well, in the year 2014, let's consider that relative cost. how
expensive is a 1/2 MB in a computer with 8 or more GB? unlike MIPS,
which increase linearly with the number of simultaneous voices and such,
a large lookup table can be used in common with every voice. (without
naming names, i am aware of a specific manufacturer who is licensing
their synth and sample-playback and amp emulation code to another
unnamed company for cloud use. and because the first-mentioned
manufacturer would not restructure their code, the latter company must
in their servers, *duplicate* *everything* including large tables with
constant values, for each and every instantiation of this library in the
servers. so i would like to assume that we're smart enough to avoid that.)
both speed and memory have been increasing according to, roughly,
Moore's Law. but, because of the speed of propagation of electrons, i
am seeing a wall approaching on speed (not considering parallel or
multiprocessing which is a complication that hurts my old geezer brain
to think about - just the scheduling is a fright), but with 64-bit words
(and 64-bit addresses), i am not seeing a similar wall on memory.
but i am fully aware of the tradeoff. at the other extreme, there was
another company i used to worked for that had a reasonably cheap
ARM-like CPU in their synth and they seemed to believe that table lookup
is the only way to do anything. they (unnecessarily, in my opinion)
wasted many megabytes on huge tables to do what could have been easily
done with a 4 or 5 term power series. so i have been arguing both sides
here, in case you think that i'm some sorta neophyte computer guy that
doesn't understand the cost of memory.
if it's a Freescale 56K chip in a stompbox, your point is well-taken.
so how are the tube curves obtained and represented in memory in the first
place?
In the present context, my understanding is that they are modeled as affine
combinations of functions like tanh(), which then are folded into the
implementation. So they're baked into the instruction code, and not sitting
in any discrete place in memory at all. Am I missing your point?
no, you're addressing it. (but if i were a different participant, i
might answer your question "Yes" or "Ahh, yes" and say later i meant
"no".) it leads to the next question of: how is "tanh()" (or whatever
family of curves) computed with a CPU that knows how to move and add and
subtract and multiply and divide floating-point and integer numbers (and
other basic bit manipulation operations)?
what's the point in straining to get a perfect representation of approximated
data to begin with?
My point was only that the table approach adds an additional layer of
approximation, which must be accounted for.
but, in the engineering lab (let's say it's a computer with MATLAB or
whatever numerical computational platform of your heart's desire), you
*still* have your original tube curve data that was empirically
obtained. so the *net* accounting is whatever the end result you get
from the non-iterative function evaluation to the original data.
in my stripped-down representation of f() and g() in the previous post,
how do we know that, once we start fitting curves to f(), that it will
have significantly greater error than fitting curves to g()? maybe we
can tweek the coefficients to f() and, instead of aiming to fit g()
(which was fit to the original empirical data), we can fit to the
original empirical data?
And which is going to require runtime MIPS. Let's note that a single
interpolation between two table points is about as much MIPS as a single
Newton step (and in high dimensions, that is the *simplest* interpolation
available - it's easy to think of approaches that use more than two points,
by varying different combinations of the independent variables). So if
you're comparing to a non-table iterative solution that uses, on average, 2
Newton steps, then the table based approach is only saving about 1 Newton
step worth of MIPS on average, in the simplest case. That's not a lot, and
so doesn't leave you much room to invent more sophisticated table
techniques to reduce memory footprint while still reducing MIPS.
understood. so, then again i'll ask, can we put a limit on the number
of iterations? if so, can we roll it out into "linear" code? (and if
you don't like that term, let's call it "non-iterated" code, the salient
feature is it starts at the top does a few instructions, and exits at
the bottom.)
also, exaggerating the table size to 10^36 or whatever is, how shall i say,
pedantic.
Pardon me, but that is the requirement for the naive table approach you
have specified,
i didn't even specify the number of points for a single-dim table. no
one wrote a word about how many dimensions would be needed. i didn't
say that each and every 32-bit floating-point number had to have its own
unique entry in the table. and we didn't say anything specific about
interpolation. people bitch about linear interpolation and, for a small
table, i will join them. but in other applications (like really,
really, clean sample-rate conversion) i've seen linear interpolation
used on tables as small as 8K or 16K words and result in -120 dB error
due to the interpolation (FYI, the tables were sinc-like functions, but
not precisely the same as sinc()). linear interpolation applied to a
table with 512K entries of any reasonably well-behaved function is
pretty awful goddamn smooth.
for the specific examples discussed in this thread, and for
a modest sampling grid resolution. If that number seems preposterous,
perhaps consider the curse of dimensionality. Things that work great in 2
or 3 dimensions can become astronomically complex in 10+ dimensions.
well, for modeling the mapping function from grid to plate, with a load
line of known slope (but the V_B+ might sag, so that's a variable) and
with a possible reactive component in feedback, how many dimensions are
we gonna end up with? not the whole damn amp; *one* stage of the amp.
Sampling grids are the textbook example of this phenomenon.
as if there is no way to interpolate between entries.
I was explicit that I assumed linear interpolation between entries in that
estimate.
sorry, i hadn't noticed:
On 6/24/14 7:27 PM, Ethan Duni wrote:
It's been given to you several times already in this thread. It has three
parts:
1) There are contexts where memory is much more precious than MIPS. So
eliminating the (runtime) MIPS associated with the iteration in favor of a
big table can be the exact wrong way to go, in those settings.
2) The memory requirements for such a table blow up as you add more coupled
non-linearities to the system. If we need, say, 100 sample points in the
grid for each parameter of a non-linearity, and each non-linearity has 3
parameters, and we have, say, 6 coupled non-linearities (which is still a
fairly small, simple system in analogue circuit terms), then we're talking
about 100^18 = 10^36 grid points. If we use 32-bit parameters, that works
out to the equivalent RAM of about 1 trillion billion PCs with 4GB memory.
That is dramatically more memory than mankind is ever going to produce,
total, before the end of time, absent some breakthrough in quantum
computation that enables astronomical improvement in memory density and
access. And that's for a relatively simple simulation, and without any
crazy resolution on the sampling grids. Absent some clever ideas to put a
lid on the growth of the memory requirements, this "fill tables ahead of
time" thing seems like a valid concept that is a total non-starter for
actual implementation.
3) The non-linearities could further depend on runtime-changable user
parameters, which blows up the memory requirements even further.
<returning>
That's where the "additional layer of approximation" stuff just
above came from. Without interpolation, the grid size would need to go up
by probably an order of magnitude, and the total number of points would
become even more ludicrous I was, in fact, attempting to be conservative
with the grid resolution there, by assuming interpolation, exactly to
illustrate how intractable the memory requirements get.
or represent the approximated data by another form?
By all means. As I mentioned before, you're going to *have to* do something
like that just to keep the memory requirements anywhere near sane. So if
you've got some suggestion on how to do this, I'm sure the list would like
to hear it. But as mentioned above, I'm skeptical that one can invent a way
of dramatically reducing the table memory requirements that doesn't exhibit
more complexity than the iterative solution you're trying to avoid in the
first place.
all's i am saying is to explore representing and implementing f() directly in
terms of *actual* independent arguments. not just for the simplification of
the runtime algorithm, but also to explore the nature of the *actual* mapping
(so maybe you have some idea of how bad the aliasing might be) which is
obscured when iterating on g().
Sure, I've agreed that this is an interesting idea several times now. But I
remain unclear on what a big table is supposed to buy us here. How does one
go from a big multidimensional table to an estimate of oversampling
requirements? Is there some analysis the table admits, that simply running
the iterative system on, say, various test signals, does not?
If we want to estimate oversampling requirements, wouldn't we be better off
just replacing the tanh() models with, say, (large) finite order polynomial
models at the outset?
HEY!!! now someone's getting it!!!
but we don't know how "(large)" until we get specific with the data.
and, what i am saying is to include the load line and other circuit
specifics (like any cathode resistance) in the model and try to fit to
that directly, rather than just fit tanh() (or whatever basis) functions
to tube curves and implement the load line and other stuff in the
real-time code.
Then we can directly solve the fixed-point equations
what "fixed-point equations"? i am not precluding either floating-point
nor fixed-point implementation. why do you keep bringing that up?
whether this is implemented as floating or fixed point is a separate
(and later question). i would probably first simulate with
floating-point and when that starts working to our satisfaction, then
start thinking about how to do this in a stomp box with a fixed-point DSP.
and get closed-form expressions for the nonlinearities as polynomials, do
the usual analysis on that, etc. I guess that must not work, or nobody
would be bothering with iterative solutions to the tanh models in the first
place...
so, because two manufacturers on this list don't seem to do that nor to
even understand the point i was making, that changes the mathematics?
somehow makes it mathematically implausible or unfeasible because
someone says "nay"? and *i'm* the guy that can't seem to imagine or
learn anything new?
Or maybe the required model order is just really, really high,
which would maybe be okay for this kind of offline analysis, but too
expensive for realtime implementation?
maybe, but if we dismiss it outright, we'll never know.
--
r b-j r...@audioimagination.com
"Imagination is more important than knowledge."
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp