Hi Robert,
This is a nice writeup. A couple questions for now:
1. I assume this logic would be followed by some sort of digital filter to
remove the unwanted Nyquist images. Have you thought about how good of
suppression you might be able to achieve, and at what FPGA resource and phase
distortion cost?
2. Do you have an idea of the latency of the signal chain? Say I wanted to do
a phase lock by feeding new p1 values into the RTIO. What sort of bandwidth
could I achieve?
Thanks,
Dave
-Original Message-
From: Robert Jördens [mailto:r...@m-labs.hk]
Sent: Sunday, July 31, 2016 5:32 AM
To: artiq@lists.m-labs.hk; Jonathan Mizrahi ; Sébastien
Bourdeauducq ; Joe Britton ;
Slichter, Daniel H. (Fed) ; Leibrandt, David R. (Fed)
; Allcock, David T. (IntlAssoc)
; Ken Brown
Subject: Re: DSP gateware
Hello,
to fuel the discussion and planning of the smart arbitrary waveform generator
requirements for the different applications, I did another extended design
study for the proposed ARTIQ/Sayma DSP gateware and signal flow, looking at
actual signal quality, resource usage and possible parametrizations.
This time, take the following parametrization of a channel's output o:
z = (a1*exp(i*(f1*t+p1)) + a2*exp(i*(f2*t+p2))) * exp(i*(f0*t+p0)) o = u +
b*Re(z) + c*Im(z_buddy)
* u and a are 16 bit cubic spline inteprolators
* p are 16 bit constant (non-) interpolators
* f are 48 bit linear interpolators
* z_buddy refers to the (complex, IQ) z data coming from each channel's "buddy"
channel, ignore it for now
* b, c are switches (with values 0 or 1) that allow a bunch of different
configurations, ignore them for now
* all spline interpolators (u, a, f, p) sample at 200 MHz
* the f1/p1 and f2/p2 oscillators sample at 200 MHz and their data is fed to
the f0/p0 oscillator without interpolation
* the f0/p0 oscillator samples at 8*200 MHz = 1.6 GHz
* data width is at least 16 bit everywhere
This setup can -- for example -- generate a two-tone signal at 162 MHz and 238
MHz by setting f0=157 MHz, f1=5 MHz, f2=81 MHz. The attached plot has the data
and the spectrum from a bit-accurate simulation of the full FPGA gateware.
Units are "natural" (sample rate=1, full
scale=1): the relevant tones are close to 0.1 and 0.15 sample rate.
Output amplitude is below clipping.
This is a bit-accurate representation of the data that would be sent to the
DAC. Actual analog output would only differ by the DAC's interpolation and it's
analog output transfer function and DAC noise.
Don't be confused by the way the samples look: this is only due to the
un-interpolated data from the f1/f2 oscillators. Same goes for the Nyquist
images all around. A very rough and conservative estimate for wideband SNR is >
85 dB not counting the images. There are a lot of things that can be tweaked
still, this demo is not supposed to be show the optimum.
* 200 MHz is a bit under maximum achievable speed for this logic on a
-2 speed grade kintex 7.
* 1.6 GHz * 4 channels is more than we can push to a DAC. The design can
obviously also run at 1 GHz (f1,f2 at 125 MHz, f0 at 1 GHz) which would just
about fill eight JESD204B pipes.
* The design can also be built for 800 MHz with significantly lower resource
usage (then running the f1,f2 NCOs at 200 MHz, f0 at 4*200 MHz = 800 MHz). This
would free a lot of room on the FPGA, fit the JESD pipes, and would still be
able to comfortably generate the signal above.
* DAC interpolation could be 2x if desired to get to 2 GHz or 1.6 GHz DAC
sample rate depending on the choice of scenario.
* Eight channels of this 1.6 GHz design occupy about 62% of the LUTs of a
xc7k325t (without _any_ other logic like everything related to the
transcievers, ARTIQ, DRTIO, FIFOs...).
* Wrapping it in a minimal ARTIQ system brings the LUT resource usage to about
72%.
* On a xcku040 the utilization estimate (same gateware as for the 62% xc7k325t
system) is below 51%, (can't get a good number because of a Xilinx-Vivado bug).
* Take the LUT usage percentages with a grain of salt. They don't react kindly
to extrapolation.
* Interpolation schemes for the f1/p1, f2/p2 oscillator data before it reaches
the f0 oscillator might be interesting to look at.
* Spline knot behavior (ramping, switching, synchronization, latency matching,
interpolation) for frequency, phase, amplitude is as expected (see e.g. the
pdq2 documentation).
This demonstrates that we can actually get very good high-data-rate two-tone
signals for eight channels out of gateware that fits on currently available
development boards. The parametrization is intuitive and extremely flexible
(you can e.g. rewire it at run-time to exploit and feed the full IQ datapaths
of the DACs giving you twice the bandwidth on half the channels and all the
other features in the DAC and downstream). Any set of spline interpolators can
receive new knot data at the same time from their RTIO FIFOs: there is no
contention. The design works just as well for driv