Re: [ARTIQ] DSP gateware
Thank you for the detailed study Robert. > This setup can -- for example -- generate a two-tone signal at 162 MHz > and 238 MHz by setting f0=157 MHz, f1=5 MHz, f2=81 MHz. The attached > plot has the data and the spectrum from a bit-accurate simulation of > the full FPGA gateware. Units are "natural" (sample rate=1, full > scale=1): the relevant tones are close to 0.1 and 0.15 sample rate. > Output amplitude is below clipping. > Thank you for the specific example. > * 200 MHz is a bit under maximum achievable speed for this logic on a > -2 speed grade kintex 7. > Can -1 speed grade on UltraScale handle generation at the 1 Gb/s data rate ? > * 1.6 GHz * 4 channels is more than we can push to a DAC. The design > can obviously also run at 1 GHz (f1,f2 at 125 MHz, f0 at 1 GHz) which > would just about fill eight JESD204B pipes. > That is, e Each DAC requires 2 parallel JESD channels at 10 Gb/s. * The design can also be built for 800 MHz with significantly lower > resource usage (then running the f1,f2 NCOs at 200 MHz, f0 at 4*200 > MHz = 800 MHz). This would free a lot of room on the FPGA, fit the > JESD pipes, and would still be able to comfortably generate the signal > above. > > This demonstrates that we can actually get very good high-data-rate > two-tone signals for eight channels out of gateware that fits on > currently available development boards. Splendid! This leaves room for future room for features like PID. -Joe ___ ARTIQ mailing list https://ssl.serverraum.org/lists/listinfo/artiq
Re: [ARTIQ] DSP gateware
Hi Dave, On Mon, Aug 1, 2016 at 5:15 PM, Leibrandt, David R. (Fed)wrote: > 1. I assume this logic would be followed by some sort of digital filter to > remove the unwanted Nyquist images. Have you thought about how good of > suppression you might be able to achieve, and at what FPGA resource and phase > distortion cost? That AA filter would be a better interpolator between the summing of the f1/f2 oscilaltors and that data being fed into the f0 oscillator. That filter would suppress the images. Currently there is just a zeroth-order interpolator. I have played with designing a higher order interpolator and for a CIC the math will survive the up to second order but most likely not for third and higher. Same for FIR. > 2. Do you have an idea of the latency of the signal chain? Say I wanted to > do a phase lock by feeding new p1 values into the RTIO. What sort of > bandwidth could I achieve? The p1 latency is about 37 cycles at 5 ns/cycle: a few misc cycles here and there plus two CORDIC's worth of latency, each 16 bits + 3 guard bits. Currently I have the latencies of all components matched so that RTIO events on the different spline interpolators would automatically arrive in the data stream time-aligned. For local feedback in e.g. PID loops I would inject that feedback signal so that there is minimal latency. For e.g. feedback on the p0 term that would be around 20 cycles, also at 5 ns/cycle. The u term can probably be as fast as one or two cycles, the entire signal loop latency limited by other things. Robert. ___ ARTIQ mailing list https://ssl.serverraum.org/lists/listinfo/artiq
Re: [ARTIQ] DSP gateware
On Mon, Aug 1, 2016 at 3:59 PM, Jonathan Mizrahiwrote: > I have one question, just out of curiosity: What is the motivation of > linking two "buddy" channels in the way you described, with the b and c > flags to turn these on and off? What application uses this feature? A pair of buddy channels gives you all the features of two of those signal generators in an IQ signal stream (thus also twice the bandwidth and 3dB more SNR -- if I am not wrong). That means four tones with two full-bandwidth oscillators. The IQ stream is something that can be naturally fed into the DAC in question. Take a look at its block diagram in the datasheet. Only an IQ stream can be coarse modulated, shifted in frequency by the DAC's NCO, sinc-shaped, and easily fed to e.g. an analog IQ mixer to get you to another frequency window. I just didn't call that pair of buddy channels an "IQ pair" because when coupled, each channels's signal generators feed both I and Q. Robert. ___ ARTIQ mailing list https://ssl.serverraum.org/lists/listinfo/artiq
Re: [ARTIQ] DSP gateware
Hi Robert, This is a nice writeup. A couple questions for now: 1. I assume this logic would be followed by some sort of digital filter to remove the unwanted Nyquist images. Have you thought about how good of suppression you might be able to achieve, and at what FPGA resource and phase distortion cost? 2. Do you have an idea of the latency of the signal chain? Say I wanted to do a phase lock by feeding new p1 values into the RTIO. What sort of bandwidth could I achieve? Thanks, Dave -Original Message- From: Robert Jördens [mailto:r...@m-labs.hk] Sent: Sunday, July 31, 2016 5:32 AM To: artiq@lists.m-labs.hk; Jonathan Mizrahi; Sébastien Bourdeauducq ; Joe Britton ; Slichter, Daniel H. (Fed) ; Leibrandt, David R. (Fed) ; Allcock, David T. (IntlAssoc) ; Ken Brown Subject: Re: DSP gateware Hello, to fuel the discussion and planning of the smart arbitrary waveform generator requirements for the different applications, I did another extended design study for the proposed ARTIQ/Sayma DSP gateware and signal flow, looking at actual signal quality, resource usage and possible parametrizations. This time, take the following parametrization of a channel's output o: z = (a1*exp(i*(f1*t+p1)) + a2*exp(i*(f2*t+p2))) * exp(i*(f0*t+p0)) o = u + b*Re(z) + c*Im(z_buddy) * u and a are 16 bit cubic spline inteprolators * p are 16 bit constant (non-) interpolators * f are 48 bit linear interpolators * z_buddy refers to the (complex, IQ) z data coming from each channel's "buddy" channel, ignore it for now * b, c are switches (with values 0 or 1) that allow a bunch of different configurations, ignore them for now * all spline interpolators (u, a, f, p) sample at 200 MHz * the f1/p1 and f2/p2 oscillators sample at 200 MHz and their data is fed to the f0/p0 oscillator without interpolation * the f0/p0 oscillator samples at 8*200 MHz = 1.6 GHz * data width is at least 16 bit everywhere This setup can -- for example -- generate a two-tone signal at 162 MHz and 238 MHz by setting f0=157 MHz, f1=5 MHz, f2=81 MHz. The attached plot has the data and the spectrum from a bit-accurate simulation of the full FPGA gateware. Units are "natural" (sample rate=1, full scale=1): the relevant tones are close to 0.1 and 0.15 sample rate. Output amplitude is below clipping. This is a bit-accurate representation of the data that would be sent to the DAC. Actual analog output would only differ by the DAC's interpolation and it's analog output transfer function and DAC noise. Don't be confused by the way the samples look: this is only due to the un-interpolated data from the f1/f2 oscillators. Same goes for the Nyquist images all around. A very rough and conservative estimate for wideband SNR is > 85 dB not counting the images. There are a lot of things that can be tweaked still, this demo is not supposed to be show the optimum. * 200 MHz is a bit under maximum achievable speed for this logic on a -2 speed grade kintex 7. * 1.6 GHz * 4 channels is more than we can push to a DAC. The design can obviously also run at 1 GHz (f1,f2 at 125 MHz, f0 at 1 GHz) which would just about fill eight JESD204B pipes. * The design can also be built for 800 MHz with significantly lower resource usage (then running the f1,f2 NCOs at 200 MHz, f0 at 4*200 MHz = 800 MHz). This would free a lot of room on the FPGA, fit the JESD pipes, and would still be able to comfortably generate the signal above. * DAC interpolation could be 2x if desired to get to 2 GHz or 1.6 GHz DAC sample rate depending on the choice of scenario. * Eight channels of this 1.6 GHz design occupy about 62% of the LUTs of a xc7k325t (without _any_ other logic like everything related to the transcievers, ARTIQ, DRTIO, FIFOs...). * Wrapping it in a minimal ARTIQ system brings the LUT resource usage to about 72%. * On a xcku040 the utilization estimate (same gateware as for the 62% xc7k325t system) is below 51%, (can't get a good number because of a Xilinx-Vivado bug). * Take the LUT usage percentages with a grain of salt. They don't react kindly to extrapolation. * Interpolation schemes for the f1/p1, f2/p2 oscillator data before it reaches the f0 oscillator might be interesting to look at. * Spline knot behavior (ramping, switching, synchronization, latency matching, interpolation) for frequency, phase, amplitude is as expected (see e.g. the pdq2 documentation). This demonstrates that we can actually get very good high-data-rate two-tone signals for eight channels out of gateware that fits on currently available development boards. The parametrization is intuitive and extremely flexible (you can e.g. rewire it at run-time to exploit and feed the full IQ datapaths of the DACs giving you twice the bandwidth on half the channels and all the other features in the DAC and
Re: [ARTIQ] DSP gateware
Hello, to fuel the discussion and planning of the smart arbitrary waveform generator requirements for the different applications, I did another extended design study for the proposed ARTIQ/Sayma DSP gateware and signal flow, looking at actual signal quality, resource usage and possible parametrizations. This time, take the following parametrization of a channel's output o: z = (a1*exp(i*(f1*t+p1)) + a2*exp(i*(f2*t+p2))) * exp(i*(f0*t+p0)) o = u + b*Re(z) + c*Im(z_buddy) * u and a are 16 bit cubic spline inteprolators * p are 16 bit constant (non-) interpolators * f are 48 bit linear interpolators * z_buddy refers to the (complex, IQ) z data coming from each channel's "buddy" channel, ignore it for now * b, c are switches (with values 0 or 1) that allow a bunch of different configurations, ignore them for now * all spline interpolators (u, a, f, p) sample at 200 MHz * the f1/p1 and f2/p2 oscillators sample at 200 MHz and their data is fed to the f0/p0 oscillator without interpolation * the f0/p0 oscillator samples at 8*200 MHz = 1.6 GHz * data width is at least 16 bit everywhere This setup can -- for example -- generate a two-tone signal at 162 MHz and 238 MHz by setting f0=157 MHz, f1=5 MHz, f2=81 MHz. The attached plot has the data and the spectrum from a bit-accurate simulation of the full FPGA gateware. Units are "natural" (sample rate=1, full scale=1): the relevant tones are close to 0.1 and 0.15 sample rate. Output amplitude is below clipping. This is a bit-accurate representation of the data that would be sent to the DAC. Actual analog output would only differ by the DAC's interpolation and it's analog output transfer function and DAC noise. Don't be confused by the way the samples look: this is only due to the un-interpolated data from the f1/f2 oscillators. Same goes for the Nyquist images all around. A very rough and conservative estimate for wideband SNR is > 85 dB not counting the images. There are a lot of things that can be tweaked still, this demo is not supposed to be show the optimum. * 200 MHz is a bit under maximum achievable speed for this logic on a -2 speed grade kintex 7. * 1.6 GHz * 4 channels is more than we can push to a DAC. The design can obviously also run at 1 GHz (f1,f2 at 125 MHz, f0 at 1 GHz) which would just about fill eight JESD204B pipes. * The design can also be built for 800 MHz with significantly lower resource usage (then running the f1,f2 NCOs at 200 MHz, f0 at 4*200 MHz = 800 MHz). This would free a lot of room on the FPGA, fit the JESD pipes, and would still be able to comfortably generate the signal above. * DAC interpolation could be 2x if desired to get to 2 GHz or 1.6 GHz DAC sample rate depending on the choice of scenario. * Eight channels of this 1.6 GHz design occupy about 62% of the LUTs of a xc7k325t (without _any_ other logic like everything related to the transcievers, ARTIQ, DRTIO, FIFOs...). * Wrapping it in a minimal ARTIQ system brings the LUT resource usage to about 72%. * On a xcku040 the utilization estimate (same gateware as for the 62% xc7k325t system) is below 51%, (can't get a good number because of a Xilinx-Vivado bug). * Take the LUT usage percentages with a grain of salt. They don't react kindly to extrapolation. * Interpolation schemes for the f1/p1, f2/p2 oscillator data before it reaches the f0 oscillator might be interesting to look at. * Spline knot behavior (ramping, switching, synchronization, latency matching, interpolation) for frequency, phase, amplitude is as expected (see e.g. the pdq2 documentation). This demonstrates that we can actually get very good high-data-rate two-tone signals for eight channels out of gateware that fits on currently available development boards. The parametrization is intuitive and extremely flexible (you can e.g. rewire it at run-time to exploit and feed the full IQ datapaths of the DACs giving you twice the bandwidth on half the channels and all the other features in the DAC and downstream). Any set of spline interpolators can receive new knot data at the same time from their RTIO FIFOs: there is no contention. The design works just as well for driving electrodes (the u spline and maybe one of the oscillators to prod an ion). It is broadband (the f0/p0 oscillator covers the entire data bandwidth). The gateware as-is could also feed two IQ pairs at 1 GHz giving you full and instant broadband access with each pair to 1 GHz IQ baseband in the first Nyquist zone which you can up-convert in analog RF to wherever you want. Or you can rethink it and feed it two IQ pairs at 600 MHz (4*150 MHz), use 4x interpolation and cover 2.4 GHz IQ baseband with each pair using the DAC's fine or coarse modulation schemes. If there are questions about this, I'd be happy to answer them. We'd also be happy to generate a quote for an implementation and/or a hardware demonstrator system. Regards, -- Robert Jördens. phaser_2fd7bfd.pdf Description: Adobe PDF document ___
Re: [ARTIQ] DSP gateware
On Fri, Apr 1, 2016 at 1:09 AM, Slichter, Daniel H. (Fed)wrote: >> And a 16 ns pulse would be just about 20 samples. Why would you want to >> describe that using ~4 spline knots each being maybe 16 times 16 bits in >> data. >> If you need the full bandwidth, the idea of compression using splines is not >> very helpful. In that case you would need to design in a little "real" AWG >> player that plays snippets from a wide BRAM. > > Sure, that is a better solution for these kinds of things. I am just saying > that unless we have some suitable feature like this, no superconducting > people will be interested in the system. So we should design things in such > a way that this is a possibility, to maximize the target audience. Sure. If people want "real" sample-based AWG, it should be into the specification. I was just pointing out that trying to do it with spline interpolators is not particularly bright. ___ ARTIQ mailing list https://ssl.serverraum.org/lists/listinfo/artiq
Re: [ARTIQ] DSP gateware
> to allow for FPGA selection and to rush the funding I have done a design > study and implemented a basic DSP output channel for the ARTIQ DSP > hardware. A 1.25 GS/s, 16 bit, "smart" channel pair would do > > o0 = u0 + i0 * a0 * cos(f0 * t + p0) + q1 * a1 * sin(f1 * t + p1) > o1 = u1 + q0 * a0 * sin(f0 * t + p0) + i1 * a1 * cos(f1 * t + p1) > > * u and a are 16 bit cubic spline inteprolators > * p are 16 bit constant (non-) interpolators > * f are 48 bit linear interpolators > * i and q are switches (0 or 1) that allow many different configurations, > among them single tone independent, two-tone, single tone iq, and two- > tone iq all with independent dc offsets > * the inteprolators interpolate at 1/8 output rate, the DUCs output at full > rate > (effectively). > * all designed for 16 bit spline knot duration resolution and scalable spline > interpolation clock This looks like a good general purpose method for defining signals, which accommodates most possible use cases in a clean and concise way. A few potential comments: - For sc qubit applications, it would be necessary to update the u and a interpolators at the full output rate (~1.25 GSPS), since pulses are often only 10-20 ns long and require nontrivial shaping over those periods of time (sometimes 2 envelope oscillations up and down, see e.g. http://arxiv.org/pdf/1405.0450v2.pdf on "wah-wah" pulses, which are commonly used). In general, having u and a only updated at 1/8 clock rate will give rise to spurs at 1/4 of the Nyquist frequency and harmonics, which is undesirable for any application. Perhaps I am misunderstanding what you mean by the interpolators running at 1/8 output date. > This uses about 28 kLUT, 14% of a xc7k325t. The timing, parsing, serial link, > rtlink, drtio, jdes phy, gearbox, monitoring, digital servo, adc logic will > probably add another 10-20 kLUT per channel pair but this is the dominant > chunk. > > This looks good for the xc7a200t or a xc7k325t as the building block and 4 > channels (two smart channel pairs). Will changing the update rate for the spline interpolators make things much larger? I assume they would have to be physically parallelized. ___ ARTIQ mailing list https://ssl.serverraum.org/lists/listinfo/artiq
Re: [ARTIQ] DSP gateware
Yes, but for such speed you don't need to match better than several mm. Greg On 31 March 2016 at 13:51, Robert Jördenswrote: > On Thu, Mar 31, 2016 at 8:51 AM, Florent Kermarrec > wrote: > > When choosing between Artix7 or Kintex7 you also have to consider that > > Artix7 only have HR IOs which mean they don't have ODELAYE2 primitives > and > > we are currently using them in the actual DDR PHY for leveling. > > > > Also when choosing XC7A200T you will stuck to this FPGA on your board > > because the package is different from others Artix7. With Kintex7, in the > > slices range you are targeting, you will have more flexibility: > > - FBG676 (8 transceivers): from XC7K70T to XC7K410T > > - FFG676 (8 transceivers): from XC7K160T to XC7K410T > > - FFG900 (16 transceivers): from XC7K325T to XC7K410T > > ACK. Good to know about the IODELAY in Artix. > I guess the alignment on a.g. http://www.ohwr.org/projects/afc/wiki is > done by trace length matching then, right? > > Robert. > ___ > ARTIQ mailing list > https://ssl.serverraum.org/lists/listinfo/artiq > ___ ARTIQ mailing list https://ssl.serverraum.org/lists/listinfo/artiq
Re: [ARTIQ] DSP gateware
Hello, When choosing between Artix7 or Kintex7 you also have to consider that Artix7 only have HR IOs which mean they don't have ODELAYE2 primitives and we are currently using them in the actual DDR PHY for leveling. Also when choosing XC7A200T you will stuck to this FPGA on your board because the package is different from others Artix7. With Kintex7, in the slices range you are targeting, you will have more flexibility: - FBG676 (8 transceivers): from XC7K70T to XC7K410T - FFG676 (8 transceivers): from XC7K160T to XC7K410T - FFG900 (16 transceivers): from XC7K325T to XC7K410T Florent 2016-03-30 21:49 GMT+02:00 Robert Jördens: > Hello, > > to allow for FPGA selection and to rush the funding I have done a > design study and implemented a basic DSP output channel for the ARTIQ > DSP hardware. A 1.25 GS/s, 16 bit, "smart" channel pair would do > > o0 = u0 + i0 * a0 * cos(f0 * t + p0) + q1 * a1 * sin(f1 * t + p1) > o1 = u1 + q0 * a0 * sin(f0 * t + p0) + i1 * a1 * cos(f1 * t + p1) > > * u and a are 16 bit cubic spline inteprolators > * p are 16 bit constant (non-) interpolators > * f are 48 bit linear interpolators > * i and q are switches (0 or 1) that allow many different > configurations, among them single tone independent, two-tone, single > tone iq, and two-tone iq > all with independent dc offsets > * the inteprolators interpolate at 1/8 output rate, the DUCs output at > full rate (effectively). > * all designed for 16 bit spline knot duration resolution and scalable > spline interpolation clock > > This uses about 28 kLUT, 14% of a xc7k325t. The timing, parsing, > serial link, rtlink, drtio, jdes phy, gearbox, monitoring, digital > servo, adc logic will probably add another 10-20 kLUT per channel pair > but this is the dominant chunk. > > This looks good for the xc7a200t or a xc7k325t as the building block > and 4 channels (two smart channel pairs). > > I haven't implemented, benchmarked, or tested the latest X suggestions > and design tweaks from article Y in journal Z. > > Robert. > ___ > ARTIQ mailing list > https://ssl.serverraum.org/lists/listinfo/artiq > ___ ARTIQ mailing list https://ssl.serverraum.org/lists/listinfo/artiq