Hi Jack,
thanks for your help. >> You could always add your own constraint file using the "UCF" block. I have to admit, I’m not aware of a UCF block. Is that a Simulink block? We are using Simulink R2013a, I couldn’t find it in the browser. Thanks, cheers Guenter From: Jack Hickish [mailto:jackhick...@gmail.com] Sent: Donnerstag, 22. September 2016 08:00 To: Guenter Knittel; casper list Subject: Re: [casper] FFT speed optimizations On Wed, 21 Sep 2016 at 04:47 Guenter Knittel <gknit...@mpifr-bonn.mpg.de> wrote: Hi all, I’m still busy trying to speed-optimize our DSP pipeline, and it’s amazing how sophisticated the tools are in picking ever new signals to let them miss timing. But a few of them occur frequently, often inside fft_wideband_real, and so I’m hoping that somebody can give me a hint. - The first is the shift-signal (or signals) that go into fft_direct and then into the butterfly[m_n] blocks. These signals have a high fan-out, which is accounted for in butterfly0_0 and butterfly1_[0:1] by using a register tree, but not so in the later stages. But why? The fan-out is just the same. So these signals miss the deadline. That doesn’t make any sense at all since normally one would define a multi-cycle path and the problem would be solved while using only one single register. So I looked it up on the web but couldn’t find a way of defining slow signals in the Simulink model. You could always add your own constraint file using the "UCF" block. This would allow you to ignore timing on the paths from the fftshift software register (i assume this is what you're using as a driver) to the multiplexers they control. In most designs the fftshift input is basically static anyway. Good luck! Jack - The second problem can also be found often and occurs for example in fft_wideband_real/fft_direct/butterfly3_x/twiddle between coeff_gen and bus_mult. From the MUX to the (complex) multiplier I count 4 pipeline stages of delay, but in the device diagram in PlanAhead I can only see two. On the other hand, the multipliers appear to use more registers than is indicated in the Simulink diagram. Can it be that the tools have moved some delay stages into the macrocell to save resources, but defeating their purpose? If so, can I do something about it? Hope I’m making sense, any help appreciated Cheers Guenter