Hi, Richard,

On Nov 3, 2014, at 11:47 AM, Richard Black wrote:

> So, it's been a little while now, but not much has changed yet. We've gotten 
> Chipscope working, and, so far, there aren't any red flags with the FPGA 
> firmware 10-GbE control signals.

That's good to know, although maybe in some way it would have been nice if you 
had found some red flags.

> We also confirmed that the bitstream we are using is in fact 
> roach2_fengine_2013_Oct_14_1756.bof.gz, so that is unfortunately not the 
> problem.

At least you are using a known good BOF file, so that eliminates a source of 
potential errors.

> I also took a look at the ROACH2 PPC setup: we pulled from the .git 
> repository on February 12, 2014 (commit number = 
> e14df9016c3b7ccba62cc6d0cae05405f4929c94). There haven't been any changes to 
> that repository since August 2013, so unless the SKA-SA ROACH-2s are using a 
> pull from before then, I don't think that is our issue.

We use our own homegrown NFS root filesystem for the ROACH2s, so I can't 
comment on the status of the one you refer to 
(https://github.com/ska-sa/roach2_nfs_uboot.git).  I am more interested in the 
U-Boot version you have (see https://github.com/ska-sa/roach2_uboot.git) and 
which version of the ROACH2 CPLD image you are using (not sure where to get 
this).  I think these are unlikely to be problematic, but we've already checked 
all the likely problems.

> We also tried out Jason Manley's suggestion of delaying the enabling of the 
> 10-GbE cores to ensure that the sync pulse propagated through the entire 
> system before buffering up data, but the problem persisted.

Do you have an external 1 PPS sync pulse connected or have you tried the latest 
rb-papergpu software that supports a software-generated "sync"?  The 
paper_feng_init.rb script already disables the data flow to the 10 GbE cores 
until the sync pulse has propagated through and the cores have been taken out 
of reset.

Does the latest rb-papergpu code show that the ADC clocks (MMCMs) are locked?  
Does it estimate the clock frequency correctly?  Does adc16_dump_chans.rb show 
samples that correspond correctly to the analog inputs (e.g. a CW tone)?

> Just to rule it out, I double-checked (or more accurately triple-checked) the 
> U72 part, and, sure enough, it is the correct oscillator, model number 
> EEG-2121.

Does it have the "L" suffix on the "100.000L" frequency part of the chip 
markings?

On a related note, as I sent off-list to you and Peter earlier today:  The fact 
that the Peter can send small packets at 200 MHz without overflow, but large 
packets give overflow is very interesting and puzzling.  I assume that the 
smaller packets are just fewer channels of the same length spectrum and that 
the number of packets per second remains the same (I think we discussed this 
previously).  In that case, the small packets reduce the data rate, which 
suggests that the 156.25 MHz "xaui_ref_clk" clock is maybe not really 156.25 
MHz but something somewhat slower.  This clock is driven by the oscillator at 
U56 and the clock splitter at U54 (see attached schematic snippet).  Can you 
please inspect those parts on your board(s)?  I will be able to inspect a 
ROACH2 this afternoon and report what I have on a known working system.

On one of our ROACH2s U56 is labeled like this:

EEG-2121
156.250L
OGPN1Z5C

Again, note the "L" suffix.  I think that signifies "LVDS", which is what is 
expected/required for the ROACH2.  That's very important.  I am not 100% sure 
about my transcription of the third line, it could have typos.

> There is another possibility, albeit an unlikely problem: we currently have 
> the ROACH-2 board booting off another PC (i.e. not the same PC that the ruby 
> control scripts are running on). I can't imagine that this is the problem, 
> but I'm planning on trying to consolidate the NFS and ruby scripts onto a 
> single PC to rule it out.

The scripts communicate with the ROACH2 over the network via KATCP.  There is 
no requirement that the scripts be running on the same server that is providing 
the NFS root filesystem to the ROACH2s.

> So I suppose at this point, my questions are:
> 
> (1) What version of the roach2_nfs_uboot .git repository are SKA-SA using?

I don't know.

> (2) Is SKA-SA using the same PCs for ROACH-2 net boots and file systems as 
> the ruby control scripts?

I doubt SKA-SA is using ruby, but as stated above the ruby scripts can be run 
on any system that can reach the ROACH2 via KATCP.

> (3) Are there any additional steps that need to be taken when installing the 
> Quad SFP+ mezzanine cards onto the ROACH-2 board? Are there potentially some 
> drivers or configuration steps that are needed to make sure they function 
> properly? As I recall, when we got the boards, we didn't do anything special 
> with the cards outside of simply plugging them in.

Just plugging them in is all that is necessary.  There is a slight complication 
in that the standoffs might not be exactly the right height and some washers 
need to be added to keep the mezzanine card parallel to the main board so that 
the mezzanine connector mates securely.  It's also important to make sure the 
connectors are properly seated vis a vis the shield which I am told can be a 
little "flappy".

Hope this helps,
Dave

Reply via email to