Hi all.

We were having problems with multiple sequentail progdev calls failing on
our ROACH-2 systems.  We were testing multiple bof files in a loop, and
the roach would fall over and crash completely, and after the kernel
panic, it would reboot itself.

After a great deal of concentrated debugging effort this afternoon by
Jack, David, Justin, Ryan, Arindam, Randy, and me, the cause of the
crashing upon multiple progdev calls was found.  It turned out to have
nothing to do with programming the chip, rather it was a problem with
memory allocation by the operating system.  Jack found that problem could
also be caused by allocating a huge array in Python, using lots of memory.

The problem was caused by the kernel thinking that the ROACH has 768 MB of
memory on board, when in fact it has only 512 MB.  The fix is to pass the
real amount of memory to the kernel in the bootargs.  the systems have
been mostly working for a long time (Years!), so you may want to check
that your systems know in fact how much memory they have.  If you start up
top you can see what it thinks, or look in /proc/meminfo.

John





Reply via email to