RE: OT: Mersenne: ARM Licenses

Anonymous Fri, 18 Jun 1999 13:39:32 -0700
> My understanding is that it comes with a language of its own. My
> impression is that it is an icon based language. Kind of like connect the
> blocks into a flow chart of some sort and hit "GO".

I'll try and give a cursory explanation of what you can do with FPGA's.
Bear in mind, I haven't done FPGA work in nearly 5 years, but not much could
change.

I'm sure we all understand boolean logic...and, or, not, etc. (as well as
their counterparts nand, nor, xor, etc).

You can do all sorts of wonderful logic designs using handfuls of "74"
chips, the basic building blocks of logic design.  You can even design
memory cells using latches and the sort.

An FPGA is essentially a WHOLE bunch of logic gates on a single chip, and
furthermore, they are programmable.  You can talk about FPGA's in terms of
the number of gates and numbers of pads (I/O connects for instance) and
such.  These gates are grouped into CLB's (configurable logic block I
think?).  On a chip with 1M gates, you maybe have around 27-28 thousand of
these CLB's.  Each CLB can be configured as basically any kind of logic gate
you want.  Configure it as a nand gate, or a xor, or whatever.

Besides the CLB's and I/O stuff, you also need to worry about routing.  On
an FPGA, there are all these CLB's, but there are limited ways to route
signals between them.

One problem I used to have (on occasion) was that my design would work great
on a simulator, but when it came time to actually route the design, I'd find
that I'd run out of routes.  I might have to "move" certain functions to
other parts of the chip that had more interconnects available.  Other
designs like PLD's have n-way routing, so each block can connect to any
other one...simplifies the routing but there are generally less blocks
available.  Hmmm...

What you can do is configure some of the gates as memory cells (J-K or S-R
latches), or use external RAM (which I did in my design...we had Xilinx
chips with very few blocks) to hold data.  Then configure the rest of the
blocks as your logic.

Another part of the routing process is determining the load on "bus"
signals, like the system clock for instance.  Too many devices running off
the clock means you're waveform will distort and you'll wind up with a bad
signal.

This routing part can take a while...be prepared to go grab a snack or two
while it churns out the design.  Programming of the device is done serially,
but it's generally not too much of a wait for that part.  Then when you
first run it, non-simulated, you get to find out if you screwed up any
timings (registers loading in the wrong order, bus collisions, etc.).  VERY
fun stuff! :-)

When doing FPGA design, you need to work out your timing signals, design
your own registers, generally a good idea to build a state machine for the
timings (kind of the master control), be sure and grey code everything in
the timings to prevent race conditions, etc.

It really isn't easy, but you can do AMAZING things.  If you wanted to build
a 256 bit multiplier, well, it could be done.  The real trick is to KNOW how
to optimize logic tables and design.  Any Joe Schmoe can build an n-bit
multiplier, but getting it done optimally is something else entirely.
That's the tricky part, and there are all sorts of cool tricks (the same
ones that programmers would know about in their psuedo-code) for getting it
done.  The fun part is that since you're designing the hardware, you can put
in your own clever "cheats" in hardware.  For a ROR, for instance, instead
of actually reading in each bit and moving it over to the right one, and
using a temporary bit to hold the extra along the way, you could just copy
the byte to another register with a 1-bit offset, then rename the registers,
making it faster (that's just a dumb example, but you get the idea).

Doing something like an add in hardware is quite easy.  I'm still not
entirely sure how a "real" CPU does multiplies and divides so DARN FAST in
the hardware though...  I mean, I could do a multiply by simply doing
multiple adds, but that would be pretty slow.

Anyone who has ever delved into advanced microprocessor designs probably
knows what I'm talking about.  Intel uses some pretty clever stuff to get
extra speed from their design, at the price of using more silicon.

It took me a good 2 months to design a 4 bit CPU (with 3 registers, A, B and
an accumulator) that could do only 8 instructions like add, sub, jmp, etc.
Sure, I was learning at the time, but it's complicated stuff!  (PS, I
cheated on some parts by using the bus as a "temporary" register.  It really
sped up a few parts without adding another register.  My prof. thought it
was clever, though he wasn't terribly crazy about it...I had to rearrange
some timings to avoid bus collisions and at that point, the examples he gave
in class no longer applied to my design.  Still, I'm bragging here, but my
design was faster than the others.)

I would still like to see just how hard it would be to design an n-bit
multiply with add or something in an FPGA...just send the data from the
computer to the FPGA and get the result back, maybe using the PCI bus.  I
sure wish I had my computers back so I could go over my old designs and
"refresh" my brain cells! :-)  The beauty is that once you have a small bit
design figured out, it's not terribly more complicated to add more bits to
the data design.  And if it were a dedicated device that only did one
instruction, but did it fast, that would simplify the state machine design
greatly!

Aaron

________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
RE: OT: Mersenne: ARM Licenses

Reply via email to