>> I wish I knew of a mailing list where I could get a definitive answer
>> on "modern problems with async circuits", or an update on the kind of
>> techniques the new AI chips were using to keep their power consumption
>> so low. I'll keep googling.
> 
> I’d be interested in knowing this as well. This gives some examples of async 
> circuits: 
> https://web.stanford.edu/class/archive/ee/ee371/ee371.1066/lectures/lect_12.pdf
> 
> Page 43, “Bottom Line” mentions that asynchronous design has “some delay 
> matching / overhead issues”. Apparently delay matching means getting the 
> signal outputs on two separate paths to arrive at the same time(?) Presumably 
> overhead refers to the 2x space on the die previously mentioned, for 
> completion detection. Pages 23-25 on “data-bundling constraints” might also 
> highlight some other challenges. Some more current material would be 
> interesting though...

The area overhead is at least partly mitigated by the major advantage of not 
having to distribute and gate a coherent clock signal across the entire chip.  
I half-remember seeing a quote that distributing the clock represents about 30% 
of the area and/or power consumption of a modern deep-sub-micron design.  This 
is area and power that is not directly contributing to functionality.

Generally there are two major styles of asynchronous logic:

1: Standard combinatorial logic stages accompanied by self-timing circuits with 
a matched delay, generally known as "bundled data".  This style has little 
overhead (probably less than the clock distribution it replaces) but requires 
local timing closure (the timing circuit must have strictly *more* delay than 
the logic it accompanies) to assure correct functionality.  I suspect that 
achieving local timing closure is easier than the global timing closure 
required by conventional synchronous logic.

2: Dual-rail QDI logic, in which completion is explicitly signalled by the 
arrival of a result.  This almost completely eliminates timing closure from the 
logic correctness equation, but the area overhead can be substantial.  
Achieving maximum performance in this style can also be challenging, but 
suitable approaches do exist, eg:

        https://brej.org/papers/mapld.pdf

Both styles can inherently adapt timings to thermal and voltage conditions 
within a design range without much explicit provisioning, and typically have 
much cleaner power load and EMI characteristics than synchronous logic.  But as 
you can see from the above, the downsides typically associated with async logic 
tend to apply to one or the other of the styles, not to both at once.

 - Jonathan Morton

_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat

Reply via email to