On Wednesday, 16 September 2015 at 08:38:25 UTC, deadalnix wrote:
The energy comparison is bullshit. As long as you haven't loaded the data, you don't know how wide they are. Meaning you need either to go pessimistic and load for the worst case scenario or do 2 round trip to memory.

That really depends on memory layout and algorithm. A likely implementation would be a co-processor that would take a unum stream and then pipe it through a network of cores (tile based co-processor). The internal busses between cores are very very fast and with 256+ cores you get tremendous throughput. But you need a good compiler/libraries and software support.

The hardware is likely to be slower as you'll need way more wiring than for regular floats, and wire is not only cost, but also time.

You need more transistors per ALU, but slower does not matter if the algorithm needs bounded accuracy or if it converge more quickly with unums. The key challenge for him is to create a market, meaning getting the semantics into scientific software and getting initial workable implementations out to scientists.

If there is a market demand, then there will be products. But you need to create the market first. Hence he wrote an easy to read book on the topic and support people who want to implement it.

Reply via email to