Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs

2021-02-13 Thread ZmnSCPxj via bitcoin-dev
Good morning Luke,

> > Another point to ponder is test modes.
> > In mass production you need test modes.
>
> > (Sure, an attacker can try targeted ESD at the `TESTMODE` flip-flop 
> > repeatedly, but this risks also flipping other scan flip-flops that contain 
> > the data that is being extracted, so this might be sufficient protection in 
> > practice.)
>
> if however the ASIC can be flipped into TESTMODE and yet it carries on
> otherwise working, an algorithm can be re-run and the exposed data
> will be clean.

But in most testmodes I have seen (and designed) all clocks are driven 
externally from a different pin (usually the serial interface) when in testmode.
If the CPU clock is now controlled by the attacker, how do you run any kind of 
algorithm?

(This could be an artifact of how my old design company designed testmodes, 
YMMV.)

Really the concern here is that testmode is entered while the CPU has key 
material loaded into registers, or caches, then it is possible, if those 
registers/caches are in the scan chain, to exfiltrate data.
Does not matter if the chip is now in a mode that cannot execute algorithms, if 
it was doing any kind of computation involving privkeys (including say deriving 
its public key so that PC-side hardware can get the `xpub`) then key material 
may be in scan chain registers, clock is now controlled by the attacker, and 
possibly scan mode as well (which disables combinational circuitry thus none of 
your algorithms can run).

>
> > If you are really going to open-source the hardware design then the layout
> > is also open and attackers can probably target specific chip area for ESD
> > pulse to try a flip-flop upset, so you need to be extra careful.
>
> this is extremely valuable advice. in the followup [1] you describe a
> gating method: this we have already deployed on a couple of places in
> case the Libre Cell Library (also being developed at the same time by
> Staf Verhaegen of Chips4Makers) causes errors: we do not want, for
> example, an error in a Cell Library to cause a permanent HI which
> locks us from being able to perform testing of other areas of the
> ASIC.
>
> the idea of being able to actually randomly flip bits inside an ASIC
> from outside is both hilarious and entirely news to me, yet it sounds
> to be exactly the kind of thing that would allow an attacker to
> compromise a hardware wallet. potentially destructively, mind, but
> compromise all the same.

Certainly outside of the the old company design philosophy I have seen many 
experts strongly protest against a design philosophy which assumes that any 
flip-flop could randomly switch.

Yet the design philosophy within the old company always had this assumption, 
supposedly (according to in-company lore) because previous engineers had 
actually found the hard way that random bitflips did occur, and for e.g. 
automobile chips the risk was too great to not have strong mitigations:

* State machines had to force unused states into known states.
  For example a state machine with 3 states needs 2 bits of state, but 2 bits 
of state is actually 4 states, so there is a 4th unused state.
  * Not all state machines needed this rule but during planning we had to 
identify state machines that needed this rule, and often we just targeted 
having 2^n states just to ensure that there were no unused states.
  * I even suggested the use of ECC encoding for important state machines and 
it was something being investigated at the time I left.
* State machines that otherwise did not need the above rule were strongly 
encouraged to clear state at display frame vsync.
  This ensured that any unexpected states they had would only last up to one 
display frame, which was considered acceptable.
* Flip-flops that held settings were periodically reloaded at each display 
frame vsync from a flash memory (which apparently as a lot more immune to 
bitflips).

It could be an artifact as well that the company had its own in-house foundry 
rather than delegate out to TSMC or whatnot --- maybe the technology we had was 
just suckier than state-of-the-art so bitflips were more common.

The reason why this stuck to mind is because at one time we had a DS test where 
shooting the ESD gun could sometimes cause the chip to fail (blank display) 
until reset, when the expectation was that at most it would flicker for one 
display frame.
And afterwards we had to go through the entire RTL looking for which state 
machine or settings register was the culprit.
I even wrote a little Verilog-PLI plugin that would inject deterministically 
random data into flip-flops in the model to try to catch it.
Eventually we found a bunch of possible root causes, and on the next DS 
iteration testing we had fun shooting the chip with the ESD gun over and over 
again and sighing in relief that the display was not failing for more than one 
frame.

The chip was a display driver for automotive, apparently at the time cars were 
starting to transition to 

Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs

2021-02-13 Thread Luke Kenneth Casson Leighton via bitcoin-dev
(cc'ing over to libre-soc-dev)
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-February/018392.html

On Thu, Feb 11, 2021 at 8:21 AM ZmnSCPxj  wrote:

> > i was stunned to learn that in a 28nm ASIC, 50% of it is repeater-buffers!
>
> Well, that surprises me as well.
> [...]
> So I suppose at some point something like that would occur and I should not 
> actually be surprised.
> (Maybe I am more surprised that it reached that level at that technology 
> size, I would have thought 33% at 7nm.)

it's about line-drive strength: lower geometries are even *less* able
to line-drive long distances.

> Another point to ponder is test modes.
> In mass production you **need** test modes.

> (Sure, an attacker can try targeted ESD at the `TESTMODE` flip-flop 
> repeatedly, but this risks also flipping other scan flip-flops that contain 
> the data that is being extracted, so this might be sufficient protection in 
> practice.)

if however the ASIC can be flipped into TESTMODE and yet it carries on
otherwise working, an algorithm can be re-run and the exposed data
will be clean.

> If you are really going to open-source the hardware design then the layout
> is also open and attackers can probably target specific chip area for ESD
> pulse to try a flip-flop upset, so you need to be extra careful.

this is extremely valuable advice.  in the followup [1] you describe a
gating method: this we have already deployed on a couple of places in
case the Libre Cell Library (also being developed at the same time by
Staf Verhaegen of Chips4Makers) causes errors: we do not want, for
example, an error in a Cell Library to cause a permanent HI which
locks us from being able to perform testing of other areas of the
ASIC.

the idea of being able to actually randomly flip bits inside an ASIC
from outside is both hilarious and entirely news to me, yet it sounds
to be exactly the kind of thing that would allow an attacker to
compromise a hardware wallet.  potentially destructively, mind, but
compromise all the same.

beyond even what the trezor team discovered [2] it makes it even more
important that wallet ASICs be Libre/Open.

l.

[1] 
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-February/018412.html
[2] 
https://blog.trezor.io/introducing-tropic-square-why-transparency-matters-a895dab12dd3
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs

2021-02-13 Thread Luke Kenneth Casson Leighton via bitcoin-dev
On Sat, Feb 13, 2021 at 3:01 PM Bryan Bishop  wrote:

> I don't see what you're talking about? None of your February emails
> were sent to ozlabs according to the archives there. Threads for the
> bitcoin-dev mailing list are stored here:
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-February/thread.html

... i am very confused, and also did not mean to send this to the list
at all!  with many apologies for taking up peoples' time here.

l.
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs

2021-02-13 Thread Bryan Bishop via bitcoin-dev
On Sat, Feb 13, 2021 at 4:18 AM Luke Kenneth Casson Leighton 
wrote:

> ... actually i don't see them in the bounces.  what's happening there?
>
> On Saturday, February 13, 2021, Luke Kenneth Casson Leighton <
> l...@lkcl.net> wrote:
> > On Sat, Feb 13, 2021 at 6:10 AM ZmnSCPxj 
> wrote:
> >> Good morning Luke,
> >
> > morning - can i ask you a favour because moderated (off-topic)
> > messages are being forwarded
> > https://lists.ozlabs.org/pipermail/bitcoin-dev-moderation/
> >
> > could you send these instead to libre-soc-...@lists.libre-soc.org?
>

I don't see what you're talking about? None of your February emails were
sent to ozlabs according to the archives there. Threads for the bitcoin-dev
mailing list are stored here:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-February/thread.html

- Bryan
https://twitter.com/kanzure
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs

2021-02-13 Thread Luke Kenneth Casson Leighton via bitcoin-dev
On Sat, Feb 13, 2021 at 6:10 AM ZmnSCPxj  wrote:
>
> Good morning Luke,

morning - can i ask you a favour because moderated (off-topic)
messages are being forwarded
https://lists.ozlabs.org/pipermail/bitcoin-dev-moderation/

could you send these instead to libre-soc-...@lists.libre-soc.org?

many thanks,

l.

> Another thing we can do with scan mode would be something like the below 
> masking:
>
> input CLK, RESET_N;
> input TESTMODE;
> input SCANOUT_INTERNAL;
> output SCANOUT_PAD;
>
> reg gating;
> wire n_gating = gating && TESTMODE;
> always_ff @(posedge CLK, negedge RESET_N) begin
>   if (!RESET_N)   gating <= 1'b1; /*RESET-HIGH*/
>   elsegating <= n_gating; end
>
> assign SCANOUT_PAD = SCANOUT_INTERNAL && gating;
>
> The `gating` means that after reset, if we are not in test mode, `gating` 
> becomes 0 permanently and prevents any scan data from being extracted.
> Assuming scan is not used in normal operation (it should not) then 
> inadvertent ESD noise on the `gating` flip-flop would not have an effect.
>
> Output being combinational should be fine as the output is "just" an AND 
> gate, as long as `gating` does not transition from 0->1 (impossible in normal 
> operation, only at reset condition) then glitching is impossible, and when 
> scan is running then `TESTMODE` should not be exited which means `gating` 
> should remain high as well, thus output is still glitch-free.
>
> Since the flip-flop resets to 1, and in some technologies I have seen a 
> reset-to-0 FF is slightly smaller than a reset-to-1 FF, it might do good to 
> invert the sense of `gating` instead, and use a NOR gate at the output (which 
> might also be smaller than an AND gate, look it up in the technology you are 
> targeting).
> On the other hand the above is a tiny circuit already and it is unlikely you 
> need more than one of it (well for large enough ICs you might want more than 
> one scan chain but still, even the largest ICs we handled never had more than 
> 8 scan chains, usually just 4 to 6) so overoptimizing this is not necessary.
>
>
> Regards,
> ZmnSCPxj
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs

2021-02-12 Thread ZmnSCPxj via bitcoin-dev
Good morning Luke,

Another thing we can do with scan mode would be something like the below 
masking:

input CLK, RESET_N;
input TESTMODE;
input SCANOUT_INTERNAL;
output SCANOUT_PAD;

reg gating;
wire n_gating = gating && TESTMODE;
always_ff @(posedge CLK, negedge RESET_N) begin
  if (!RESET_N)   gating <= 1'b1; /*RESET-HIGH*/
  elsegating <= n_gating; end

assign SCANOUT_PAD = SCANOUT_INTERNAL && gating;

The `gating` means that after reset, if we are not in test mode, `gating` 
becomes 0 permanently and prevents any scan data from being extracted.
Assuming scan is not used in normal operation (it should not) then inadvertent 
ESD noise on the `gating` flip-flop would not have an effect.

Output being combinational should be fine as the output is "just" an AND gate, 
as long as `gating` does not transition from 0->1 (impossible in normal 
operation, only at reset condition) then glitching is impossible, and when scan 
is running then `TESTMODE` should not be exited which means `gating` should 
remain high as well, thus output is still glitch-free.

Since the flip-flop resets to 1, and in some technologies I have seen a 
reset-to-0 FF is slightly smaller than a reset-to-1 FF, it might do good to 
invert the sense of `gating` instead, and use a NOR gate at the output (which 
might also be smaller than an AND gate, look it up in the technology you are 
targeting).
On the other hand the above is a tiny circuit already and it is unlikely you 
need more than one of it (well for large enough ICs you might want more than 
one scan chain but still, even the largest ICs we handled never had more than 8 
scan chains, usually just 4 to 6) so overoptimizing this is not necessary.


Regards,
ZmnSCPxj
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs

2021-02-11 Thread ZmnSCPxj via bitcoin-dev

Good morning Luke,

> > (to be fair, there were tools to force you to improve coverage by injecting 
> > faults to your RTL, e.g. it would virtually flip an `&&` to an `||` and if 
> > none of your tests signaled an error it would complain that your test 
> > coverage sucked.)
>
> nice!

It should be possible for a tool to be developed to parse a Verilog RTL design, 
then generate a new version of it with one change.
Then you could add some automation to run a set of testcases around mutated 
variants of the design.
For example, it could create a "wrapper" module that connects to an unmutated 
differently-named version of the design, and various mutated versions, wire all 
their inputs together, then compare outputs.
If the testcase could trigger an output of a mutated version to be different 
from the reference version, then we would consider that mutation covered by 
that testcase.
Possibly that could be done with Verilog-2001 file writing code in the wrapper 
module to dump out which mutations were covered, then a summary program could 
just read in the generated file.
Or Verilog plugins could be used as well (Icarus supports this, that is how it 
implements all `$` functions).

A drawback is that just because an output is different does not mean the 
testcase actually ***checks*** that output.
If the testcase does not detect the diverging output it could still not be 
properly covering that.

The point of this is to check coverage of the tests.
Not sure how well this works with formal validation.



> > Synthesis in particular is a black box and each vendor keeps their 
> > particular implementations and tricks secret.
>
> sigh.  i think that's partly because they have to insert diodes, and buffers, 
> and generally mess with the netlist.
>
> i was stunned to learn that in a 28nm ASIC, 50% of it is repeater-buffers!

Well, that surprises me as well.

On the other hand, smaller technologies consistently have lower raw output 
current driving capability due to the smaller size, and as trace width goes 
down and frequency goes up they stop acting like ideal 0-impedance traces and 
start acting more like transmission lines.
So I suppose at some point something like that would occur and I should not 
actually be surprised.
(Maybe I am more surprised that it reached that level at that technology size, 
I would have thought 33% at 7nm.)

In the modules where we were doing manual netlist+layout, we used inverting 
buffers instead (slightly smaller than non-inverrting buffers, in most 
technologies a non-inverting buffer is just an inverter followed by an 
inverting buffer), it was an advantage of manual design since it looks like 
synthesis tools are not willing to invert the contents of intermediate 
flip-lfops even if it could give theoretical speed+size advantage to use an 
inverting buffer rather than an non-inverting one (it looks like synthesis 
optimization starts at the output of flip-flops and ends at their input, so a 
manual designer could achieve slightly better performance if they were willing 
to invert an intermediate flip-flop).
Another was that inverting latches were smaller in the technology we were using 
than non-inverting latches, so it was perfectly natural for us to use an 
inverting latch and an inverting buffer on those parts where we needed higher 
fan-out (t was equivalent to a "custom" latch that had higher-than-normal 
driving capability).

Scan chain test generation was impossible though, as those require flip-flops, 
not latches.
Fortunately this was "just" deserialization of high-frequency low-width data 
with no transformation of the data (that was done after the deserialization, at 
lower clock speeds but higher data width, in pure RTL so flip-flops), so it was 
judged acceptable that it would not be covered by scan chain, since scan chain 
is primarily for testing combinational logic between flip-flops.
So we just had flip-flops at the input, and flip-flops at the output, and 
forced all latches to pass-through mode, during scan mode.
We just needed to have enough coverage to uncover stuck-at faults (which was 
still a pain, since additional test vectors slow down manufacturing so we had 
to reduce the test vectors to the minimum possible) in non-scan-momde testing.

Man, making ASICs was tough.


>
> plus, they make an awful lot of money, it is good business.
>
> > Pointing some funding at the open-source Icarus Verilog might also fit, as 
> > it lost its ability to do synthesis more than a decade ago due to inability 
> > to maintain.
>
> ah i didn't know it could do synthesis at all! i thought it was simulation 
> only.

Icarus was the only open-source synthesis tool I could find back then, and it 
dropped synthesis capability fairly early due to maintenance burden (I never 
managed to get the old version with synthesis compiled and never managed actual 
synthesis on it, so my knowledge of it is theoretical).


There is an argument that open-source software is not truly 

Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs

2021-02-03 Thread Luke Kenneth Casson Leighton via bitcoin-dev
On Wednesday, February 3, 2021, ZmnSCPxj  wrote:
> Good morning again Luke,

:)

> If you mean miner power usage, then power efficiency will not reduce
energy consumption.


> Thus, any rational miner will just pack more miners in the same number of
watts rather than reduce their watt consumption.

yes, of course.  the same non-consumer-computing-intuitive logic applies to
purchasing decisions for beowulf clusters.


> Thus, increasing power efficiency for mining does not reduce the amount
of actual energy that will be consumed by Bitcoin mining.

arse.

and if everybody does that, then no matter the performance/watt nobody
"wins".  in fact a case could be made that everybody "loses".

my biggest concern here is that the inherent "arms race" results in very
few players being able to create bitcoin mining ASICs *at all*.

i mentioned earlier that geometry costs are an exponential scale.  3nm must
be somewhere around USD 16 million for production masks.

if there are only a few players that leaves the entirety of bitcoin open to
hardware backdoors.

l.






-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs

2021-02-03 Thread Luke Kenneth Casson Leighton via bitcoin-dev
(hi folks do cc me, i am subscribed digest, thank you for doing that,
ZmnSCPxj)

On Wednesday, February 3, 2021, ZmnSCPxj  wrote:
> Good morning Luke,
>
> I happen to have experience designing digital ASICs, mostly pipelined
data processing.
> However my experience is limited to larger geometries and in
SystemVerilog.

larger geometries for a hardware wallet ASIC is ok (as long as it is not
retail based and trying to run e.g. RSA, taking so long to complete that
the retail customer walks out)

> On the technical side, as I understand it (I have been out of that
industry for 4 years now, so my knowledge may be obsolete)

not at all! still very valuable

> as you approach lower geometries, you also start approaching analog
design.

yyeah i could intuitively tell/guess there might be something like this
which would throw a spanner in the works, it is why the grant request i put
in specifically excluded data-dependent constant time analysis and also
power analysis.


> In our case we were already manually laying out gates and flip-flops (or
replacing flip-flops with level-triggered latches and being extra careful
with clocks) to squeeze performance (and area) ...

ya-howw :)


> Many of the formal correctness proofs were really about the formal
equivalence of the netlist to the RTL; the correctness of the RTL was
"proved" by simulation testing.

thanks to Symbiyosys we are using formal proofs much more extensively, as
effectively a 100% coverage replacement for unit tests.

an example is popcount.  we did two versions.  one is a recursive tree
algorithm, almost impossible to read and understand what the hell it does.

the other is a total braindead 1-liner "x = x + input[i]", rubbish
performance though.

running a formal proof on these gave us 100% confidence that the complex
optimised version does the damn job.


yes we still do unit tests, these are more "demo code".

now, the caveat is that you have to have a model of the "dut" (device under
test) against which to compare, and if the dut is ridiculously complex then
the formal model variant, which has to do the same job, ends up equally as
complex (or effectively a duplicate of the dut) and the exercise is a bit
of a waste of time...

...*unless*... there happens to be other implementations out there.  then
the proof can be run against those and everybody wins through collaboration.



now, here's why i put in the NLnet Grant request to explore going back to
the mathematics of crypto-primitives.

many ISAs e.g. intel AVX2 have added GFMULT8 etc etc because that does
S-Boxes for Rijndael.  they have gone mad by analysing algorithms trying to
fit them to standard ISAs.

nobody does Rijndael S-Boxes any way other than 256-entry lookup tables
because no standard ISA has general-purpose Galois Field Multiply.

consequently implementations in assembler get completely divorced from the
original mathematics on which the cryptographic algorithm was based.

the approach i would like to take is, "hang on a minute: how far would you
get if you actually added *general-purpose* instructions that *directly*
provided the underlying mathematical principles, and then wrapped a
Vector-Matrix Engine around them?".

would this drastically simplify algorithms to the point where *READABLE* c
code compiles directly to opcodes that run screamingly fast, outperforming
hand-optimised SIMD code using standard ISAs?

then, given the Formal Correctness approach above, can we verify that the
mathematically-related opcodes do the job?


> (to be fair, there were tools to force you to improve coverage by
injecting faults to your RTL, e.g. it would virtually flip an `&&` to an
`||` and if none of your tests signaled an error it would complain that
your test coverage sucked.)

nice!

> Things might have changed.

nah.  this is such a complex area, run by few incumbent players, that
innovation is rare.  not least, innovation is different and cannot be
trusted by the Foundries!


> A good RTL would embed SystemVerilog Assertions or PSL Assertions as well.
> Some formal verification tools can understand a subset of SystemVerilog
Assertions / PSL assertions and validate that your RTL conformed to the
assertions, which would probably help cut down on the need for RTL
simulation.

interesting.

> Overall, my understanding is that smaller geometries are needed only if
you want to target a really high performance / unit cost and performance /
energy consumption ratios.
> That is, you would target smaller geometries for mining.

yes.

> If you need a secure tr\*sted computing module that does not need to be
fast or cheap, just very accurate to the required specification, the larger
geometries should be fine and you would be able to live almost entirely in
RTL-land without diving into netlist and layout specifications.

hardware wallet ASICs.

i concur.

> A wrinkle here is that licenses for tools from tr\*sted vendors like
Synopsys or Cadence are ***expensive***.

yes they are :)  we are currently 

Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs

2021-02-02 Thread ZmnSCPxj via bitcoin-dev
Good morning again Luke,



> [my personal favourite is a focus on power-efficiency: battery-operated 
> hand-held devices at or below 3.5 watts (thus not requiring thermal pipes or 
> fans - which tend to break). i have to admit i am a little alarmed at the 
> world-wide energy consumption of bitcoin: personally i would very much prefer 
> to be involved in eco-conscious blockchain and crypto-currency products].

If you mean miner power usage, then power efficiency will not reduce energy 
consumption.

Suppose you are a miner.
Suppose you have access to 1 watt of energy at a particular fixed cost of 1 BTC 
per watt, and you have a current hardware that gives 1 Exahash for 1 watt of 
energy usage.
Suppose this 1 Exahash earns 2 BTC (and that is why you mine, you earn 1 BTC).

Now suppose there is a new technology where a hardware can give 1 Exohash for 
only 0.5 watt of energy usage.
Your choices are:

* Buy only one unit, get 1 Exohash for 0.5 watt, thus getting 2.0 BTC while 
only paying 0.5 BTC in electricity fees for a net of 1.5 BTC.
* Buy two units, get 2 Exohash for 1.0 watt, thus getting 4.0 BTC while only 
paying 1.0 BTC in electricity fees for a net of 3.0 BTC.

What do you think your better choice is?

That assumes that difficulty adjustments do not occur.
If difficulty adjustments are put into consideration, then if everyone *else* 
does the second choice, global mining hashrate doubles and the difficulty 
adjustment matches, and if you took the first choice, you would end up earning 
far less than 2.0 BTC after the difficulty adjustment.

Thus, any rational miner will just pack more miners in the same number of watts 
rather than reduce their watt consumption.
There may be physical limits involved (only so many miners you can put in an 
amount of space, or whatever other limits) but absent those, a rational miner 
will not reduce their energy expenditure with higher-efficiency units, they 
will buy more units.

Thus, increasing power efficiency for mining does not reduce the amount of 
actual energy that will be consumed by Bitcoin mining.

If you are not referring to mining energy, then I think a computer running 
BitTorrent software 24/7 would consume about the same amount of energy as a 
fullnode running Bitcoin software 24/7, and I do not think the energy consumed 
thus is actually particularly high relative to a lot of other things.

Regards,
ZmnSCPxj
___
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev


Re: [bitcoin-dev] Libre/Open blockchain / cryptographic ASICs

2021-02-02 Thread ZmnSCPxj via bitcoin-dev
Good morning Luke,

I happen to have experience designing digital ASICs, mostly pipelined data 
processing.
However my experience is limited to larger geometries and in SystemVerilog.

On the technical side, as I understand it (I have been out of that industry for 
4 years now, so my knowledge may be obsolete) as you approach lower geometries, 
you also start approaching analog design.
In our case we were already manually laying out gates and flip-flops (or 
replacing flip-flops with level-triggered latches and being extra careful with 
clocks) to squeeze performance (and area) for some of the more boring parts 
(i.e. just deserialization of data from a high-frequency low bus width to a 
lower-frequency wide bus width).

Formal correctness proofs are nice, but we were impeded from using those 
because of the need to manually lay out devices, meaning the netlist did not 
correspond exactly to an RTL that formal correctness could understand.
Though to be fair most of the circuit was standard RTL->synthesized netlist and 
formal correctness proofs worked perfectly well for those.
Many of the formal correctness proofs were really about the formal equivalence 
of the netlist to the RTL; the correctness of the RTL was "proved" by 
simulation testing.
(to be fair, there were tools to force you to improve coverage by injecting 
faults to your RTL, e.g. it would virtually flip an `&&` to an `||` and if none 
of your tests signaled an error it would complain that your test coverage 
sucked.)
Things might have changed.

A good RTL would embed SystemVerilog Assertions or PSL Assertions as well.
Some formal verification tools can understand a subset of SystemVerilog 
Assertions / PSL assertions and validate that your RTL conformed to the 
assertions, which would probably help cut down on the need for RTL simulation.

Overall, my understanding is that smaller geometries are needed only if you 
want to target a really high performance / unit cost and performance / energy 
consumption ratios.
That is, you would target smaller geometries for mining.

If you need a secure tr\*sted computing module that does not need to be fast or 
cheap, just very accurate to the required specification, the larger geometries 
should be fine and you would be able to live almost entirely in RTL-land 
without diving into netlist and layout specifications.

A wrinkle here is that licenses for tools from tr\*sted vendors like Synopsys 
or Cadence are ***expensive***.
What is more, you should really buy two sets of licenses, e.g. do logic 
synthesis with Synopsys and then formal verification with Cadence, because you 
do not want to fully tr\*st just one vendor.
Synthesis in particular is a black box and each vendor keeps their particular 
implementations and tricks secret.

Pointing some funding at the open-source Icarus Verilog might also fit, as it 
lost its ability to do synthesis more than a decade ago due to inability to 
maintain.
Icarus Verilog only supports Verilog-2001 and only has very very partial 
support for SystemVerilog (though to be fair, there is little that 
SystemVerilog adds that can be used in RTL --- `always_comb` and `always_ff` 
come to mind, as well as assertions, and I think recent Icarus has started 
experimental support for those for `always` variants).
Note as well that I heard (at the time when I was in the industry) that some 
foundries will not even accept a netlist unless it was created by a synthesis 
tool from one of the major vendors (Synopsys, Cadence, Mentor Graphics, maybe 
more I have forgotten since).

Regards,
ZmnSCPxj

> folks, hi, please do cc me as i am subscribed "digest", apologies for the 
> inconvenience.
>
> i've been speaking on and off with kanzure, asking his advice about a libre / 
> transparently-developed ASIC / SoC, for some time, since meeting a very 
> interesting person at the Barcelona RISC-V Workshop in 2018.
>
> this person pointed out that FIPS-approved algorithms, implemented in 
> FIPS-approved crypto-chips used in hardware wallets to protect billions to 
> trillions in cryptocurrency assets world-wide are basically asking for 
> trouble.  i heard 3rd-hand that the constants used in the original bitcoin 
> protocol were very deliberately changed from those approved by FIPS and the 
> NSA for exactly the reasons that drive people to question whether it is a 
> good idea to trust closed and secretive crypto-chips, no matter how 
> well-intentioned the company that manufactures them.  the person i met was 
> there to "sound out" interested parties willing to help with such a venture, 
> even to the extent of actually buying a Foundry, in order to guarantee that 
> the crypto-chip they would like to see made had not been tampered with at any 
> point during manufacturing.
>
> at FOSDEM2019 i was also approached by a team that also wanted to do a basic 
> "embedded" processor, entirely libre-licensed, only in 350nm or 180nm, with 
> just enough horsepower to do digital signing and so