Re: Allegations regarding OpenBSD IPSEC

2010-12-23 Thread Kurt Knochner
2010/12/23 Clint Pachl pa...@ecentryx.com:
 The last time I installed FreeBSD about 5 years ago, it asked me to pound on
 the keyboard for like 60 seconds during installation (or at first boot,
 can't remember) in order to build up some randomness. I wonder what kind
 of entropy that provided?

run it through a hash function and it's a good value. Patch that value
into the kernel and it's available from the start of the kernel. Then
use that value as a key for a HMAC, to hash time values (and other
entropy). Do all that and you have a good seed for a PRNG.
Unpredictable, different every time, different on all systems.

Regards
Kurt Knochner

http://knochner.com



Re: Allegations regarding OpenBSD IPSEC

2010-12-22 Thread Kurt Knochner
2010/12/22 Theo de Raadt dera...@cvs.openbsd.org:
 Go ahead, do a FIPS check on it.  You will be doing a FIPS check on
 4096 bytes here, then a gap of unknown length, then 4096 bytes here,
 then a gap of unknown length, then 4096 bytes here, then a gap of
 unknown length, 

that's true, if one uses just /dev/arandom (as other consumers will
call arc4random() in the background as well). However if one changes
the code of arc4random() and arc4random_buf() to emit all generated
random values, we will get the whole sequence, from the very first
byte, no matter what consumer requestes data. Reading from
/dev/arandom will then generate the required amount of random values
for the statistic tests, while we can still record all values.

I'll see if I'll be able to do that, just for the sake of learning
something about the internals of openbsd.

Do you have a hint, how I could emit the random values from arc4random
in a clever way? I thought of using an internal buffer and accessing
that through sysctl or another device, e.g. /dev/randstream. The later
looks more complicated, but will certainly teach me more about openbsd
internals.

Regards
Kurt Knochner

http://knochner.com/



Re: Allegations regarding OpenBSD IPSEC

2010-12-21 Thread Kurt Knochner
Hi,

upfront: sorry for double posting!! Some people told me, that I should send
my findings directly to the list instead of a link. Sorry if I  violated the
netiquette on the list!

So, here we go again (text from the forum where I posted it).

regarding the allegations about a backdoor beeing planted into OpenBSD, I
did a code review myself and I believe that I've found two bugs in the PRNG
code. I'm NOT saying that this is the backdoor or even part of the backdoor.
I'm not even saying, that these two bugs create a weakness in the PRNG
itself, however the two bugs just don't look good and possibly need more
investigation!!

Here we go...

OpenBSD uses arc4random() and arc4random_buf() all over the code to generate
random numbers. This code is defined in src/sys/dev/rnd.c.

Within arc4random() and arc4random_buf() the code flow is like this:

arc4random - arc4maybeinit - arc4_stir

arc4_stir() will be called at least every 10 minutes, as a timer is set
within arc4maybeinit() that resets the variable 'arc4random_initialized'
(see below).

 static void
 arc4maybeinit(void)
 {

 if (!arc4random_initialized) {
 #ifdef DIAGNOSTIC
 if (!rnd_attached)
 panic(arc4maybeinit: premature);
 #endif
 arc4random_initialized++;
 arc4_stir();
 /* 10 minutes, per dm@'s suggestion */
 timeout_add_sec(arc4_timeout, 10 * 60);
 }
 }

Now, let's have a look at arc4_stir().

 arc4_stir(void)
 {
 u_int8_t buf[256];
 int len;

 nanotime((struct timespec *) buf);
 len = sizeof(buf) - sizeof(struct timespec);
 get_random_bytes(buf + sizeof (struct timespec), len);
 len += sizeof(struct timespec);

 mtx_enter(rndlock);
if (rndstats.arc4_nstirs  0)
rc4_crypt(arc4random_state, buf, buf, sizeof(buf));

rc4_keysetup(arc4random_state, buf, sizeof(buf));
arc4random_count = 0;
   rndstats.arc4_stirs += len;
 rndstats.arc4_nstirs++;

/*
 * Throw away the first N words of output, as suggested in the
 * paper Weaknesses in the Key Scheduling Algorithm of RC4
 * by Fluher, Mantin, and Shamir.  (N = 256 in our case.)
 */
rc4_skip(arc4random_state, 256 * 4);
mtx_leave(rndlock);

 }

This initializes the RC4 context with some random data, gathered by system
enthropy, that is mainly done by get_random_bytes().

== Bug #1

HOWEVER: Have a look at the buffer that's beeing used as a seed for the RC4
key setup. It's beeing filled with the random data, BUT at the beginning it
will be filled with just the value of nanotime().

nanotime((struct timespec *) buf);
len = sizeof(buf) - sizeof(struct timespec);
get_random_bytes(buf + sizeof (struct timespec), len);
len += sizeof(struct timespec);


So, there is a lot of effort in get_random_bytes() to get real random data
for the buffer and then the value of nanotime() is prepended to the buffer?
That does not look right. Please consider: this buffer will be used as key
for  rc4_keysetup() and thus it should contain unrelated and unpredictable
data.

== Bug #2

The function rc4_crypt() get's called as soon as rndstats.arc4_nstirs  0.
This will be the case whenever arc4_stir get's called the second time (by
the timer reset - see above).

if (rndstats.arc4_nstirs  0)
rc4_crypt(arc4random_state, buf, buf, sizeof(buf));

rc4_keysetup(arc4random_state, buf, sizeof(buf));
arc4random_count = 0;
rndstats.arc4_stirs += len;
rndstats.arc4_nstirs++;

HOWEVER, right after the call of rc4_crypt(), we call rc4_keysetup() with
the same 'arc4random_state'. This makes the call to rc4_crypt() useless, as
the data structure will be overwritten again with the init data of the RC4
function.

AGAIN: I'm not saying that this is part of the backdoor nor that it weakens
the PRNG. HOWEVER, this does not look right and leaves some bad feeling for
me!

I think we will need some investigation on the effect of PRNG quality caused
by these two bugs.

Regards
Kurt Knochner

http://knochner.com/



Re: Allegations regarding OpenBSD IPSEC

2010-12-21 Thread Kurt Knochner
2010/12/21 Theo de Raadt dera...@cvs.openbsd.org

  regarding the allegations about a backdoor beeing planted into OpenBSD, I
  did a code review myself [...]

 By the way...

 It is unfortunate that it required an allegation of this sort for
 people to get to the point where they stop blindly trusting and
 instead go audit the code

without a 'hint' (true or fake), where would you start auditing the
code? It's just too much.

Now, as I have started with it, I will continue to do so, at least
with the crypto code and PRNG code. However, don't get me wrong. I'm
neither a cryptographer nor have I ever touched the openbsd code
before. I did some patching for BSDI BSD/OS (ages ago), but that's it
with my *bsd code contact.

 But looked at from the half-glass-full side, it is refreshing to see
 people trying!

:-)

BTW: iTWire mentions, that two bugs have been found in the crypto
code. Where can I find details on those bugs?

http://www.itwire.com/opinion-and-analysis/open-sauce/43995-openbsd-backdoor-claims-code-audit-begins

Regards
Kurt Knochner

http://knochner.com/



Re: Allegations regarding OpenBSD IPSEC

2010-12-21 Thread Kurt Knochner
2010/12/21 Kurt Knochner cdowl...@googlemail.com:
 2.) don't forget to check if sizeof(ts) = sizeof(buf), otherwise you
 will create a buffer overrun.

O.K. this one is not THAT critical, as buf is defined locally as
u_int8_t buf[256]; However I tend to make those superflous checks in
my code, just to make sure later changes won't break my logic ;-))

Regards
Kurt Knochner

http://knochner.com/



Re: Allegations regarding OpenBSD IPSEC

2010-12-21 Thread Kurt Knochner
2010/12/21 Otto Moerbeek o...@drijf.net:
 Yes, predictable, but different for each call.

hm... predictable is not a good term in the domain of a PRNG.

However the time value will not be used by itself. It is part of an
encrypt operation with itself + buf and a previous RC4 state, at least
after the second call to arc4_stir.

So, maybe this has no meaning at all. However I would recommend to
check this very thoroughly before changing any line of that code.
Maybe you'll add a weakness by removing the time value.

I would recommend to do the follwoing, and I'm trying to do it myself
during the next few days.

1.) Rewrite arc4random() and arc4random_buf() to store all random
values from boot time until the establishment of a few IPSEC tunnels.

2.) Repeat that procedure a few times, i.e. reboot, ipsec, store,
reboot, ipsec, store, etc.

3.) Take all those pseudo random value sequences and feed them into
the NIST test suite for random values (chi-square, diehard, etc.)

4.) Repeat those steps after the removal of the time value from the code.

5.) Try to interpret the outcome of the NIST tests. Maybe other people
(real cryptographers) should help with this last step.

Regards
Kurt Knochner

http://knochner.com/



Re: Allegations regarding OpenBSD IPSEC

2010-12-21 Thread Kurt Knochner
2010/12/21 Ted Unangst ted.unan...@gmail.com:
 On Tue, Dec 21, 2010 at 2:54 PM, Kurt Knochner cdowl...@googlemail.com 
 wrote:
 2.) Repeat that procedure a few times, i.e. reboot, ipsec, store,
 reboot, ipsec, store, etc.

 3.) Take all those pseudo random value sequences and feed them into
 the NIST test suite for random values (chi-square, diehard, etc.)

 You are going to need a buttload of samples for your tests to have any
 significance, and even then, all you've proven is that the numbers
 have a good distribution, not that they are unpredictable.

yes, that's true. However, it's just a starting point. Do we currently
know that they have a good distribution? Is there any documented test
for the quality of the PRNG?

Regards
Kurt Knochner

http://knochner.com



Re: Allegations regarding OpenBSD IPSEC

2010-12-21 Thread Kurt Knochner
2010/12/22 Theo de Raadt dera...@cvs.openbsd.org:
 Is there any documented test for the quality of the PRNG?

 Are you talking about our use of MD5, or our use of RC4?

RC4.

 If you are talking about our RC4, then there is; I will put it this
 way: If our use of RC4 in this exactly-how-a-stream-cipher-works way
 is bad, then every other use on this planet of steam ciphers is bad,
 and very broken.  We are relying on the base concept.

I was just asking if the implementation of the RC4 based PRNG is done
correctly and if there has been a test of the quality of the PRNG
output. It just looked strange for me to seed the algorithm of the
PRNG with a plain time value, though it's just a few bytes at the
beginning of a larger block of data. So, if you believe the
implementation of the PRNG is correct, there is no need to further
analyze this issue.

 The idea is that you can initialize a stream cipher with near-crap and
 it will work OK for the way we are using it.

Right.

 If the MD5 stuff we generate is crap, we are still probably more than
 OK compared to everyone because we are going further, and doing the
 slice/dice everyone-shares on the RC4 output.

I did not say, that anything you generate is crap.



Re: Allegations regarding OpenBSD IPSEC

2010-12-21 Thread Kurt Knochner
2010/12/22 Theo de Raadt dera...@cvs.openbsd.org:
 12 to 16 bytes of kind-of-known but not really known data are mixed with
 256 - (12 to 16) bytes of data to from the initial state of RC4, which is
 then filtered by dropping the first 256 or 256*4 bytes of data as written
 in the best paper that exists today.

 Is it relevant?

It's up to you to make that decision. You know the code better than
anybody else.



Re: Allegations regarding OpenBSD IPSEC

2010-12-21 Thread Kurt Knochner
2010/12/22 Theo de Raadt dera...@cvs.openbsd.org:
 so, the current state of the PRNG will be preserved during reboots.

 That statement is false.

you're right. As you posted in the other thread, the output of the
PRNG is saved during shutdown and that file is loaded as entropy data
during startup.

 No.  You misread the code.

I understood the code, just my description of the process was not
correct (detailed enough).

 However, arc4_stir will still be called once after every reboot.
 During its first call, the value of nanotime() will be placed at the
 beginning of buf, which is then beeing used to init the rc4 context.

 What else do you think we should use?

I don't know. I just wanted to discuss a possible issue. That's all...

 Where do we invent entropy from when the kernel has only
 been running for 0.01 of a second?

O.K. where do you need ramdom bytes during that state of the kernel?
All locations where arc4random* is called in the kernel are these:

src/sys/dev/ic/if_wi.c: sc-wi_icv = arc4random();
src/sys/dev/ic/if_wi_hostap.c:  arc4random();
src/sys/dev/ic/rt2860.c:uint32_t val = arc4random();
src/sys/dev/softraid_crypto.c:  arc4random_buf(sd-mds.mdd_crypto.scr_key,
src/sys/dev/softraid_crypto.c:  arc4random_buf(sd-mds.mdd_crypto.scr_maskkey,
src/sys/dev/usb/if_uath.c:  iv = (ic-ic_iv != 0) ? ic-ic_iv : 
arc4random();
src/sys/dev/usb/ehci.c: /* XXX prevent panics at boot by not using
arc4random */
src/sys/dev/usb/ehci.c: islot = EHCI_IQHIDX(lev, arc4random());
src/sys/dev/pci/ubsec.c:arc4random_buf(ses-ses_iv, 
sizeof(ses-ses_iv));
src/sys/dev/pci/safe.c: arc4random_buf(ses-ses_iv, 
sizeof(ses-ses_iv));
src/sys/dev/pci/noct.c: arc4random_buf(iv, sizeof(iv));
src/sys/dev/pci/if_iwi.c:   arc4random_buf(data, sizeof data);
src/sys/dev/pci/if_ix.c:arc4random_buf(random, sizeof(random));
src/sys/dev/pci/hifn7751.c: arc4random_buf(ses-hs_iv,
src/sys/dev/softraid.c: arc4random_buf(uuid-sui_id, sizeof(uuid-sui_id));

Those in dev/pci are about initializing hardware encryption devices.

The rest of the calls (to the level I checked), will need at least the
root filesystem to load some config data and then init some stuff
(i.e. WEP key generation, etc.).

So, until the filesystem is mounted, there is no need for arc4random()
in the kernel. After the filesystem has been mounted the entropy data
will be loaded from the file. If this is true. Where is the need for
the time value in arc4_stir()??

Maybe I'm wrong. If so, please direct me to the code that needs
arc4random() before the filesystem has been mounted, maybe EXCEPT the
hardware crypto devices. Most certainly those drivers don't need
arc4random during kernel init as well.

 So, at the first glance it looks like using the value of nanotime() in
 At least I would XOR the value of nanotime() to buf,
 instead of just prepending it. MD5 and the like does not seem to be
 necessary, as buf will allways contain some good random data.

 XOR it?  Why?

To fold the plain time value into some other random data returned by
get_random_bytes. If it's a bad idea to stir or fold data that
way, why is MD5 used in extract_entropy() to achieve the same goal?

 Please provide a citation regarding the benefit of XOR'ing feed data
 before passing it into MD5 for the purpose of PRNG folding.

I did not say that. I said, that XORing the time value with the data
of get_random_bytes() is probably sufficient and that MD5 would not be
required.



Re: Allegations regarding OpenBSD IPSEC

2010-12-21 Thread Kurt Knochner
2010/12/22 Theo de Raadt dera...@cvs.openbsd.org:
  Where do we invent entropy from when the kernel has only
  been running for 0.01 of a second?

 O.K. where do you need ramdom bytes during that state of the kernel?
 All locations where arc4random* is called in the kernel are these:

 [list of 16]

 Unfortunately it looks like you missed a hundred or more.

Damn, you're right. It seems my grep pattern was initialized in the
wrong way (maybe not enough entropy from the user) :-))

 No, there is much more than that.  Processes get started and
 initialize their libc-based prng's, as well as other state, including
 address space randomization, stack biasing, etc etc.

After adjusting my grep pattern, I found several more locations. A lot
of those need the filesystem. However at least one (for sure much
more) is indeed calling arc4random while there is no filesystem
mounted.

So, just forget my theory!

 So, until the filesystem is mounted, there is no need for arc4random()
 in the kernel.

 Totally false.

True (that it's false).

So, I guess the discussion about the use of nanotime() is finished, as
there is common agreement that it has no influence on the PRNG,
right?