Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?

2023-09-18 Thread Roberto A. Foglietta
On Tue, 19 Sept 2023 at 03:25, Michael Conrad 
wrote:

> On 9/18/23 06:14, Guillermo Rodriguez Garcia wrote:
>
> everything is compressed with gzip -7. This is the worst scenario.
>> However, even in the worst scenario due to gzip one single bit of
>> difference in the input generates a completely different compressed
>> output:
>>
>
> Compression (or any other deterministic manipulation of data) does not add
> any entropy (or "unpredictability") since the processing is 100%
> reproducible.
> In terms of entropy the output of the function is as good (or as bad) as
> the amount of entropy in the initial seed.
>
> Hi Michel,

> Even aside from that, using gzip as some sort of hash function is not
> going to be anywhere near as good as using an actual hash function, like
> sha256, sha1 or even md5.
>

PREMISE

Hashing functions and compression algorithms are two completely different
kinds of mathematical tools. The most obvious difference is that

1. hash produces an output whose size is fixed whatever is the size of the
input while compression output size might vary when input size changes
2. compression algorithms (f) have their counterpart (f⁻¹) that reverse the
process while hash have not
3. because of point #2 the compression algorithm is bi-univocal functions
the same input gave the same output and the same output brings back to the
same input
4 We know that hashing functions are always injective functions: the same
input gives the same output but the same output can have different input
(collisions).

Unless a hardware system is provided with a specific hardware component
that produces constant entropy (white noise, preferably) the main problem
is to create it from few reasonably good random inputs.

As you can imagine, we can start a debate about the definition of
"reasonably good random inputs" or "entropy". Or at the opposite, we can
accept that those definitions are - restricted to our specific sector -
simply meaning unpredictable data - unpredictable by an attacker or even
better by the root admin of the system. Nanoseconds time granularity cannot
be predicted by an attacker and also a system administrator could have a
real hard time in doing that without sophisticated external hardware
instruments. Unfortunately, not all systems are able to provide a
nanosecond timing and the first main reason of this lack depend by the
clock frequency: to have nanoseconds granularity (10⁻⁹) is necessary to
have a GHz (10⁹) clock.


MD5SUM, GZIP AND THE WHITE NOISE

A relatively weak hash like MD5SUM is way better to create an unpredictable
stream of data than any compression function. Ok, let see it:

+:git-restore:recovery2:yamui> echo | md5sum
68b329da9893e34099c7d8ad5cb9c940  -

As you can see I have 1/16 chance to guess the right next char in the
md5sum output. At this point you notice that I am unfair because I used the
output of the md5sum command-line (textual human-friendly representation)
instead of the md5sum() binary output stream. Obviously, you are right.
Hence, I make you notice that you did the same considering the gzip. You
took the whole stream which also contains the information to decompress
that stream of data. Modifying the function in a way that decompression
information are not sent to the output, the output cannot be reversed
anymore (f⁻¹) does not exist anymore.

Because of this trick we can have a sort of length variable
hashing function. It is a very bad hash function, because the fixed size of
the output is a great feature. A great feature for the primary purpose for
which hash functions are currently used, not so great for generating
entropy. In fact, if we have 8 bit of entropy, the hash can provide us 512
bit of data stream - white noise - but the number of the 512 bit dataset
that could provide us remains 256. In other words: O(8-bit-entropy) = 256 =
O(sh512sum 8-bit-entropy).

This is what Gulliermo wrote proposing as "entropy conservation principle".
The entropy of a closed system never remains constant along the time but
always increases and this is a currently accepted principle of physics.
Guillermo confused the information with the entropy. However, the principle
for which information is an immutable constant is not yet established in
physics. I suggest abandoning this kind of consideration and remain
confined in our specific sector.

In our specific sector, the spectral analysis cannot confute this claim:
removed all the data which are specifically tailored for decompressing, the
compressed data stream can be statistically separated by white noise in a
sensitive way. Well, it is not 100% true. The gzip -1 output can be
discriminated by this kind of analysis from gzip -9 output with a
certain degree of confidence. However, if we do the same with a real-world
entropy generator - say based on thermal effects - we notice a slight
rose-noise effect, a tiny-tiny-tiny-tiny predominance in low-frequencies. I
used the word tiny 4 times because the black-body law has a 4 as an

Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?

2023-09-18 Thread Michael Conrad

On 9/18/23 06:14, Guillermo Rodriguez Garcia wrote:


everything is compressed with gzip -7. This is the worst scenario.
However, even in the worst scenario due to gzip one single bit of
difference in the input generates a completely different compressed
output:


Compression (or any other deterministic manipulation of data) does not 
add any entropy (or "unpredictability") since the processing is 100% 
reproducible.
In terms of entropy the output of the function is as good (or as bad) 
as the amount of entropy in the initial seed.


Even aside from that, using gzip as some sort of hash function is not 
going to be anywhere near as good as using an actual hash function, like 
sha256, sha1 or even md5.


I would expect this all goes into the kernel's own hashing inside the 
RNG and so gzip or any other hash function before delivering it to the 
kernel is probably irrelevant.


The name of the game is to find actually random bits, which you either 
need to save from the previous boot, or obtain from hardware somehow.  
The low bits of thermal sensors and multithreading scheduler timing 
nanoseconds are probably your best bet if you can't rely on having a 
hardware entropy generator.



-Mike C
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?

2023-09-18 Thread Guillermo Rodriguez Garcia
Hi Roberto,

El lun, 18 sept 2023 a las 11:54, Roberto A. Foglietta (<
roberto.foglie...@gmail.com>) escribió:

> On Mon, 18 Sept 2023 at 11:20, Guillermo Rodriguez Garcia
>  wrote:
> >
> >> # RAF: seeding the urandom device with some data and a few bits of
> randomness.
> >> #  The randomness is put at the beginning of some text data, which
> is going
> >> #  to be compressed. It is expected that the whole compressed data
> will be
> >> #  way different each time, even if a great part of the input is
> constant.
> >> #  Moreover, the size of the randomness changes each time into a
> range of
> >> #  [32, 64] bytes, and this adds more unpredictability. Like a
> hash, the
> >> #  compression algorithm will produce a way different binary output
> by just
> >> #  changing a few bytes and initial conditions.
> >> {
> >> n=$((33 + ${RANDOM:-15}%32))
> >> dd if=/dev/random bs=$n count=1 2>&1
> >> cat /proc/cmdline /proc/*stat /init*
> >> } | pigz -$((1 + n%9))c > /dev/urandom &
> >
>
> Hi Gulliermo,
>
> first of all, thank for the feedback.
>
> > Not sure whether seeding dev/urandom with output from dev/random makes
> much sense, since both use the same source of entropy.
>
> AFAIK, the /dev/random uses a source of entropy related to hardware
> events while /dev/urandom is a pseudo-random generator.


No, this is a common myth but it is not correct. Both random and urandom
use the same PRNG. See: https://www.2uo.de/myths-about-urandom/


> This should
> grant us that there is a difference between the two. immediately after
> a boot, it is supposed that many hardware events took place


Or not. This could be, for example, a cloned VM running on the cloud (which
is a typical use case for busybox).


> and
> therefore reading some bytes from /dev/random would not be such a big
> issue., IMHO.
>
> > Also note that dev/random will block if there is not enough entropy
> left, so doing this in an init script might not be a very good idea --
> specially on systems that don't have a good source of entropy available.
>
> As  as you might noticed in the function that I sent later answering
> to Jeff, I do not use anymore the /dev/random but /dev/urandom
>
> udvseed(){ local n=$((33+${RANDOM:-15}%32)) u=/dev/urandom;f(){ dd
> if=$u bs=$n count=1; };(cd /proc;f;cat cmdline *stat;f;) 2>&1|pigz
> -$((1+n%9))c >$u; }
>
> This makes your first statement something to consider. Also $RANDOM if
> available should be considered generated by the /dev/urandom and
> therefore belonging to its entropy pool. If $RANDOM is not available
> then my function is quite weak in term of unpredictability because
> read 48 bytes from /dev/urandom which is not seeded yet and use it
> with some proc data that might change but can be guessed and then
> everything is compressed with gzip -7. This is the worst scenario.
> However, even in the worst scenario due to gzip one single bit of
> difference in the input generates a completely different compressed
> output:
>

Compression (or any other deterministic manipulation of data) does not add
any entropy (or "unpredictability") since the processing is 100%
reproducible.
In terms of entropy the output of the function is as good (or as bad) as
the amount of entropy in the initial seed.

Best regards,

Guillermo Rodriguez Garcia
guille.rodrig...@gmail.com
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?

2023-09-18 Thread Roberto A. Foglietta
On Mon, 18 Sept 2023 at 11:20, Guillermo Rodriguez Garcia
 wrote:
>
>> # RAF: seeding the urandom device with some data and a few bits of 
>> randomness.
>> #  The randomness is put at the beginning of some text data, which is 
>> going
>> #  to be compressed. It is expected that the whole compressed data will 
>> be
>> #  way different each time, even if a great part of the input is 
>> constant.
>> #  Moreover, the size of the randomness changes each time into a range of
>> #  [32, 64] bytes, and this adds more unpredictability. Like a hash, the
>> #  compression algorithm will produce a way different binary output by 
>> just
>> #  changing a few bytes and initial conditions.
>> {
>> n=$((33 + ${RANDOM:-15}%32))
>> dd if=/dev/random bs=$n count=1 2>&1
>> cat /proc/cmdline /proc/*stat /init*
>> } | pigz -$((1 + n%9))c > /dev/urandom &
>

Hi Gulliermo,

first of all, thank for the feedback.

> Not sure whether seeding dev/urandom with output from dev/random makes much 
> sense, since both use the same source of entropy.

AFAIK, the /dev/random uses a source of entropy related to hardware
events while /dev/urandom is a pseudo-random generator. This should
grant us that there is a difference between the two. immediately after
a boot, it is supposed that many hardware events took place and
therefore reading some bytes from /dev/random would not be such a big
issue., IMHO.

> Also note that dev/random will block if there is not enough entropy left, so 
> doing this in an init script might not be a very good idea -- specially on 
> systems that don't have a good source of entropy available.

As  as you might noticed in the function that I sent later answering
to Jeff, I do not use anymore the /dev/random but /dev/urandom

udvseed(){ local n=$((33+${RANDOM:-15}%32)) u=/dev/urandom;f(){ dd
if=$u bs=$n count=1; };(cd /proc;f;cat cmdline *stat;f;) 2>&1|pigz
-$((1+n%9))c >$u; }

This makes your first statement something to consider. Also $RANDOM if
available should be considered generated by the /dev/urandom and
therefore belonging to its entropy pool. If $RANDOM is not available
then my function is quite weak in term of unpredictability because
read 48 bytes from /dev/urandom which is not seeded yet and use it
with some proc data that might change but can be guessed and then
everything is compressed with gzip -7. This is the worst scenario.
However, even in the worst scenario due to gzip one single bit of
difference in the input generates a completely different compressed
output:

redfishos:~ # cat /etc/firmware/touch_module_id_0x82.img | pigz -7c | sha1sum
1f0e7e00a47159708a5877b052d2e2c6e3489788  -
redfishos:~ # { cat /etc/firmware/touch_module_id_0x82.img; echo; } |
pigz -7c | sha1sum
d8db1ae97fc5ac8fa441db5c146d95fc43ac6d2e

In the best scenario $RANDOM provides a number that makes the
procedure change boot. The data read from /dev/urandom can be 32 bytes
or 64 bytes and the compression level can vary between 1 and 9.
Therefore every single bit of the input should be correctly guessed
otherwise the output will be completely different.

Instead of using $RANDOM, it is possible to use a value generated from
/dev/random by a single byte read:

randval=$(dd if=/dev/random bs=1 count=1 status=none | hexdump -ve '1/1 "%d\n"')

Or keep the first variant in which the initial data is read from
/dev/random in a range between 32 and 64 bytes, adding this trick just
in case $RANDOM is not defined.

Best regards, R-
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?

2023-09-18 Thread Guillermo Rodriguez Garcia
El lun, 18 sept 2023 a las 9:42, Roberto A. Foglietta (<
roberto.foglie...@gmail.com>) escribió:

> Hi all,
>
>  I am investigating the Android init procedure (one version, one
> device, not in general) and I found an interesting line about the
> initialization of the /dev/urandom (seeding, I suppose).
>
>  cat /proc/cmdline > /dev/urandom
>
>  Therefore, I developed a more sophisticated way to do that initialisation:
>
> # RAF: seeding the urandom device with some data and a few bits of
> randomness.
> #  The randomness is put at the beginning of some text data, which is
> going
> #  to be compressed. It is expected that the whole compressed data
> will be
> #  way different each time, even if a great part of the input is
> constant.
> #  Moreover, the size of the randomness changes each time into a range
> of
> #  [32, 64] bytes, and this adds more unpredictability. Like a hash,
> the
> #  compression algorithm will produce a way different binary output by
> just
> #  changing a few bytes and initial conditions.
> {
> n=$((33 + ${RANDOM:-15}%32))
> dd if=/dev/random bs=$n count=1 2>&1
> cat /proc/cmdline /proc/*stat /init*
> } | pigz -$((1 + n%9))c > /dev/urandom &
>

Not sure whether seeding dev/urandom with output from dev/random makes much
sense, since both use the same source of entropy.

Also note that dev/random will block if there is not enough entropy left,
so doing this in an init script might not be a very good idea -- specially
on systems that don't have a good source of entropy available.

Guillermo
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


[PATCH v2] date: exit with failure when clock_settime fails

2023-09-18 Thread Ladislav Michl
From: Ladislav Michl 

Coreutils date behaves this way since 1998-12-11 as done in their git commit
a17cdb11731e ("(main): Arrange to exit unsuccessfully when stime fails.")

Signed-off-by: Ladislav Michl 
---
 CHANGES:
 -v2: better compatibily with coreutils, add explaining commit message

 coreutils/date.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/coreutils/date.c b/coreutils/date.c
index 3a89b6caf..09d5697dc 100644
--- a/coreutils/date.c
+++ b/coreutils/date.c
@@ -166,12 +166,13 @@ int date_main(int argc UNUSED_PARAM, char **argv)
struct tm tm_time;
char buf_fmt_dt2str[64];
unsigned opt;
-   int isofmt = -1;
char *date_str;
char *fmt_dt2str;
char *fmt_str2dt;
char *filename;
char *isofmt_arg = NULL;
+   int ret = EXIT_SUCCESS;
+   int isofmt = -1;
 
opt = getopt32long(argv, "^"
"Rs:ud:r:"
@@ -287,9 +288,12 @@ int date_main(int argc UNUSED_PARAM, char **argv)
ts.tv_sec = validate_tm_time(date_str, _time);
ts.tv_nsec = 0;
 
-   /* if setting time, set it */
+   /* if setting time, set the system clock to the specified date,
+* then regardless of the success of that operation,
+* format and print that date. */
if ((opt & OPT_SET) && clock_settime(CLOCK_REALTIME, ) < 0) {
bb_simple_perror_msg("can't set date");
+   ret = EXIT_FAILURE;
}
}
 
@@ -383,5 +387,5 @@ int date_main(int argc UNUSED_PARAM, char **argv)
}
puts(date_buf);
 
-   return EXIT_SUCCESS;
+   return ret;
 }
-- 
2.39.2

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?

2023-09-18 Thread Roberto A. Foglietta
On Mon, 18 Sept 2023 at 10:11, Jeff Pohlmeyer  wrote:
>
> On Mon, Sep 18, 2023 at 2:42 AM Roberto A. Foglietta
>  wrote:
>
> > In case the /dev/urandom initialisation is a necessity (or a best
> > practice), does it make sense to add it into busybox as an option or
> > as an application?
>
> If you are able to update to a newer version of busybox, you might
> want to check out the recently added "seedrng" applet, which seems to
> be a well-considered means of addressing this issue.

Hi Jeff,

thanks for the anwer:

redfishos:~ # seedrng
seedrng: can't create directory '/var/lib/seedrng': No such file or directory
redfishos:~ # mkdir -p /var/lib/seedrng
redfishos:~ # seedrng
Saving 2048 bits of creditable seed for next boot

I think that the app could create the directory path if it does not
exist. Moreover an option to write on stdout would be nice to have.

> You can find a
> (rather lengthy) discussion here:
>
> http://lists.busybox.net/pipermail/busybox/2022-April/089545.html

About this discussion, I have noticed two main points

1. the RNG can't actually be seeded from a shell script, due to the
reliance on ioctls and the fact that entropy written into the
unprivileged /dev/urandom device is not immediately mixed in, making
subsequent seed reads dangerous.

2. I suppose that the kernel will load the generated file in the
standard folder at the next boot time without further changes but I am
not sure about that. For sure, it will not succeed in my case because
rootfs a volatile filesystem and adding a link to a permanent data
partition is not a general solution (for this system and at the
moment).

IMHO, the best I can do is to seed the /dev/urandom by injecting some
data and then retrieve some data from it. I have no clue how long the
data read from /dev/urandom to be granted that the entropy injected
into it will be mixed as expected.

I have created a function that generates more than 2048 bytes for
seeding the /dev/urandom and read 4Kb after hoping to trigger the mix
of the new entropy.

udvseed(){ local n=$((33+${RANDOM:-15}%32)) u=/dev/urandom;f(){ dd
if=$u bs=$n count=1; };(cd /proc;f;cat cmdline *stat;f;) 2>&1|pigz
-$((1+n%9))c >$u; }

I wrote it in a way to be short. In fact, it is 153 bytes while the
seedrng app in busybox is about 1650. The function and the app are
completely different and it is not a fair comparison. However, it is
not the first time that I noticed that a busybox app can be easily
replaced with a shell script function and this reduces N times the
footprint.

Best regards, R-
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?

2023-09-18 Thread Jeff Pohlmeyer
On Mon, Sep 18, 2023 at 2:42 AM Roberto A. Foglietta
 wrote:

> In case the /dev/urandom initialisation is a necessity (or a best
> practice), does it make sense to add it into busybox as an option or
> as an application?

If you are able to update to a newer version of busybox, you might
want to check out the recently added "seedrng" applet, which seems to
be a well-considered means of addressing this issue. You can find a
(rather lengthy) discussion here:

http://lists.busybox.net/pipermail/busybox/2022-April/089545.html


- Jeff
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox


RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?

2023-09-18 Thread Roberto A. Foglietta
Hi all,

 I am investigating the Android init procedure (one version, one
device, not in general) and I found an interesting line about the
initialization of the /dev/urandom (seeding, I suppose).

 cat /proc/cmdline > /dev/urandom

 Therefore, I developed a more sophisticated way to do that initialisation:

# RAF: seeding the urandom device with some data and a few bits of randomness.
#  The randomness is put at the beginning of some text data, which is going
#  to be compressed. It is expected that the whole compressed data will be
#  way different each time, even if a great part of the input is constant.
#  Moreover, the size of the randomness changes each time into a range of
#  [32, 64] bytes, and this adds more unpredictability. Like a hash, the
#  compression algorithm will produce a way different binary output by just
#  changing a few bytes and initial conditions.
{
n=$((33 + ${RANDOM:-15}%32))
dd if=/dev/random bs=$n count=1 2>&1
cat /proc/cmdline /proc/*stat /init*
} | pigz -$((1 + n%9))c > /dev/urandom &

 I wish to ask people here in this m-list, because I know that there
are Linux experts here, two questions:

1. initialise the /dev/urandom is necessary? Or the kernel provided
itself but is it better?
2. In your opinion, the script above can provide a reasonable
unpredictable initialisation?

In case the /dev/urandom initialisation is a necessity (or a best
practice), does it make sense to add it into busybox as an option or
as an application?

Best regards, R-
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox