Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?
On Tue, 19 Sept 2023 at 03:25, Michael Conrad wrote: > On 9/18/23 06:14, Guillermo Rodriguez Garcia wrote: > > everything is compressed with gzip -7. This is the worst scenario. >> However, even in the worst scenario due to gzip one single bit of >> difference in the input generates a completely different compressed >> output: >> > > Compression (or any other deterministic manipulation of data) does not add > any entropy (or "unpredictability") since the processing is 100% > reproducible. > In terms of entropy the output of the function is as good (or as bad) as > the amount of entropy in the initial seed. > > Hi Michel, > Even aside from that, using gzip as some sort of hash function is not > going to be anywhere near as good as using an actual hash function, like > sha256, sha1 or even md5. > PREMISE Hashing functions and compression algorithms are two completely different kinds of mathematical tools. The most obvious difference is that 1. hash produces an output whose size is fixed whatever is the size of the input while compression output size might vary when input size changes 2. compression algorithms (f) have their counterpart (f⁻¹) that reverse the process while hash have not 3. because of point #2 the compression algorithm is bi-univocal functions the same input gave the same output and the same output brings back to the same input 4 We know that hashing functions are always injective functions: the same input gives the same output but the same output can have different input (collisions). Unless a hardware system is provided with a specific hardware component that produces constant entropy (white noise, preferably) the main problem is to create it from few reasonably good random inputs. As you can imagine, we can start a debate about the definition of "reasonably good random inputs" or "entropy". Or at the opposite, we can accept that those definitions are - restricted to our specific sector - simply meaning unpredictable data - unpredictable by an attacker or even better by the root admin of the system. Nanoseconds time granularity cannot be predicted by an attacker and also a system administrator could have a real hard time in doing that without sophisticated external hardware instruments. Unfortunately, not all systems are able to provide a nanosecond timing and the first main reason of this lack depend by the clock frequency: to have nanoseconds granularity (10⁻⁹) is necessary to have a GHz (10⁹) clock. MD5SUM, GZIP AND THE WHITE NOISE A relatively weak hash like MD5SUM is way better to create an unpredictable stream of data than any compression function. Ok, let see it: +:git-restore:recovery2:yamui> echo | md5sum 68b329da9893e34099c7d8ad5cb9c940 - As you can see I have 1/16 chance to guess the right next char in the md5sum output. At this point you notice that I am unfair because I used the output of the md5sum command-line (textual human-friendly representation) instead of the md5sum() binary output stream. Obviously, you are right. Hence, I make you notice that you did the same considering the gzip. You took the whole stream which also contains the information to decompress that stream of data. Modifying the function in a way that decompression information are not sent to the output, the output cannot be reversed anymore (f⁻¹) does not exist anymore. Because of this trick we can have a sort of length variable hashing function. It is a very bad hash function, because the fixed size of the output is a great feature. A great feature for the primary purpose for which hash functions are currently used, not so great for generating entropy. In fact, if we have 8 bit of entropy, the hash can provide us 512 bit of data stream - white noise - but the number of the 512 bit dataset that could provide us remains 256. In other words: O(8-bit-entropy) = 256 = O(sh512sum 8-bit-entropy). This is what Gulliermo wrote proposing as "entropy conservation principle". The entropy of a closed system never remains constant along the time but always increases and this is a currently accepted principle of physics. Guillermo confused the information with the entropy. However, the principle for which information is an immutable constant is not yet established in physics. I suggest abandoning this kind of consideration and remain confined in our specific sector. In our specific sector, the spectral analysis cannot confute this claim: removed all the data which are specifically tailored for decompressing, the compressed data stream can be statistically separated by white noise in a sensitive way. Well, it is not 100% true. The gzip -1 output can be discriminated by this kind of analysis from gzip -9 output with a certain degree of confidence. However, if we do the same with a real-world entropy generator - say based on thermal effects - we notice a slight rose-noise effect, a tiny-tiny-tiny-tiny predominance in low-frequencies. I used the word tiny 4 times because the black-body law has a 4 as an
Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?
On 9/18/23 06:14, Guillermo Rodriguez Garcia wrote: everything is compressed with gzip -7. This is the worst scenario. However, even in the worst scenario due to gzip one single bit of difference in the input generates a completely different compressed output: Compression (or any other deterministic manipulation of data) does not add any entropy (or "unpredictability") since the processing is 100% reproducible. In terms of entropy the output of the function is as good (or as bad) as the amount of entropy in the initial seed. Even aside from that, using gzip as some sort of hash function is not going to be anywhere near as good as using an actual hash function, like sha256, sha1 or even md5. I would expect this all goes into the kernel's own hashing inside the RNG and so gzip or any other hash function before delivering it to the kernel is probably irrelevant. The name of the game is to find actually random bits, which you either need to save from the previous boot, or obtain from hardware somehow. The low bits of thermal sensors and multithreading scheduler timing nanoseconds are probably your best bet if you can't rely on having a hardware entropy generator. -Mike C ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?
Hi Roberto, El lun, 18 sept 2023 a las 11:54, Roberto A. Foglietta (< roberto.foglie...@gmail.com>) escribió: > On Mon, 18 Sept 2023 at 11:20, Guillermo Rodriguez Garcia > wrote: > > > >> # RAF: seeding the urandom device with some data and a few bits of > randomness. > >> # The randomness is put at the beginning of some text data, which > is going > >> # to be compressed. It is expected that the whole compressed data > will be > >> # way different each time, even if a great part of the input is > constant. > >> # Moreover, the size of the randomness changes each time into a > range of > >> # [32, 64] bytes, and this adds more unpredictability. Like a > hash, the > >> # compression algorithm will produce a way different binary output > by just > >> # changing a few bytes and initial conditions. > >> { > >> n=$((33 + ${RANDOM:-15}%32)) > >> dd if=/dev/random bs=$n count=1 2>&1 > >> cat /proc/cmdline /proc/*stat /init* > >> } | pigz -$((1 + n%9))c > /dev/urandom & > > > > Hi Gulliermo, > > first of all, thank for the feedback. > > > Not sure whether seeding dev/urandom with output from dev/random makes > much sense, since both use the same source of entropy. > > AFAIK, the /dev/random uses a source of entropy related to hardware > events while /dev/urandom is a pseudo-random generator. No, this is a common myth but it is not correct. Both random and urandom use the same PRNG. See: https://www.2uo.de/myths-about-urandom/ > This should > grant us that there is a difference between the two. immediately after > a boot, it is supposed that many hardware events took place Or not. This could be, for example, a cloned VM running on the cloud (which is a typical use case for busybox). > and > therefore reading some bytes from /dev/random would not be such a big > issue., IMHO. > > > Also note that dev/random will block if there is not enough entropy > left, so doing this in an init script might not be a very good idea -- > specially on systems that don't have a good source of entropy available. > > As as you might noticed in the function that I sent later answering > to Jeff, I do not use anymore the /dev/random but /dev/urandom > > udvseed(){ local n=$((33+${RANDOM:-15}%32)) u=/dev/urandom;f(){ dd > if=$u bs=$n count=1; };(cd /proc;f;cat cmdline *stat;f;) 2>&1|pigz > -$((1+n%9))c >$u; } > > This makes your first statement something to consider. Also $RANDOM if > available should be considered generated by the /dev/urandom and > therefore belonging to its entropy pool. If $RANDOM is not available > then my function is quite weak in term of unpredictability because > read 48 bytes from /dev/urandom which is not seeded yet and use it > with some proc data that might change but can be guessed and then > everything is compressed with gzip -7. This is the worst scenario. > However, even in the worst scenario due to gzip one single bit of > difference in the input generates a completely different compressed > output: > Compression (or any other deterministic manipulation of data) does not add any entropy (or "unpredictability") since the processing is 100% reproducible. In terms of entropy the output of the function is as good (or as bad) as the amount of entropy in the initial seed. Best regards, Guillermo Rodriguez Garcia guille.rodrig...@gmail.com ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?
On Mon, 18 Sept 2023 at 11:20, Guillermo Rodriguez Garcia wrote: > >> # RAF: seeding the urandom device with some data and a few bits of >> randomness. >> # The randomness is put at the beginning of some text data, which is >> going >> # to be compressed. It is expected that the whole compressed data will >> be >> # way different each time, even if a great part of the input is >> constant. >> # Moreover, the size of the randomness changes each time into a range of >> # [32, 64] bytes, and this adds more unpredictability. Like a hash, the >> # compression algorithm will produce a way different binary output by >> just >> # changing a few bytes and initial conditions. >> { >> n=$((33 + ${RANDOM:-15}%32)) >> dd if=/dev/random bs=$n count=1 2>&1 >> cat /proc/cmdline /proc/*stat /init* >> } | pigz -$((1 + n%9))c > /dev/urandom & > Hi Gulliermo, first of all, thank for the feedback. > Not sure whether seeding dev/urandom with output from dev/random makes much > sense, since both use the same source of entropy. AFAIK, the /dev/random uses a source of entropy related to hardware events while /dev/urandom is a pseudo-random generator. This should grant us that there is a difference between the two. immediately after a boot, it is supposed that many hardware events took place and therefore reading some bytes from /dev/random would not be such a big issue., IMHO. > Also note that dev/random will block if there is not enough entropy left, so > doing this in an init script might not be a very good idea -- specially on > systems that don't have a good source of entropy available. As as you might noticed in the function that I sent later answering to Jeff, I do not use anymore the /dev/random but /dev/urandom udvseed(){ local n=$((33+${RANDOM:-15}%32)) u=/dev/urandom;f(){ dd if=$u bs=$n count=1; };(cd /proc;f;cat cmdline *stat;f;) 2>&1|pigz -$((1+n%9))c >$u; } This makes your first statement something to consider. Also $RANDOM if available should be considered generated by the /dev/urandom and therefore belonging to its entropy pool. If $RANDOM is not available then my function is quite weak in term of unpredictability because read 48 bytes from /dev/urandom which is not seeded yet and use it with some proc data that might change but can be guessed and then everything is compressed with gzip -7. This is the worst scenario. However, even in the worst scenario due to gzip one single bit of difference in the input generates a completely different compressed output: redfishos:~ # cat /etc/firmware/touch_module_id_0x82.img | pigz -7c | sha1sum 1f0e7e00a47159708a5877b052d2e2c6e3489788 - redfishos:~ # { cat /etc/firmware/touch_module_id_0x82.img; echo; } | pigz -7c | sha1sum d8db1ae97fc5ac8fa441db5c146d95fc43ac6d2e In the best scenario $RANDOM provides a number that makes the procedure change boot. The data read from /dev/urandom can be 32 bytes or 64 bytes and the compression level can vary between 1 and 9. Therefore every single bit of the input should be correctly guessed otherwise the output will be completely different. Instead of using $RANDOM, it is possible to use a value generated from /dev/random by a single byte read: randval=$(dd if=/dev/random bs=1 count=1 status=none | hexdump -ve '1/1 "%d\n"') Or keep the first variant in which the initial data is read from /dev/random in a range between 32 and 64 bytes, adding this trick just in case $RANDOM is not defined. Best regards, R- ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?
El lun, 18 sept 2023 a las 9:42, Roberto A. Foglietta (< roberto.foglie...@gmail.com>) escribió: > Hi all, > > I am investigating the Android init procedure (one version, one > device, not in general) and I found an interesting line about the > initialization of the /dev/urandom (seeding, I suppose). > > cat /proc/cmdline > /dev/urandom > > Therefore, I developed a more sophisticated way to do that initialisation: > > # RAF: seeding the urandom device with some data and a few bits of > randomness. > # The randomness is put at the beginning of some text data, which is > going > # to be compressed. It is expected that the whole compressed data > will be > # way different each time, even if a great part of the input is > constant. > # Moreover, the size of the randomness changes each time into a range > of > # [32, 64] bytes, and this adds more unpredictability. Like a hash, > the > # compression algorithm will produce a way different binary output by > just > # changing a few bytes and initial conditions. > { > n=$((33 + ${RANDOM:-15}%32)) > dd if=/dev/random bs=$n count=1 2>&1 > cat /proc/cmdline /proc/*stat /init* > } | pigz -$((1 + n%9))c > /dev/urandom & > Not sure whether seeding dev/urandom with output from dev/random makes much sense, since both use the same source of entropy. Also note that dev/random will block if there is not enough entropy left, so doing this in an init script might not be a very good idea -- specially on systems that don't have a good source of entropy available. Guillermo ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
[PATCH v2] date: exit with failure when clock_settime fails
From: Ladislav Michl Coreutils date behaves this way since 1998-12-11 as done in their git commit a17cdb11731e ("(main): Arrange to exit unsuccessfully when stime fails.") Signed-off-by: Ladislav Michl --- CHANGES: -v2: better compatibily with coreutils, add explaining commit message coreutils/date.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/coreutils/date.c b/coreutils/date.c index 3a89b6caf..09d5697dc 100644 --- a/coreutils/date.c +++ b/coreutils/date.c @@ -166,12 +166,13 @@ int date_main(int argc UNUSED_PARAM, char **argv) struct tm tm_time; char buf_fmt_dt2str[64]; unsigned opt; - int isofmt = -1; char *date_str; char *fmt_dt2str; char *fmt_str2dt; char *filename; char *isofmt_arg = NULL; + int ret = EXIT_SUCCESS; + int isofmt = -1; opt = getopt32long(argv, "^" "Rs:ud:r:" @@ -287,9 +288,12 @@ int date_main(int argc UNUSED_PARAM, char **argv) ts.tv_sec = validate_tm_time(date_str, _time); ts.tv_nsec = 0; - /* if setting time, set it */ + /* if setting time, set the system clock to the specified date, +* then regardless of the success of that operation, +* format and print that date. */ if ((opt & OPT_SET) && clock_settime(CLOCK_REALTIME, ) < 0) { bb_simple_perror_msg("can't set date"); + ret = EXIT_FAILURE; } } @@ -383,5 +387,5 @@ int date_main(int argc UNUSED_PARAM, char **argv) } puts(date_buf); - return EXIT_SUCCESS; + return ret; } -- 2.39.2 ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?
On Mon, 18 Sept 2023 at 10:11, Jeff Pohlmeyer wrote: > > On Mon, Sep 18, 2023 at 2:42 AM Roberto A. Foglietta > wrote: > > > In case the /dev/urandom initialisation is a necessity (or a best > > practice), does it make sense to add it into busybox as an option or > > as an application? > > If you are able to update to a newer version of busybox, you might > want to check out the recently added "seedrng" applet, which seems to > be a well-considered means of addressing this issue. Hi Jeff, thanks for the anwer: redfishos:~ # seedrng seedrng: can't create directory '/var/lib/seedrng': No such file or directory redfishos:~ # mkdir -p /var/lib/seedrng redfishos:~ # seedrng Saving 2048 bits of creditable seed for next boot I think that the app could create the directory path if it does not exist. Moreover an option to write on stdout would be nice to have. > You can find a > (rather lengthy) discussion here: > > http://lists.busybox.net/pipermail/busybox/2022-April/089545.html About this discussion, I have noticed two main points 1. the RNG can't actually be seeded from a shell script, due to the reliance on ioctls and the fact that entropy written into the unprivileged /dev/urandom device is not immediately mixed in, making subsequent seed reads dangerous. 2. I suppose that the kernel will load the generated file in the standard folder at the next boot time without further changes but I am not sure about that. For sure, it will not succeed in my case because rootfs a volatile filesystem and adding a link to a permanent data partition is not a general solution (for this system and at the moment). IMHO, the best I can do is to seed the /dev/urandom by injecting some data and then retrieve some data from it. I have no clue how long the data read from /dev/urandom to be granted that the entropy injected into it will be mixed as expected. I have created a function that generates more than 2048 bytes for seeding the /dev/urandom and read 4Kb after hoping to trigger the mix of the new entropy. udvseed(){ local n=$((33+${RANDOM:-15}%32)) u=/dev/urandom;f(){ dd if=$u bs=$n count=1; };(cd /proc;f;cat cmdline *stat;f;) 2>&1|pigz -$((1+n%9))c >$u; } I wrote it in a way to be short. In fact, it is 153 bytes while the seedrng app in busybox is about 1650. The function and the app are completely different and it is not a fair comparison. However, it is not the first time that I noticed that a busybox app can be easily replaced with a shell script function and this reduces N times the footprint. Best regards, R- ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?
On Mon, Sep 18, 2023 at 2:42 AM Roberto A. Foglietta wrote: > In case the /dev/urandom initialisation is a necessity (or a best > practice), does it make sense to add it into busybox as an option or > as an application? If you are able to update to a newer version of busybox, you might want to check out the recently added "seedrng" applet, which seems to be a well-considered means of addressing this issue. You can find a (rather lengthy) discussion here: http://lists.busybox.net/pipermail/busybox/2022-April/089545.html - Jeff ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
RFC: initialize /dev/urandom, is it necessary? Can we do it in a better way?
Hi all, I am investigating the Android init procedure (one version, one device, not in general) and I found an interesting line about the initialization of the /dev/urandom (seeding, I suppose). cat /proc/cmdline > /dev/urandom Therefore, I developed a more sophisticated way to do that initialisation: # RAF: seeding the urandom device with some data and a few bits of randomness. # The randomness is put at the beginning of some text data, which is going # to be compressed. It is expected that the whole compressed data will be # way different each time, even if a great part of the input is constant. # Moreover, the size of the randomness changes each time into a range of # [32, 64] bytes, and this adds more unpredictability. Like a hash, the # compression algorithm will produce a way different binary output by just # changing a few bytes and initial conditions. { n=$((33 + ${RANDOM:-15}%32)) dd if=/dev/random bs=$n count=1 2>&1 cat /proc/cmdline /proc/*stat /init* } | pigz -$((1 + n%9))c > /dev/urandom & I wish to ask people here in this m-list, because I know that there are Linux experts here, two questions: 1. initialise the /dev/urandom is necessary? Or the kernel provided itself but is it better? 2. In your opinion, the script above can provide a reasonable unpredictable initialisation? In case the /dev/urandom initialisation is a necessity (or a best practice), does it make sense to add it into busybox as an option or as an application? Best regards, R- ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox