notmuch sha1 implementation broken on (some) big-endian architectures
On Sun, Nov 24 2013, David Bremner wrote: > The following code, when linked with libnotmuch.a and libutil.a does a > passable imitation of sha1sum on amd64 (and I guess also i386) but > computes a different digest on powerpc and probably sparc and s390x. > > In the long run we should maybe outsource hash computations to > e.g. librhash, but I'd like a simpler fix for 0.17, if possible > > P.S. I blame Austin for adding the "missing-headers" test which found > this bug ;). This is interesting problem, I would have guessed that this would fails on LITTLE_ENDIAN machines easier, if ever... ... especially as there is line lib/libsha1.c:52:#if (PLATFORM_BYTE_ORDER == IS_LITTLE_ENDIAN) ... but... I could not find any (other) matches for PLATFORM_BYTE_ORDER nor IS_LITTLE_ENDIAN in source code or in /usr/include/**/*.h or in /usr/lib/gcc/**/*.h I did some testing and it seems that #if (F == BBBAAARRR) #endif will have in output file in case neither of the above are defined... :/ So, this could work: #if // for BYTE_ORDER && LITTLE_ENDIAN and then #if (BYTE_ORDER == LITTLE_ENDIAN) ... to replace lib/libsha1.c:53 (53 now after endian.h added) Please test on BIG_ENDIAN machine... In case this works, then we'd need to inform users that their long/missing Message ID:s are now coded differently in their databases... Tomi
notmuch sha1 implementation broken on (some) big-endian architectures
David Bremner writes: > The following code, when linked with libnotmuch.a and libutil.a does a > passable imitation of sha1sum on amd64 (and I guess also i386) but > computes a different digest on powerpc and probably sparc and s390x. > > In the long run we should maybe outsource hash computations to > e.g. librhash, but I'd like a simpler fix for 0.17, if possible Out of curiousity, I tried out a similar example with librhash, and it works fine on powerpc. #include #include "rhash.h" /* LibRHash interface */ int main(int argc, char *argv[]) { char digest[64]; char output[130]; rhash_library_init(); /* initialize static data */ int res = rhash_file(RHASH_SHA1, argv[1], digest); if(res < 0) { fprintf(stderr, "LibRHash error: %s: %s\n", argv[1], strerror(errno)); return 1; } /* convert binary digest to hexadecimal string */ rhash_print_bytes(output, digest, rhash_get_digest_size(RHASH_SHA1),RHPR_HEX); printf("%s (%s) = %s\n", rhash_get_name(RHASH_SHA1), argv[1], output); return 0; }
notmuch sha1 implementation broken on (some) big-endian architectures
The following code, when linked with libnotmuch.a and libutil.a does a passable imitation of sha1sum on amd64 (and I guess also i386) but computes a different digest on powerpc and probably sparc and s390x. In the long run we should maybe outsource hash computations to e.g. librhash, but I'd like a simpler fix for 0.17, if possible P.S. I blame Austin for adding the "missing-headers" test which found this bug ;). /* 8<- */ #include #include "notmuch.h" char * notmuch_sha1_of_file(const char* filename); int main (int argc, char **argv) { char *digest = notmuch_sha1_of_file (argv[1]); printf("%s %s\n",digest,argv[1]); return 0; }
notmuch sha1 implementation broken on (some) big-endian architectures
The following code, when linked with libnotmuch.a and libutil.a does a passable imitation of sha1sum on amd64 (and I guess also i386) but computes a different digest on powerpc and probably sparc and s390x. In the long run we should maybe outsource hash computations to e.g. librhash, but I'd like a simpler fix for 0.17, if possible P.S. I blame Austin for adding the missing-headers test which found this bug ;). /* 8- */ #include stdio.h #include notmuch.h char * notmuch_sha1_of_file(const char* filename); int main (int argc, char **argv) { char *digest = notmuch_sha1_of_file (argv[1]); printf(%s %s\n,digest,argv[1]); return 0; } ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: notmuch sha1 implementation broken on (some) big-endian architectures
David Bremner da...@tethera.net writes: The following code, when linked with libnotmuch.a and libutil.a does a passable imitation of sha1sum on amd64 (and I guess also i386) but computes a different digest on powerpc and probably sparc and s390x. In the long run we should maybe outsource hash computations to e.g. librhash, but I'd like a simpler fix for 0.17, if possible Out of curiousity, I tried out a similar example with librhash, and it works fine on powerpc. #include errno.h #include rhash.h /* LibRHash interface */ int main(int argc, char *argv[]) { char digest[64]; char output[130]; rhash_library_init(); /* initialize static data */ int res = rhash_file(RHASH_SHA1, argv[1], digest); if(res 0) { fprintf(stderr, LibRHash error: %s: %s\n, argv[1], strerror(errno)); return 1; } /* convert binary digest to hexadecimal string */ rhash_print_bytes(output, digest, rhash_get_digest_size(RHASH_SHA1),RHPR_HEX); printf(%s (%s) = %s\n, rhash_get_name(RHASH_SHA1), argv[1], output); return 0; } ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: notmuch sha1 implementation broken on (some) big-endian architectures
On Sun, Nov 24 2013, David Bremner da...@tethera.net wrote: The following code, when linked with libnotmuch.a and libutil.a does a passable imitation of sha1sum on amd64 (and I guess also i386) but computes a different digest on powerpc and probably sparc and s390x. In the long run we should maybe outsource hash computations to e.g. librhash, but I'd like a simpler fix for 0.17, if possible P.S. I blame Austin for adding the missing-headers test which found this bug ;). This is interesting problem, I would have guessed that this would fails on LITTLE_ENDIAN machines easier, if ever... ... especially as there is line lib/libsha1.c:52:#if (PLATFORM_BYTE_ORDER == IS_LITTLE_ENDIAN) ... but... I could not find any (other) matches for PLATFORM_BYTE_ORDER nor IS_LITTLE_ENDIAN in source code or in /usr/include/**/*.h or in /usr/lib/gcc/**/*.h I did some testing and it seems that #if (F == BBBAAARRR) code #endif will have code in output file in case neither of the above are defined... :/ So, this could work: #if endian.h // for BYTE_ORDER LITTLE_ENDIAN and then #if (BYTE_ORDER == LITTLE_ENDIAN) ... to replace lib/libsha1.c:53 (53 now after endian.h added) Please test on BIG_ENDIAN machine... In case this works, then we'd need to inform users that their long/missing Message ID:s are now coded differently in their databases... Tomi ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch