notmuch sha1 implementation broken on (some) big-endian architectures

2013-11-24 Thread Tomi Ollila
On Sun, Nov 24 2013, David Bremner  wrote:

> The following code, when linked with libnotmuch.a and libutil.a does a
> passable imitation of sha1sum on amd64 (and I guess also i386) but
> computes a different digest on powerpc and probably sparc and s390x.
>
> In the long run we should maybe outsource hash computations to
> e.g. librhash, but I'd like a simpler fix for 0.17, if possible
>
> P.S. I blame Austin for adding the "missing-headers" test which found
> this bug ;).

This is interesting problem, I would have guessed that this would
fails on LITTLE_ENDIAN machines easier, if ever...

... especially as there is line

lib/libsha1.c:52:#if (PLATFORM_BYTE_ORDER == IS_LITTLE_ENDIAN)

... but...

I could not find any (other) matches for PLATFORM_BYTE_ORDER nor
IS_LITTLE_ENDIAN in source code or in /usr/include/**/*.h or
in /usr/lib/gcc/**/*.h

I did some testing and it seems that 
#if (F == BBBAAARRR)

#endif

will have  in output file in case neither of the above are
defined... :/

So, this could work:

#if   // for BYTE_ORDER && LITTLE_ENDIAN

and then

#if (BYTE_ORDER == LITTLE_ENDIAN)
...

to replace lib/libsha1.c:53 (53 now after endian.h added)


Please test on BIG_ENDIAN machine...


In case this works, then we'd need to inform users that their
long/missing Message ID:s are now coded differently in their
databases...


Tomi


notmuch sha1 implementation broken on (some) big-endian architectures

2013-11-24 Thread David Bremner
David Bremner  writes:

> The following code, when linked with libnotmuch.a and libutil.a does a
> passable imitation of sha1sum on amd64 (and I guess also i386) but
> computes a different digest on powerpc and probably sparc and s390x.
>
> In the long run we should maybe outsource hash computations to
> e.g. librhash, but I'd like a simpler fix for 0.17, if possible

Out of curiousity, I tried out a similar example with librhash, and it
works fine on powerpc.

#include 
#include "rhash.h" /* LibRHash interface */

int main(int argc, char *argv[])
{
  char digest[64];
  char output[130];

  rhash_library_init(); /* initialize static data */

  int res = rhash_file(RHASH_SHA1, argv[1], digest);
  if(res < 0) {
fprintf(stderr, "LibRHash error: %s: %s\n", argv[1], strerror(errno));
return 1;
  }

  /* convert binary digest to hexadecimal string */
  rhash_print_bytes(output, digest, rhash_get_digest_size(RHASH_SHA1),RHPR_HEX);

  printf("%s (%s) = %s\n", rhash_get_name(RHASH_SHA1), argv[1], output);
  return 0;
}


notmuch sha1 implementation broken on (some) big-endian architectures

2013-11-24 Thread David Bremner

The following code, when linked with libnotmuch.a and libutil.a does a
passable imitation of sha1sum on amd64 (and I guess also i386) but
computes a different digest on powerpc and probably sparc and s390x.

In the long run we should maybe outsource hash computations to
e.g. librhash, but I'd like a simpler fix for 0.17, if possible

P.S. I blame Austin for adding the "missing-headers" test which found
this bug ;).

/* 8<- */

#include 

#include "notmuch.h"
char * notmuch_sha1_of_file(const char* filename);

int
main (int argc, char **argv)
{

char *digest = notmuch_sha1_of_file (argv[1]);

printf("%s  %s\n",digest,argv[1]);
return 0;
}


notmuch sha1 implementation broken on (some) big-endian architectures

2013-11-24 Thread David Bremner

The following code, when linked with libnotmuch.a and libutil.a does a
passable imitation of sha1sum on amd64 (and I guess also i386) but
computes a different digest on powerpc and probably sparc and s390x.

In the long run we should maybe outsource hash computations to
e.g. librhash, but I'd like a simpler fix for 0.17, if possible

P.S. I blame Austin for adding the missing-headers test which found
this bug ;).

/* 8- */

#include stdio.h

#include notmuch.h
char * notmuch_sha1_of_file(const char* filename);

int
main (int argc, char **argv)
{

char *digest = notmuch_sha1_of_file (argv[1]);

printf(%s  %s\n,digest,argv[1]);
return 0;
}
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: notmuch sha1 implementation broken on (some) big-endian architectures

2013-11-24 Thread David Bremner
David Bremner da...@tethera.net writes:

 The following code, when linked with libnotmuch.a and libutil.a does a
 passable imitation of sha1sum on amd64 (and I guess also i386) but
 computes a different digest on powerpc and probably sparc and s390x.

 In the long run we should maybe outsource hash computations to
 e.g. librhash, but I'd like a simpler fix for 0.17, if possible

Out of curiousity, I tried out a similar example with librhash, and it
works fine on powerpc.

#include errno.h
#include rhash.h /* LibRHash interface */

int main(int argc, char *argv[])
{
  char digest[64];
  char output[130];

  rhash_library_init(); /* initialize static data */

  int res = rhash_file(RHASH_SHA1, argv[1], digest);
  if(res  0) {
fprintf(stderr, LibRHash error: %s: %s\n, argv[1], strerror(errno));
return 1;
  }

  /* convert binary digest to hexadecimal string */
  rhash_print_bytes(output, digest, rhash_get_digest_size(RHASH_SHA1),RHPR_HEX);

  printf(%s (%s) = %s\n, rhash_get_name(RHASH_SHA1), argv[1], output);
  return 0;
}
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: notmuch sha1 implementation broken on (some) big-endian architectures

2013-11-24 Thread Tomi Ollila
On Sun, Nov 24 2013, David Bremner da...@tethera.net wrote:

 The following code, when linked with libnotmuch.a and libutil.a does a
 passable imitation of sha1sum on amd64 (and I guess also i386) but
 computes a different digest on powerpc and probably sparc and s390x.

 In the long run we should maybe outsource hash computations to
 e.g. librhash, but I'd like a simpler fix for 0.17, if possible

 P.S. I blame Austin for adding the missing-headers test which found
 this bug ;).

This is interesting problem, I would have guessed that this would
fails on LITTLE_ENDIAN machines easier, if ever...

... especially as there is line

lib/libsha1.c:52:#if (PLATFORM_BYTE_ORDER == IS_LITTLE_ENDIAN)

... but...

I could not find any (other) matches for PLATFORM_BYTE_ORDER nor
IS_LITTLE_ENDIAN in source code or in /usr/include/**/*.h or
in /usr/lib/gcc/**/*.h

I did some testing and it seems that 
#if (F == BBBAAARRR)
code
#endif

will have code in output file in case neither of the above are
defined... :/

So, this could work:

#if endian.h  // for BYTE_ORDER  LITTLE_ENDIAN

and then

#if (BYTE_ORDER == LITTLE_ENDIAN)
...

to replace lib/libsha1.c:53 (53 now after endian.h added)


Please test on BIG_ENDIAN machine...


In case this works, then we'd need to inform users that their
long/missing Message ID:s are now coded differently in their
databases...


Tomi
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch