Hello again!

I wrote:
> > #elif defined(BF_PTR)
> >
> > /* This is normally very good */
> >
> > #define BF_ENC(LL,R,S,P) \
> >       LL^=P; \
> >       LL^= (((*(BF_LONG *)((unsigned char *)&(S[  0])+((R>>BF_0)&BF_M))+ \
> >               *(BF_LONG *)((unsigned char *)&(S[256])+((R>>BF_1)&BF_M)))^ \
> >               *(BF_LONG *)((unsigned char *)&(S[512])+((R>>BF_2)&BF_M)))+ \
> >               *(BF_LONG *)((unsigned char *)&(S[768])+((R<<BF_3)&BF_M)));
> Observe &BF_M! Remember that it had *two* LSB cleared? What does it
> mean? It means that on LP64 it's going to generate misaligned access!
> Indeed, two bits means 32-bit alignment, doesn't it? Boom! The code
> dumps the core with BUS ERROR. Well, probably not on Alpha (I don't have
> one to test, yet...).
No, it doesn't for reasons I don't understand... Can anybody explain?
Compiler does generate ldq instructions, assembler manual says
"data-alignment exception is signaled", but programs runs with no BUS
ERROR (unlike Solaris 7/64). It must be the kernels (tested under
Digital Unix and Linux) playing smart and fetching unaligned data in
trap handler...
> But first I want to point out that misaligned
> access is actually a minor problem here! All the shifts and masks are
> chosen in assumption that S is an array of 32 bits values!!! No wonder
> it doesn't work on LP64s...
> > #else
> >
> > /* This will always work, even on 64 bit machines and strangly enough,
> >  * on the Alpha it is faster than the pointer versions (both 32 and 64
> >  * versions of BF_LONG) */
> Well, Alpha does have misaligned access instructions that are known
> for *hurting* performance of memory access bound algorithms like this
> one!
Whatever impact of misaligned access instructions is I don't beleive
it's comparable with what I see when instrumenting following program:

unsigned long func (unsigned long *p,int i)
{  return *(unsigned long *)((unsigned char *)p + (i&~3));  }
main ()
{  unsigned long p[16]; int i=0;
   while (1) func(p,i), ++i&=63;
}

It performs 30 times slower under Linux and 15 - under OSF!!! Yes, it's
lpq generated in func()...
> 
> Ways to solve the puzzle:
> 
> - force BF_LONG to unsigned int on *all* platforms;
> - pick BF_* depending on sizeof(long);
> 
> I myself would vote for the first alternative unless someone can either:
> 
> - confirm that *(unligned long *)((unsigned char *)p+i&~7) generates the
> unwanted unaligned load instruction on Alpha;
As you realize the performance degradation on Alpha was presumably
caused *not* by presence of misaligned load instructions, but because
aligned load instructions are/were generating a lot of traps in BF_PTR
loop! I've also bothered to instrument an aligned 32 bits load and can
confirm that there's *no* notable performance difference between 64 bits
and 32 bits load as long as they're properly aligned.

Bottom line. I vote (no "would" this time:-) for #define BF_LONG
unsigned int on *all* platforms. Patch is damn simple:

*** crypto/bf/blowfish.h.orig   Tue Dec 22 16:59:56 1998
--- crypto/bf/blowfish.h        Tue Apr 20 14:18:20 1999
***************
*** 66,84 ****
  #define BF_ENCRYPT    1
  #define BF_DECRYPT    0
  
! /* If you make this 'unsigned int' the pointer variants will work on
!  * the Alpha, otherwise they will not.  Strangly using the '8 byte'
!  * BF_LONG and the default 'non-pointer' inner loop is the best
configuration
!  * for the Alpha */
! #if defined(__sgi)
! #  if (_MIPS_SZLONG==64)
! #    define BF_LONG unsigned int
! #  else
! #    define BF_LONG unsigned long
! #  endif
! #else
! #  define BF_LONG unsigned long
! #endif
  
  #define BF_ROUNDS     16
  #define BF_BLOCK      8
--- 66,72 ----
  #define BF_ENCRYPT    1
  #define BF_DECRYPT    0
  
! #define BF_LONG unsigned int
  
  #define BF_ROUNDS     16
  #define BF_BLOCK      8

Well, cleaning up #ifdef spaghetti in the beginning of bf_locl.h sounds
like a good idea as well, but I leave it as homework for the development
team:-)

Cheers. Andy.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to