-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Alex Stapleton wrote: > On 11 Jul 2007, at 16:35, Tres Seaver wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> [EMAIL PROTECTED] wrote: >>> Hi everyone, >>> >>> I'm happy to see a nice and compact result with zero bloat. I'm also >>> happy you guys kept alignment within the request/response struct and >>> that would help performance. >>> >>> I see byte ordering is mentioned twice; the length field both in the >>> request and response. >>> >>> While network byte ordering (Big Endian) is traditionally the 'right' >>> thing to do (or the default thing to do), in most cases it's a minor >>> performance hit due to constant swapping. Since we're implementing a >>> binary protocol specifically to avoid/minimize minor performance hits >>> and since this is a brand new protocol I would recommend to keep all >>> values as Little Endian because: >>> >>> - It's easier that all values are kept to a the same endianess; >>> reduces >>> confusion. >> Heh, agreed. Any numeric value larger than one byte should be in >> network order, which removes the confusion. ;) >> >>> - Nowadays MOST (but obviously not all) servers are running little >>> endian. So this saves byte swapping for most people's cases and >>> thus a >>> few cycles are spared on each request -- isn't that the whole >>> point? ;) >> - -1. Burden of proof is on those wanting host order to show >> *measured* >> overhead on real workloads. > > It's 1 single extra instruction (BSWAP) to convert each multibyte > value. So the overhead is rather low. > > My quick benchmark on this managed to do 20,000,000,000 htonls() > (implemented as BSWAP) in 0.88 seconds. > > On one hand it's almost no performance hit, on the other, > intentionally adding any performance penalty seems like a bad call. > It would make implementation somewhat simpler to only support network > ordering, and supporting both orders is probably not going to be > justified the performance gains, which I imagine will be close to 0.
Thanks for quantifying. 44 picoseconds per command seems pretty tolerable overhead to me. > +1 for network ordering only. (And I'm an Intel user ;) Agreed. Given the possibility of pipeline stall on modern CPUs, it is quite credible that network-only implementation is faster, even on Intel, than one which sniffs the magic byte to determine whether to *do* the swapping. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 [EMAIL PROTECTED] Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGlQ53+gerLs4ltQ4RAinQAKDJ4JkRXFYc3cgrvWBfC7I6WGWJ2gCguEfx F6xcLI8ofURyUnGtyGqjeVg= =J20n -----END PGP SIGNATURE-----
