On Mon, 2008-12-22 at 13:47 -0600, Kenneth Marshall wrote: 
> Dear PostgreSQL developers,
> 
> I am re-sending this to keep this last change to the
> internal hash function on the radar.
> 

Hi Ken,

A few comments:

1. New patch with very minor changes attached.

2. I reverted the change you made to indices.sgml. We still don't use
WAL for hash indexes, and in my opinion we should continue to discourage
their use until we do use WAL. We can add back in the comment that hash
indexes are suitable for large keys if we have some results to show
that.

3. There was a regression test failure in union.sql because the ordering
of the results was different. I updated the regression test.

4. Hash functions affect a lot more than hash indexes, so I ran through
a variety of tests that use a HashAggregate plan. Test setup and results
are attached. These results show no difference between the old and the
new code (about 0.1% better).

5. The hash index build time shows some improvement. The new code won in
every instance in which a there were a lot of duplicates in the table
(100 distinct values, 50K of each) by around 5%.

The new code appeared to be the same or slightly worse in the case of
hash index builds with few duplicates (1000000 distinct values, 5 of
each). The difference was about 1% worse, which is probably just noise.

Note: I'm no expert on hash functions. Take all of my tests with a grain
of salt.

I would feel a little better if I saw at least one test that showed
better performance of the new code on a reasonable-looking distribution
of data. The hash index build that you showed only took a second or two
-- it would be nice to see a test that lasted at least a minute.

Regards,
        Jeff Davis


Attachment: test_results.tar.gz
Description: application/compressed-tar

diff --git a/src/backend/access/hash/hashfunc.c b/src/backend/access/hash/hashfunc.c
index 96d5643..8a236b5 100644
--- a/src/backend/access/hash/hashfunc.c
+++ b/src/backend/access/hash/hashfunc.c
@@ -200,39 +200,94 @@ hashvarlena(PG_FUNCTION_ARGS)
  * hash function, see http://burtleburtle.net/bob/hash/doobs.html,
  * or Bob's article in Dr. Dobb's Journal, Sept. 1997.
  *
- * In the current code, we have adopted an idea from Bob's 2006 update
- * of his hash function, which is to fetch the data a word at a time when
- * it is suitably aligned.  This makes for a useful speedup, at the cost
- * of having to maintain four code paths (aligned vs unaligned, and
- * little-endian vs big-endian).  Note that we have NOT adopted his newer
- * mix() function, which is faster but may sacrifice some randomness.
+ * In the current code, we have adopted Bob's 2006 update of his hash
+ * which fetches the data a word at a time when it is suitably aligned.
+ * This makes for a useful speedup, at the cost of having to maintain
+ * four code paths (aligned vs unaligned, and little-endian vs big-endian).
+ * It also two separate mixing functions mix() and final(), instead
+ * of a slower multi-purpose function.
  */
 
 /* Get a bit mask of the bits set in non-uint32 aligned addresses */
 #define UINT32_ALIGN_MASK (sizeof(uint32) - 1)
+#define rot(x,k) (((x)<<(k)) | ((x)>>(32-(k))))
 
 /*----------
  * mix -- mix 3 32-bit values reversibly.
- * For every delta with one or two bits set, and the deltas of all three
- * high bits or all three low bits, whether the original value of a,b,c
- * is almost all zero or is uniformly distributed,
- * - If mix() is run forward or backward, at least 32 bits in a,b,c
- *	 have at least 1/4 probability of changing.
- * - If mix() is run forward, every bit of c will change between 1/3 and
- *	 2/3 of the time.  (Well, 22/100 and 78/100 for some 2-bit deltas.)
+ *
+ * This is reversible, so any information in (a,b,c) before mix() is
+ * still in (a,b,c) after mix().
+ *
+ * If four pairs of (a,b,c) inputs are run through mix(), or through
+ * mix() in reverse, there are at least 32 bits of the output that
+ * are sometimes the same for one pair and different for another pair.
+ * This was tested for:
+ * * pairs that differed by one bit, by two bits, in any combination
+ *   of top bits of (a,b,c), or in any combination of bottom bits of
+ *   (a,b,c).
+ * * "differ" is defined as +, -, ^, or ~^.  For + and -, I transformed
+ *   the output delta to a Gray code (a^(a>>1)) so a string of 1's (as
+ *   is commonly produced by subtraction) look like a single 1-bit
+ *   difference.
+ * * the base values were pseudorandom, all zero but one bit set, or
+ *   all zero plus a counter that starts at zero.
+ * 
+ * This does not achieve avalanche.  There are input bits of (a,b,c)
+ * that fail to affect some output bits of (a,b,c), especially of a.  The
+ * most thoroughly mixed value is c, but it doesn't really even achieve
+ * avalanche in c. 
+ * 
+ * This allows some parallelism.  Read-after-writes are good at doubling
+ * the number of bits affected, so the goal of mixing pulls in the opposite
+ * direction as the goal of parallelism.  I did what I could.  Rotates
+ * seem to cost as much as shifts on every machine I could lay my hands
+ * on, and rotates are much kinder to the top and bottom bits, so I used
+ * rotates.
  *----------
  */
 #define mix(a,b,c) \
 { \
-  a -= b; a -= c; a ^= ((c)>>13); \
-  b -= c; b -= a; b ^= ((a)<<8); \
-  c -= a; c -= b; c ^= ((b)>>13); \
-  a -= b; a -= c; a ^= ((c)>>12);  \
-  b -= c; b -= a; b ^= ((a)<<16); \
-  c -= a; c -= b; c ^= ((b)>>5); \
-  a -= b; a -= c; a ^= ((c)>>3);	\
-  b -= c; b -= a; b ^= ((a)<<10); \
-  c -= a; c -= b; c ^= ((b)>>15); \
+  a -= c;  a ^= rot(c, 4);  c += b; \
+  b -= a;  b ^= rot(a, 6);  a += c; \
+  c -= b;  c ^= rot(b, 8);  b += a; \
+  a -= c;  a ^= rot(c,16);  c += b; \
+  b -= a;  b ^= rot(a,19);  a += c; \
+  c -= b;  c ^= rot(b, 4);  b += a; \
+}
+
+/*----------
+ * final -- final mixing of 3 32-bit values (a,b,c) into c
+ *
+ * Pairs of (a,b,c) values differing in only a few bits will usually
+ * produce values of c that look totally different.  This was tested for
+ * * pairs that differed by one bit, by two bits, in any combination
+ *   of top bits of (a,b,c), or in any combination of bottom bits of
+ *   (a,b,c).
+ * * "differ" is defined as +, -, ^, or ~^.  For + and -, I transformed
+ *   the output delta to a Gray code (a^(a>>1)) so a string of 1's (as
+ *   is commonly produced by subtraction) look like a single 1-bit
+ *   difference.
+ * * the base values were pseudorandom, all zero but one bit set, or
+ *   all zero plus a counter that starts at zero.
+ *     
+ * The use of separate functions for mix() and final() allow for a
+ * substantial performance increase since final() does not need to
+ * do well in reverse, but is does need to affect all output bits.
+ * mix(), on the other hand, does not need to affect all output
+ * bits (affecting 32 bits is enough).The original hash function had
+ * a single mixing operation that had to satisfy both sets of requirements
+ * and was slower as a result.
+ *----------
+ */
+#define final(a,b,c) \
+{ \
+  c ^= b; c -= rot(b,14); \
+  a ^= c; a -= rot(c,11); \
+  b ^= a; b -= rot(a,25); \
+  c ^= b; c -= rot(b,16); \
+  a ^= c; a -= rot(c,4);  \
+  b ^= a; b -= rot(a,14); \
+  c ^= b; c -= rot(b,24); \
 }
 
 /*
@@ -260,8 +315,7 @@ hash_any(register const unsigned char *k, register int keylen)
 
 	/* Set up the internal state */
 	len = keylen;
-	a = b = 0x9e3779b9;			/* the golden ratio; an arbitrary value */
-	c = 3923095;				/* initialize with an arbitrary value */
+	a = b = c = 0x9e3779b9 + len + 3923095;
 
 	/* If the source pointer is word-aligned, we use word-wide fetches */
 	if (((long) k & UINT32_ALIGN_MASK) == 0)
@@ -445,7 +499,7 @@ hash_any(register const unsigned char *k, register int keylen)
 #endif /* WORDS_BIGENDIAN */
 	}
 
-	mix(a, b, c);
+	final(a, b, c);
 
 	/* report the result */
 	return UInt32GetDatum(c);
@@ -465,11 +519,10 @@ hash_uint32(uint32 k)
 				b,
 				c;
 
-	a = 0x9e3779b9 + k;
-	b = 0x9e3779b9;
-	c = 3923095 + (uint32) sizeof(uint32);
+	a = b = c = 0x9e3779b9 + (uint32) sizeof(uint32) + 3923095;
+	a += k;
 
-	mix(a, b, c);
+	final(a, b, c);
 
 	/* report the result */
 	return UInt32GetDatum(c);
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 295dace..be046a9 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -263,16 +263,16 @@ ORDER BY 1;
 SELECT q2 FROM int8_tbl INTERSECT SELECT q1 FROM int8_tbl;
         q2        
 ------------------
-              123
  4567890123456789
+              123
 (2 rows)
 
 SELECT q2 FROM int8_tbl INTERSECT ALL SELECT q1 FROM int8_tbl;
         q2        
 ------------------
-              123
  4567890123456789
  4567890123456789
+              123
 (3 rows)
 
 SELECT q2 FROM int8_tbl EXCEPT SELECT q1 FROM int8_tbl ORDER BY 1;
@@ -305,16 +305,16 @@ SELECT q1 FROM int8_tbl EXCEPT SELECT q2 FROM int8_tbl;
 SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q2 FROM int8_tbl;
         q1        
 ------------------
-              123
  4567890123456789
+              123
 (2 rows)
 
 SELECT q1 FROM int8_tbl EXCEPT ALL SELECT DISTINCT q2 FROM int8_tbl;
         q1        
 ------------------
-              123
  4567890123456789
  4567890123456789
+              123
 (3 rows)
 
 --
@@ -341,8 +341,8 @@ SELECT f1 FROM float8_tbl EXCEPT SELECT f1 FROM int4_tbl ORDER BY 1;
 SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl;
         q1         
 -------------------
-               123
   4567890123456789
+               123
                456
   4567890123456789
                123
@@ -353,15 +353,15 @@ SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FR
 SELECT q1 FROM int8_tbl INTERSECT (((SELECT q2 FROM int8_tbl UNION ALL SELECT q2 FROM int8_tbl)));
         q1        
 ------------------
-              123
  4567890123456789
+              123
 (2 rows)
 
 (((SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl))) UNION ALL SELECT q2 FROM int8_tbl;
         q1         
 -------------------
-               123
   4567890123456789
+               123
                456
   4567890123456789
                123
@@ -416,8 +416,8 @@ LINE 1: ... int8_tbl EXCEPT SELECT q2 FROM int8_tbl ORDER BY q2 LIMIT 1...
 SELECT q1 FROM int8_tbl EXCEPT (((SELECT q2 FROM int8_tbl ORDER BY q2 LIMIT 1)));
         q1        
 ------------------
-              123
  4567890123456789
+              123
 (2 rows)
 
 --
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to