[PERFORM] Indexes for hashes

Ivan Voras Wed, 15 Jun 2016 02:36:07 -0700

Hi,

I have an application which stores a large amounts of hex-encoded hash
strings (nearly 100 GB of them), which means:


   - The number of distinct characters (alphabet) is limited to 16
   - Each string is of the same length, 64 characters
   - The strings are essentially random

Creating a B-Tree index on this results in the index size being larger than
the table itself, and there are disk space constraints.

I've found the SP-GIST radix tree index, and thought it could be a good
match for the data because of the above constraints. An attempt to create
it (as in CREATE INDEX ON t USING spgist(field_name)) apparently takes more
than 12 hours (while a similar B-tree index takes a few hours at most), so
I've interrupted it because "it probably is not going to finish in a
reasonable time". Some slides I found on the spgist index allude that both
build time and size are not really suitable for this purpose.

My question is: what would be the most size-efficient index for this
situation?

[PERFORM] Indexes for hashes

Reply via email to