Re: [PERFORM] Creating large database of MD5 hash values

2008-05-28 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes: > Decibel! wrote: >> If you do this *please* post it. I really think it would be worth >> while for us to have fixed-size data types for common forms of binary >> data; MD5, SHA1 and SHA256 come to mind. > Why do you think it would be worth while? Giv

Re: [PERFORM] Creating large database of MD5 hash values

2008-05-28 Thread Bruce Momjian
Decibel! wrote: > On Apr 11, 2008, at 10:25 AM, Alvaro Herrera wrote: > > Sorry, yes, I'm behind on email... :( > > > If MD5 values will be your primary data and you'll be storing millions > > of them, it would be wise to create your own datatype and operators > > with > > the most compact and

Re: [PERFORM] Creating large database of MD5 hash values

2008-05-24 Thread Decibel!
On Apr 11, 2008, at 10:25 AM, Alvaro Herrera wrote: Sorry, yes, I'm behind on email... :( If MD5 values will be your primary data and you'll be storing millions of them, it would be wise to create your own datatype and operators with the most compact and efficient representation possible.

Re: [PERFORM] Creating large database of MD5 hash values

2008-04-11 Thread Florian Weimer
* Jon Stewart: >> BYTEA is slower to load and a bit inconvenient to use from DBI, but >> occupies less space on disk than TEXT or VARCHAR in hex form (17 vs 33 >> bytes with PostgreSQL 8.3). > Can you clarify the "slower to load" point? Where is that pain point > in the postgres architecture?

Re: [PERFORM] Creating large database of MD5 hash values

2008-04-11 Thread Jon Stewart
> > 1. Which datatype should I use to represent the hash value? UUIDs are > > also 16 bytes... > > BYTEA is slower to load and a bit inconvenient to use from DBI, but > occupies less space on disk than TEXT or VARCHAR in hex form (17 vs 33 > bytes with PostgreSQL 8.3). Can you clarify the "s

Re: [PERFORM] Creating large database of MD5 hash values

2008-04-11 Thread Alvaro Herrera
Jon Stewart escribió: > Hello, > > I am creating a large database of MD5 hash values. I am a relative > newb with PostgreSQL (or any database for that matter). The schema and > operation will be quite simple -- only a few tables, probably no > stored procedures -- but I may easily end up with seve

Re: [PERFORM] Creating large database of MD5 hash values

2008-04-11 Thread Florian Weimer
* Jon Stewart: > 1. Which datatype should I use to represent the hash value? UUIDs are > also 16 bytes... BYTEA is slower to load and a bit inconvenient to use from DBI, but occupies less space on disk than TEXT or VARCHAR in hex form (17 vs 33 bytes with PostgreSQL 8.3). > 2. Does it make sense

Re: [PERFORM] Creating large database of MD5 hash values

2008-04-11 Thread Chris
1. Which datatype should I use to represent the hash value? UUIDs are also 16 bytes... md5's are always 32 characters long so probably varchar(32). 2. Does it make sense to denormalize the hash set relationships? The general rule is normalize as much as possible then only denormalize whe

[PERFORM] Creating large database of MD5 hash values

2008-04-10 Thread Jon Stewart
Hello, I am creating a large database of MD5 hash values. I am a relative newb with PostgreSQL (or any database for that matter). The schema and operation will be quite simple -- only a few tables, probably no stored procedures -- but I may easily end up with several hundred million rows of hash v