> > The following is the obvious PGSQL code, you'd obviously need something > else to stop duplicates. > > SELECT array_to_string(array(( > SELECT SUBSTRING('abcdefghjklmnpqrstuvwxyz23456789' > FROM mod((random()*32)::int, 32)+1 FOR 1) > FROM generate_series(1,5))),''); > > As this only generates five characters and each character can be one of > 32 values, you've got about 33554432 choices and you'd have a 50% chance > of getting a duplicate after 7240 values. This assumes I wrote the > above code correctly. It's also not amazing because PG's random number > generator is defined to return a value between 0 and 1 inclusive, it's > generally much more useful if it runs from 0 to less than 1 and would > mean that I wouldn't need the "mod" above and would remove the (slight) > biasing towards choosing 'a'. >
That does actually work! I'm not sure why you're saying that there's a 50% chance of duplication after 7240 values though. With 33 million combinations, I would have thought that duplications would become equally likely at the 16,777,216 mark. I hadn't thought of coding it the way you did, which is an interesting way of approaching it! Thom