Mark,

I think what Greg suggested was sha1(number) as the key instead of requiring uuid as the key... it would perform the same function as far as you r use case is concerned.

As a similar example (using MD5):

CREATE SEQUENCE marks_seq START 1 INCREMENT 1;
CREATE TABLE your_tbl (
    your_key          VARCHAR(32) NOT NULL DEFAULT md5(nextval('marks_seq')),
    your_picture      BYTEA NOT NULL,
    PRIMARY KEY (your_key));

INSERT INTO your_tbl (your_picture) VALUES ('some bytea data');

select * from your_tbl ;
             your_key             |  your_picture
----------------------------------+-----------------
 c4ca4238a0b923820dcc509a6f75849b | some bytea data

same functionality from your standpoint.



On 9/8/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
On Thu, Sep 08, 2005 at 01:45:10PM -0000, Greg Sabino Mullane wrote:
> > For a rather simple example, consider a site that associates a picture
> > with each member. If the pictures are named 1.jpg, 2.jpg, 3.jpg, etc.
> > it makes it ridiculously easy to write a script to pull all of the
> > pictures off the site. This can be bothersome, as the only type of
> > person who would do this, is the type of person with an illegitimate
> > motivation. I want the data to be easily and freely accessible as
> > specific objects, but I do not wish to provide an easy way of
> > dumping all of the data as a unit.
> I don't think security through obscurity is a good reason to add something
> to PostgreSQL. Either make the pictures accessible to everyone, or put
> some other security mechanism in place. As far as pictures, you could
> certainly used the SHA1 checksum as your primary key, which would satisfy
> all your requirements.

I think you missed two parts. The first is, that it isn't really an
attempt at security. The pictures are still freely available. There
is no need to block accesses.

The goal is to prevent a dump of the database. Nobody should be able to
trivially download all the pictures at one time.

I'm not sure what you are suggesting the SHA1 checksum be applied to.
Not the picture, for sure, as this would require that my tables store
a primary key, and a picture key, leading me right back to something
that seems a little ridiculous.

If you mean the SHA1 checksum of some constant secret password, and
the primary key, possibly SERIAL, then sure - but that's ignoring the
other benefits of UUID. If I'm going to use UUID, I might as well use
only UUID, and use the SHA1 checksum of the constant secret password,
and the UUID. So, the idea has merit - and was suggested by another
person - but it doesn't mean using the UUID alone, without all the
extra layers on top I would have to write, is the wrong choice. Why
would I go out of my way to jump through so many hoops, when I can
very easily, use the UUID?

You're asking me to do a lot, only so that you can point out that
'UUID' is not necessary. Heck, I could use the timestamp, and a
random key as two separate fields. There are always work arounds.

Is it good to use work arounds? Or a type that already works fine
for the purposes? :-)

> > In my case, it isn't only pictures. I don't want people pulling all
> > the data off the site as a dump, and using it how they wish, but I do
> > wish to make the data freely available, and easily accessible from a
> > web browser.
> .htaccess? Cookies? Encrypted data? iptables? All of these are better
> solutions than random filenames.

Nope. The files are publically accessible. No reason the secure their
access, and securing their access makes the whole system less efficient.

> > Not that everybody should rip out SERIAL and replace it with UUID,
> > but it really isn't that bad, and in some cases, such as mine,
> > I don't see the point of using both, and choose to instead allow
> > UUID to solve many of my concerns at the same time, with an
> > acceptable for me cost in database pages.
> We're doing something similar to UUIDs here - LSID (Life Science
> Identifiers). But being as the whole point of LSIDs is to share
> data, the "random" factor is not needed. If ever PG did do something
> like UUID (and despite the paragraphs above, I could see a use for it),
> I'd rather do something like picking from a predetermined range of values
> at random and deplete the pool as you go rather than create large
> values with a low (but finite) probability of collision.

I don't need the random factor. I only need it to be difficult to guess.

The timestamps have sufficient granularity, that they are effectively
difficult to guess, and a foreach loop would be expensive. If none of
the bits were random, I would still be happy.

If I had LSID support, and no UUID support, perhaps I'd use LSID
instead. I have no special attachment to UUID, other than it satisfies
my concerns and purposes in a convenient manner. I can merge my tables
without fiddling with the sequence definitions. I can hand out
external identifiers for specific objects which are effectively
unguessable. I can avoid having two primary keys, and use only the
one. Any type that satisfied these concerns would make me happy.

Cheers,
mark

--
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED]     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   |
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend



--
Respectfully,

Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
http://www.enterprisedb.com/

Reply via email to