On Tue, 10 Jul 2007, Alexander Staubo wrote:

> My take: Stick with TOAST unless you need fast random access. TOAST
> is faster, more consistently supported (eg., in Slony) and easier
> to work with.

Toasted bytea columns have some other disadvantages also:

1.

It is impossible to create its value in chunks - it means that
you'll need to create one very big "insert" with file contents. If
your client library does not support binary arguments to prepared
statements you'll need to escape this data, which makes it several
times bigger (I think it could be 4 times bigger). For parsing and
saving this a server would need several copies of this data (I think
a server would need at least 10 times more memory than binary data
size).

If I'm not mistaken it means that for saving a 20MB data file a
server would need at least 200MB of memory - and this is a huge
amount. Also a client would need over 100MB.


I've worked around this with a temporary table:
        create temporary table chunks (
                chunk_nr int primary key,
                chunk_data bytea not null
        );
and an aggregate:
        create aggregate bytea_accum(
                sfunc = byteacat,
                basetype = bytea,
                stype = bytea,
                initcond = ''
        );
I put 1MB chunks into this "chunks" table and then do:
        insert into attachments
                (filename, filedata)
        select
                ('filename.txt', bytea_accum(chunk_data)
        from (
                select chunk_data
                from chunks
                order by chunk_nr
        )

I've proposed that it would be easier if there was a memory-efficient
function:
        bytea_from_lo(oid lo_id) returns bytea
But there was no feedback.


2.

Also there's the same problem when reading this bytea data. I'm also doing
this in chunks - I've set on this table external storage:
        alter table attachments alter column filedata set storage external;
If this is set then a function
        substring(filedata from [offset] for [chunk_size])
is efficient.


================

In case of large objects you'll not be able to enforce constraints in
database, for example:
- you will be able to delete lo which is referenced elsewhere;
- you won't be able to limit lo size;
- I think that you'll not be able to limit access to lo;
- you will be able to delete a reference to a lo without
  deleting this object (you can use contrib/vacuum_lo for garbage
  collecting though).

Regards
Tometzky
-- 
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
                                                      Winnie the Pooh

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Reply via email to