Re: [GENERAL] Storing large files in multiple schemas: BLOB or BYTEA

Chris Travers Thu, 11 Oct 2012 03:19:11 -0700

On Wed, Oct 10, 2012 at 10:56 PM, Craig Ringer <ring...@ringerc.id.au>wrote:

> On 10/11/2012 01:35 PM, tigran2-postg...@riatest.com wrote:
>
>> Using files stored outside the database creates all sorts of problems.
>> For starters you lose ACID guaranties. I would prefer to keep them in
>> database. We did a lot of experiments with Large Objects and they really
>> worked fine (stored hundreds of LOBs ranging from a few MB up to 1GB).
>> Postgres does a really good job with Large Objects. If it was not the
>> pg_dump problem I would not hesitate to use LOBs.
>>
>
> Yeah, a pg_dump mode that dumped everything but large objects would be
> nice.
>

Remembering when that was the only way pg_dump worked and it caused plenty
of problems.

But yeah, --only-lobs and --no-lobs might be nice switches.

>
> Right now I find storing large objects in the DB such a pain from a backup
> management point of view that I avoid it where possible.
>
>
> I'm now wondering about the idea of implementing a pg_dump option that
> dumped large objects into a directory tree like
>   lobs/[loid]/[lob_md5]
> and wrote out a restore script that loaded them using `lo_import`.
>

Thinking of the problems that occurred when we used to require lobs to be
backed up to binary archive formats....

> During dumping temporary copies could be written to something like
> lobs/[loid]/.tmp.nnnn with the md5 being calculated on the fly as the byte
> stream is read. If the dumped file had the same md5 as the existing one
> it'd just delete the tempfile; otherwise the tempfile would be renamed to
> the calculated md5.
>

>
> That way incremental backup systems could manage the dumped LOB tree
> without quite the same horrible degree of duplication as is currently faced
> when using lo in the database with pg_dump.
>

How do incremental backup systems work with lots of data anyway with
pg_dump?  I would think thats not the approach I would take to incremental
backups and PostgreSQL....

>
> A last_modified timestamp on `pg_largeobject_metadata` would be even
> better, allowing the cost of reading and discarding rarely-changed large
> objects to be avoided.
>

It might be interesting to look at the issue of large objects from a total
backup perspective.    I do wonder though where the end would be.  You
could have 500MB text fields and those might pose backup issues as well.  I
suppose with better LOB support on backups, it would give additional
options for management.

Best Wishes,
Chris Travers

Re: [GENERAL] Storing large files in multiple schemas: BLOB or BYTEA

Reply via email to