Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-20 Thread Dominique Devienne
On Thu, Oct 20, 2022 at 12:21 PM Laurenz Albe wrote: > On Thu, 2022-10-20 at 10:32 +0200, Dominique Devienne wrote: > > FWIW, when Oracle introduced SecureFile blobs years ago in v11, it > > represented a leap forward in > > performance, and back then we were seeing them being 3x faster than LO >

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-20 Thread Ron
On 10/20/22 03:32, Dominique Devienne wrote: [snip] And from the posts here, the issues with large blobs may be more related to backup/restore perhaps, than runtime performance. From my long experience as a first a programmer and then a database administrator, backups and archiving are at the

Aw: Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-20 Thread Karsten Hilbert
> I don't know what exactly they are, but I suspect that they are just > files (segments?) in Oracle's "file system" (tablespaces/datafiles). > So pretty much what we recommend. Maybe so, but if those large segments are presented "seamlessly" in the form of a table integrated with PGs

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-20 Thread Laurenz Albe
On Thu, 2022-10-20 at 10:32 +0200, Dominique Devienne wrote: > But before I finish this thread for now, I'd like to add that I > consider unfortunate a state of affairs where > NOT putting the data in the DB is the mostly agreed upon advice. It > IMHO points to a weak point of > PostgreSQL, which

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-20 Thread Andreas Joseph Krogh
På torsdag 20. oktober 2022 kl. 10:32:44, skrev Dominique Devienne < ddevie...@gmail.com >: On Wed, Oct 19, 2022 at 5:05 PM Laurenz Albe wrote: > On Wed, 2022-10-19 at 12:48 +0200, Dominique Devienne wrote: > > On Wed, Oct 19, 2022 at 12:17 PM Andreas Joseph Krogh

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-20 Thread Dominique Devienne
On Wed, Oct 19, 2022 at 5:05 PM Laurenz Albe wrote: > On Wed, 2022-10-19 at 12:48 +0200, Dominique Devienne wrote: > > On Wed, Oct 19, 2022 at 12:17 PM Andreas Joseph Krogh > > wrote: > > > First advice, don't do it. We started off storing blobs in DB for “TX > > > safety” > > Not really an

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-20 Thread Dominique Devienne
On Wed, Oct 19, 2022 at 5:30 PM Alvaro Herrera wrote: > That's exactly what I was trying to say. > If there's no compression, we don't read prior chunks. Great to read that. I'll probably try to benchmark w/ and w/o compression eventually. Need to deal with other issues first, will take a while

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-20 Thread Dominique Devienne
On Wed, Oct 19, 2022 at 4:29 PM Daniel Verite wrote: > In your case I would probably opt for bytea (as opposed to large > objects), and slicing the blobs in the application in chunks Thanks for the advice, and the valuable info on LO permissions. --DD

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Alvaro Herrera
On 2022-Oct-19, Dominique Devienne wrote: > Anybody has an answer to my question regarding how substr() works on > bytea values? I.e. is it "pushed down" / optimized enough that it > avoids reading the whole N-byte value, to then pass it to substr(), > which then returns an M-byte value (where M

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Laurenz Albe
On Wed, 2022-10-19 at 12:48 +0200, Dominique Devienne wrote: > On Wed, Oct 19, 2022 at 12:17 PM Andreas Joseph Krogh > wrote: > > First advice, don't do it. We started off storing blobs in DB for “TX > > safety” > > Not really an option, I'm afraid. You should reconsider. Ruling out that

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Ron
On 10/19/22 06:38, Andreas Joseph Krogh wrote: På onsdag 19. oktober 2022 kl. 13:21:38, skrev Dominique Devienne : On Wed, Oct 19, 2022 at 1:00 PM Andreas Joseph Krogh wrote: > Ok, just something to think about; Thank you. I do appreciate the feedback. > Will your

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Ron
On 10/19/22 04:47, Dominique Devienne wrote: [snip] PS: Another drawback of lo is that because it's a single table, it still subject to the 32 TB limit on a relation. The 4TB limit per lo is way more than we need, but the 32 TB limit may actually be more of an issue for our larger clients,

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Daniel Verite
Dominique Devienne wrote: > PostgreSQL bytea is much better and simpler, except limited to 1GB... > Bytea also has no direct random access, except via substr[ing], but > how efficient and "random access" is that? Bytea contents are compressed before being sliced (in chunks of

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Dominique Devienne
On Wed, Oct 19, 2022 at 3:05 PM Alvaro Herrera wrote: > On 2022-Oct-19, Dominique Devienne wrote: > > OTOH, lo has random access, which I also need... > > Generally speaking, bytea sucks for random access, because if a TOAST > item is compressed, it has to be always read from the beginning in

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Dominique Devienne
On Wed, Oct 19, 2022 at 3:01 PM Daniel Verite wrote: > Dominique Devienne wrote: > > the fact the lo table is unique for the whole database would allow > > users to see blobs from any schema, as I understand it. > Each large object has its own set of permissions. This is a significant >

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Alvaro Herrera
On 2022-Oct-19, Dominique Devienne wrote: > Upfront, I have to state that I'm not keen on lo, because of security > considerations. We store blobs in many different schemas, and users > can access some schemas, and not others. So the fact the lo table is > unique for the whole database would

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Daniel Verite
Dominique Devienne wrote: > the fact the lo table is unique for the whole database would allow > users to see blobs from any schema, as I understand it. Direct access to pg_largeobject is only possible for superusers. If lo_compat_privileges is on, any user can read any large object with

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread esconsult1
We had the same thought of storing the blobs inside LO’s as well many years ago. But ultimately chose cloud storage and stored a pointer in the database instead. Now that we are approaching a terabyte of just normal data I don’t regret this decision one bit. Just handling backups and storage is

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Dominique Devienne
On Wed, Oct 19, 2022 at 1:38 PM Andreas Joseph Krogh wrote: > There's a reason “everybody” advices to move blobs out of DB, I've learned. > I get that. I really do. But the alternative has some real downsides too. Especially around security, as I already mentioned. That's why I'd like if

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Andreas Joseph Krogh
På onsdag 19. oktober 2022 kl. 13:21:38, skrev Dominique Devienne < ddevie...@gmail.com >: On Wed, Oct 19, 2022 at 1:00 PM Andreas Joseph Krogh wrote: > Ok, just something to think about; Thank you. I do appreciate the feedback. > Will your database grow beyond

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Dominique Devienne
On Wed, Oct 19, 2022 at 1:00 PM Andreas Joseph Krogh wrote: > Ok, just something to think about; Thank you. I do appreciate the feedback. > Will your database grow beyond 10TB with blobs? The largest internal store I've seen (for the subset of data that goes in the DB) is shy of 3TB. But we

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Andreas Joseph Krogh
På onsdag 19. oktober 2022 kl. 12:48:24, skrev Dominique Devienne < ddevie...@gmail.com >: On Wed, Oct 19, 2022 at 12:17 PM Andreas Joseph Krogh mailto:andr...@visena.com>> wrote: First advice, don't do it. We started off storing blobs in DB for “TX safety” Not

Re: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Dominique Devienne
On Wed, Oct 19, 2022 at 12:17 PM Andreas Joseph Krogh wrote: > First advice, don't do it. We started off storing blobs in DB for “TX > safety” > Not really an option, I'm afraid. > , but backup/restore quickly became too cumbersome so we ended up moving > all blobs out and only store

Sv: How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Andreas Joseph Krogh
First advice, don't do it. We started off storing blobs in DB for “TX safety”, but backup/restore quickly became too cumbersome so we ended up moving all blobs out and only store reference in DB. This required us to make a “vacuum system” that cleans up the blob-storage regularly as

How to store "blobs" efficiently for small and large sizes, with random access

2022-10-19 Thread Dominique Devienne
Hi. I'd like some advice storing blobs (millions of them), ranging from very small, to large > 1GB. I know about bytea versus lo, and I have probably read most of what's out there about them :) Upfront, I have to state that I'm not keen on lo, because of security considerations. We store blobs