On Mon, Feb 07, 2022 at 10:10:53AM -0700, David G. Johnston wrote: > On Mon, Feb 7, 2022 at 9:58 AM Esteban Zimanyi <estebanzima...@gmail.com> > wrote: > > > > > As suggested by David, this goes beyond the "traditional" usage of > > PostgreSQL. Therefore my questions are > > * What is the suggested strategy to splitting these 2K attributes into > > vertically partitioned tables where the tables are linked by the primary > > key (e.g. trip number in the example above). Are there any limitations/best > > practices in the number/size of TOASTED attributes that a table should > > contain. > > * In each partitioned table containing N TOASTED attributes, given the > > above requirements, are there any limitations/best practices in storing > > them using extended storage or an alternative one such as external. > > > > > Frankly, the best practice is "don't have that many columns". Since you > do, I posit that you are just going to have to make decisions (possibly > with experimentation) on your own. Or maybe ask around on a MobilityDB > forum what people using that tool and having these kinds of data structures > do. From a core PostgreSQL perspective you've already deviated from the > model structures that it was designed with in mind. > I'm really confused that you'd want the data value itself to contain a > timestamp that, on a per-row basis, should be the same timestamp that every > other value on the row has. Extracting the timestamp to it own column and > using simpler and atomic data types is how core PostgreSQL and the > relational model normalization recommend dealing with this situation. Then > you just break up the attributes of a similar nature into their own tables > based upon their shared nature. In almost all cases relying on "main" > storage.
Actually looking at the original example: > CREATE TYPE tint ( > internallength = variable, > [...] > storage = extended, > alignment = double, > [...] > ); I'm wondering if it's just some miscommunication here. If the tint data type only needs to hold a timestamp and an int, I don't see why it would be varlerna at all. So if a single tint can internally hold thousands of (int, timestamptz), a bit like pgpointcloud, then having it by default external (so both possibly out-of-line and compressed) seems like a good idea, as you can definitely hit the 8k boundary, it should compress nicely and you also avoid some quite high tuple header overhead.