Re: Compressing large column by moving it to a unique table

2024-02-13 Thread Ron Johnson
On original_hashed, I think I'd try moving start_timestamp into its own index. On Tue, Feb 13, 2024 at 12:02 PM Adrian Garcia Badaracco < adr...@adriangb.com> wrote: > I'm using PostgreSQL 15.5. > > Here's a self-contained example. I included links to public pgMustard > query plans. > > Gist

Re: Compressing large column by moving it to a unique table

2024-02-13 Thread Adrian Garcia Badaracco
I'm using PostgreSQL 15.5. Here's a self-contained example. I included links to public pgMustard query plans. Gist link: https://gist.github.com/adriangb/05a01cca99a438de2a286945903c44f4 Also copied below for archiving: ```sql CREATE OR REPLACE FUNCTION random_bytes(length integer) RETURNS

Re: Compressing large column by moving it to a unique table

2024-02-13 Thread Ron Johnson
1. Show us the PG version, view definition, the exact query that's slow, and the EXPLAIN (ANALYZE). 2. Presumably there's an index on each table's *_hash column? On Tue, Feb 13, 2024 at 8:48 AM Adrian Garcia Badaracco wrote: > Thank you for the reply Ron. > > Yes there are many fewer (<1%) the

Re: Compressing large column by moving it to a unique table

2024-02-13 Thread Adrian Garcia Badaracco
Thank you for the reply Ron. Yes there are many fewer (<1%) the number of rows in new_table. Thanks for making me think of normalization, I hadn’t seen it that way. Although there is no theoretical relationship between the rows in the other columns in the original table and the attributes

Re: Compressing large column by moving it to a unique table

2024-02-12 Thread Ron Johnson
On Mon, Feb 12, 2024 at 10:12 PM Adrian Garcia Badaracco < adr...@adriangb.com> wrote: > I am using Timescale so I'll be mentioning some timestamp stuff but I > think this is a general postgres question for the most part. > > I have a table with some fixed, small columns (id, timestamp, etc) and

Compressing large column by moving it to a unique table

2024-02-12 Thread Adrian Garcia Badaracco
I am using Timescale so I'll be mentioning some timestamp stuff but I think this is a general postgres question for the most part. I have a table with some fixed, small columns (id, timestamp, etc) and a large JSONB column (let's call it `attributes`). `attributes` has 1000s of schemas, but given