On Sun, 29 May 2022, 15:58 Tom Lane, <t...@sss.pgh.pa.us> wrote:

> Shaheed Haque <shaheedha...@gmail.com> writes:
> > Unfortunately, the real query which I think should behave very
> > similarly is still at the several-seconds level despite using the
> > index. ...
>
> >     -> Bitmap Heap Scan on paiyroll_payrun (cost=26.88..30.91 rows=1
> > width=4) (actual time=32.488..2258.891 rows=62 loops=1)
> >       Recheck Cond: ((company_id = 173) AND ((snapshot ->
> > 'employee'::text) ? '16376'::text))
> >       Filter: (((snapshot #>
> > '{employee,16376,last_run_of_employment}'::text[]) <> 'true'::jsonb)
> > OR ((snapshot #> '{employee,16376,pay_graph}'::text[]) <> '0'::jsonb)
> > OR ((snapshot #> '{employee,16376,state,employment,-1,2}'::text[]) <=
> > '0'::jsonb))
> >       Heap Blocks: exact=5
> >         -> BitmapAnd (cost=26.88..26.88 rows=1 width=0) (actual
> > time=0.038..0.039 rows=0 loops=1)
> >           -> Bitmap Index Scan on paiyroll_payrun_company_id_ce341888
> > (cost=0.00..6.56 rows=304 width=0) (actual time=0.016..0.016 rows=304
> > loops=1)
> >                 Index Cond: (company_id = 173)
> >           -> Bitmap Index Scan on idx1 (cost=0.00..20.07 rows=9
> > width=0) (actual time=0.021..0.021 rows=62 loops=1)
> >                 Index Cond: ((snapshot -> 'employee'::text) ?
> '16376'::text)
>
> > IIUC, at the bottom, the indices are doing their thing, but a couple
> > of layers up, the "Bitmap Heap Scan" jumps from ~30ms to 2200ms. But I
> > cannot quite see why.
>
> I suppose it's the execution of that "Filter" condition, which will
> require perhaps as many as three fetches of the "snapshot" column.
>

Thanks, that's clearly in the frame.

You really need to rethink that data structure.  Sure, you can store tons
> of unorganized data in a jsonb column, but you pay for that convenience
> with slow access.  Normalizing the bits you need frequently into a more
> traditional relational schema is the route to better-performing queries.
>

Ack. Indeed, the current design works very well for all of the access
patterns other than this one, which only recently came into view as a
problem.

Ahead of contemplating a design change I have been looking at how to
optimise this bit. I'm currently mired in a crash course on SQL syntax as
pertains to JSONB, jsonpath et. al. And the equally mysterious side effects
of "?" and "@>" and so on in terms of the amount of data being fetched etc.
(and all wrapped in a dose of ORM for good measure).

I'll write separately with more specific questions if needed on those
details.

Thanks again for the kind help.

Shaheed


>                         regards, tom lane
>

Reply via email to