This looks like the same optimizer problem that occasionally plagues our
customers.  Whenever the estimated rows of a join==1, but the actual rows
is higher, the optimizer may choose very poor plans.  I made some attempts
to fix.  The very simple fix is to never estimate 1 for a join result.
Even using 2 works remarkably well as a defense against this problem.


https://github.com/labkey-matthewb/postgres/commit/b1fd99f4deffbbf3db2172ccaba51a34f18d1b1a

I also made a much more correct but complicated patch to track both
uniqueness and selectivity thought the optimizer, but I didn't quite push
that over the finish line (I made a mistake in the hash join code, and got
distracted by my day job before finishing it).

  https://github.com/labkey-matthewb/postgres/commits/struct_selectivity

The second path is certainly better approach, but needs someone to pick up
the mission.

Matt

On Wed, Apr 26, 2017 at 8:00 AM, Gerardo Herzig <gher...@fmed.uba.ar> wrote:

> Some other approaches you could try:
>
> 1) What about an hashed index? You could make
> CREATE INDEX ON FIELD (unit_id, hashtext(field_name))
>
> and changing your query accordingly....
>
> "....where hashtext(FIELD.FIELD_NAME)=hashtext('SHEETS_PRESENT') ...."
>
> 2) Partitioning (not native yet, but can be simulated through
> inheritance), like in
> https://www.postgresql.org/docs/current/static/ddl-partitioning.html
> This could work well if you have a sort of limited different values in
> FIELD.FIELD_NAME
>
> Gerardo
>
> ----- Mensaje original -----
> > De: "Alessandro Ferrucci" <alessandroferru...@gmail.com>
> > Para: pgsql-performance@postgresql.org
> > Enviados: MiƩrcoles, 26 de Abril 2017 0:19:37
> > Asunto: Re: [PERFORM] Slow query with 3 table joins
> >
> >
> >
> > After about 40 inutes the slow query finally finished and the result
> > of the EXPLAIN plan can be found here:
> >
> >
> > https://explain.depesz.com/s/BX22
> >
> >
> > Thanks,
> > Alessandro Ferrucci
> >
> >
> > On Tue, Apr 25, 2017 at 11:10 PM, Alessandro Ferrucci <
> > alessandroferru...@gmail.com > wrote:
> >
> >
> >
> >
> > Hello - I am migrating a current system to PostgreSQL and I am having
> > an issue with a relatively straightforward query being extremely
> > slow.
> >
> >
> > The following are the definitions of the tables:
> >
> >
> > CREATE TABLE popt_2017.unit
> > (
> > id serial NOT NULL,
> > unit_id text,
> > batch_id text,
> > create_date timestamp without time zone DEFAULT now(),
> > update_date timestamp without time zone,
> > CONSTRAINT unit_pkey PRIMARY KEY (id)
> > )
> > WITH (
> > OIDS=FALSE
> > );
> >
> >
> > CREATE TABLE popt_2017.field
> > (
> > id serial NOT NULL,
> > unit_id integer,
> > subunit_data_id integer,
> > field_name character varying(50),
> > page_id character varying(20),
> > page_type character varying(20),
> > batch_id character varying(20),
> > file_name character varying(20),
> > data_concept integer,
> > "GROUP" integer,
> > omr_group integer,
> > pres integer,
> > reg_data text,
> > ocr_conf text,
> > ocr_dict text,
> > ocr_phon text,
> > create_date timestamp without time zone DEFAULT now(),
> > update_date timestamp without time zone,
> > CONSTRAINT field_pkey PRIMARY KEY (id),
> > CONSTRAINT field_subunit_data_id_fkey FOREIGN KEY (subunit_data_id)
> > REFERENCES popt_2017.subunit (id) MATCH SIMPLE
> > ON UPDATE NO ACTION ON DELETE NO ACTION,
> > CONSTRAINT field_unit_id_fk FOREIGN KEY (unit_id)
> > REFERENCES popt_2017.unit (id) MATCH FULL
> > ON UPDATE NO ACTION ON DELETE NO ACTION,
> > CONSTRAINT field_unit_id_fkey FOREIGN KEY (unit_id)
> > REFERENCES popt_2017.unit (id) MATCH SIMPLE
> > ON UPDATE NO ACTION ON DELETE NO ACTION
> > )
> > WITH (
> > OIDS=FALSE
> > );
> >
> >
> > CREATE TABLE popt_2017.answer
> > (
> > id serial NOT NULL,
> > field_id integer,
> > ans_status integer,
> > ans text,
> > luggage text,
> > arec text,
> > kfi_partition integer,
> > final boolean,
> > length integer,
> > create_date timestamp without time zone DEFAULT now(),
> > update_date timestamp without time zone,
> > CONSTRAINT answer_pkey PRIMARY KEY (id),
> > CONSTRAINT answer_field_id_fk FOREIGN KEY (field_id)
> > REFERENCES popt_2017.field (id) MATCH FULL
> > ON UPDATE NO ACTION ON DELETE NO ACTION,
> > CONSTRAINT answer_field_id_fkey FOREIGN KEY (field_id)
> > REFERENCES popt_2017.field (id) MATCH SIMPLE
> > ON UPDATE NO ACTION ON DELETE NO ACTION
> > )
> > WITH (
> > OIDS=FALSE
> > );
> >
> >
> > Below are the index definitions for those tables:
> >
> >
> > UNIT:
> > CREATE UNIQUE INDEX unit_pkey ON unit USING btree (id);
> > CREATE INDEX unit_unit_id_idx ON unit USING btree (unit_id);
> >
> >
> > FIELD:
> > CREATE UNIQUE INDEX field_pkey ON field USING btree (id)
> > CREATE INDEX field_unit_id_idx ON field USING btree (unit_id)
> > CREATE INDEX field_subunit_id_idx ON field USING btree
> > (subunit_data_id)
> > CREATE INDEX field_field_name_idx ON field USING btree (field_name)
> >
> >
> > ANSWER:
> > CREATE UNIQUE INDEX answer_pkey ON answer USING btree (id)
> > CREATE INDEX answer_field_id_idx ON answer USING btree (field_id)
> > CREATE INDEX answer_ans_idx ON answer USING btree (ans)
> >
> >
> > The tables each have the following number of rows:
> >
> >
> > UNIT: 10,315
> > FIELD: 139,397,965
> > ANSWER: 3,463,300
> >
> >
> > The query in question is:
> >
> >
> > SELECT
> > UNIT.ID AS UNIT_ID,
> > UNIT.UNIT_ID AS UNIT_UNIT_ID,
> > UNIT.BATCH_ID AS UNIT_BATCH_ID,
> > UNIT.CREATE_DATE AS UNIT_CREATE_DATE,
> > UNIT.UPDATE_DATE AS UNIT_UPDATE_DATE
> > FROM
> > UNIT, FIELD, ANSWER
> > WHERE
> > UNIT.ID =FIELD.UNIT_ID AND
> > FIELD.ID =ANSWER.FIELD_ID AND
> > FIELD.FIELD_NAME='SHEETS_PRESENT' AND
> > ANSWER.ANS='2';
> >
> >
> > I attempted to run an EXPLAIN (ANALYZE,BUFFERS) and the query has
> > been running for 32 minutes now, So I won't be able to post the
> > results (as I've never been able to get the query to actually
> > finish.
> >
> >
> > But, if I remove the join to UNIT (and just join FIELD and ANSWER)
> > the resulting query is sufficiently fast, (the first time it ran in
> > roughly 3 seconds), the query as such is:
> >
> >
> > SELECT * FROM
> > ANSWER, FIELD
> > WHERE
> > FIELD.ID =ANSWER.FIELD_ID AND
> > FIELD.FIELD_NAME='SHEETS_PRESENT' AND
> > ANSWER.ANS='2';
> >
> >
> > The EXPLAIN ( ANALYZE, BUFFERS ) output of that query can be found
> > here https://explain.depesz.com/s/ueJq
> >
> >
> > These tables are static for now, so they do not get DELETEs or
> > INSERTS at all and I have run VACUUM ANALYZE on all the affected
> > tables.
> >
> >
> > I'm running PostgreSQL PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu,
> > compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52), 64-bit
> >
> >
> > I'm running this on RHEL 6.9
> >
> >
> > On a server with 32 GB of ram, 2 CPUs.
> >
> >
> > The following are the changes to postgresql.conf that I have made:
> >
> >
> > shared_buffers = 7871MB
> > effective_cache_size = 23611MB
> > work_mem = 1000MB
> > maintenance_work_mem = 2048MB
> >
> >
> > I have not changed the autovacuum settings, but since the tables are
> > static for now and I've already ran VACUUM that should not have any
> > effect.
> >
> >
> > Any assistance that could be provided is greatly appreciated.
> >
> >
> > Thank you,
> > Alessandro Ferrucci
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > --
> >
> > Signed,
> > Alessandro Ferrucci
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Reply via email to