That is really weird. Just having a visible vs. invisible PK should not
have made any difference at all. In InnoDB there is always a PK, if you
don't define one, an invisible 48-bit integer one will be defined for you.

Reasoning behind my approach was that secondary keys point at the PK, and
the table is clustered in PK. So what I proposed should have avoided the
additional key dereference that might have save about half of the time.

So if explicit vs. implicit PK makes a 10x difference, and it is not due to
buffer pool being hot vs. cold, that is most definitely a critical
performance bug you just stumbled upon.


On Sun, 29 Oct 2023, 21:08 Ivan Krylov via discuss, <
[email protected]> wrote:

> On Sun, 29 Oct 2023 18:12:53 +0200
> Gordan Bobic <[email protected]> wrote:
>
> > ALTER TABLE mol_trans
> > DROP INDEX spid_flag_wl,
> > ADD COLUMN id int unsigned auto_increment,
> > ADD PRIMARY KEY (species_id, flag, wl_vac, id);
>
> I wasn't able to add a primary key like this. My version of MariaDB
> only allows the id column in the beginning of the compound primary key.
> Additionally, trying to add a primary key to an existing table results
> in an error message complaining about the index for the table being
> corrupted (and rolling back the implicit transaction).
>
> I tried recreating the table with PRIMARY KEY (species_id, flag,
> wl_vac, upper_id, lower_id). By itself, the combination of upper_id and
> lower_id must be unique, and it's our mistake that we didn't properly
> declare them as foreign keys into a different table. Unfortunately,
> this didn't improve the performance.
>
> What _did_ improve performance was creating the id column with the type
> INT UNSIGNED AUTO_INCREMENT and setting it to be the primary key. With
> the index spid_flag_wl(species_id, flag, wl_vac) recreated, the EXPLAIN
> output now looks a bit differently:
>
>            id: 1
>   select_type: SIMPLE
>         table: mtr
>          type: range
> possible_keys: spid_flag_wl
>           key: spid_flag_wl
>       key_len: 16
>           ref: NULL
>          rows: 5487882
>         Extra: Using index condition
>
> ...and I get my 3024559 rows in slightly more than 6 seconds.
>
> So many thanks for giving me a pointer in the direction that eventually
> helped solve my problem, even if I don't fully understand why it works.
> Is the lesson here to always create a synthetic primary key for a table?
>
> I've also tried recreating the table with the "natural" primary key of
> (upper_id, lower_id), but it takes much longer to reinsert the rows (it
> still isn't complete after tens of minutes, and previously reinsert
> would be done in just a few minutes), probably because the rows don't
> go in the natural order by upper_id and lower_id at all.
>
> --
> Best regards,
> Ivan
> _______________________________________________
> discuss mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
_______________________________________________
discuss mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to