That is really weird. Just having a visible vs. invisible PK should not have made any difference at all. In InnoDB there is always a PK, if you don't define one, an invisible 48-bit integer one will be defined for you.
Reasoning behind my approach was that secondary keys point at the PK, and the table is clustered in PK. So what I proposed should have avoided the additional key dereference that might have save about half of the time. So if explicit vs. implicit PK makes a 10x difference, and it is not due to buffer pool being hot vs. cold, that is most definitely a critical performance bug you just stumbled upon. On Sun, 29 Oct 2023, 21:08 Ivan Krylov via discuss, < [email protected]> wrote: > On Sun, 29 Oct 2023 18:12:53 +0200 > Gordan Bobic <[email protected]> wrote: > > > ALTER TABLE mol_trans > > DROP INDEX spid_flag_wl, > > ADD COLUMN id int unsigned auto_increment, > > ADD PRIMARY KEY (species_id, flag, wl_vac, id); > > I wasn't able to add a primary key like this. My version of MariaDB > only allows the id column in the beginning of the compound primary key. > Additionally, trying to add a primary key to an existing table results > in an error message complaining about the index for the table being > corrupted (and rolling back the implicit transaction). > > I tried recreating the table with PRIMARY KEY (species_id, flag, > wl_vac, upper_id, lower_id). By itself, the combination of upper_id and > lower_id must be unique, and it's our mistake that we didn't properly > declare them as foreign keys into a different table. Unfortunately, > this didn't improve the performance. > > What _did_ improve performance was creating the id column with the type > INT UNSIGNED AUTO_INCREMENT and setting it to be the primary key. With > the index spid_flag_wl(species_id, flag, wl_vac) recreated, the EXPLAIN > output now looks a bit differently: > > id: 1 > select_type: SIMPLE > table: mtr > type: range > possible_keys: spid_flag_wl > key: spid_flag_wl > key_len: 16 > ref: NULL > rows: 5487882 > Extra: Using index condition > > ...and I get my 3024559 rows in slightly more than 6 seconds. > > So many thanks for giving me a pointer in the direction that eventually > helped solve my problem, even if I don't fully understand why it works. > Is the lesson here to always create a synthetic primary key for a table? > > I've also tried recreating the table with the "natural" primary key of > (upper_id, lower_id), but it takes much longer to reinsert the rows (it > still isn't complete after tens of minutes, and previously reinsert > would be done in just a few minutes), probably because the rows don't > go in the natural order by upper_id and lower_id at all. > > -- > Best regards, > Ivan > _______________________________________________ > discuss mailing list -- [email protected] > To unsubscribe send an email to [email protected] >
_______________________________________________ discuss mailing list -- [email protected] To unsubscribe send an email to [email protected]
