https://bugzilla.wikimedia.org/show_bug.cgi?id=10788

--- Comment #22 from Asher Feldman <afeld...@wikimedia.org> ---
Yes, I am so long as the interface doesn't provide pagination based on
limit+offset queries. 

mysql:root@db1051 [enwiki]> select count(*) from revision where rev_page =
1952670;
+----------+
| count(*) |
+----------+
|   941113 |
+----------+
1 row in set (0.68 sec)

mysql:root@db1051 [enwiki]> select count(1) as count, rev_user from revision
where rev_page = 1952670 group by rev_user order by count desc limit 1;
+--------+----------+
| count  | rev_user |
+--------+----------+
| 139166 |  6327251 |
+--------+----------+

mysql:root@db1051 [enwiki]> explain select * from revision where rev_page =
1952670 and rev_user = 6327251 order by rev_timestamp desc limit 25;
+------+-------------+----------+------+---------------------------------------+----------------+---------+-------+---------+-------------+
| id   | select_type | table    | type | possible_keys                        
| key            | key_len | ref   | rows    | Extra       |
+------+-------------+----------+------+---------------------------------------+----------------+---------+-------+---------+-------------+
|    1 | SIMPLE      | revision | ref  | PRIMARY,page_timestamp,user_timestamp
| page_timestamp | 4       | const | 2707632 | Using where |
+------+-------------+----------+------+---------------------------------------+----------------+---------+-------+---------+-------------+
1 row in set (0.03 sec)

2707632 looks bad, but:

mysql:root@db1051 [enwiki]> flush status;
mysql:root@db1051 [enwiki]> select * from revision where rev_page = 1952670 and
rev_user = 6327251 order by rev_timestamp desc limit 25;
....

mysql:root@db1051 [enwiki]> show status like 'Handler_read%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Handler_read_first       | 0     |
| Handler_read_key         | 1     |
| Handler_read_last        | 0     |
| Handler_read_next        | 0     |
| Handler_read_prev        | 151   |
| Handler_read_rnd         | 0     |
| Handler_read_rnd_deleted | 0     |
| Handler_read_rnd_next    | 0     |
+--------------------------+-------+
8 rows in set (0.02 sec)

That example is for the most edited page and the user with the most edits.  The
worst case for the current schema would be a user with only one edit of the
most edited page. 

mysql:root@db1051 [enwiki]> explain select * from revision where rev_page =
1952670 and rev_user = 4305640 order by rev_timestamp desc limit 25;
+------+-------------+----------+------+---------------------------------------+----------------+---------+-------+------+-------------+
| id   | select_type | table    | type | possible_keys                        
| key            | key_len | ref   | rows | Extra       |
+------+-------------+----------+------+---------------------------------------+----------------+---------+-------+------+-------------+
|    1 | SIMPLE      | revision | ref  | PRIMARY,page_timestamp,user_timestamp
| user_timestamp | 4       | const |  479 | Using where |
+------+-------------+----------+------+---------------------------------------+----------------+---------+-------+------+-------------+
1 row in set (0.03 sec)

mysql:root@db1051 [enwiki]> flush status;
Query OK, 0 rows affected (0.03 sec)

mysql:root@db1051 [enwiki]> select * from revision where rev_page = 1952670 and
rev_user = 4305640 order by rev_timestamp desc limit 25;
+-----------+----------+-------------+---------------------+----------+---------------+----------------+----------------+-------------+---------+---------------+---------------------------------+
| rev_id    | rev_page | rev_text_id | rev_comment         | rev_user |
rev_user_text | rev_timestamp  | rev_minor_edit | rev_deleted | rev_len |
rev_parent_id | rev_sha1                        |
+-----------+----------+-------------+---------------------+----------+---------------+----------------+----------------+-------------+---------+---------------+---------------------------------+
| 134864881 |  1952670 |   134137447 | /* User-reported */ |  4305640 |
AM01NU06      | 20070531180220 |              0 |           0 |    1905 |    
134864494 | l2ac4niowr2u88gl84ii3vz9cbrkc6t |
+-----------+----------+-------------+---------------------+----------+---------------+----------------+----------------+-------------+---------+---------------+---------------------------------+
1 row in set (0.13 sec)

mysql:root@db1051 [enwiki]> show status like 'Handler_read%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Handler_read_first       | 0     |
| Handler_read_key         | 1     |
| Handler_read_last        | 0     |
| Handler_read_next        | 0     |
| Handler_read_prev        | 480   |
| Handler_read_rnd         | 0     |
| Handler_read_rnd_deleted | 0     |
| Handler_read_rnd_next    | 0     |
+--------------------------+-------+
8 rows in set (0.03 sec)

130ms is a little sucky but for a worst case, this should still be ok.

The nice thing is that hopefully we'll be able to use extended_keys in MariaDB
5.5.31.  Everything is currently migrating to 5.5.30 which has an optimizer bug
that I reported regarding extended_keys so I have it disabled in production for
now.  It allows the primary key to be fully utilized as the right side member
of every secondary key.  The revision primary key is:

  PRIMARY KEY (`rev_page`,`rev_id`),

and we have in production:

  KEY `user_timestamp` (`rev_user`,`rev_timestamp`),

So with extended_keys, rev_page will be useable without having to increase the
index side by duplicating as with the proposed page_user_timestamp.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to