limit不参与SQL成本计算致索引失效

描述

测试环境对某应用进行压测时,QPS突然陡降,监控显示mysql的CPU荷载跑满且有大量的慢查询。于是查看慢查询sql,发现该sql理论上不应该出现慢查询。

简化后的SQL:select * from test_table where a = 0 and b = 0 limit 1a字段和b字段已经建立了联合索引后,mysql执行该sql时却选择了其他的错误索引导致了慢sql。

复现

1
2
3
4
5
6
7
8
9
CREATE TABLE `test_table` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`a` int(11) NOT NULL,
`b` int(11) NOT NULL,
`content` varchar(32) NOT NULL,
PRIMARY KEY (`id`),
KEY `a_index` (`a`) USING BTREE,
KEY `b_a_index` (`b`,`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
delimiter  //
CREATE PROCEDURE init_test_table()
BEGIN
DECLARE index_id INT;
SET index_id = 0;
START TRANSACTION;
WHILE index_id < 2000000 DO
insert into test_table(a,b,content) values (0,0,CONCAT('content',index_id));
SET index_id = index_id + 1;
END WHILE;
COMMIT;
END //
delimiter ;
call init_test_table();

update test_table set b = 1 where id < 300000;
update test_table set a = 1 where id > 1700000;

该测试表一共填充了200W条记录,其中前30W条记录的b=1,后30W条记录的a=1,数据分布情况如下图:

image

1
select * from test_table where a = 0 and b = 0 limit 1;

该sql当前最佳使用的索引是b_a_index,mysql使用该索引只需要通过索引找到一条符合查询条件的记录ID后,再对ID做一次回表操作即可。但实际上,如下,mysql使用了错误的索引a_index,导致sql在执行where查询条件时发生了30万条记录的回表,造成了慢查询。

在对sql使用强制索引后,对比发现,a_index索引下sql的平均耗时在450毫秒左右,而b_a_index索引下sql的平均耗时仅需要0.5毫秒左右,两者相差近千倍的耗时。这一对比,更加证实mysql的执行计划选择了错误的索引导致sql的执行效率下降。

1
2
3
4
5
6
mysql> explain select * from test_table where a = 0 and b = 0 limit 1;
+----+-------------+------------+------------+------+-------------------+---------+---------+-------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+-------------------+---------+---------+-------+--------+----------+-------------+
| 1 | SIMPLE | test_table | NULL | ref | a_index,b_a_index | a_index | 4 | const | 997354 | 8.08 | Using where |
+----+-------------+------------+------------+------+-------------------+---------+---------+-------+--------+----------+-------------+
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
mysql> show profiles;
+----------+------------+-------------------------------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+-------------------------------------------------------------------------------+
| 49 | 0.44619325 | select * from test_table where a = 0 and b = 0 limit 1 |
| 50 | 0.466321 | select * from test_table where a = 0 and b = 0 limit 1 |
| 51 | 0.45312325 | select * from test_table where a = 0 and b = 0 limit 1 |
| 52 | 0.4334895 | select * from test_table where a = 0 and b = 0 limit 1 |
| 53 | 0.48915875 | select * from test_table where a = 0 and b = 0 limit 1 |
| 54 | 0.000488 | select * from test_table force index(b_a_index) where a = 0 and b = 0 limit 1 |
| 55 | 0.0005185 | select * from test_table force index(b_a_index) where a = 0 and b = 0 limit 1 |
| 56 | 0.0006425 | select * from test_table force index(b_a_index) where a = 0 and b = 0 limit 1 |
| 57 | 0.00053575 | select * from test_table force index(b_a_index) where a = 0 and b = 0 limit 1 |
| 58 | 0.00069475 | select * from test_table force index(b_a_index) where a = 0 and b = 0 limit 1 |
+----------+------------+-------------------------------------------------------------------------------+

原因

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
"range_scan_alternatives": [
{
"index": "a_index",
"ranges": [
"0 <= a <= 0"
],
"index_dives_for_eq_ranges": true,
"rowid_ordered": true,
"using_mrr": false,
"index_only": false,
"rows": 997354,
"cost": 1200000,
"chosen": true
},
{
"index": "b_a_index",
"ranges": [
"0 <= b <= 0 AND 0 <= a <= 0"
],
"index_dives_for_eq_ranges": true,
"rowid_ordered": true,
"using_mrr": false,
"index_only": false,
"rows": 997354,
"cost": 1200000,
"chosen": false,
"cause": "cost"
}
]

通过mysql的optimizer tracing输出该sql的执行计划,发现执行计划中mysql估算出a_indexb_a_index两个的索引的执行成本相同,输出如上。因此mysql选择了第一个索引,也就是a_index

对于a_index索引,由于索引字段只有a,不能满足查询条件的ab两个查询条件使用覆盖索引条件,因此该索引需要在where查询时就需要对数据进行回表操作。见下图数据分布,where a = 0 and b = 0的查询条件下,由于前30万条数据都是b=1,因此前30万条记录均不符合查询条件而导致30万条记录的回表操作,产生了30万次的回表性能浪费。

而对于b_a_index索引,该索引字段有ab,对于where a = 0 and b = 0的查询条件下不需要做回表操作,直接通过innodb的B+Tree索引结构快速获取到一个符合查询条件的记录ID后,仅对该一条记录做回表操作即完成sql执行。因此性能消耗远远低于a_index索引。

image

之所以mysql的执行计划估算两个索引的执行成本相同,是因为mysql在评估执行成本时不考虑limit条件,即参与执行计划计算的sql是select * from test_table where a = 0 and b = 0。对于select * from test_table where a = 0 and b = 0的sql执行,a_indexb_a_index两个的索引的执行成本相同。虽然不知道innodb引擎在设计时为什么在执行计划过程中会不考虑limit条件,但这一现象,确实导致了当前场景下的索引失效问题。

解决

  1. 强制索引
  2. 拆分查询,将原sql拆成如下的右连接查询
1
2
3
select * from test_table where a = 0 and b = 0 limit 1;
select * from test_table force index(b_a_index) where a = 0 and b = 0 limit 1
select * from test_table t1 right join (select id from test_table where a = 0 and b = 0 limit 1) t2 on t1.id = t2.id;
1
2
3
4
5
6
mysql> explain select * from test_table where a = 0 and b = 0 limit 1;
+----+-------------+------------+------------+------+-------------------+---------+---------+-------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+-------------------+---------+---------+-------+--------+----------+-------------+
| 1 | SIMPLE | test_table | NULL | ref | a_index,b_a_index | a_index | 4 | const | 997354 | 8.08 | Using where |
+----+-------------+------------+------------+------+-------------------+---------+---------+-------+--------+----------+-------------+
1
2
3
4
5
6
7
8
mysql> explain select t1.* from test_table t1 right join (select id from test_table where a = 0 and b = 0 limit 1) t2 on t1.id = t2.id;
+----+-------------+------------+------------+--------+-------------------+-----------+---------+-------------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+--------+-------------------+-----------+---------+-------------+--------+----------+-------------+
| 1 | PRIMARY | <derived2> | NULL | system | NULL | NULL | NULL | NULL | 1 | 100 | NULL |
| 1 | PRIMARY | t1 | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100 | NULL |
| 2 | DERIVED | test_table | NULL | ref | a_index,b_a_index | b_a_index | 8 | const,const | 997354 | 100 | Using index |
+----+-------------+------------+------------+--------+-------------------+-----------+---------+-------------+--------+----------+-------------+

image

image

image

>