近期对MySQL做的一些整理记录

参考了 MySQL Performance Blog 上的很多内容，同时浏览了 High Performance MySQL, Second Edition 这本书中部分内容，对MySQL索引的使用和优化有了一些新的认识。

主要有几点：

1）充分利用聚集索引和覆盖索引的优势，另外注意使用的场景和条件。

2）书中对Optimizing Sorts和Optimizing LIMIT and OFFSET分别放在了2个不同的章节，但说的有一个共同点。

InnoDB 索引

聚集索引 - InnoDB,MyISAM不是

Clustered Index

非聚集索引

non-clustered indexes (Secondary Indexes)

其它概念

联合索引（组合索引）

composite indexes

Multiple-Column Indexes

覆盖索引

Covering Indexes

MySQL高效使用索引

Using index to find rows

The main purpose of the index is to find rows quickly - without scanning whole data set.

Using Index to Sort Data

Another great benefit of BTREE index is - it allows to retrieve data in sorted form hence avoiding external sort process for executing of queries which require sorting.

MySQL高效使用索引

Using index to read data

Some storage engines (MyISAM and Innodb included) can also use index to read the data, hence avoiding to read the row data itself.

If MySQL is only reading index and not accessing rows you will see "using index" in EXPLAIN output.

让MySQL使用正确的索引

有索引就使用"最好"的索引，除非优化器认为使用表扫描更有效。

是否使用扫描取决于是否"最好"的索引跨越超过(30%)的表。

其它因素：例如表大小、行数和I/O块大小，固定比例不再决定选择使用索引还是扫描。

有时MySQL不使用索引，即使有可用的索引。一种情形是当优化器估计到使用索引将需要MySQL访问表中的大部分行时。

(在这种情况下，表扫描可能会更快些，因为需要的搜索要少）。然而，如果此类查询使用LIMIT只搜索部分行，MySQL则使用索引，因为它可以更快地找到几行并在结果中返回。

OPTIMIZE

ANALYZ

USE/Force INDEX

IGNORE INDEX

Negate use of PRIMARY KEY

Make MySQL think the problem is harder than it really is

MySQL如何优化排序

在某些情况中，MySQL可以使用一个索引来满足ORDER BY子句，而不需要额外的排序。

即使ORDER BY不确切匹配索引，只要WHERE子句中的所有未使用的索引部分和所有额外的ORDER BY 列为常数，就可以使用索引。

索引的问题

增加了写操作的开销

非序列增长主键带来的开销

理想

只有一个聚集索引

查询只根据主键来进行

插入数据主键顺序递增

使用合适的最小类型

...

->Sphinx

==============================================================

Optimizing Sorts

The last issue we want to cover in this case study is sorting. Sorting small result sets

with filesorts is fast, but what if millions of rows match a query? For example, what if

only sex is specified in the WHERE clause?

We can add special indexes for sorting these low-selectivity cases. For example, an

index on (sex,rating) can be used for the following query:

mysql> SELECT <cols> FROM profiles WHERE sex='M' ORDER BY rating LIMIT 10;

This query has both ORDER BY and LIMIT clauses, and it would be very slow without

the index.

Even with the index, the query can be slow if the user interface is paginated and

someone requests a page that's not near the beginning. This case creates a bad com-

bination of ORDER BY and LIMIT with an offset:

mysql> SELECT <cols> FROM profiles WHERE sex='M' ORDER BY rating LIMIT 100000, 10;

Such queries can be a serious problem no matter how they're indexed, because the

high offset requires them to spend most of their time scanning a lot of data that they

will then throw away. Denormalizing, precomputing, and caching are likely to be the

only strategies that work for queries like this one. An even better strategy is to limit

the number of pages you let the user view. This is unlikely to impact the user's expe-

rience, because no one really cares about the 10,000th page of search results.

Another good strategy for optimizing such queries is to use a covering index to

retrieve just the primary key columns of the rows you'll eventually retrieve. You can

then join this back to the table to retrieve all desired columns. This helps minimize

the amount of work MySQL must do gathering data that it will only throw away.

Here's an example that requires an index on (sex, rating) to work efficiently:

mysql> SELECT <cols> FROM profiles INNER JOIN (

-> SELECT <primary key cols> FROM profiles

-> WHERE x.sex='M' ORDER BY rating LIMIT 100000, 10

-> ) AS x USING(<primary key cols>);

Optimizing LIMIT and OFFSET

Queries with LIMITs and OFFSETs are common in systems that do pagination, nearly

always in conjunction with an ORDER BY clause. It's helpful to have an index that sup-

ports the ordering; otherwise, the server has to do a lot of filesorts.

A frequent problem is having a high value for the offset. If your query looks like

LIMIT 10000, 20, it is generating 10,020 rows and throwing away the first 10,000 of

them, which is very expensive. Assuming all pages are accessed with equal fre-

quency, such queries scan half the table on average. To optimize them, you can

either limit how many pages are permitted in a pagination view, or try to make the

high offsets more efficient.

One simple technique to improve efficiency is to do the offset on a covering index,

rather than the full rows. You can then join the result to the full row and retrieve the

additional columns you need. This can be much more efficient. Consider the follow-

ing query:

mysql> SELECT film_id, description FROM sakila.film ORDER BY title LIMIT 50, 5;

If the table is very large, this query is better written as follows:

mysql> SELECT film.film_id, film.description

-> FROM sakila.film

-> INNER JOIN (

-> SELECT film_id FROM sakila.film

-> ORDER BY title LIMIT 50, 5

-> ) AS lim USING(film_id);

This works because it lets the server examine as little data as possible in an index

without accessing rows, and then, once the desired rows are found, join them against

the full table to retrieve the other columns from the row. A similar technique applies

to joins with LIMIT clauses.

==============================================================

--EOF--

Sun	Mon	Tue	Wed	Thu	Fri	Sat
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

近期对MySQL做的一些整理记录

Categories:

Tags:

Related Entries

No TrackBacks

Leave a comment

Categories

Monthly Archives

Pages

快速链接

May 2016

Search

About this Entry