SQL Solutions: Move from Scan to Seek

Yes, that’s the thumb rule to target while optimizing SQL Queries “Scans are bad, Seeks are good“.

Scan indicates reading the whole of the index/table looking for matches – the time this takes is proportional to the size of the index.

Seek, on the other hand, indicates b-tree structure of the index to seek directly to matching records – time taken is only proportional to the number of matching records.

In general an index seek is preferable to an index scan (when the number of matching records is proportionally much lower than the total number of records), as the time taken to perform an index seek is constant regardless of the total number of records in your table. However, in certain situations, an index scan can be faster than an index seek (sometimes significantly faster) – usually when the table is very small, or when a large percentage of the records match the predicate.

	Scan	Seek
Heap	Table Scan
Clustered Index	Clustered Index Scan	Clustered Index Seek
Non-clustered Index	Index Scan	Index Seek

Table Scans: With no indexes defined/available at all that are relevant to your query the planner is forced to use a table scan meaning that every row is looked at. This can result in every page relating to the table’s data being read from disk (worst case scenario). However, for some queries, table scan is used even when a useful index is present – this is usually because the data in the table is so small that it is more hassle to traverse the indexes.

Index Scans: Reads the entire index—all the rows in the index order and occurs when an index exists that is only partially helpful.The index might only be partially helpful if there is relatively low selectivity in the distribution of the data. Index scan means the DB reads the data in an Index directly, when it finds what it wants in the Index , it uses the record addresses in the index to go to the table and read only what it requires.There’s a performance advantage to an index scan because an index tends to be much narrower than the full table column.

Clustered Index Scan: In a table without a clustered index (a heap table), data pages are not linked together – so traversing pages requires a lookup into the Index Allocation Map. A clustered table, however, has it’s data pages linked in a doubly linked list – making sequential scans a bit faster. Of course, in exchange, you have the overhead of dealing with keeping the data pages in order on INSERTs, UPDATEs, and DELETEs. A heap table, however, requires a second write to the Index Allocation Map.

Index Seeks: traverses a B-tree and walks through leaf nodes seeking only the matching or qualifying rows based on the filter criteria. Only relevant index pages need to be read instead of every page in the index (or table).

Clustered Indexes: With a clustered index the table data is stored in the leaf nodes of that index instead of being in a separate heap structure. This means that there will never need to be any extra row lookups after finding rows using that index no matter what columns are needed [unless you have off-page data like TEXT columns or VARCHAR(MAX) columns containing large object (LOB) data types]. You can have only one clustered index for this reason, so chose the one carefully in order to get maximum gain.

SQL Solutions

Monday, 1 February 2016

Move from Scan to Seek

No comments:

Post a Comment