{"id":153,"date":"2016-06-22T00:00:00","date_gmt":"2013-01-01T00:00:00","guid":{"rendered":"https:\/\/www.sqlhammer.com\/clustered-indexes-sql-server"},"modified":"2026-03-18T21:52:33","modified_gmt":"2026-03-18T21:52:33","slug":"clustered-indexes-sql-server","status":"publish","type":"post","link":"https:\/\/sqlhammer.com\/index.php\/2016\/06\/22\/clustered-indexes-sql-server\/","title":{"rendered":"Clustered Indexes in SQL Server"},"content":{"rendered":"<p>In the <a href=\"http:\/\/www.sqlhammer.com\/heaps-in-microsoft-sql-server\/\">last post<\/a> of this blog series, I discussed what a heap is and how it impacts query performance.<\/p>\n<p>In this post, I will describe clustered indexes and how they differ from a heap. They are indexes but also tables. Like a heap, the clustered index stores the data pages and is indistinguishable from that physical table.<\/p>\n<h2>When should you use clustered indexes?<\/h2>\n<p>Clustered indexes should be your default choice for structuring your table data. Many people will demonize heaps and declare clustered indexes as the only choice. I disagree with that mentality, instead I embrace the DBA\u2019s <em>it depends<\/em>.<\/p>\n<p>Heaps can outperform clustered indexes in some scenarios but the scenarios are less common. Review the <a href=\"http:\/\/www.sqlhammer.com\/heaps-in-microsoft-sql-server\/\">last post<\/a> for an explanation of the few cases where heaps can be better.<\/p>\n<p>I recommend that you create a clustered index for all tables unless you can prove that the clustered index is a burden or that a heap will outperform it. I will discuss the performance grains of the clustered index below.<\/p>\n<h2>Clustered index structure<\/h2>\n<p>A clustered index is a B+-Tree as described in the <a href=\"http:\/\/www.sqlhammer.com\/the-b-in-b-tree-indexing-sql-server\/\">first post<\/a> of this series. It is a hierarchical tree which has a root level, zero to many intermediate levels, and a leaf level. The leaf level is where all of the table data is stored with pointers to the leaf pages which comes before and after in the index key sort order.<\/p>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3073\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/Clustered-Index-Structure-1024x560.jpg\" alt=\"\" \/><\/a><\/p>\n<p>Each root node is an index allocation map (<a href=\"http:\/\/sqlity.net\/en\/2315\/index-allocation-map\/\">IAM<\/a>) page. These pages are allocated for each 4 GB chunk of the below three different <a href=\"http:\/\/blogs.msdn.com\/b\/sqlserverstorageengine\/archive\/2006\/06\/25\/646865.aspx\">allocation units<\/a>, at least one per unit type.<\/p>\n<ul>\n<li>IN_ROW_DATA allocation unit<\/li>\n<\/ul>\n<ul>\n<li>LOB_DATA allocation unit<\/li>\n<\/ul>\n<ul>\n<li>ROW_OVERFLOW_DATA allocation unit<\/li>\n<\/ul>\n<p>The intermediate nodes contains pointers to a sub-set of the leaf level pages. The number of intermediate levels and pages is determined by the size of the table. A small table might not have any intermediate levels while a large table may have many.<\/p>\n<p>DBCC IND (&#8216;AdventureWorks2012&#8242;,&#8217;Person.Address&#8217;,1)<\/p>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3074\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/clustered-iam-page.png\" alt=\"\" \/><\/a><\/p>\n<p>Above you can see three types of index pages. The first row, PageType 10, is an IAM page. As seen below, the IAM page records page allocations.<\/p>\n<p>DBCC PAGE (&#8216;AdventureWorks2012&#8217;,1,836,3);<\/p>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3075\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/IAM-page.jpg\" alt=\"\" \/><\/a><\/p>\n<p>The second row is an intermediate page, PageType 2. Only pointers to the leaf level are found here.<\/p>\n<p>DBCC PAGE (&#8216;AdventureWorks2012&#8217;,1,840,3);<\/p>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3076\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/intermediate-page-1024x377.jpg\" alt=\"\" \/><\/a><\/p>\n<p>Third, we have the first of many of our data pages, PageType 1. Below is the bitmap and column data.<\/p>\n<p>DBCC PAGE (&#8216;AdventureWorks2012&#8217;,1,20847,3);<\/p>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3077\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/data-page.jpg\" alt=\"\" \/><\/a><\/p>\n<h3>Sort order<\/h3>\n<p>The data in a clustered index is <em>logically<\/em> sorted but does not guarantee that it will be <em>physically<\/em> sorted. The physical sorting is simply a common misconception. In fact, <a href=\"http:\/\/www.sqlservercentral.com\/blogs\/discussionofsqlserver\/2012\/10\/21\/does-a-clustered-index-really-physically-store-the-rows-in-key-order\/\">the rows on a given page are not sorted<\/a> even though all rows contained on that page will be appropriate to its place in the logical sort order. Also, <a href=\"https:\/\/sqlwithmanoj.com\/2013\/06\/02\/clustered-index-do-not-guarantee-physically-ordering-or-sorting-of-rows\/\">the pages on disk are not guaranteed to be sorted<\/a> by the logical key either.<\/p>\n<p>The most likely time where you will have a clustered index that is physically sorted is immediately after an index rebuild operation. If you are trying to optimize for sequential reads, setting a fill factor to leave free space on your pages will help limit how often you have pages physically out of order at the expense of disk space.<\/p>\n<h2>What makes a good clustered index?<\/h2>\n<p>Clustered indexes impact your read and write performance of every query and every data manipulation operation. There are key characteristics of a good clustered index.<\/p>\n<p><em>In order of importance\u2026<\/em><\/p>\n<h3>1.\u00a0\u00a0\u00a0\u00a0\u00a0 Keep it skinny<\/h3>\n<p><strong>Your clustered indexes should be as <em>narrow<\/em> as possible.<\/strong><\/p>\n<p>Clustered indexes include all columns of your table, because it is the table. This means that each page you read will have the smallest number of rows possible for that table, making each logical read cost more than a narrower non-clustered index would.<\/p>\n<p>The lack of a suitable non-clustered index will force SQL Server to read the clustered index. In addition, the clustered index will be used for key lookups. All non-clustered index rows include the clustered key. <em>More on non-clustered indexes in the next post<\/em>. This makes it possible for SQL Server to use a non-clustered index and then look up the corresponding rows in the clustered index to complete the query.<\/p>\n<h4>Performance impact<\/h4>\n<p>In the query below, there is a non-clustered index which covers the person\u2019s name but SQL Server must refer back to the clustered index to pull the AdditionalContactInfo field, since it is not included in the non-clustered index. These key lookups are costly.<\/p>\n<pre><code class=\"language-\">SELECT\u00a0\u00a0 [LastName] ,\n[FirstName] ,\n[MiddleName] ,\n[AdditionalContactInfo]\nFROM\u00a0\u00a0 [Person].[Person]\nWHERE\u00a0 lastname = 'Duffy'<\/code><\/pre>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3078\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/Key-Lookup.jpg\" alt=\"\" \/><\/a><\/p>\n<p>The clustered key being stored in the non-clustered index is the main reason that heaps are chosen for certain designs. In the <a href=\"http:\/\/www.sqlhammer.com\/heaps-in-microsoft-sql-server\/\">heap post<\/a> you learned that a row ID (RID) is 8 bytes. A RID is included in every non-clustered index instead of a clustered key. The size of your clustered key depends upon the data types of the columns which make up the key. The maximum size of a clustered key is 900 bytes. This is why you should strive to keep your clustered keys very narrow. You do not want a large key bloating all of your non-clustered indexes. This will cost disk space and limit the amount of rows you can store on each non-clustered index page.<\/p>\n<p>To show how many rows fit in each of our pages I have barrowed a query from <a href=\"http:\/\/sqlfool.com\/author\/michelle\/\">Michelle Ufford<\/a>\u2019s <a href=\"http:\/\/sqlfool.com\/2009\/02\/estimating-rows-per-page\/\">Estimating Rows per Page<\/a> post.<\/p>\n<pre><code class=\"language-\">Select object_name(i.object_id) As 'tableName'\n, i.name As 'indexName'\n, i.type_desc\n, Max(p.partition_number) As 'partitions'\n, Sum(p.rows) As 'rows'\n, Sum(au.data_pages) As 'dataPages'\n, Sum(p.rows) \/ Sum(au.data_pages) As 'rowsPerPage'\nFrom sys.indexes As i\nJoin sys.partitions As p\nOn i.object_id = p.object_id\nAnd i.index_id = p.index_id\nJoin sys.allocation_units As au\nOn p.hobt_id = au.container_id\nWhere au.type_desc = 'IN_ROW_DATA'\nAND object_name(i.object_id) = 'Person'\nGroup By object_name(i.object_id)\n, i.name\n, i.type_desc<\/code><\/pre>\n<p>The original clustered index on the Person table was a single INT, 4 bytes wide. It can only hold 5 rows per page and the associated indexes held 307 and 186 each.<\/p>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3080\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/Person-ClusterKey-1.jpg\" alt=\"\" \/><\/a><\/p>\n<p>By adding the rowguid column (16 bytes) and the ModifiedDate column (8 bytes) the clustered key becomes a total of 28 bytes, that is 7 times the size of the original. This did not affect our clustered index but it did make our non-clustered indexes 22% and 35% less effective.<\/p>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3079\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/Person-ClusterKey-2.jpg\" alt=\"\" \/><\/a><\/p>\n<p>Primary keys create a clustered index by default which matches their key but this is not a requirement. Often data architects will choose to use a different clustered key so that they can keep it small. One common strategy is to create an integer identity column and use it as the clustered key. There are additional benefits to this method, aside from being very narrow, keep reading.<\/p>\n<h3>2.\u00a0\u00a0\u00a0\u00a0\u00a0 Static<\/h3>\n<p><strong>Your clustered keys should <em>not<\/em> be updated.<\/strong><\/p>\n<p>Your non-clustered indexes reference table data by storing the clustered key. This causes a clustered key update to be very expensive because all non-clustered indexes have to be updated to reflect the change to the modified clustered index row. It is possible to create up to 999 non-clustered indexes on a table and it is common to see 5-10 on OLTP database tables or several dozen on OLAP database tables.<\/p>\n<p>Keeping your key static removes the problem completely. There are two ways to achieve a static key. The first is to understand how your system works with the table and simply select a set of columns which make sense and you have no intention of ever including in an update statement. The second is to create an identity column. A column which stores a small data type satisfies the narrow recommendation from above and, if it has no direct link to reality there will be no reason to update it.<\/p>\n<h4>Performance impact<\/h4>\n<p>To demonstrate the write impact of updating a clustered index key, I have a table with five indexes, one of which is the clustered index. The two indexes in yellow include the SalesOrderId field. The two indexes in orange include the CustomerId field. I will be updating these two fields to represent the difference between two nonclustered indexes being updated and what would seem to be only two other indexes being updated but one is the clustered index.<\/p>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3081\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/Index-setup-static.jpg\" alt=\"\" \/><\/a><\/p>\n<p>First I backup the log to truncate it, preventing old transactions from slipping into our numbers. In addition, I have named my transactions to make parsing the log file easier. Updating a single row\u2019s clustered key results in 11 log records written.<\/p>\n<pre><code class=\"language-\">--\/<em> Backup log to verify that it has been logically cleared\nBACKUP DATABASE AdventureWorks2014\nTO DISK = 'C:\\Backups\\AdventureWorks2014_Log.trn'\nWITH INIT;\n--<\/em>\/\n\n<p>BEGIN TRANSACTION Take1<\/p>\n\n<p>UPDATE t\nSET SalesOrderId = 6\nFROM dbo.SalesOrderHeader t\nWHERE SalesOrderId = 7<\/p>\n\n<p>COMMIT<\/p>\n\n<p>--Show log records to indicate writes\nSELECT [Transaction ID],\nSUM([Log Record Length]) TotalLogSize,\nCOUNT(<em>) [TotalLogRecords]\nFROM sys.fn_dblog(NULL,NULL)\nWHERE [Transaction ID] = '0000:0000bfe4'\nGROUP BY [Transaction ID]<\/code><\/pre>\n<\/p>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3082\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/Clustered-update.jpg\" alt=\"\" \/><\/a><\/p>\n<p>Next I will update a single record\u2019s CustomerId field. Two indexes will be updated but only 7 log records are written. This is because the clustered key update forced updates of all of the non-clustered indexes while the CustomerId update only forced writes for two of the five indexes.<\/p>\n<pre><code class=\"language-\">--\/<\/em> Backup log to verify that it has been logically cleared\nBACKUP DATABASE AdventureWorks2014\nTO DISK = 'C:\\Backups\\AdventureWorks2014_Log.trn'\nWITH INIT;\n--<em>\/\n\n<p>BEGIN TRANSACTION Take2<\/p>\n\n<p>UPDATE t\nSET CustomerId = 7\nFROM dbo.SalesOrderHeader t\nWHERE SalesOrderID = 6<\/p>\n\n<p>COMMIT<\/p>\n\n<p>--Show log records to indicate writes\nSELECT [Transaction ID],\nSUM([Log Record Length]) TotalLogSize,\nCOUNT(<\/em>) [TotalLogRecords]\nFROM sys.fn_dblog(NULL,NULL)\nWHERE [Transaction ID] = '0000:0000bff6'\nGROUP BY [Transaction ID]<\/code><\/pre>\n<\/p>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3083\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/Nonclustered-update.jpg\" alt=\"\" \/><\/a><\/p>\n<p>Should you care about four additional writes during your update? No, not if it is only four. However, this performance hit scales up rather quickly. Below I updated the entire table with the same queries as above, just no WHERE clause. The SalesOrderId update is 44% slower than the CustomerId update which makes a large difference at only 35,000 rows.<\/p>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3084\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/large-scale-index-update.jpg\" alt=\"\" \/><\/a><\/p>\n<h3>3.\u00a0\u00a0\u00a0\u00a0\u00a0 Sequential<\/h3>\n<p><strong>Your clustered keys should be inserted in the order that your clustered key is sorted.<\/strong><\/p>\n<p>One term I have heard when referring to clustered indexes is that they should be <em>ever increasing<\/em>. I phrased my recommendation specifically because this depends upon the sort order of the columns in your key. If you ever choose to use a descending key instead of the default ascending key then your inserts should have a decreasing value.<\/p>\n<p>The sequential nature of your inserts is important because of page splits. If you used a GUID as your clustered key, <em>please do not do that<\/em>, the random nature of something like the NEWID() function would cause a lot of page splits. This is assuming that you are using the default fill factor of 100. You can limit the page splits by using a lower fill factor such as 80 but your table would then be 20% larger. That can end up being a huge waste of space. It will further reduce the number of rows that can fit on each page thus increasing the expense of each read operation.<\/p>\n<p>Not only do page splits allocate a new page, but if that page is not at the end of the index it will need to update the intermediate and\/or adjacent pages as well.<\/p>\n<h4>Performance impact<\/h4>\n<p>The below tables will be used to demonstrate the page splits between sequential key inserts and non-sequential key inserts.<\/p>\n<pre><code class=\"language-\">DROP TABLE dbo.PageSplits_Sequential\nCREATE TABLE dbo.PageSplits_Sequential\n(\nid INT IDENTITY(1,1) NOT NULL,\ntxt VARCHAR(30) NOT NULL,\ntxt1 VARCHAR(30) NOT NULL,\ntxt2 VARCHAR(30) NOT NULL,\ntxt3 VARCHAR(30) NOT NULL,\ntxt4 VARCHAR(30) NOT NULL\n)\n\n<p>INSERT INTO dbo.PageSplits_Sequential\n(\ntxt ,txt1 ,txt2 ,txt3 ,txt4\n)\nSELECT TOP 10000\nREPLICATE('a',20) ,REPLICATE('a',20)\n,REPLICATE('a',20) ,REPLICATE('a',20)\n,REPLICATE('a',20)\nFROM sys.columns c\nCROSS JOIN sys.columns c1<\/p>\n\n<p>ALTER TABLE dbo.PageSplits_Sequential\nADD CONSTRAINT PK_PageSplits_Sequential\nPRIMARY KEY CLUSTERED (id);<\/p>\n\n<p>CREATE NONCLUSTERED INDEX IX_PageSplits_Sequential_txt\nON dbo.PageSplits_Sequential (txt)\nCREATE NONCLUSTERED INDEX IX_PageSplits_Sequential_txt1\nON dbo.PageSplits_Sequential (txt1)\nCREATE NONCLUSTERED INDEX IX_PageSplits_Sequential_txt2\nON dbo.PageSplits_Sequential (txt2)\nCREATE NONCLUSTERED INDEX IX_PageSplits_Sequential_txt3\nON dbo.PageSplits_Sequential (txt3)\nCREATE NONCLUSTERED INDEX IX_PageSplits_Sequential_txt4\nON dbo.PageSplits_Sequential (txt4)<\/p>\n\n<p>DROP TABLE dbo.PageSplits_NonSequential\nCREATE TABLE dbo.PageSplits_NonSequential\n(\nid INT NOT NULL,\ntxt VARCHAR(30) NOT NULL,\ntxt1 VARCHAR(30) NOT NULL,\ntxt2 VARCHAR(30) NOT NULL,\ntxt3 VARCHAR(30) NOT NULL,\ntxt4 VARCHAR(30) NOT NULL\n)<\/p>\n\n<p>INSERT INTO dbo.PageSplits_NonSequential\nSELECT id <em> 2\n,txt,txt1,txt2,txt3,txt4\nFROM dbo.PageSplits_Sequential<\/p>\n\n<p>ALTER TABLE dbo.PageSplits_NonSequential\nADD CONSTRAINT PK_PageSplits_NonSequential\nPRIMARY KEY CLUSTERED (id);<\/p>\n\n<p>CREATE NONCLUSTERED INDEX IX_PageSplits_NonSequential_txt\nON dbo.PageSplits_NonSequential (txt)\nCREATE NONCLUSTERED INDEX IX_PageSplits_NonSequential_txt1\nON dbo.PageSplits_NonSequential (txt1)\nCREATE NONCLUSTERED INDEX IX_PageSplits_NonSequential_txt2\nON dbo.PageSplits_NonSequential (txt2)\nCREATE NONCLUSTERED INDEX IX_PageSplits_NonSequential_txt3\nON dbo.PageSplits_NonSequential (txt3)\nCREATE NONCLUSTERED INDEX IX_PageSplits_NonSequential_txt4\nON dbo.PageSplits_NonSequential (txt4)<\/code><\/pre>\n<\/p>\n<p>Since the tables and indexes have just been created, they are fresh and show no page splits in <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/ms174281.aspx\">sys.dm_db_index_operational_stats<\/a>. <\/em>Below query taken from <em><a href=\"http:\/\/social.technet.microsoft.com\/wiki\/contents\/articles\/25473.sql-server-dmv-to-track-page-split.aspx\"><\/em>this article*<\/a>.<\/p>\n<pre><code class=\"language-\">SELECT\nIOS.INDEX_ID,\nO.NAME AS OBJECT_NAME,\nI.NAME AS INDEX_NAME,\nIOS.LEAF_ALLOCATION_COUNT AS PAGE_SPLIT_FOR_INDEX,\nIOS.NONLEAF_ALLOCATION_COUNT PAGE_ALLOCATION_CAUSED_BY_PAGESPLIT\nFROM SYS.DM_DB_INDEX_OPERATIONAL_STATS(DB_ID(N'DB_NAME'),NULL,NULL,NULL) IOS\nINNER JOIN SYS.INDEXES I ON IOS.INDEX_ID=I.INDEX_ID\nAND IOS.OBJECT_ID = I.OBJECT_ID\nINNER JOIN SYS.OBJECTS O ON IOS.OBJECT_ID=O.OBJECT_ID\nWHERE O.TYPE_DESC='USER_TABLE'\nAND O.Name LIKE 'PageSplits%'\nORDER BY o.name DESC, i.name ASC<\/code><\/pre>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3085\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/PageSplits-Fresh.jpg\" alt=\"\" \/><\/a><\/p>\n<p>By adding equal number of rows to the two tables we can see that the non-sequential inserts produce more page splits. It is important to note that not all page splits are bad. In the case of the sequential inserts, a page split is registered for each time a new page is allocated at the end of the index. These are considered to be good page splits. On the non-sequential inserts there will be good page splits and bad page splits. Bad page splits are when the index splits in the middle of the index. These are bad because they fragment the index and might require splitting intermediate level pages which increases the number of writes even further. The intermediate level splits are indicated in the PAGE_ALLOCATION_CAUSED_BY_PAGESPLIT. This table is rather small so the count will be low. These splits can become rather large, however.<\/p>\n<pre><code class=\"language-\">INSERT INTO dbo.PageSplits_Sequential\n(\ntxt,txt1,txt2,txt3,txt4\n)\nSELECT TOP 20000\nREPLICATE('a',20),REPLICATE('a',20)\n,REPLICATE('a',20),REPLICATE('a',20)\n,REPLICATE('a',20)\nFROM sys.columns c\nCROSS JOIN sys.columns c1\n\n<p>INSERT INTO dbo.PageSplits_NonSequential\nSELECT id - 1\n,txt,txt1,txt2,txt3,txt4\nFROM dbo.PageSplits_NonSequential<\/p>\n\n<p>INSERT INTO dbo.PageSplits_NonSequential\nSELECT TOP 10000 id + 50000\n,txt,txt1,txt2,txt3,txt4\nFROM dbo.PageSplits_NonSequential<\/code><\/pre>\n<\/p>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3086\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/PageSplits-AfterInserts.jpg\" alt=\"\" \/><\/a><\/p>\n<h3>4.\u00a0\u00a0\u00a0\u00a0\u00a0 Uniqueness<\/h3>\n<p><strong>When possible, your clustered keys should be unique.<\/strong><\/p>\n<p>SQL Server needs to uniquely identify each row within the clustered index. Uniqueness is not a requirement for a clustered index, however. Instead, SQL Server will append an uniqueifier to each duplicate row. The uniqueifier is 4 bytes. The 4 bytes alone can cause a lot of bloat if you have a lot of duplicates, especially when accounting for the bloat to all of the non-clustered indexes. However, there is an additional penalty if the table does not already contain any variable length columns. In that case, there is 4 bytes of additional overhead to manage the variable length uniqueifier, resulting in a total of 8 bytes of bloat per duplicate row.<\/p>\n<p>To put this in perspective, if you were to use a single non-unique INT column (4 bytes in size) the potential uniqueifier cost could triple the size of your clustered key.<\/p>\n<h4>Performance impact<\/h4>\n<p>I created two versions of the Person.Person table, dbo.Person_Unique and dbo.Person_WithDups. They have the exact same amount of rows in them but the dbo.Person_WithDups includes around 9000 duplicate BusinessEntityIDs.<\/p>\n<pre><code class=\"language-\">Select object_name(i.object_id) As 'tableName'\n, i.name As 'indexName'\n, i.type_desc\n, Max(p.partition_number) As 'partitions'\n, Sum(p.rows) As 'rows'\n, Sum(au.data_pages) As 'dataPages'\n, Sum(p.rows) \/ Sum(au.data_pages) As 'rowsPerPage'\nFrom sys.indexes As i\nJoin sys.partitions As p\nOn i.object_id = p.object_id\nAnd i.index_id = p.index_id\nJoin sys.allocation_units As au\nOn p.hobt_id = au.container_id\nWhere au.type_desc = 'IN_ROW_DATA'\nAND (object_name(i.object_id) = 'Person_WithDups'\nOR object_name(i.object_id) = 'Person_Unique')\nGroup By object_name(i.object_id)\n, i.name\n, i.type_desc<\/code><\/pre>\n<p><a href=\"http:\/\/www.sqlhammer.com\/?attachment_id=3087\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/Person-Uniqueness.jpg\" alt=\"\" \/><\/a><\/p>\n<p>With identical row counts, the Person_Unique table has 40% fewer data pages. Instead of 5 rows per page in the Person_Unique table, the Person_WithDups table can only hold 3 rows per page. This is because of the additional overhead induced by the clustered index uniqueifier.<\/p>\n<h2>Next time<\/h2>\n<p>The clustered index is the recommended default choice for structuring your table data. You learned the best practices for designing your clustered keys and how they impact performance of your system and resulting non-clustered indexes. In the next part of this series I will cover SQL Server\u2019s easiest go-faster button, non-clustered indexes.<\/p>\n<h2>References<\/h2>\n<p>[1] <a href=\"http:\/\/logicalread.solarwinds.com\/sql-server-clustered-indices-mc03\/#.Vr4N9VJUjFk\">About SQL Server Clustered Indices<\/a> (Solarwinds)<\/p>\n<p>[2] <a href=\"https:\/\/www.simple-talk.com\/sql\/learn-sql-server\/effective-clustered-indexes\/\">Effective Clustered Indexes<\/a> (Simple Talk)<\/p>\n<p>[3] <a href=\"https:\/\/books.google.com\/books?id=nvLpCgAAQBAJ&#038;pg=PA241&#038;lpg=PA241&#038;dq=when+to+clustered+indexes+outperform+heaps&#038;source=bl&#038;ots=KLmv8IBMRc&#038;sig=P7Y7B8bqATC_jQoTwMdn4Vka5MM&#038;hl=en&#038;sa=X&#038;ved=0ahUKEwiS-t_9jvbKAhUINj4KHYioA5wQ6AEIQzAG#v=onepage&#038;q=when%20to%20clustered%20indexes%20outperform%20heaps&#038;f=false\">Expert Performance Indexing<\/a> (Book)<\/p>\n<p>[4] <a href=\"http:\/\/sqlwithmanoj.com\/2013\/06\/02\/clustered-index-do-not-guarantee-physically-ordering-or-sorting-of-rows\/\">Clustered Index Do Not Guarantee Physically Ordering or Sorting of Rows<\/a> (SQL with Manoj)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the last post of this blog series, I discussed what a heap is and how it impacts query performance. In this post, I will describe clustered indexes and how they differ from a heap. They are indexes but also tables. Like a heap, the clustered index stores the data pages and is indistinguishable from [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":550,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,7,9,13,14,21],"tags":[],"class_list":["post-153","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-administration","category-community","category-development","category-microsoft-sql-server","category-performance","category-t-sql"],"_links":{"self":[{"href":"https:\/\/sqlhammer.com\/index.php\/wp-json\/wp\/v2\/posts\/153","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sqlhammer.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sqlhammer.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sqlhammer.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sqlhammer.com\/index.php\/wp-json\/wp\/v2\/comments?post=153"}],"version-history":[{"count":1,"href":"https:\/\/sqlhammer.com\/index.php\/wp-json\/wp\/v2\/posts\/153\/revisions"}],"predecessor-version":[{"id":415,"href":"https:\/\/sqlhammer.com\/index.php\/wp-json\/wp\/v2\/posts\/153\/revisions\/415"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sqlhammer.com\/index.php\/wp-json\/wp\/v2\/media\/550"}],"wp:attachment":[{"href":"https:\/\/sqlhammer.com\/index.php\/wp-json\/wp\/v2\/media?parent=153"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sqlhammer.com\/index.php\/wp-json\/wp\/v2\/categories?post=153"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sqlhammer.com\/index.php\/wp-json\/wp\/v2\/tags?post=153"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}