Every time I think I understand indexes, I later discover I don’t. Now I feel like I REALLY do (take with a grain of salt), and this is how I understand them to be…
Indexes are all about speed of access. While a primary key lets us uniquely identify a row, it doesn’t help us find the row really fast – that’s what an index is for. An index helps you jump directly to the relevant record.
A clustered index is the one column in a database that the data is ordered by. You choose one column to order all the data by down on the disk level where it is saved.
Most often, the primary key on any table is the clustered index. Whichever column is the clustered index, the database will order the data in the table based on that column.
Each table can have only one clustered index, but you can have others which aren’t clustered.
If you query on the clustered index column – the query will be very fast. However, if you query on a column without an index, the Database management system has no idea where that particular record might turn up, so it would require it to read through every row of the database looking for it. this is also known as a Full Table Scan, which is massively inefficient. The more rows, the worse it gets. If you get to 100000s, it is highly inefficient.
So you might do a non-clustered index, or secondary index, on another column. This is like an index in the back of the book.
Check it out below – non-clustered Indexes are EXACTLY like indexes at the back of a book.
You choose the column to index on, e.g. LastName. Then you order them alphabetically so it is easy to straight away find, for example, where ‘Hall’ is. Then, it will tell you the ‘page’ or actually ‘row’ where this data is held, via the ID, and it will retrieve that row.
It’s not as fast as going straight through a clustered index. But it’s still good. and better than a full table scan.
Why not have an index on every column? Well, indexes are good when reading from a database, but the cost is when you write or update the database, because they must be maintained.
These non clustered indexes are data structures, extra data structures you have to add things to as your table grows. That’s more work, and makes updating the database slower and more inefficient.