Linked lists are linear collections of data very much like an array, but instead of the data being arranged in one continous block, each piece of data is linked to the next one in the chain using pointers coupled with the data. More specifically such a structure is a singly linked list; a doubly linked list has two pointers for each element, with the second pointing to the previous element in the list. Here we will consider singly linked lists, as doubly linked lists are simply a natural extension.

Singly Linked List
Singly linked list

By using pointers we can insert and delete elements from anywhere on the list in constant time, instead of time proportional to number of elements like in an array. This is possible because now the elements do not rely on being adjacent in RAM to be connected, instead the pointer associated with the element simply points to wherever the next piece of data may be, regardless of where it is in RAM. This means that to add a new piece of data, we just have to allocate some more memory and move the destination of a single existing pointer. This is just two operations and so insertion has an efficiency of O(1). The same holds for deletion, but instead we’re altering the pointer of the previous element to point to the element after the element to be deleted.

Here’s a simple class describing a singly-linked list. I recommend that you open that and read through it along with the rest of this tutorial.

This seems great, we’ve removed the array’s limitations so we can throw those away and use lists all the time! Sadly this isn’t the case, for several reasons. For one, I said that insertion and deletion was O(1) anywhere, but actually is only partly true (sorry). A singly linked list has no way of accessing elements before a given element in constant time, you see; it has to start at the beginning and work its way forwards until it finds the correct element. This means that we can’t insert an element before an existing one in constant time, we can only insert an element efficiently after. The exception to this is when we insert an element before the head, as there’s no element before it to change the pointer of.

Doubly linked lists do have a way of accessing the previous element though, so one advantage of them is that they allow for constant time insertion and deletion before or after any element.

Doubly Linked List
Doubly linked list

Because we have to iterate through n-1 elements to insert an element before the nth, insertion before is an O(n) operator. What about deletion? Deletion at the start is easily done, just move the head element. Deletion after an element is easy too, we just move a pointer over the element to be deleted, and make it point to the next element (which we know from the element being deleted). Deletion of the tail element (the last element in the list) is somewhat more tricky; if we delete the element after the second last element, then we only need to move the tail to point to the second last element. But if we delete the tail element directly, we have to loop through the entire list to find the second last element! We’ve done essentially the same operation, but doing it directly took O(n) not O(1) time! That’s OK, we’ll just use insert after and erase after routines instead.

As I kept mentioning in the last paragraph, to retrieve an element from a list you must either know the pointer to it before hand, or you must find that pointer by iterating over all the existing elements until you find the element you want. This deals another blow to lists, as whereas arrays have constant time element access, linked lists only have linear time (O(x)). By allowing insertion and deletion anywhere, we’ve removed the ability to quickly access data from anywhere within it. This drawback is lessened somewhat by using a doubly linked list, as that enables us to iterate forwards as well as backwards through the data. A doubly linked list still has O(n) access time, but if we know that an element is closer to the end than to the beginning we can get there faster by starting from the end.

Surely we should use doubly linked lists all the time then? Once again, this is not always a good idea, in this case because there are two sides to efficiency, time and space. Arrays use up space in RAM equal to the number of elements multiplied by the size of each element. This is the least space that a data structure can ever use, short of compressing the data. Singly linked lists use all this, and more, as each element has a pointer associated with it. On a 32-bit computer these pointers will take up 4 bytes, and on a 64-bit computer, 8. So linked lists of ints on 64-bit computers use more space for the pointers than for the data itself! In big-o notation this is the same efficiency, but clearly the linked lists take up more space than the arrays; this is the limitation of using this notation.

For a doubly linked list, the problem becomes twice as bad, since now two pointers are being stored instead of one. If quick insertion and deletion is crucial though, or you’re dealing with large elements (class instances filled with many variables say, perhaps even other data structures), then percentage-wise the additional memory usages becomes neglible with respect to the overall size of the structure, data included.

Instead of using my inefficient implementation of a singly linked list, the STL offers two classes to do the same job. std::list is a doubly linked list class, and std:forward_list is a singly linked list. The latter is only available in the new C++11 standard though, which your compiler may or may not support. I’ll just cover std::lists for that reason.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include <list>
#include <iostream>
#include <string>

int main(void)
{
    std::list<std::string> names; // Same declaration as for vectors

    names.push_back("Sandra"); // Add a piece of data to the end of the list
    names.push_back("Bill");
    names.push_back("Catherine");
    names.push_front("Timothy"); // Add a piece of data to the start of the list
    names.pop_back(); // Remove the last element from the list

    // STL containers use iterators to loop through elements. Iterators are like
    // pointers to elements in the container, but are handled slightly differently
    for(std::list<std::string>::iterator it = names.begin(); it != names.end(); ++it)
    {
        std::cout << *it << std::endl;
    }

    return 0;
}

There is one other type of list that we haven’t covered yet, the circular list. Circular lists can either be singly or doubly linked, but have the head and tail elements linked together too, curling this list into a circle. This allows for continuous iteration through the list, and further improves the speed of iteration through the structure, so long as you know roughly the location of the element you’re looking for. Circular lists do not have an implementation in the STL however, and I personally have never found a use for them that a standard list could not simply do.