Deciphering the Range-Based For Loop in C++

June 14, 2021

The C++11 standard introduced a number of quality of life improvements to the language. Chief among them is the range-based for loop, which provides a neat syntax for looping from the beginning to the end of a container. This article discusses some of the nuances of the range-based for and explains how it can be used with custom classes.

Recognizing the Syntax

Below is a minimal example of the range-based for at work.

std::vector<int> nums = {1, 2, 3, 4, 5};
for (const int& num : nums)
{
    std::cout << num << " ";
}

The above program initializes a vector of integers. It then loops through each element in the vector, from beginning to end, printing each in turn.

How it Works

In the abstract, the range-based for looks like this.

for (element : iterable)
{
    statement
}

It begins with the familiar for keyword, and is followed by an element declaration and an iterable. The iterable must evaluate to an iterable container, which we will explain in short order. The element represents the current element of the container. In the statement block, operate on the element as you would in a traditional for loop. Indeed, the range-based for is little more than a traditional for loop in disguise. To support ranged-based for syntax, the compiler transforms the above statement into a traditional for loop at compile time, which looks essentially like the following:

for (
    auto begin = iterable.begin(), 
    end = iterable.end(); 
    begin != end; 
    ++begin
    )
{
    auto element = *begin;
    statement
}

The compiler initializes the begin and end iterators in the first clause of the for loop. In the second clause of the for loop, it compares those iterators for equality. Then, the element variable is initialized with the current element, and the loop body is executed. Finally, the third clause of the for loop is executed, incrementing the begin iterator. When begin is equal to end, i.e. when the container has been fully traversed, the loop will terminate.

Working with Custom Classes

As you can see, range-based for is essentially a traditional for loop based on iterators. As such, the iterable expression that we provide must be an iterable container. In particular, to use one of our own classes with range-based for, the class must implement the begin and end functions. Each function should return something that works like an iterator. At the very least, the iterator must support:

  • Incrementing using the ++ operator
  • Dereferencing using the * operator
  • Comparison using the != operator

Most of the standard library containers, like vector and list, implement the begin and end functions and return compatible iterators from those functions. This is why they work with range-based for out-of-the-box. Implementing iterators for our own classes can be complex, and that task is beyond the scope of this article. However, for a good starting point, take a look at the implementation of iterators for std::array or std::vector, which conduct simple pointer arithmetic to provide iterator behavior.

Other Expansions of Range-Based For

The expansion of range-based for which uses iterable.begin() and iterable.end() is common in the standard library, but it is not the only transformation that the compiler can do. For example, the compiler supports range-based for on built-in arrays, even though such arrays do not contain any member functions. If the compiler detects that the iterable is a built-in array, it sets the begin iterator to be a pointer to the beginning of the array. It also sets the end iterator to point just past the end of the array, using its inherent knowledge of the fixed size of the array.

Similarly, range-based for will work on containers that do not implement member begin and end functions, so long as there are comparable functions defined in global scope. For instance, instead of defining those functions in our iterable class, we might define the following functions in global scope.

class iterable
{
    ...
};

iterator begin(iterable i);
iterator end(iterable i);

In this situation, the compiler will use the global functions to set the begin and end iterators. You can read more about these expansions here.

Applications and Limitations

The range-based for can provide several benefits. In the first instance, the loop syntax can be easier to read than traditional loops. In addition, container traversal is prone to off-by-one errors, i.e. running over or under the size of the container. The range-based for ensures that each element in a container is processed exactly once.

While range-based for can be useful, it has its limitations. The loop is designed to iterate in only one direction: forward. Many containers provide iterators for iterating backwards over a container, using different functions such as rbegin and rend, but there is no way to tell the compiler to use these functions. Another issue is that the range-based for caches the end iterator instead of refreshing it on each loop evaluation. Thus, the construct may break when adding to or removing elements from the container during the loop, as the end iterator may not reflect the true end of the container. In these circumstances, it is better to use a traditional for loop.



© 2021 Mustafa Moiz.