Generator Functions

Steve the Dev

In Generator Functions are Awesome, I wrote about Generator Functions and why you would use them (if you haven't read that article, I recommend reading it after this one). Generator Functions are a weird enough feature in JavaScript that they warrant two back-to-back articles. This time around, I'm going to spend some time talking about what they are and how you use them.

What is a Generator Function?

A Generator Function provides a built-in JavaScript technique for creating and iterating through what amounts to a Singly-Linked List. The main benefit, however, is that they improve code clarity by separating complex looping behavior and the loop body. Consider the following example:

function* loop() {
    yield 1;
    yield 2;
    yield 3;
    return 4;
}

var it = loop();
console.log(it.next()); // { value: 1, done: false }
console.log(it.next()); // { value: 2, done: false }
console.log(it.next()); // { value: 3, done: false }
console.log(it.next()); // { value: 4, done: true }

What we've just done is created an iterator that steps through four integer values. Intuitively, you may see this function and assume that loop() builds an iterator and returns it to the calling function. In actuality, the yield keyword pauses execution of the function and returns control to the calling context in much the same way that a callback or setTimeout() would. This is a completely different way of thinking about JavaScript functions, so let's unwrap what's happening here.

  1. In the first step, we create our Generator-Iterator. Think of it as being a cursor that moves through the linked-list nodes using the next() method.
    var it = loop();
    // Our primitive linked-list looks like this:  1 → 2 → 3 → 4
  2. We invoke the next() method to move to the first node in the chain. The loop() function is activated, and we run to the first yield value. The state of the function execution is saved in it, and a copy of the node is returned. The node holds the value of the first node (1, in this case) in the value property and a boolean done property to identify when the last yield has been completed (false, in this case).
    it.next(); // { value: 1, done: false }
    // Our primitive linked-list looks like this: [1]→ 2 → 3 → 4
  3. We invoke the next() method again to move to the second node in the chain. The loop() function is activated, and we pick up where we left off. The loop() function runs from the end of the first yield to the second yield, and again returns a result.
    it.next(); // { value: 2, done: false }
    // Our primitive linked-list looks like this:  1 →[2]→ 3 → 4
  4. We invoke the next() method a third time to move to the third node in the chain. The loop() function is activated again, and we pick up where we left off. The loop() function runs from the second yield to the third yield, and returns another list node.
    it.next(); // { value: 3, done: false }
    // Our primitive linked-list looks like this:  1 → 2 →[3]→ 4
  5. We invoke the next() method a fourth and final time to move to the last node in the chain. The loop() function is activated again, and we pick up where we left off. Since there are no more yield statements, the system continues until the end of the function and attaches the result of the function to the resulting object. (If no return is defined, then an undefined value will be used). Since there are no more yield statements to hit, the done property is set to true.
    it.next(); // { value: 4, done: true }
    // Our primitive linked-list looks like this:  1 → 2 → 3 →[4]

Using Generators in a Loop

Using the next() method, we can use most of the standard loops to walk over our Generator-Iterator. However, two things become clear very quickly. First, that iterating like this is very cumbersome; and second, that getting the return value out of a generator function really complicates the looping logic.

var it, node;
for (it = loop(); (node = it.next()) && !node.done; ) {
    console.log(node.value); // 1, 2, 3
}
console.log(node.value); // 4

The solution to the loop() function's return value is simple: replace return with yield. This makes sure that a node's done property is set once all of the desired values have been iterated — which aligns its behavior with comparing an integer to the length of an array when iterating (i < array.length).

function* loop() {
    yield 1;
    yield 2;
    yield 3;
    yield 4; // return 4;
}

This lets us get all of the Generator-Iterator's values, but we're still stuck with that ugly (node = it.next()) && !node.done line. Luckily, the Generator-Iterator is an Iterable, and so we can make use of the for ... of syntax for a much more attractive and expressive loop:

for (let value of loop()) {
    console.log(value); // 1, 2, 3, 4
}

Iterables in Generators

One of the weirder behaviors that Generator-Functions provide is their ability to automatically iterate over other iterables with the yield* keyword (note the asterisk). Anything that can be iterated with a for ... of loop can be iterated with the yield* keyword from within a Generator Function. For this reason, it can be accurately described as an esoteric convenience feature. Whether it is more clear than the for ... of loop is up to the developer, but it obviously provides a more terse syntax that may appeal to client-side JavaScript developers more than Node.js developers.

// More Terse
function deferredLoop() {
    yield* loop();
}

for (var value of deferredLoop()) {
    console.log(value);
}
// Less Esoteric
function iteratedLoop() {
    for (var value of loop()) {
        yield value;
    }
}

for (var value of iteratedLoop()) {
    console.log(value);
}

Passing Arguments to Generator-Iterators

One of the benefits of using the node-based iteration is the ability to communicate information back into the Generator Function. One common use-case when processing data in a for-loop is to move "backward" in an iteration or to skip a step. In a Generator Function, the next() method may accept a parameter and return it as the value of the yield expression. We can take advantage of this feature to provide the same "skipping" behavior.

function* generateArray(array) {
    for (var i = 0; i < array.length; ++i) {
        var skip = yield array[i];
        if ('undefined' !== typeof skip) {
            i += skip - 1;
        }
    }
}

var it = generateArray([1, 2, 3, 4]);

console.log(it.next(  )); // { value: 1, done: false } -- move one to the right
console.log(it.next( 1)); // { value: 2, done: false } -- move one to the right
console.log(it.next(-1)); // { value: 1, done: false } -- move one to the left
console.log(it.next( 0)); // { value: 1, done: false } -- don't move

This feature provides a lot of flexibility in how Generator-Iterators are used, but with the not-insignificant cost of not being compatible with the for ... of loop.

Style and Convention

There is no syntactic difference between function* loop, function *loop, and function * loop. I prefer to append the asterisk (*) to the function keyword as a matter of convention because generators are a special type of function, and it makes more sense to me with anonymous functions.

function foo() {
    return function* () { /* do things */ };
}

Unfortunately, Generator Arrow-Functions and Generator Async-Functions are non-existent with no word on future support. If you find yourself in need (which I often do), you'll be forced to fall back on some old favorites:

() => {
    const self = this; //< Scoping "this"
    return (function* () { //< Closure
        yield Promise.resolve(self);
    })();
};

Conclusion

A Generator Function is an underappreciated (and poorly documented) technique for iterating through a series of data. It's a very powerful feature that is difficult to explain without highly simplified toy problems. The real strength of a Generator Function lies not in its ability to iterate through an array or some fixed data-set, but in its ability to separate potentially complex looping logic from the behavioral logic. As an added bonus, this allows the looping logic to be refined or replaced at a later date without having to modify the loop body, which lends itself especially to the prototyping and stubbing that tends to occur early in API-based projects.

The more complicated or prone to change a loop's logic is, the more your code will benefit from using a Generator Function; if for no other reason than because this:

for (var prime of primes(1, 1000)) {
    console.log(prime);
}

... is easier to grok than this ...

for (var i = 1; i < 1000; ++i) {
    var isPrime = true;
    for (var j = 2; isPrime && j < Math.abs(i); ++j) {
        if ( (i / j) === ((i / j)|0) ) {
            isPrime = false;
        }
    }
    if (isPrime) {
        console.log(i);
    }
}

Comments

Back to top