Type inference for Haskell, part 20

Posted on January 21, 2019

This is part 20 of the series on type inference for Haskell.

(THIH Section 11.5, From Types to Type Schemes)

The previous few posts dealt with some processing that gets applied to the predicates for a binding. The predicates were converted to head-normal form and then simplified to remove redundant predicates. Even after all that processing, there might be some issues with the predicates. Today we’re going to look at the first of the possible issues.

Consider this example, which has an outer binding foo and an inner binding bar:

foo x =
  let bar s = show x ++ s ++ show x
  in bar ", "

When processing the inner binding bar, we’ll discover a constraint that the type of x (let’s call that tx) needs to implement Show. That’s because we’re calling show x¹. This need is expressed as the predicate (Show tx). The type of bar is then a little bit odd:

(Show tx) => String -> String

The predicate mentions the type variable tx, but tx doesn’t appear anywhere in the type! It’s a bit silly to attach the predicate there. It would be much better to just defer that predicate to the outer binding for foo. The type for foo then looks like a good place for that predicate:

(Show tx) => tx -> String

With the predicate moved out there, the type of bar is then simply String -> String.

When a predicate is moved out to the outer binding, we call that “deferring” the predicate. When the predicate stays on the current binding, we call that “retaining” the predicate.

The logic for this is actually just one line. If you look at the code from THIH, you’ll find this line in the split function. The main job of the split function is to split apart the deferred and the retained predicates, but it also does other stuff we’ll get to in another post.

    let (ds, rs) = partition (all (`elem` fs) . tv) ps'

That is one very dense line. What does it do?

To start with, you need to know what data is found in the variables it references:

ps' contains the list of predicates that were returned from context reduction. For the bar function, this would be (Show tx).
fs is the set of type variables in the assumptions. These are the type variables in the types of other variables that are currently in scope. For the bar function, this would include x (which has type tx), so the set fs includes tx.

This expression relies on a few functions:

tv is the function defined many posts ago to find free type variables.
all and elem are built-in functions.
partition is a function from Data.List that splits a list into two based on some boolean function. Elements go in the first list when the function returns true, and in the second when it returns false.

This results in two variables. ds is the deferred predicates and rs is the retained predicates.

So, this line is saying, defer a predicate if all the type variables in that predicate are used in types from the outer scope. If not, assume the predicate belongs to this function.

It’s worth noting that the logic wasn’t written the other way around - checking if the type variables are used in the current binding. That is because there might be some predicates that have type variables that don’t reference either this binding’s type or any of the variables in scope. Those are potential ambiguities. I’ll look at those in the next post.

We haven’t yet covered how class constraints are actually found. The logic basically collects all the predicates from the all functions called in the expression.↩