Category Theory: The Beginning

April 13, 2017


Samuel Eilenberg and Mac Lane introduces Category Theory on their paper titled “General Theory of Natural Equivalences” in 1945
When I first heard about the word Category Theory on my journey to functional programming, I was quick to dismiss it. I write codes for living, so I thought those are the mathematical models of something that has no value on my day-to-day job. Any article I read about CT is laden with math symbols I did not understand and the vocabulary was not familiar for me. I’m living in non-English speaking country, so English math vocabulary is a thing I’m studying at my 20s, not in my teenage years. I can program just fine without any knowledge of Category Theory. I heard and I swept it under the rug.

That changed when I observe steady stream of new languages claiming to be Functional Languages entering the mainstream. Libraries pops out to support functional style programming on previously imperative languages. I sense something changing in the programming world, and my foreboding become more apparent recently. Nowadays, the way people do programming is different compared to 10 years ago. Multicore is the norm, running on multiple small machines is how you do big software nowadays, not on the big bulky mainframes. New techniques based on old ideas coming out. Programmers are now constantly put in a hot water of concurrency. It’s getting harder and harder to make sense and prove the program correctness. Software becomes so complex and it’s easy to lose track.

We’re in the middle of what physicisits will say a phase transition. Paradigm is shifting due too multicore revolution and the ploriferation of cloud providers. Object Oriented Programming make sense in last century, but it starts to show cracks. It does not model concurrency and parallelism easily and correctly. Multi-threading doesn’t really solve the problem. It introduces buggy design when used recklessly. People starts to face deadlock, livelock, starvation, and races everyday. Not only on single machine but also on their datacenters due to separation of processing onto multiple machines. We need something better to model and solve this problem.

It turns out, computer operations can be modeled onto mathematics formula, axioms, and laws. Something that hard to be made sense of, now can be proven. Mathematicians have been doing this for years, and now we can steal their knowledge to model our computation even before we go down to hours of coding session.

Category Theory forces us to think about the Type, which fortunately, can be verified by compiler. Compiler is good on following instructions to the minute details. We’re not, we tend to forget things. We invent Test Driven Development to make sure our software is correct and consistent. We invent Domain Driven Design to abstract out our code and domain from the infrastructure nitty gritty. And now we have Category Theory to build the Domain Model and making sure of its correctness by working with types. Something that the compiler can verify quickly and correctly.

‘The’ Category

Category is a embarrasingly simplistic yet powerful concept. It has two things: Objects and Morphism . A Categories must follows Composition Law and obeying some Properties.

Objects

Let’s start with Objects. In this article a Category is defined by cursive script. Let’s say we have category , The objects of Category are expressed with and can be represented below:

Representation of objects within Category in Category Theory

From diagram above we do know that , , , , and are . We don’t know what are them. We also don’t know their properties. We just see it as black box. We don’t focus too deeply on that.

Objects in Category Theory loosely corresponds to types in programming. As a programmer, it’s easier to think object as types. It can be anything, Integer, Boolean, Order, GameObject, anything. In fact, by thinking like this, we’re jumping into Category of Programming.

Morphism

Morphism is relation between objects in a category. Morphism also called Arrow . I use it interchangeably.

Morphisms between objects in category C. Each object has relation with other objects

If and are objects in , then is a morphism from to and is collection of all morphisms from to . is also called the Hom-Set of to . All objects and morphisms in the figure above defines the category . A morphism can go from one object to another or to itself. For example, morphism goes from to .

In programming, morphisms corresponds to functions . So morphisms from object to in Category Theory can be interpreted in programming as a function that takes parameter of type and returns something of type .

In the study of Category Theory, we focus more into morphisms or functions. We understand objects or types from the relations from each other and composition of those relations.

Composition

To form a category there must be a composition between morphisms. Composition denoted by small circle pronounced “after”. If and are morphisms which and . There must be .

Composition between morphism f and g within category 𝒞

In programming, composition is the same as applying one function to another. See code below written in Kotlin.


fun <A, B> f(a: A): B { /* do something to A and returns B */ }
fun <B, C> g(b: B): C { /* do something to B and returns C */ }

// compose the function
fun <A, C> gAfterf: (a: A): C = g(f(a)) 

We’ll go back to composition later on. For now let’s move on to the Identity properties.

Category Laws

Inside category, there are laws to be satisfied: the Identity Law and the Associativity Law .

Identity Law

Identity is a morphism form an object to itself which: For each object in , there’s an identity so that for each

This can be expressed in picture:

Identity property of morphism in Category

.

In programming, a morphism is a function that returns the same value of the same type. Written in Kotlin:

fun <A> identity(a: A) = a

It seems to be useless. However, as Category Theory does not concern much about objects, but its morphisms. This identity morphism is essential for further study of Category Theory. Think identity morphism like number zero. It was made to express nothingness. Identity morphism express no operation.

Associativity Law

Composition is associative. If we have morphism of , , that can be composed. Meaning their objects match end-to-end, this associativity law holds:

Therefore, we don’t need to put parenthesis when expressing function composition. If you like pretty pictures, the law can be represented by this:

Associativity law should hold between objects within category.

In programming, associativity holds for functions. But when we’re dealing with other categories such as Set, it is less obvious.

Composition and Programming


George A. Miller is American psychologist who had written coincidence between limit of short-term memory and one-dimensional judgement task.
When we code, we chop up our big problem into smaller problems and then to smaller units. In the context of domain driven design, we divide our program into domains. Each domains can consists of several services. Each services consists of many functions and every function consists of expressions. We compose these smaller parts to build the bigger parts. We do this because of limitation of our brain. A psychologist, George A. Miller has even said on his paper about this phenomena.

A beautiful or elegan code is a chunk of code that can fit and digested easily by our brain. A good size of composable code has, to quote from Barotz Milewski: “Their surface area has to increase slower than their volume”. The surface area is the information we need to compose code, the volume is information we need to implement them.

One thing that struck me is that Category Theory doesn’t want us to peek inside the object. An object is black box, a ‘something’. We only know their properties by examining the relation between them. In Object-Oriented Programming this means that we model our program based on their interface (just surface area, no volume) with their methods is the morphism. The litmus test for non-composable program can be summarised to one sentence, this also paraphrased from Barotz Milewski:

The moment you have to peek and dive into implementation details to understand how to compose it with other object. You’ve lost advantage of your programming paradigm.

We access and services by their interfaces like REST API right? You should not even know how they work. All you know you want to compose your system or app with theirs. All you want to know is how to call them.

Summary

This article is the discovery I made when I study Category Theory on my free time. It helps me decompose problem into composable smaller chunk and write less code to achieve same result. Hope this article can wet your appetite on studying and utilising Category Theory when you create code.

comments powered by Disqus
Category Theory: The Beginning - April 13, 2017 - Didiet Noor