Monday, December 8, 2025

A Symbology for Concatenative Combinators

I believe that one of the problems with the study of concatenative combinators is a fundamental misunderstanding of the symbology used to represent them. (Most of the explanations I have seen skip steps, or conflate the function being executed with the data on the top of the stack.) The theory is simple enough. After all, it is essentially an order-reversed version of the Lambda Calculus implemented on a Turing Machine style tape, effectively implementing the Church-Turing thesis. Now that I’ve thoroughly confused you, let’s get on with it.

Definitions and Terminology

The present: The current instruction being executed. Data is a very simple form of instruction. (2 is a function that pushes the number 2 onto the stack.) Represented as an instruction placed inside a pair of vertical bars:
|X|

The future: Instructions waiting to be executed. This is better known as the command queue. Represented as everything to the right of the present bars, with the leftmost as the next instruction to be executed:
|| X Y Z

The past: Data derived from the previous execution of instructions. This is better known as the data stack. Represented as everything placed to the left of the present bars, with the newest on top:
a b c ||

Comments are denoted by a double backslash.
\\SingleWordComment
\\ Comment to the end of the line. Notice the space after the backslashes.
\\[Multiline
comment \\]
\\[Comment followed by \\] instructions and data

Basic Procedure

Instructions are devoured from the future, executed in the present, and produce data that gets appended to the past. This act of executing an instruction or otherwise advancing through the program’s execution is depicted as the forward arrow → of time. Instructions follow the data they act upon. (This is known as postfix or reverse Polish notation.)

|| a b c X Y Z → |a| b c X Y Z → a || b c X Y Z → a |b| c X Y Z → a b || c X Y Z

In this summary, simple data elements are written as numbers or lower case variables. Instructions are written as upper case variables, or actual names. Blocks (unpacked and executed immediately) are represented with parenthesis (X Y Z). Quoted blocks, or quotes (treated as data until unquoted), are represented with curly braces {X Y Z}. Lists of data are represented as square brackets [a b c].

|| [a b] c → |[a b]| c → [a b] || c
|| {X Y} Z → |{X Y}| Z → {X Y} || Z
|| (X Y) Z → |(X Y)| Z → || X Y Z

A single backslash represents a temporary quote, or pass. It passes the instruction to its right directly to the stack without executing it. The future is transferred to the past without passing through the present. The pass character may be prepended to the instruction for clarity and succinctness.

a b c || \ X Y Z → a b c |\| X Y → a b c X || Y
a b c || \X Y Z → a b c |\X| Y → a b c X || Y
a b c || \( X Y) Z → a b c |\(X Y)| Z → a b c (X Y) || Z

Simple Instructions

|| 1 2 + → |1| 2 + → 1 || 2 + → 1 |2| + → 1 2 || + → 1 2 |+| → 3 ||
1 2 |-| → -1 ||
1 2 |*| → 2 ||
1 2 |/| → 0.5 ||
5 3 |div| → 1 || \\ Integer division.
5 3 |mod| → 2 || \\ Modulo arithmetic = remainder after integer division.
5 3 |divMod| → 1 2 || \\ A function can produce multiple return values.
1 2 |dup| → 1 2 2 ||
1 2 |drop| → 1 ||

Predicates are functions that return a Boolean value (true or false).
2 1 |>| → true ||
1 2 |>| → false ||
3 |odd?| → true ||
3 |even?| → false ||

Eval

You can now understand what the eval command does. It takes the top element of the stack and moves it to the present for execution.

(X Y Z) |eval| → |(X Y Z)| → || X Y Z
{X Y Z} |eval| → |{X Y Z}| → {X Y Z} ||
[a b c] |eval| → |[a b c]| → [a b c] ||
a b c |eval| → a b |c| → a b c ||
1 2 + |eval| → 1 2 |+| → 3 ||

The dup (duplicate) command duplicates the top element of the data stack without evaluating it.

a b |dup| → a b b ||

So, to summarize what we have so far, here is a very simple program.

|| a \dup eval → |a| \dup eval → a || \dup eval → a |\dup| eval → a dup || eval →
a dup |eval| → a |dup| → a a ||

Part of the simple genius of concatenative programming is that a long series of predictable steps can be replaced by what they eventually produce.

a \dup eval => a a

As is probably obvious, a passed command followed immediately by eval reduces to just the bare command.

\X eval => X

Do

do is a more powerful form of eval.
{X Y Z} |do| → || X Y Z
(X Y Z) |do| → || X Y Z
[1 2 3] |do| → [1 2 3] ||

do dereferences variables in lists. eval does not.
\\ Where a=1, b=2, c=3.
[a b c] |eval| → [a b c] ||
[a b c] |do| → [1 2 3] ||

do evaluates each element in a list as an instruction, then enlists the results.
[a 1 + b 2 * c dup] |do| → → [2 4 3 3] ||

Combinators

A combinator is simply an instruction that acts on other instructions or quotes. This normally happens through rearranging, quoting, or unquoting.

Here are some of the more common combinators. The actual commands vary a bit from language to language, but these will give you the sense of them.

a b |dup| → b b ||
a b |drop| → a ||
a b |quote| → a {b} ||
a {b c} |dequote| → a b c ||
a b |nip| → b ||
a b |swap| → b a ||
a b |over| → a b a ||
a b c |rotl| → b c a ||
a b c |rotr| → c a b ||
a b |dupd| → a a b ||
a b c |swapd| → b a c ||
{a} b |append| → {a b} ||
{a} b |prepend| → {b a} ||
{a} {b} |concat| → {a b} ||
a {b} |curry| → {a b} || \\ Named for Haskell Curry, who developed combinators.
{a} {b} |curry| → {{a} b} || \\ Some languages use cons for this.

Now we get into some of the more advanced combinators. These start moving things from the past to the future. They’re incredibly handy. (Functional bros call these higher order functions.)

a b X |dip| → a || X \b \\ Data moved to the future is passified.
a b {X Y} |dip| → a || X Y \b \\ Function quotes are unpacked.
a b X |keep| → a b || X \b
a b {X Y} |keep| → a b || X Y \b \\ This is the last quote I’m going to explicitly show.
a b X Y |doBoth| → a || X \b Y \\ Spreads two functions across two data.
a b X |doToBoth| → a || X \b X \\ Applies one function to two data.
a X Y |doBothTo| → a || X \a Y \\ Applies two functions to one data.

There are a whole series of combinators that act on lists of data, generally producing lists of data. They’re just different common iterators.

[1 2 3] {2 *} |map| → [2 4 6] || \\ Apply a function to each element.
[2 3 4] 0 {+} |reduce| → 9 || \\ Apply a function to each element in turn, from a baseline.
[2 3 4] 1 {*} |reduce| → 24 ||
[2 3 4] 0 {*} |reduce| → 0 ||
[1 2 3] 0 {+} |mapReduce| → [1 3 6] || \\ Produces a running total.
[1 2 3 4 5 6] {odd?} |filter| → [1 3 5] || \\ Requires a predicate.
[1 2 3 4 5 6] {3 >} |filter| → [4 5 6] ||

Control Flow

Control flow in concatenative languages is accomplished by, you guessed it, combinators. For example, the if combinator takes three arguments: a predicate and two functions (or quotes). If the predicate resolves as true, if evaluates the first function, otherwise the second.

3 || (2 >) {“greater than 2”} {“less than 2”} if → || “greater than 2”

The while combinator takes a predicate and a function. It repeatedly evaluates the function as long as the predicate returns true.

10 || {dup 0 >} {dup print “ …” println 1 -} while “Liftoff!” println

Notice that we could take everything in the future part of the above, and create a new instruction from it.

{ {dup 0 >} {dup print “ …” println 1 -} while “Liftoff!” println } “countdown” define

Now we can use countdown with any starting value we want.

20 countdown


I hope this summary helped clarify the topic. There are a lot more details to explore, but this should have provided a solid foundation. Remember, the vertical bars || surrounding the present instruction are a notational tool for understanding. They don’t actually appear in any program.

Something modern concatenative style programming sneaks in almost for free: anonymous functions. (Functional bros call them lambdas.) They’re just quotes. Pass them around like any other piece of data, because that’s all they are.

Reference 1: The Theory of Concatenative Combinators, Brent Kirby, 2002
Reference 2: The Mathematical Foundations of Joy, Manfred Von Thun, 2001

No comments:

Post a Comment

I reserve the right to remove egregiously profane or abusive comments, spam, and anything else that really annoys me. Feel free to agree or disagree, but let's keep this reasonably civil.