6.001/Recitation 23

Rob Speer

Contents

Register machines

Register machines are hypothetical pieces of computing hardware, controlled by a Lispy assembly language. This way we can gloss over the messy bits.

At first, a register machine is a set of registers and operator boxes that are interconnected in a particular way, called a "datapath diagram". Data flows between registers at specified times along datapaths.

A register machine for Euclid's GCD algorithm.
A register machine for Euclid's GCD algorithm.

But then you need a controller language to describe which datapaths to activate when. And once you make your controller language sufficiently general, you can describe any register machine you want. Datapath diagrams become redundant.

The controller language

A controller (that is, an assembly-like program that deals with registers) might look like this:

(controller
 test-b
   (test (op =) (reg b) (const 0))
   (branch (label gcd-done))
   (assign t (op rem) (reg a) (reg b))
   (assign a (reg b))
   (assign b (reg t))
   (goto (label test-b))
 gcd-done)

All controllers begin with the tag 'controller, as you may expect by this point. The rest is a list of labels and instructions. A label is a bare symbol that gives a name to the following instruction.

These are the possible instructions:

  • Data is moved between registers with assign. Assign commands take a destination, which must be a register (where else would you put it?), and then a source, which could be:
    • A register: (reg t)
    • The result of an operation: (op rem) (reg a) (reg b)
    • A constant: (const 0)
    • A label: (label test-b). Think of a label as being a constant for a line number. The register machine assembler finds the labels and converts them to numbers.
  • goto is how you control the flow of the program: it goes to a certain line number. It takes a source, like assign. I don't think you're allowed to use an (op) or (const) as the source. I certainly hope not.
  • test is to make decisions. Its source is a boolean operation. The result is stored in a branch register.
  • branch should happen immediately after a test. If the branch register is set to true, it acts like goto; if false, it does nothing.
  • perform is for operations that don't require an assignment: (perform (op print) (reg a))

We haven't got much to work with, but we can make a factorial machine out of this.

But something is seriously cramping our style. What's missing?

Recursion

Recursion would sure be nice if we want to do anything Scheme-like. We want to be able to call subroutines an arbitrary number of times, without losing track of the values in our registers. The subroutine may step all over our registers, so we need to save what we're doing.

Shazam. Now there's a stack. We have two more instructions we can use:

  • save: push the contents of a register onto the stack
  • restore: pop the stack and put its top value in a given register

The calling convention

  1. Save all important registers onto the stack.
  2. Store the return label in the continue register. (All subroutines end with (goto (reg continue)).
  3. Call the subroutine.
  4. Restore the registers from the stack, in the opposite order.
  • How do we make an arbitrary number of subroutine calls with just one continue register?
  • What are other possible conventions we could have used?
  • Write a recursive factorial function this way.

Tail-call optimization

Isn't the end of that factorial call a bit redundant?

Seen from this perspective, tail-call optimization seems like a blatantly obvious thing to do.

WTF

The following twisted and pointless register machine comes from Dan Zaharopol. What does it do? Can you break it down into understandable chunks?

(registers n foo bar continue)           Contract:
(operations + - * / < =)                   input: n, continue
(controller                                output: result
mystery-proc                               writes:
  (assign n (op +) (reg n) (reg n))        stack:
  (save n)
  (goto (label mystery1))
whynot
  (assign n (op -) (reg n) (const 2))
  (assign continue (label mystery4))
  (save bar)
  (save n)
  (goto (label mystery1))
mystery3
  (assign result (op *) (reg n) (reg n))
  (goto (reg bar))
mystery1
  (save n)
  (save continue)
  (restore n)
  (restore foo)
  (assign bar (reg n))
  (restore n)
  (test (op <) (reg foo) (const 0))
  (branch (label mystery2))
  (test (op =) (reg n) (const 0))
  (branch (label mystery3))
  (save n)
  (test (op =) (reg n) (const -5))
  (branch (label uh-oh))
  (goto (label whynot))
mystery2
  (goto (label mystery3))
uh-oh
  (assign n (op /) (const 1) (const 0))
  (goto (reg bar))
mystery4
  (restore bar)
  (restore foo)
  (assign n (op /) (reg foo) (const 2))
  (assign result (op +) (reg n) (reg result))
  (goto (reg bar))
)