Writing a MachineFunctionPass in LLVM
January 27, 2017
I’ve been hacking on LLVM lately and I recently needed to write a
to analyze some IR instructions while they got converted to assembly, since I was
working with machine-dependent representations in LLVM as opposed to machine-independent IR.
Unfortunately, LLVM’s splendid Writing an LLVM Pass
doc (which has a great introduction to IR-level passes), didn’t fully cover how
to write a
MachineFunctionPass (or rather, get it running), at least not well
enough for noobs like me to understand. This mailing list thread
was invaluable for me to get started off, but I’ll elaborate a bit more.
The post below assumes you’ve read and understand the Writing an LLVM Pass doc. It also assumes you have a basic familiarity with the kind of tools LLVM offers out-of-the-box.
opt tool doesn’t make any machine-dependant optimizations
and only runs on IR, so it makes sense why
MachineFunctionPasses don’t work in
since they only run on
MachineInstrs (if this doesn’t make sense to you,
check out Eli Bendersky’s Life of an instruction in LLVM post).
The cool part of
opt is that you can write a pass out-of-source and choose to
dynamically load it as a shared object library into
opt without recompiling
opt tool, which takes a shit-ton of time. Unfortunately, there is no
such nice modular way to write machine-dependent passes for
llc. You simply
need to hack LLVM’s source to get
llc to run your
you invoke it for the architecture you’re working on.
Enough talk, let’s dive in!
So, let’s say I want to write a
MachineFunctionPass dumping the
MachineFunction. Let’s call our file
Whenever you start navigating a codebase as intimidating as LLVM’s, you often wonder,
“How the heck do you figure out how xyz works without documentation?”. The clichéd
answer is simply, read the source. I ended up making friends with
grep -nr "[search term]" .
and ctags, and life got a bit better.
The crucial learning from the first link is that all optimizations on
are in the form of
MachineFunctionPasses. If you follow the LLVM Reviews page
(and you should!), try to get hold of some review process involving such an
optimization. The diffs should give you an idea of the additions you need to make
to get your stuff working (and shhh, find some sample code). My helper link was this
and sample file was
Moving on, add the following to
lib/Target/X86/X86TargetMachine.cpp, add the snippet below. Note
that we’ll be added our pass under the
addPreRegAlloc() function because we’ll
choose to print our
machineinstrs before register allocation takes place).
X86MachineInstrPrinter.cpp to the
and compile llvm from your build directory. If you have a computer like mine,
you might want to get a cup of coffee despite the fact you just need to recompile
The next time you run
llc, you’ll see your
machineinstrs being outputted. How exciting!
Phew, that was long. Hopefully, this gave you some insight into how does one go about figuring out a large codebase like LLVM without losing one’s mind or going in too deep looking for reasoning (“To make an apple pie from scratch, you must first invent the universe.” - Carl Sagan).
You might notice that all the
follow the same naming scheme. You might think this is just a convention; why not
try changing one of them when you define them in
You’ll be greeted with a bunch of cryptic error messages.
To answer them, take a look at
/include/llvm/PassSupport.h. And holy shit, the
entire file is full of giant macros in the wild with the all the function names