Recently, I started building a SQLite3 module for Deno. The module works by compiling SQLite3 with a custom 'Virtual File System (VFS)' to WASM and wrapping it in a small JavaScript API.
Throughout the project I learned a lot about compiling WASM modules, avoiding JavaScript glue code, and even got to expose two issues with Deno.
Using a compiled language like C and importing it to JavaScript is really exciting! It enables modules that are much faster than anything we could write in JavaScript and let's us use libraries like SQLite3 without platform-specific runtime extensions.
Below is a tutorial on compiling C libraries for use with Deno.
A simple demo module
Let's start with a module.c
which will allow us to use a few simple c
functions:
int total_count = 0;
// Modules can have state
int count() {
return ++total_count;
}
// They can do work
int add(int a, int b) {
return a + b;
}
// But they only work with numbers
char * string() {
return "Hello, world!";
}
Compiling the WASM binary
Now the first step is to build a WASM binary from this. WASM is supported by
LLVM, so all we need to do to compile our module.c
file is running:
clang --target=wasm32 --no-standard-libraries -Wl,--export-all -Wl,--no-entry -o module.wasm module.c
(Caveat: If you're on macOS, the default LLVM shipped by Apple does not support
WASM. But you can easily install one that does via e.g. homebrew using
brew install llvm
.)
Let's unpack this command:
--target=wasm32
tells LLVM that we want to produce a WebAssebly binary--no-standard-libraries
does exactly what it says; this means we won't have access to things like<stdlib.h>
,<stdio.h>
, or<string.h>
-Wl,--export-all
tells the linker to export all symbols-Wl,--no-entry
tells the linker we don't need amain
function-o module.wasm
specifies the output file
Having build our module, we can now proceed to using the functions it exports. In Deno, creating a WebAssembly instance is super simple:
import * as module from "./module.wasm";
console.log(module);
should output:
{ __data_end, __dso_handle, __global_base, __heap_base, __wasm_call_ctors, add, count, memory, string, total_count }
Calling Functions
Having done this, functions exported by the module can be called as if they were regular JavaScript functions:
// ...
console.log(module.count(), module.count(), module.count());
console.log("1+1 =", module.add(1, 1));
1 2 3
1+1 = 2
However, there are a few things to be aware of. Firstly, until interface types are supported, WASM modules can only input and output numbers. This means that reading things like strings from our WASM module is a bit more involved:
// ...
function readString(ptr) {
// Get the memories bytes, starting from the beginning of the string.
const mem = new Uint8Array(module.memory.buffer, ptr);
// Find the length of the string.
// (doing this in C would be faster)
let length;
for (length = 0; mem[length] !== 0; length++);
// Decode the string (see note about optimization below)
return new TextDecoder("utf-8").decode(
new Uint8Array(mem.buffer, ptr, length),
);
}
console.log(readString(module.string()));
Hello, world!
(Notice that this can be done more efficiently, see here for an example using an optimization stolen from EMSCRIPTEN.)
Another limitation comes from JavaScript. In JavaScript numbers are always floating point values. Specifically, they are encoded in memory in such a way that there are no 64 bit integers. This means that integers are always cast to 32 bit if passed from JavaScript. We have to do something like the following if we want maximum precision numbers:
void needs_64_bit_int(long long int number) {
// ...
}
// becomes
void needs_64_bit_int(double number_js) {
long long int number = (long long int)number_js;
// ...
}
Exporting only specific functions
So far we used the -Wl,--export-all
linker option to export all the symbols
declared in our module. However this also exported some symbols we don't care
about. To only export specific symbols we have multiple options:
- Export specific symbols with a the
-Wl,--export,SYMBOL
linker flag. E.g.-Wl,--export,count
Use pragmas in our code to control what gets exported. EMSCRIPTEN defines the macro
EMSCRIPTEN_KEEPALIVE
. We can do the same like so:#define KEEPALIVE __attribute__((used)) __attribute__((visibility ("default"))) // and then e.g. int KEEPALIVE count() { // ... }
We can also combine the two. E.g. it makes sense to use a pragma for your own
code, and export standard library functions you need to access from JavaScript
(e.g. malloc
and free
) using linker flags.
EMSCRIPTEN
So far, our module was extremely simple. We did not use any C standard
libraries, and could compile our module directly with a standard LLVM clang
compiler. If we need any standard headers like <stdlib.h>
or <string.h>
,
this will no longer work.
Anything we want to use that doesn't know its memory requirements up front (e.g.
uses malloc
) will not be able to compile this way. To get these standard
libraries, we need to use an SDK that includes libraries which run on a WASM VM.
The EMSCRIPTEN compiler provides us with an LLVM
compiler and a posix libc
which runs in a browser. This will also work for
Deno, but their implementation produces a non-trivial amount of glue-code which
is tricky to work with. It is also based on constrains which don't quite align
with our needs, for example deno-sqlite runs faster when not
using the EMSCRIPTEN provided glue.
The exported function for instantiating modules is also not that nice. It
implements a .then
(like a promise) but isn't a real promise which caused me
all sorts of headaches.
All of this does not (!) mean that EMSCRIPTEN is bad. On the contrary, there are very impressive demos that demonstrate what the compiler is capable of. It's just built for a different use-case.
WASI and standard libraries
WASI is short for WebAssemblySystemInterface and is an emerging
standard which aims to let WebAssembly target environments outside of the
browser. It is not yet supported by Deno, but they still provide a quite nice
standard library which we can use as long as we don't use any headers that need
sys-calls (specifically <stdio.h>
is something we sadly can't use).
Note that you can use these features, if you provide the module with the necessary WASI imports when instantiating it. That, however, means that you won't be able to directly import the binary from a JavaScript file.
If you need to emulate a complete POSIX environment, you should use EMSCRIPTEN
as your compiler. But if you can get away with not using these APIs (notably
that is the case for wrapping SQLite), using the libc
provided by the WASI SDK
allows you to build .wasm
modules which are fast and self-sufficient.
You can download the WASI SDK for your platform here.
Compiling using the WASI SDK
To compile our module, we simply need to use the clang provided by the SDK and specify where the linker should look for the standard libraries:
$(WASI_SDK_DIR)/bin/clang --target=wasm32-unknown-wasi -Wl,--no-entry -nostartfiles --sysroot $(WASI_SDK_DIR)/share/wasi-sysroot
As before, there is a lot going on here. These are the new options:
--target=wasm32-unknown-wasi
target WASM with WASI-nostartfiles
means we don't want to link in any standard initialization actions provided by the OS--sysroot
tells the linker where to find the WASIlibc
Bundling WASM files for Deno
Update (16th Feb. 2020): This problem has been resolved starting with Deno
v0.33.0
. This section of this guide will be left here for completeness, but is no longer required.Update (18th Mar. 2020): Deno has removed
.wasm
imports for theirv1.0
release. Thus this section is relevant again.
This will be irrelevant once this issue is resolved. Until then,
Deno will not correctly fetch .wasm
binaries over the network. For local
files, using
import * as module from "./module.wasm";
is fine. But to reliably import files via the network, we need to embed the WASM binary into our JavaScript source. (We could also fetch it dynamically, but that means requesting network permission and also does not play too well with relative imports.)
I have seen people use hexadecimal encoding. This encodes 1 byte as 2 ASCII bytes. We can do better by using base 64 encoding to 'only' get an extra 33%.
A base 64 string of our binary is readily obtained, e.g. using:
base64 module.wasm
It is also easily decoded in JavaScript. This gives us the following module
file, which can be imported similarly to a bare .wasm
as
import module from "./module.js"
:
// WASM binary
const base64 = "\
AGFzbQEAAAABBwFgAn9/AX8DAgEABAUBcA\
EBAQUDAQACBggBfwFBgIgECwcQAgZtZW1v\
cnkCAANhZGQAAAoJAQcAIAEgAGoLAA0Ebm\
FtZQEGAQADYWRkAD4JcHJvZHVjZXJzAQxw\
cm9jZXNzZWQtYnkBBWNsYW5nHjkuMC4wIC\
h0YWdzL1JFTEVBU0VfOTAwL2ZpbmFsKQ==\
";
// Decode base 64 to typed array
function decode(base64) {
const str = atob(base64);
const bytes = new Uint8Array(str.length);
for (let i = 0; i < str.length; i++) {
bytes[i] = str.charCodeAt(i);
}
return bytes;
}
// Export WASM binary instance's exports
const { instance } = await WebAssembly.instantiate(decode(base64));
export default instance.exports;
Thank You!
I'd like to end by saying THANK YOU to the amazing people behind WebAssembly and Deno. I also want to acknowledge the posts and materials I used when figuring out how to do this. The following where especially useful and/ or interesting: