There are two main benefits WebAssembly provides:
By avoiding the simultaneous asm.js constraints of AOT-compilability and good performance even on engines without specific asm.js optimizations, a new standard makes it much easier to add the features required to reach native levels of performance.
Comparing the two, even for engines which already optimize asm.js, the benefits outweigh the costs.
WebAssembly was designed with a variety of use cases in mind.
We think so. There was an early prototype with demos [1, 2], which showed that decoding a binary WebAssembly-like format into asm.js can be efficient. And as the WebAssembly design has changed there have been more experiments with polyfilling.
Overall, optimism has been increasing for quick adoption of WebAssembly in browsers, which is great, but it has decreased the motivation to work on a polyfill.
It is also the case that polyfilling WebAssembly to asm.js is less urgent because of the existence of alternatives, for example, a reverse polyfill - compiling asm.js to WebAssembly - exists, and it allows shipping a single build that can run as either asm.js or WebAssembly. It is also possible to build a project into two parallel asm.js and WebAssembly builds by just flipping a switch in emscripten, which avoids polyfill time on the client entirely. A third option, for non-performant code, is to use a compiled WebAssembly interpreter such as binaryen.js.
However, a WebAssembly polyfill is still an interesting idea and should in principle be possible.
As WebAssembly evolves it will support more languages than C/C++, and we hope that other compilers will support it as well, even for the C/C++ language, for example GCC. The WebAssembly working group found it easier to start with LLVM support because they had more experience with that toolchain from their Emscripten and PNaCl work.
We hope that proprietary compilers also gain WebAssembly support, but we’ll let vendors speak about their own platforms.
The WebAssembly Community Group would be delighted to collaborate with more compiler vendors, take their input into consideration in WebAssembly itself, and work with them on ABI matters.
Yes! WebAssembly defines a text format to be rendered when developers view the source of a WebAssembly module in any developer tool. Also, a specific goal of the text format is to allow developers to write WebAssembly modules by hand for testing, experimenting, optimizing, learning and teaching purposes. In fact, by dropping all the coercions required by asm.js validation, the WebAssembly text format should be much more natural to read and write than asm.js. Outside the browser, command-line and online tools that convert between text and binary will also be made readily available. Lastly, a scalable form of source maps is also being considered as part of the WebAssembly tooling story.
The LLVM compiler infrastructure has a lot to recommend it: it has an existing intermediate representation (LLVM IR) and binary encoding format (bitcode). It has code generation backends targeting many architectures is actively developed and maintained by a large community. In fact PNaCl already uses LLVM as a basis for its binary format. However the goals and requirements that LLVM was designed to meet are subtly mismatched with those of WebAssembly.
WebAssembly has several requirements and goals for its Instruction Set Architecture (ISA) and binary encoding:
LLVM IR is meant to make compiler optimizations easy to implement, and to represent the constructs and semantics required by C, C++, and other languages on a large variety of operating systems and architectures. This means that by default the IR is not portable (the same program has different representations for different architectures) or stable (it changes over time as optimization and language requirements change). It has representations for a huge variety of information that is useful for implementing mid-level compiler optimizations but is not useful for code generation (but which represents a large surface area for codegen implementers to deal with). It also has undefined behavior (largely similar to that of C and C++) which makes some classes of optimization feasible or more powerful, but which can lead to unpredictable behavior at runtime. LLVM’s binary format (bitcode) was designed for temporary on-disk serialization of the IR for link-time optimization, and not for stability or compressibility (although it does have some features for both of those).
None of these problems are insurmountable. For example PNaCl defines a small portable subset of the IR with reduced undefined behavior, and a stable version of the bitcode encoding. It also employs several techniques to improve startup performance. However, each customization, workaround, and special solution means less benefit from the common infrastructure. We believe that by taking our experience with LLVM and designing an IR and binary encoding for our goals and requirements, we can do much better than adapting a system designed for other purposes.
Note that this discussion applies to use of LLVM IR as a standardized format. LLVM’s clang frontend and midlevel optimizers can still be used to generate WebAssembly code from C and C++, and will use LLVM IR in their implementation similarly to how PNaCl and Emscripten do today.
Optimizing compilers commonly have fast-math flags which permit the compiler to relax the rules around floating point in order to optimize more aggressively. This can include assuming that NaNs or infinities don’t occur, ignoring the difference between negative zero and positive zero, making algebraic manipulations which change how rounding is performed or when overflow might occur, or replacing operators with approximations that are cheaper to compute.
These optimizations effectively introduce nondeterminism; it isn’t possible to determine how the code will behave without knowing the specific choices made by the optimizer. This often isn’t a serious problem in native code scenarios, because all the nondeterminism is resolved by the time native code is produced. Since most hardware doesn’t have floating point nondeterminism, developers have an opportunity to test the generated code, and then count on it behaving consistently for all users thereafter.
WebAssembly implementations run on the user side, so there is no opportunity for developers to test the final behavior of the code. Nondeterminism at this level could cause distributed WebAssembly programs to behave differently in different implementations, or change over time. WebAssembly does have some nondeterminism in cases where the tradeoffs warrant it, but fast-math flags are not believed to be important enough:
pow, and so on. WebAssembly’s strategy for such functions is to allow them to be implemented as library routines in WebAssembly itself (note that x86’s
cosinstructions are slow and imprecise and are generally avoided these days anyway). Users wishing to use faster and less precise math functions on WebAssembly can simply select a math library implementation which does so.
multhey don’t have to worry about NaN for example doesn’t make them any faster, because NaN is handled quickly and transparently in hardware on all modern platforms.
syscall has many useful features. While these are all packed into one overloaded
syscall in POSIX, WebAssembly unpacks this functionality into multiple
A significant feature of
mmap that is missing from the above list is the
ability to allocate disjoint virtual address ranges. The reasoning for this
mmapwith what appears to be noncontiguous memory allocation (but, under the hood is just coordinated use of
The benefit of allowing noncontiguous virtual address allocation would be if it allowed the engine to interleave a WebAssembly module’s linear memory with other memory allocations in the same process (in order to mitigate virtual address space fragmentation). There are two problems with this:
The amount of linear memory needed to hold an abstract
size_t would then also
need to be determined by an abstraction, and then partitioning the linear memory
address space into segments for different purposes would be more complex. The
size of each segment would depend on how many
size_t-sized objects are stored
in it. This is theoretically doable, but it would add complexity and there would
be more work to do at application startup time.
Also, allowing applications to statically know the pointer size can allow them to be optimized more aggressively. Optimizers can better fold and simplify integer expressions when they have full knowledge of the bitwidth. And, knowing memory sizes and layouts for various types allows one to know how many trailing zeros there are in various pointer types.
Also, C and C++ deeply conflict with the concept of an abstract
sizeof are required to be fully evaluated in the front-end
of the compiler because they can participate in type checking. And even before
that, it’s common to have predefined macros which indicate pointer sizes,
allowing code to be specialized for pointer sizes at the very earliest stages of
compilation. Once specializations are made, information is lost, scuttling
attempts to introduce abstractions.
And finally, it’s still possible to add an abstract
size_t in the future if
the need arises and practicalities permit it.
A great number of applications don’t ever need as much as 4 GiB of memory. Forcing all these applications to use 8 bytes for every pointer they store would significantly increase the amount of memory they require, and decrease their effective utilization of important hardware resources such as cache and memory bandwidth.
The motivations and performance effects here should be essentially the same as those that motivated the development of the x32 ABI for Linux.
Even Knuth found it worthwhile to give us his opinion on this issue at point, a flame about 64-bit pointers.