Poincare takes text input such as
1+2*3 and turns it into a tree structure, that can be simplified, approximated and pretty-printed.
Each node of a tree represents either an operator or a value. All nodes have a type (
Type::Multiplication…) and some also store a value (ie
According to their types, expressions are childless (
Type::Rational) or store pointers to their children (we call those children operands). To ease tree traversal, each node also keeps a pointer to its parent: that information is somewhat redundant but makes dealing with the expression tree much easier.
Addition are the only type that can hold an infinite number of operands. Other expressions have a fixed number of operands: for instance, an
AbsoluteValue will only ever have one child.
The type of a C++ object is used by the compiler to generate a vtable. A vtable is a lookup table that tells which function to call for a given object class, hence creating polymorphism. Once the vtable has been built, the compiler completely discards the type information of a given object.
The problem with vtables is that they allow polyphormism based on a single class only: you can have different code called on a Node depending on whether it’s an addition or a multiplication. But vtables can’t handle dynamic behavior based on two parameters. For example, if you want to call a function depending on the type of two parameters, vtables can’t do that.
That case happens quite often in Poincare: for example, if an expression contains the addition of another addition, we can merge both nodes in a single one ($1+(\pi+x)$ is $1+\pi+x$), see figure below). And we want to implement this behavior only if both nodes are additions.
The C++ standard has support for keeping type information at runtime, a behavior known as RTTI. However that feature is quite comprehensive and a bit overkill for what we needed, so we decided to do an equivalent solution manually: each expression subclass implements a
type() function to give its type.
Lexing and parsing are done by homemade lexer and parser available here.
Expression simplification is done in-place and modifies directly the expression. Simplifying is a two-step process: first the expression is reduced, then it is beautified. So far, we excluded matrices from the simplification process to avoid increasing complexity due to the non-commutativity of matrix multiplication.
To simplify an expression one needs to find relevant patterns. Searching for a given pattern can be extremely long if done the wrong way. To make pattern searching much more efficient, we need to sort operands of commutative operations.
To sort those operands, we defined an order on expressions with the following features:
In the example, both root nodes are r so we compare their last operands. Both are equal to $\pi$ so we compare the next operands. As 3 > 2, we can conclude on the order relation between the expressions.
Moreover, the simplification order has a few additional rules:
Rationalis always the first operand
Additiona with an
Expressione is equivalent to comparing a with an
Additionwhose single operand is e. Same goes for the
Powerp with an
Expressione, we compare $p$ with $e^1$.
Thanks to these rules, the order groups similar terms together and thus avoid quadratic complexity when factorizing. For example, it groups expressions with same bases together (ie $\pi$ and $\pi^3$) and terms with same non-rational factors together (ie $\pi$ and $2*\pi$).
Last but not least, as this order is total, it makes checking if two expressions are identical very easy.
The reduction phase is the most important part of simplification. It happens recursively and bottom-up: we first reduce the operands of an expression before reducing the expression itself. That way, when reducing itself, an expression can assert that its operands are reduced (and thus have some very useful knowledge such as “there is no
Subtraction among my operands”). Every type of
Expression has its own reduction rules.
To decrease the set of possible expression types in reduced expressions, we turn
Power and so on:
Here is a short tour of the reduction rules for the main
Additionsare reduced by common applying mathematics rules
Multiplicationsapply the following rules
Powersapply the following rules
To avoid infinite loops, reduction is contextualized on the parent expression. This forces to reduce an expression only once it has been attached to its parent expression.
This phase turns expressions in a more readable way. Divisions, subtractions, Naperian logarithms reappear at this step. Parentheses are also added to be able to print the tree in infix notation without any ambiguity. This phase is also recursive and top-down: we first beautify the node expression and then beautify its operands.
Expressions can be approximate thanks to the method
approximate() which return another (dynamically allocated) expression that can be either:
To approximate an expression, we first approximate its operands (which are ensured to be either complex or matrix of complexes) and then approximate the expression depending on its type (an
Addition add its operand approximations for example).
Poincare is responsible for laying out expressions in 2D as in a text book. The
ExpressionLayout class represents the layout on screen of an
Expression, and can be derived from an
Expression by calling the function
ExpressionLayout is also a tree structure, although the layout tree does not exactly follow the expression tree
ExpressionLayout is useful to align several layouts relatively to each other.