Lately I spent some time writing a parser using NimbleParsec and was really pleased with how the library works. So I decided to write up a small tutorial on building a parser for boolean logic and how I approached building the grammar and the parser.
This is what we will attempt to parse:
true
andfalse
The boolean constants()
Grouping of expressionsnot
Negation of expressionsand
Logical conjunction of expressionsor
Logical disjunction of expressions
Precedence rules
To correctly parse boolean expressions we need to take into account the operator precedence rules. They are listed underneath from highest to lowest precedence.
# (highest to lowest)
() -> not -> and -> or
# Some examples how precedence rules apply:
A || B && C == (A || (B && C))
(A || B) && C == ((A || B) && C)
! A && B && C == (((!A) && B) && C)
Grammar for our language
Lets get started by defining an EBNF for our language. To be honest for me this is where I spent most of the time. Once the grammar is defined, the implementation in NimbleParsec is just a walk in the park.
We will build it up while we go along.
Constants
Lets define our true and false values, and begin defining our expression as one of these constants.
<expr> :== <const>
<const> :== "true" | "false"
Grouping expressions
Now that we have our constants defined we can start implementing the operators following the operator precedence. First one to tackle is: ()
, which allows grouping expressions.
<expr> :== "(" <expr> ")" | <const>
<const> :== "true" | "false"
As you can see the <expr>
just became recursive. It is not that powerful yet because we only have <const>
in our grammar, but it will be pretty amazing later on.
Negation
Lets make it possible to negate our <expr>
, our grammar becomes:
<expr> :== <not> <expr> | "(" <expr> ")" | <const>
<const> :== "true" | "false"
<not> :== "!"
With our grammar so far we can declare expressions like:
!(!(!true))
!!false
Conjunction
So this one is a bit more involved, basically we want to make it possible to repeat the and
operator zero or multiple times:
<expr> && <expr> && <expr>
We are going to rename the <expr>
we had previously defined into <factor>
so we can optionally repeat it. Furthermore we will add the <and>
operator.
Our grammar now becomes:
<expr> :== <factor> {<and> <factor>}
<factor> :== <not> <factor> | "(" <expr> ")" | <const>
<const> :== "true" | "false"
<not> :== "!"
<and> :== "&&"
In EBNF {...}
means 0 or multiple repetitions of what is inside the braces. So <factor> {<and> <factor>}
reads as <factor>
optionally followed by multiple <and> <factor>
Its worth noting that we did not change the "(" <expr> ")"
inside <factor>
…
Disjunction
Finishing with the: or
operator, basically this operator is just like the <and>
operator, and like in the previous step joins the <expr>
zero or multiple times:
<expr> || <expr> || <expr>
Again we rename the <expr>
we had previously defined, this time into <term>
. And we add the <or>
operator.
Our grammar finally becomes:
<expr> :== <term> {<or> <term>}
<term> :== <factor> {<and> <factor>}
<factor> :== <not> <factor> | "(" <expr> ")" | <const>
<const> :== "true" | "false"
<not> :== "!"
<and> :== "&&"
<or> :== "||"
Again we did not change the "(" <expr> ")"
inside <factor>
Parser implementation
Ok, now we can start building our parser using the grammar we defined above.
Some remarks:
I will not add any whitespace handling to keep the code concise. You can have a look at https://github.com/slapers/ex_sel which handles whitespaces and adds arithmetics and comparison operations.
You can find the parser in the following gist
Project setup
Create a new project using mix:
$ mix new parser
$ cd parser
Edit the mix.exs file and add the nimble parsec dependency:
defp deps do
[{:nimble_parsec, "~> 0.4"}]
end
Install the dependencies:
$ mix deps.get
Parser code and tests
defmodule Parser do
import NimbleParsec
not_ = string("!") |> label("!")
and_ = string("&&") |> replace(:&&) |> label("&&")
or_ = string("||") |> replace(:||) |> label("||")
lparen = ascii_char([?(]) |> label("(")
rparen = ascii_char([?)]) |> label(")")
# <const> :== "true" | "false"
true_ = string("true") |> replace(true) |> label("true")
false_ = string("false") |> replace(false) |> label("false")
const = choice([true_, false_]) |> label("boolean")
# <factor> :== <not> <expr> | "(" <expr> ")" | <const>
negation = not_ |> ignore |> parsec(:factor) |> tag(:!)
grouping = ignore(lparen) |> parsec(:expr) |> ignore(rparen)
defcombinatorp(:factor, choice([negation, grouping, const]))
# <term> :== <factor> {<and> <factor>}
defcombinatorp(
:term,
parsec(:factor)
|> repeat(and_ |> parsec(:factor))
|> reduce(:fold_infixl)
)
# <expr> :== <term> {<or> <term>}
defparsec(
:expr,
parsec(:term)
|> repeat(or_ |> parsec(:term))
|> reduce(:fold_infixl)
)
defp fold_infixl(acc) do
acc
|> Enum.reverse()
|> Enum.chunk_every(2)
|> List.foldr([], fn
[l], [] -> l
[r, op], l -> {op, [l, r]}
end)
end
end
And to top it off some tests:
defmodule ParserTest do
use ExUnit.Case
doctest Parser
defp parse(input), do: input |> Parser.expr() |> unwrap
defp unwrap({:ok, [acc], "", _, _, _}), do: acc
defp unwrap({:ok, _, rest, _, _, _}), do: {:error, "could not parse" <> rest}
defp unwrap({:error, reason, _rest, _, _, _}), do: {:error, reason}
@err "expected boolean while processing !, " <>
"followed by factor or (, followed by expr, followed by ) or boolean"
test "parses consts" do
assert true == parse("true")
assert false == parse("false")
assert {:error, @err} == parse("1")
end
test "parses negations" do
assert {:!, [true]} == parse("!true")
assert {:!, [false]} == parse("!false")
end
test "parses conjunctions" do
assert {:&&, [{:!, [true]}, false]} == parse("!true&&false")
assert {:!, [{:&&, [false, true]}]} == parse("!(false&&true)")
end
test "parses disjunctions" do
assert {:||, [true, {:&&, [true, false]}]} == parse("true||true&&false")
assert {:&&, [{:||, [true, true]}, false]} == parse("(true||true)&&false")
end
end