How to Build a Quantifier Machine |
|||||||||||||||||
| Issue
#4 in Advanced Programming in Qi This short study is about how to use functional programming to test first-order logic conjectures about finite domains. A program that does this I call a quantifier machine. Some of the material here is drawn from my forthcoming book Logic, Proof and Computation. You can find the link to the text file for the code for this study at the very bottom of this page. Bon voyage! Quantifiers Two common operations used in functional programming are testing a list to see if some element of the list has a property P and testing a list to see if all elements of the list have the property P. These can be encoded in Qi as two higher order functions. (define
some Thus (some [8 9 10 11 12 13] prime?) returns true (meaning that some element of the domain {8 9 10 11 12 13} is prime) whereas (all [11 12] prime?) returns false. What if we want to test if two distinct elements of a domain are prime? The problem with our treatment of quantification is that it lacks the concept of a bound variable. We want to be able to say something like (exists
x [8 9 10 11 12 13] (exists y [8 9 10 11 12 13] (and (and
(prime? x) (prime? y)) (not (= x y))))) An Eager Quantifier Machine
Figure 1 The Quantifier Machine as a Black Box An alternative is to build a special quantifier machine (QM) which will receive quantified first-order sentences and evaluate them using the semantics of first-order logic. The machine will output either true or false (figure 1). If the quantifiers will range over finite domains and the predicates used in the sentence are computable, then the evaluation of the sentence will halt. There are several ways we can tackle this. The easiest is to represent the inputs to the quantifier machine as list structures. A nice bonus of writing first-order sentences as lists, is that we can use the syntax of first-order logic. So our representation can look like this. [exists
x [8 9 10 11 12 13] where prime? is defined as (define prime? (define prime-help The simplest machine which will handle this sort of input I call the Eager Quantifier Machine. Here is the code. (define
quantifier-machine The
penultimate line of the eager quantifier machine (below)
uses on-the-fly currying to handle n-place (n > 1)
predicates by successively generating closures which are
in turn recycled back into the quantifer machine. The
expression (subst Y X P) substitutes Y for X throughout
P. (3-) (quantifier-machine [exists x [8 9 10
11 12 13] [exists y [8 9 10 11 12 13] [[[prime? x] &
[prime? y]] & [~ [x = y]]]]]). (4-) (quantifier-machine [exists x [8 9 10
11 12 13] [exists y [8 9 10 11 12 13] [x = [+ y 4]]]]) Conjectures Involving Large
Domains Twin primes are primes whose difference is 2. 3 and 5 are twin primes and so is 11 and 13. How could we test the following conjecture? There are twin primes between 1000 and 2000. Using our eager quantifier machine, we need to first write down 1001 numbers in a list and then set a symbol to denote the list. This is not practical. The problem is that we have learnt no way of representing very large lists. But there is a way of doing this which uses streams or lazy lists. A stream is a three element list composed of the following items. 1. The
first element is the head of the list. For example, we can represent the stream of natural numbers from 1000 to 2000 by the following stream. [1000 (+ 1) (= 2001)] = is an alternative to the equality relation = which is one of a few special forms in Qi (i.e. like the others if, @p, output, input+, error, cons and let, its not intended to be curried or partially applied and has its own type rule). If streams can be manipulated like lists then we need analogs to the bread-and-butter list functions used in Qi like head, tail and cons. Here are the analogs.
force, delay and end? are not hard to
define. How about push, the analog to cons? If I push an
element x on a stream I should make a larger stream. But
I cannot simply place x at the front of the stream
without making an list of four objects, which is not a
stream. On the other hand, if I simply throw away the
first element y of the stream and replace it with x, then
I have not made a larger stream but simply replaced y. Another useful operation is unpack; which unpacks a stream by turning it into the corresponding list. For instance (unpack [1000 (+ 1) (= 2001)]) will generate [1000 1001 .. 1999 2000]. (define
unpack Armed with streams, we can now approach the twin prime conjecture. All that is needed is to evaluate each of the following. set allocates a value to a symbol and value recovers it. (set
1000to2000 (unpack [1000 (+ 1) (= 2001)])) A Lazy Quantifier Machine This works fine, but an alternative would be to design a lazy quantifier machine (LQM) that operates directly with streams, rather than requiring them to be unpacked into lists. Here it is. (define
lazy-quantifier-machine The LQM is more memory efficient than the eager version, but not only this, but the LQM is actually expressively more powerful, since it allows us to test conjectures that range over infinite domains. For example, domain of natural numbers is just the stream. [0 (+ 1) (/. X false)] Here the terminating function always returns false, so this is an example of an infinite stream. So we can write (lazy-quantifier-machine [exists x [10000000 (+ 1) (/. X false)] [prime? x]]) to test the conjecture that there exists a prime greater than 107 (assuming that we define prime? to be a recognisor for prime numbers). Hence the LQM allows us to test semi-decidable conjectures. If there is a prime, the LQM will halt with true; otherwise it will run forever. Of course we know that for this particular conjecture, the LQM will return true. For universal quantifiers ranging over infinite domains, the LQM will return false if there is a counterinstance and run forever if the conjecture is true. There is a class of undecidable problems which even the LQM cannot help us with; these are characterised by mixed existential and universal quantifiers which both range over infinite domains. Here is an example. There is no greatest prime number. Though the truth of this sentence has been known over 2,000 years, the LQM will not return any verdict on it. Written in first-order notation the sentence appears as "x (prime x) ® $y ((prime y) & (y > x)) Suppose the statement is true. Since it uses a universal quantifier over an infinite domain, the quantifier machine will never return a result. Suppose it is false, then the quantifier machine will eventually find the greatest prime and spend forever looking for a bigger one. In either case, the program is silent. So the LQM is not an omniscient oracle; but still we can make it better. We don't get the full power of quantification until we can manipulate dependent streams. Dependent Streams and Goldbach's Conjecture Goldbachs's Conjecture (first formulated in 1746 by the Prussian mathematician Goldbach) states that every even number > 2 is the sum of two primes. How can we test Goldbach's Conjecture using our technology? One way to do it is this: we start with a stream for the even numbers > 2, for each element X drawn from this stream, we take samples Y and Z from two streams of prime numbers whose minimum is 2 and whose maximum is X. If we find that there is no Y and Z such that X = Y+Z then the Conjecture is false. If there is such a Y and Z, then we take N+2 and repeat. This gives us a semi-decision procedure. If the Conjecture is false, the program will halt. If it is true, it will run forever. Now there is something important here. The ranges of the streams that generate Y and Z are dependent on the value generated from X - this is what I call a dependent stream. However here we hit a snag - our LQM program does not handle dependent streams. The Limitations of the LQM There are two limitations of the LQM.
Bound Variables Revisited We have been searching for an implementation of a bound variable; but functional programming already has bound variables. Lambda abstractions bind variables - can we use this mechanism to implement quantification? The answer is "yes" - if we are prepared to accept a hard syntax. For instance here is a definition of the universal quantifier forall using streams. (define forall An example (forall [3 (+ 2) (= 9)] prime?) means every second number from 3 to 7 is prime. Quantifiers work with streams. So lets define a stream as a type in Qi. (datatype stream X : A; S
: (A --> A); E : (A --> boolean); Here is the definition of exists using streams. (define exists An example; (exists [33 (+ 2) (= 39)] prime?) means at least one of 33, 35, 37 is prime. There are no bound variables in the inputs of these functions, but no matter because lambda-bound variables will do the trick. Here is Goldbach's Conjecture. (forall [4 (+ 2) (/. N false)] (/.
X (exists [2 next-prime (< X)] (/. Y (exists [2
next-prime (< X)] Its looks pretty horrible; lets go through it. The first bit (forall [4 (+ 2) (/. N false)] ... just tells us we are quantifying over the domain of even numbers > 2. The bound variable X that follows does the job of the bound variable in quantification. This makes it hard to read because here the bound variable follows the domain of its quantification. Never mind for the moment. The next bit .... (exists [2 next-prime (< X)] .... says there exists a number in the range of primes between 2 and a value less than X. This number is Y and the last bit ... (exists [2 next-prime (< X)] (/. Z (= X (+ Y Z)))))))) says there exists a number Z in the range of primes between 2 and a value less than X such that X = Y + Z. Phew! So it captures Goldbach's Conjecture and it uses dependent domains and it is amazingly compact. If you look the Qi book, Goldbach's Conjecture occupies 20 lines in chapter 4 at least - excluding the prime program. This chappie does it in one line with a 3 line help function. But its hard to read - some syntactic sugar would be nice. I'll come back to this. The next-prime stuff? Here it is. (define next-prime Super Quantification I'm now going to talk about super quantification which is at a level more abstract than ordinary quantifiers. Super quantifiers allow us to tie together a lot of diverse processes from universal and existential quantification to FOR loops and filters. Super quantifiers begin from an observation. The code for forall and exists is very similar; one uses true and the other false, one uses or and the other and. Suppose we parameterise these definitions to extract the common content - we get the code for super quantification. Here is the code for super quantification. (define super Haskell programmers will recognise the code for super quantification as a close relative of their foldr function. The main difference is that there are 4 and not 3 arguments to the function and the function operates on streams and not lists. What does this function do and why is it so super? Well first we can define forall and exists in it. (define forall Thats not all - we can also define the FOR loop beloved of procedural programmers since FORTRAN was a lad. Here is the code. (define for Lets try it out. Print the numbers 0 to 99. (101+) (for [0 (+ 1) (= 100)] print) Filtering a stream gathers all elements of a certain kind. But with super quantification, filtering over streams is easy. Here is the definition. (define filter Here is an example; find all the perfect squares between 1 and 99. (104+) (filter [1 (+ 1) (= 100)] (/. X
(integer? (sqrt X)))) You can also devise mapping functions over streams and other such stuff with super quantification; I leave you to explore this. One source of annoyance. Our encoding of super quantifiers uses strict evaluation because Qi uses strict evaluation in all functions except or and and if. A higher-order function will be strict in its evaluation and this is also the case with super. Unfortunately, when we pass lazy functions to super like and and or we want the evaluation in super to be lazy too. The only practical solution is to make these special cases in a case statement. The result works, but cannot be type checked as before (the explicit use of booleans in the function puts constraints on the type checker). (define super [X S E] P F Y -> (if (E X) Y
(if (= F or)
(or (P X) (super [(S
X) S E] P F Y))
(if (= F and)
(and (P X) (super [(S X) S E] P F Y)) (F (P X) (super [(S X) S E] P F Y)))))) Sugaring the Syntax Our new treatment of quantification is very powerful, but the syntax leaves something to be desired. Really this is unavoidable, since the position of the lambda bound variables determines the proper evaluation of the whole expression. However it would be nice to have something more readable. Qi contains a facility for changing its own syntax. In fact you can make Qi look like almost anything using this facility. The facility is called macroexpand. macroexpand is a function that is applied to every input, whether typed or read from a file. By default, macroexpand is the identity function i.e. it does nothing. However you, the user, can change the definition of this function to suit your convenience. For example, suppose we want an infix +; the following macroexpansion does it. (0-)(define macroexpand Ergo, if you don't like Qi syntax - invent your own! OK, lets invent our own syntax for quantification to make our lambda stuff more readable. Here is my shot at it. (define macroexpand Now I can enter Goldbach's conjecture in a friendlier form and reduce the bracketing. (forall X [4 (+ 2) (/. N false)] (exists Y [2 next-prime (< X)] (exists Z [2 next-prime (< X)] (= X (+ Y Z))))) Now this can be read quite easily left to right. For all X from the stream [4 (+ 2) (/. X false)], there is a Y from the stream [2 next-prime (< X)] such that there is a Z from the stream [2 next-prime (< X)] such that (= X (+ Y Z)). Olé! Mark Copyright
(c) 2005, Mark Tarver
|
|||||||||||||||||