Oct 01 2011

Shakespearean Simians

Category: My Web LogAdmin @ 10:53

French mathematician Emile Borel was one of the first few intellectuals to pose these questions (not in original form): How many monkeys would it take to successfully reproduce a work of Shakespeare (or any other literature) and how long would the process take? And, if infinite variables, what is the probability of success? The method: They are all typing randomly on 50-key standard typewriters.

To give the scale of the task, I will invoke some statistics and quotes from Seth Lloyd’s Programming the Universe. The following stats assume a 50-key standard typewriter. Ignoring capitalization, the probability of randomly typing ‘h’ is 1 in 50…typing ‘ha’ is 1 in 2500…typing ‘ham’ is 1 in 125,000…typing ‘hamlet. act i, scene i’ would take a magnitude of 10^-38 (approximately, “it would take a billion billion monkeys, each typing ten characters per second, for each of the roughly billion billion seconds since the universe began”).

A large number of experiments have been carried out to answer Emile Borel’s question using both real and virtual monkeys, but they have all, for the most part, failed or come to a stand-still. One of the latest researchers to try the experiment is Jesse Anderson, an American programmer. Equipped with the Hadoop programming tool and Amazon’s cloud, EC2, Mr. Anderson set out to create the virtual project in August and has recently reported a 99.990% completion rate of Shakespeare’s collections (~3,695,990 characters) using millions of virtual monkeys. How is this possible within such a short time period? Mr. Anderson’s success isn’t due to intelligent algorithms or a secret access to quantum computers, but because he has established very convenient constraints in his program. One constraint is the disregard for punctuation and spaces, while another is the production limit of 9-character text strings per monkey at each time interval. The latter constraint enables Mr. Anderson to sift through each produced text strings for characters that match those within Shakespeare’s collections, which explains his high success rate.

Without any constraints, ‘Borel’s’ project is nearly impossible to simulate using contemporary computers. Without the ubiquity of quantum computing, the answer to Emile Borel’s question will continuously be settled at ‘infinity’. It has been suggested by various computer scientists and physicists that it would be far easier for randomly typing monkeys to recreate computer programs, which are often shorter, less imaginative, and less coherent than literature, than to create masterpieces. So this begs the question: How many monkeys would it take to randomly write Jesse Anderson’s computer program and how long would it take?