The semiotic argument against naturalism

Although I am not really interested in evolution debates generally, I recently came across a piece written by John Lennox that seemed to be so fresh and interesting that my mind ticked over with its implications until late that night. Lennox is a mathematician at Oxford, and a prolific author and debater.

Few responsible scientists would deny evolution wholesale. I think that God probably created in a way that we can – and should – investigate scientifically, just that Darwinian evolution as proposed by Dawkins et al. wasn’t it. I don’t doubt evolution for theological reasons but simply because it is incomplete as a scientific theory. Some scientistic fundamentalists like Richard Dawkins do make grandiose claims about its explanatory power, but most serious secular scientists recognise the severe shortcomings of the theory.

Over the years, several arguments against purely Darwinian evolution have also been advanced. Michael Behe’s argument from irreducible complexity has endured much scrutiny, but has not been refuted as far as I can tell. However, since it requires in-depth knowledge of molecular biology not available to most people, it isn’t always an effective tool for layperson discussions. I think Alvin Plantinga’s Evolutionary Argument Against Naturalism (EAAN) does provide his “undefeated defeater”, but after critiques and rejoinders the logic becomes so advanced that most people can’t follow, and so it isn’t practical either. Enter the semiotic argument against naturalism. By applying powerful (and easily understood) results from modern mathematics and information theory to natural systems, it seems to me that a sound argument for the logical impossibility of naturalistic origins of the DNA molecule can be made. Lennox’s thorough treatment in Chapters 8-11 of God’s Undertaker is highly recommended, and what follows here is a brief and heavily simplified summary of his argument.

The argument can be written as follows:

1. Biological systems are information systems
2. The information contained in a DNA molecule is algorithmically incompressible.
3. Such information-producing algorithms aren’t present in nature
4. Algorithms that produce incompressible pieces of information have to themselves be more complex, or receive a more complex input of information, than that which they produce, and therefore do not produce new information.
5. Therefore the algorithm of evolution by natural selection (or any other unguided process) cannot produce any new information, including that contained in the DNA molecule.

1) Biological systems are information systems

What lies at the heart of every living thing is not a fire, warm breath, nor a “spark of life”. It is information, words, instructions… Think of billions of discrete digital characters… If you want to understand life, think about digital technology. Richard Dawkins.

The concept of information is central both to genetics and evolutionary theory. John Maynard Smith

The problem of the origin of life is clearly basically equivalent to the problem of the origin of biological information. Bernd-Olaf Küppers

In Shannon information theory (which concerns the transmission of information in a channel), there is a difference between syntactic and semantic information. EVOY OL UI represents syntactic information: 10 symbols. But the symbols don’t mean anything; they are not semantic. I LOVE YOU contains the same letters, but now represents semantic information in a way that no other arrangement of letters would convey in the English language.

The DNA molecule contains about 7 billion bits of semantic information, coded with a 4-letter alphabet: A, G, C, and T: Adenine, Guanine, Cytosine, and Thymine. Their order in the genome, along with the RNA molecule, code for the production of various proteins like instructions in a computer program or a recipe book. Except that this recipe book would fill a whole library. Clearly, biological systems are information systems of a high order.

2) The information contained in a DNA molecule is algorithmically incompressible.

Assume that you come across two strings 6.5 billion digits long: The one says ILOVEYOUILOVEYOUILOVEYOU… and the other is a string of random letters typed by a monkey at a keyboard. Are both equally complex? Not at all: the first string is simply ILOVEYOU repeated many times in an algorithm like

Repeat 437.5 million times.

The second string cannot be shortened in an algorithm like the one above. Lennox remarks that algorithmic compressibility is a good measure of randomness.
Now suppose you have a string of digits 3.5 billion letters long, but that they are of the books in a library. They are as complex as the monkey’s string in the sense that they are incompressible, but they exhibit specified complexity. That is, the information is semantic and intelligible to us because we have learned the English language independently of them, and can interpret them. Both random strings and strings of specified complexity are therefore algorithmically incompressible, but the latter conveys semantic information, whilst the former doesn’t. An analogy of ink spilled on a page vs. a message written on a page can also be used: both geometries are equally unlikely, but the one conveys information, whilst the other one doesn’t. Suppose now that you have a 10-word sentence. There are 362 880 ways in which the sentence could be written, but only 1 grammatical way in which it conveys the correct meaning. In the same way, there are 10320 sequence alternatives for the genome to code the simplest biologically significant amino acids, and only a few of them work.

Notice here that the argument isn’t about likelihood and improbability; it is about the information content of the genome, and the fact that such information is algorithmically incompressible.

3) Such information-producing algorithms aren’t present in nature

Can specific randomness be the guaranteed product of a mechanical, law-like process, like a primordial soup left to the mercy of familiar laws of physics and chemistry? No it couldn’t. No known law of nature could achieve this… We conclude that biologically relevant macromolecules simultaneously possess two vital properties: randomness and extreme specificity. A chaotic process could possibly achieve the former property but would have a negligible probability of achieving the latter. Paul Davies, The Fifth Miracle

Pre-biological evolution is a contradiction in terms. Theodosius Dobzhansky.

Notice that careful scientists use the word ‘negligible’ to describe the probability of this situation arising accidentally or randomly. However, this is not a fine-tuning argument relying on small probabilities; no-one seriously believes that semantic information like this arises randomly. The question, for all scientists, is ‘by what mechanism did it arise’?

One possible objection to this might be that the DNA molecule could be determined somehow through its physics and chemistry, a kind of “self-assembly” as is sometimes present in nature, and that there it isn’t as unlikely as we might think. However, this possibility has been roundly dismissed:

Whatever may be the origin of a DNA configuration, it can function as a code only if its order is not due to the forces of potential energy. It must be as physically indeterminate as the sequence of words is on a printed page. Michael Polanyi

… the message [printed on the page] is not derivable from the physics and chemistry of paper and ink. John Lennox

Attempts to relate the idea of order with biological organization or specificity must be regarded as a play on words which cannot stand careful scrutiny. Informational macromolecules can code genetic messages and therefore carry information because the sequence of bases or residues is affected very little, if at all, by physiochemical factors. Hubert Yockey, Information Theory and Biology

4) Algorithms that produce incompressible information have to themselves be longer, or receive a more complex input of information, than that which they produce, and therefore do not produce new information.

A machine does not create any new information, but it performs a very valuable transformation of known information. Léon Brillouin

The brilliant mathematician Kurt Gödel, described by many as the best logician since Aristotle, laid the groundwork for this premise. Gödel was an Austrian mathematician and philosopher who later worked at Princeton’s Institute for Advanced Study with his best friend Albert Einstein. He became famous for two proofs which he devised at the age of 24, refuting Bertrand Russell and Alfred North Whitehead’s magisterial Principia Mathematica and showing that mathematical knowledge is always incomplete – proofs with such far-reaching and profound implications that the scientific world is still reeling 85 years later:

Kurt Gödel showed that it is impossible to establish the internal logical consistency of a very large class of deductive systems – elementary arithmetic, for example – unless one adopts principles of reasoning so complex that their internal consistency is as open to doubt as that of the sytems themselves. Gordana Dodig-Crnkovic

After trying for many years to salvage his life’s work, Russell was forced to concede defeat:

I wanted certainty in the kind of way in which people want religious faith. I thought that certainty is more likely to be found in mathematics than anywhere…But after some twenty years of arduous toil, I came to the conclusion that there was nothing more that I could do in the way of making mathematical knowledge indubitable.

Although Gödel didn’t develop this further, he foresaw the implications of his work:

More generally, Gödel believes that mechanism in biology is a prejudice of our time which will be disproved. In this case, one disproval, in Gödel’s opinion, will consist in a mathematical theorem to the effect that the formation within geological times of a human body by the laws of physics (or any other laws of a similar nature), starting from a random distribution of the elementary particles and the field, is as unlikely as the separation by chance of the atmosphere into its components. Hao Wang. Nature’s imagination – the frontiers of scientific vision.

The mathematician Gregory Chaitin did build on Gödel’s work, and found that you cannot prove that a sequence of numbers has a greater complexity than the program required to generate it.

Chaitin’s arguments are based on the Turing machine. This is an abstract mathematical construct named after its inventor, the brilliant mathematician Alan Turing, who worked at Bletchely Park in the UK during the Second World War and led the team that cracked the famous Enigma code. The upshot of Chaitin’s work is to make possible the idea that no Turing machine can generate information that does not either belong to its input or its own informational structure. Why is this important? Because according to the Church-Turing thesis, any computational device whatsoever (past, present or future) can be simulated by a Turing machine. On that basis, any result obtained for Turing machines can be at once translated into the digital world. John Lennox, God’s Undertaker

Bernd-Olaf Küppers took this even further:

In sequences that carry semantic information the information is clearly coded irreducibly in the sense that it is not further compressible. Therefore there do not exist any algorithms that generate meaningful sequences where those algorithms are shorter than the sequences they generate. Complexity and Gödel’s incompleteness theorem, ACM SIGACT News, no. 9, April 1971, 11-12

Küppers’ conjecture seems to me to be the weakest part of the argument and warrants further investigation. He admits that this is a conjecture since by Chaitin it is impossible to prove, for a given sequence and algorithm, that there is no shorter algorithm that could generate the sequence. It is a conjecture by mathematical standards, but passes as sound scientific theory with flying colours. No counterexample has ever been devised; moreover it is self-evident for me personally as I have coded many algorithms during my career. The point of writing them is to solve problems you couldn’t solve yourself – to give you new information once you interpret the results. But algorithms only ever do what you tell them to do. Ask any first-time programmer: this can be very frustrating. Without highly directed, thoughtful, intelligent input (it is much like poetry), all you do is make mistakes faster. GIGO, or “Garbage In, Garbage Out” is a programming mantra that has become even become cliché.

5) Therefore the algorithm of evolution by natural selection (or any other unguided process) cannot produce any new information, including that contained in the DNA molecule.

As Lennox says, “much interesting and difficult work remains to be done in this area.” Nevertheless, the implications of such a conclusion is staggering. If this semantic information could not be produced naturalistically, it had to come from outside the system. Notwithstanding the known intertextual references, there may be even more to John’s opening statement that “in the beginning was the Word” than we think.

This is not a god-of-the-gaps argument in the classical sense, although it could be seen as a God-of-the-gap argument. The nature of the gap is important: generally, god-of-the-gaps arguments are criticized because they are based on an absence of knowledge: “we don’t know how it happened, so God must have done it.” As scientific knowledge progresses, we find a naturalistic explanation and the gap vanishes, along with the god. However, the gap in this case is due to a presence of knowledge. Up to now naturalistic biogenesis was an evolution-of-the-gaps argument. “We don’t know how, but evolution must have done it!” But that doesn’t work anymore; it is not that we don’t know how it could arise naturalistically, it is that we are proving that it couldn’t.

12 thoughts on “The semiotic argument against naturalism

  1. So, why is the Origin of Life still attributed to Evolution by Natural Processes, if this is conclusive proof that that could not have happened?

    • That’s a good question! Strictly speaking, Chaitin’s theorem still needs to be made watertight, although it makes intuitive sense. So I guess some people would use that as an way out. But I doubt whether most atheists would take it that far – the reasons for unbelief are usually not that rational. I think you can chalk up a lot of it to the Semmelweiss Reflex. (I can’t add the hyperlik here, but check out the Wiki.)

      What most people do when they come across something that could really alter their beliefs, is that they then go to their favourite popular author (let’s pick on Dawkins) and read a poor refutation of the idea. This usually satisfies them, and they stop reading.

      I’d love to read someone who has engaged with this idea on a meaningful level, but I haven’t come across a sceptical author like that yet.

      Oh, and I also wrote about this idea in the post on the God Elusion.

  2. Pingback: Morality and God: Is there a Connection? | Standard Deviations

  3. The origin of life is NOT attributed to evolution. Evolution is a process that has and continues to take place AFTER life arose.

  4. #2 is false. There is a very efficient algorithm for compressing DNA, and several freely available software implementations. Using a reference genome and storing just the differences to this makes DNA very compressible. Using the chimp genome to compress the human genome, for example, is a very efficient compression (giving us about a 50- to 100-fold compression). For comparison, using the human genome as a reference for an individual’s genome compresses a 800MB genome down to about 2MB (so about a 400-fold compression). These are much better than the compression ratios one gets from compressing (for example) ascii text. Of course, this is exactly what we expect from common descent.

    Also #3 is demonstrably false too. Information-producing (as defined by compression size) algorithms are found in nature. For just one example, neutral drift with purifying selection is a mechanism that increases the information (i.e. increases entropy and decreases compression efficiency) without degrading function, is directly observable in the lab, and can be reproduced with simple simulations.

    Also #4 is false. Algorithms that produce apparently incompressible information can be much smaller than the compression size of the data they produce. For example, random noise (e.g. neutral drift) is always very hard to compress (this is the white noise paradox). The noise does not need to be ontologically random. Any process that produces uncorrelated changes to DNA increases apparent information (i.e. increases measured entropy and decreases compression efficiency). For example noise added by a hidden pseudorandom function is entirely defined by a few bits in a random seed (therefore it is low information), but it produces a very very large amount of apparently incompressible bits that is essentially indistinguishable from high information content, because we do not know the seed used to produce the sequence, or the pseudo random algorithm. So the empirical compression is only a very loose upper bound on information, and an extremely bad estimate when we do not know the generative process (e.g. when the random seed is unknown to us). Of course, quantum noise is incompressible. We do not need quantum mechanics though, Brownian motion and cosmic radiation produces infinite information too. So do chaotic attractors (even when we know the generating function and the random seed!).

    It’s a cute idea, and I appreciate you reproducing his argument here, but with three major errors his “therefore #5” does not fly.

    • Thank you for the comment, Joshua! I appreciate it. You seem to raise some good points, and I’d like to respond to them. Unfortunately, I have some big deadlines at the moment, but I hope to get back to you.

      • Whenever you can.

        I can anticipate two objections to my objection, so I might a well as deal with now.

        First, I’m only addressing the biological evolution component of this argument, NOT the origin of life. It is already known that the origin of life is an open problem in biology, so that part of the argument is superfluous. I am instead focusing on evolution: which is defined as common descent. This is the confirmed (as far we can tell in science) theory that current forms of life share common ancestors among earlier life forms. This means, I am starting from the DNA of a living organism, and just need to show how the processes I’ve described will increase the compression size of this DNA over time.

        Second, the random processes I have described will increase compression size of DNA over time (this is just a restatement of the 2nd law of thermodynamics), but one might object that this is not a “semiotic” system. Well all we have to do is apply these processes to a genome, and remember that the mutations that break the code will not continue to reproduce (because they do not produce viable life) so they can be deleted (this is negative selection). In this way, we have a “semiotic” system that has ever growing increased compression size.

        I put “semiotic” in quotes, because DNA is not a true semiotic system, but that is beside the point any way. Lennox seems to argue that DNA _*IS*_ a language and a code and a semiotic system. This turns out to be false. It is more correct to say in some limited ways DNA is a little like a semiotic system. Yes there are similarities, but there are many very important differences. A mathematician like Lennox does not know this, but a computational biologist like me does. Before you contest me on this, I encourage you to try listing out the consequential difference between, say, DNA and english. After reading wikipedia, it should become fairly obvious the error in logic here. For the record, Francis Bacon explains that this is an example of falling prey to the Idol of the Theatre.


  5. I had a chance to talk to Lennox. According to him, his argument is NOT about evolution. Rather it is about the origin of life exclusively. This raises the question: are you trying to extend his argument beyond abiogenesis to evolution? I do not think this works for all the reasons I have already explained. Peace.

    • Hi Joshua
      Thank you for asking him – I was going to do the same. As I understand it, his argument is about evolution not being a credible explanation for biogenesis, since evolution as an algorithm doesn’t create information. So I think some of your arguments may still be applicable. If it seems as though I was trying to extend his argument, that’s probably a lack of clarity on my part. I haven’t looked at it for a while, though. Just trying to finish my PhD, but when I am done (which should be soon, hopefully) I’ll have a chance to revisit this. It’s great to have input from someone with more knowledge about computational biology than I have, though, so I really appreciate your posts!
      Kind regards,

      • Did not know you are a PhD student. Send me a private email sometime about that (I am easy to find online).

        As I have explained, it is demonstrably false that evolution does not increase information (if we define information by compressibility). It absolutely does. However, evolution requires replicating units to proceed. Absent replicating units, e.g. before abiogenesis, it is irrelevant to the conversation if our focus is abiogenesis.

        The best argument against abiogenesis is that we do not yet have a plausible mechanism, and it can only be demonstrated to have occurred one time. Our failure to detect any type of life outside our solar system (btw, life on Mars would most likely be seeded from the Earth), should not stop us from looking. However, it should make us suspect of any grand claims that life arising is “inevitable.” The fact that life does not appear to have arisen multiple times on earth (as far as we can tell) should make us very cautious about claims like this. Though questions about God are outside the domain of science, it is obviously reasonable to think God created the first cell.

        It does not take a contrarian book to make this point. Everything I’ve stated here is almost entirely agreed upon by most scientists at this time.

        There are several arguments for abiogenesis, both scientific and religious.

        Scientifically, several theories of abiogenesis make predictions that explain subtle details about biology. For example.

        1. The direct-templating hypothesis is a mechanism by which the genetic code arises. It predicts that amino acids will bind the the codons or anti-codons that code them in the modern genetic code:

        2. The iron-sulfur theory of primordial metabolism predicts that self-sustaining metabolism would be possible before an RNA world, without requiring phosphorus. We find a metabolic “fossil” that suggests this may be possible and identifies a coherent metabolic network in extant life:

        3. The RNA world hypothesis explains why ribosomes include so much RNA, instead of being composed primarily of protein, as most enzymes are.

        None of this is proof, but it is certainly arbitrary details about life that make no real sense unless it arose by abiogenesis. Even if abiogenesis remains entirely unlikely, there are two reasons we cannot formally rule it out: (1) regions of the universe that extend far beyond the visible universe or (2) a multiverse The fact that it appears to have happened only one time means that just increasing the number of trials this way is enough to provide a scientifically justifiable (even if it is not proven) explanation.

        Religiously (speaking from my point of view as a Christian), the best argument for abiogenesis is Genesis 1, which literally says (and appears to be mirroring statements by Aristotle) that the “land” and “water” gave forth plants and animals of “many kinds”. It was not till the 1800s that it spontaneous generation was falsified. Until that point, it was a matter of (it seemed to them) of direct observation to see that life could arise from non-life. Given this is what the Bible literally says, and that this is what most learned people thought for at least the last 3000 years, it goes against the traditional interpretation of Scripture and its plain reading to reject abiogenesis.

        So what really happened? I do not know. I wasn’t there.


Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s