15.1 The Genetic Code

Learning Objectives

Learning Objectives

In this section, you will explore the following questions:

  • What is the central dogma of protein synthesis?
  • What is the genetic code, and how does nucleotide sequence prescribe the amino acid and polypeptide sequence?

Connection for AP® Courses

Connection for AP® Courses

Since the rediscovery of Mendel’s work in the 1900s, scientists have learned much about how the genetic blueprints stored in DNA are capable of replication, expression, and mutation. Just as the 26 letters of the English alphabet can be arranged into what seems to be a limitless number of words (with new ones added to the dictionary every year), the four nucleotides of DNA—A, T, C, and G—can generate sequences of DNA called genes that specify tens of thousands of polymers of amino acids. In turn, these sequences can be transcribed into mRNA and translated into proteins, which orchestrate nearly every function of the cell. The genetic code refers to the DNA alphabet [adenine (A), cytosine (C), guanine (G), and thymine (T)], the RNA alphabet [adenine (A), cytosine (C), guanine (G), and uracil (U)], and the polypeptide alphabet (20 amino acids). But how do genes located on a chromosome ultimately produce a polypeptide that can result in a physical phenotype such as hair or eye color—or a disease such as cystic fibrosis or hemophilia?

The central dogma describes the normal flow of genetic information from DNA to mRNA to protein: DNA in genes specify sequences of mRNA which, in turn, specify amino acid sequences in proteins. The process requires two steps: transcription and translation. During transcription, genes are used to make messenger RNA (mRNA). In turn, the mRNA is used to direct the synthesis of proteins during the process of translation. Translation also requires two other types of RNA: transfer RNA (tRNA) and ribosomal RNA (rRNA). The genetic code is a triplet code, with each RNA codon consisting of three consecutive nucleotides that specify one amino acid or the release of the newly formed polypeptide chain; for example, the mRNA codon CAU specifies the amino acid histidine. The code is degenerate; that is, some amino acids are specified by more than one codon, like synonyms you study in your English class (different word, same meaning). For example, CCU, CCC, CCA, and CCG are all codons for proline. It is important to remember that the same genetic code is universal to almost all organisms on Earth. Small variations in codon assignment exist in mitochondria and some microorganisms.

Deviations from the simple scheme of the central dogma are discovered as researchers explore gene expression with new technology. For example, the human immunodeficiency virus (HIV) is a retrovirus which stores its genetic information in single-stranded RNA molecules. Upon infection of a host cell, RNA is used as a template by the virally encoded enzyme, reverse transcriptase, to synthesize DNA. The viral DNA is later transcribed into mRNA and translated into proteins. Some RNA viruses, such as the influenza virus, never go through a DNA step. The RNA genome is replicated by an RNA-dependent RNA polymerase, which is virally encoded.

The content presented in this section supports the Learning Objectives outlined in Big Idea 1 and Big Idea 3 of the AP® Biology Curriculum Framework. The Learning Objectives merge Essential Knowledge content with one or more of the seven Science Practices. These Learning Objectives provide a transparent foundation for the AP® Biology course, along with inquiry-based laboratory experiences, instructional activities, and AP® Exam questions.

Big Idea 1 The process of evolution drives the diversity and unity of life.
Enduring Understanding 1.B Organisms are linked by lines of descent from common ancestry.
Essential Knowledge 1.B.1 Organisms share many conserved core processes and features that evolved and are widely distributed among organisms today.
Science Practice 3.1 The student can pose scientific questions.
Science Practice 7.2 The student can connect concepts in and across domain(s) to generalize or extrapolate in and/or across enduring understandings and/or big ideas.
Learning Objective 1.15 The student is able to describe specific examples of conserved core biological processes and features shared by all domains or within one domain of life, and how these shared, conserved core processes and features support the concept of common ancestry for all organisms.
Big Idea 3 Living systems store, retrieve, transmit, and respond to information essential to life processes.
Enduring Understanding 3.A Heritable information provides for continuity of life.
Essential Knowledge 3.A.1 DNA, and in some cases RNA, is the primary source of heritable information.
Science Practice 6.5 The student can evaluate alternative scientific explanations.
Learning Objective 3.1 The student is able to construct scientific explanations that use the structure and functions of DNA and RNA to support the claim that DNA and, in some cases, that RNA are the primary sources of heritable information.

The Science Practices Assessment Ancillary contains additional test questions for this section that will help you prepare for the AP exam. These questions address the following standards:

  • [APLO 3.4]
  • [APLO 3.25]

The cellular process of transcription generates messenger RNA (mRNA), a mobile molecular copy of one or more genes with an alphabet of A, C, G, and U. Translation of the mRNA template converts nucleotide-based genetic information into a protein product. Protein sequences consist of 20 commonly occurring amino acids; therefore, it can be said that the protein alphabet consists of 20 letters (Figure 15.2). Each amino acid is defined by a three-nucleotide sequence called the triplet codon. Different amino acids have different chemistries (such as acidic versus basic, or polar and nonpolar) and different structural constraints. Variation in amino acid sequence gives rise to enormous variation in protein structure and function.

 
Structures of the twenty amino acids are given. Six amino acids—glycine, alanine, valine, leucine, methionine, and isoleucine—are non-polar and aliphatic, meaning they do not have a ring. Six amino acids—serine, threonine, cysteine, proline, asparagine, and glutamate—are polar but uncharged. Three amino acids—lysine, arginine, and histidine—are positively charged. Two amino acids, glutamate and aspartate, are negatively charged. Three amino acids—phenylalanine, tyrosine, and tryptophan—are
Figure 15.2 Structures of the 20 amino acids found in proteins are shown. Each amino acid is composed of an amino group ( N H 3 + N H 3 + ), a carboxyl group (COO-), and a side chain (blue). The side chain may be nonpolar, polar, or charged, as well as large or small. It is the variety of amino acid side chains that gives rise to the incredible variation of protein structure and function.

The Central Dogma: DNA Encodes RNA; RNA Encodes Protein

The Central Dogma: DNA Encodes RNA; RNA Encodes Protein

The flow of genetic information in cells from DNA to mRNA to protein is described by the central dogma (Figure 15.3), which states that genes specify the sequence of mRNAs, which, in turn, specify the sequence of proteins. The decoding of one molecule to another is performed by specific proteins and RNAs. Because the information stored in DNA is so central to cellular function, it makes intuitive sense that the cell would make mRNA copies of this information for protein synthesis, while keeping the DNA itself intact and protected. The copying of DNA to RNA is relatively straightforward, with one nucleotide being added to the mRNA strand for every nucleotide read in the DNA strand. The translation to protein is a bit more complex because three mRNA nucleotides correspond to one amino acid in the polypeptide sequence. However, the translation to protein is still systematic and colinear, such that nucleotides 1 to 3 correspond to amino acid 1, nucleotides 4 to 6 correspond to amino acid 2, and so on.

To make a protein, genetic information encoded by the DNA must be transcribed onto an mRNA molecule. The RNA is then processed by splicing to remove exons and by the addition of a 5' cap and a poly-A tail. A ribosome then reads the sequence on the mRNA, and uses this information to string amino acids into a protein.
Figure 15.3 Instructions on DNA are transcribed onto messenger RNA. Ribosomes are able to read the genetic information inscribed on a strand of messenger RNA and use this information to string amino acids together into a protein.

The Genetic Code Is Degenerate and Universal

Given the different numbers of letters in the mRNA and protein alphabets, scientists theorized that combinations of nucleotides corresponded to single amino acids. Nucleotide doublets would not be sufficient to specify every amino acid because there are only 16 possible two-nucleotide combinations (42). In contrast, there are 64 possible nucleotide triplets (43), which is far more than the number of amino acids. Scientists theorized that amino acids were encoded by nucleotide triplets and that the genetic code was degenerate. In other words, a given amino acid could be encoded by more than one nucleotide triplet. This was later confirmed experimentally; Francis Crick, Sydney Brenner, Leslie Barnett, and R.J. Watts-Tobin used the chemical mutagen proflavin to insert one, two, or three nucleotides into the gene of a virus. When one or two nucleotides were inserted, protein synthesis was completely abolished. When three nucleotides were inserted, the protein was synthesized and functional.


 
This demonstrated that three nucleotides specify each amino acid. These nucleotide triplets are called codons. The insertion of one or two nucleotides completely changed the triplet reading frame, thereby altering the message for every subsequent amino acid (Figure 15.4). Though insertion of three nucleotides caused an extra amino acid to be inserted during translation, the integrity of the rest of the protein was maintained.
Illustration shows a frameshift mutation in which the reading frame is altered by the deletion of two amino acids.
Figure 15.4 The deletion of two nucleotides shifts the reading frame of an mRNA and changes the entire protein message, creating a nonfunctional protein or terminating the protein synthesis altogether.

Scientists painstakingly solved the genetic code by translating synthetic mRNAs in vitro and sequencing the proteins they specified (Figure 15.5).

Figure shows all 64 codons. Sixty-two of these code for amino acids, and three are stop codons.
Figure 15.5 This figure shows the genetic code for translating each nucleotide triplet in mRNA into an amino acid or a termination signal in a nascent protein. (credit: modification of work by the National Institutes of Health)

Along with instructing the addition of a specific amino acid to a polypeptide chain, three of the 64 codons terminate protein synthesis and release the polypeptide from the translation machinery. These triplets are called nonsense codons, or stop codons. Another codon, AUG, also has a special function. In addition to specifying the amino acid methionine, it also serves as the start codon to initiate translation. The reading frame for translation is set by the AUG start codon near the 5′ end of the mRNA.

The genetic code is universal. With a few exceptions, virtually all species use the same genetic code for protein synthesis. Conservation of codons means that a purified mRNA encoding the globin protein in horses could be transferred to a tulip cell, and the tulip would synthesize horse globin. That there is only one genetic code is powerful evidence that all life on Earth share a common origin, especially considering that there are about 1084 possible combinations of 20 amino acids and 64 triplet codons.

Link to Learning

QR Code representing a URL

Transcribe a gene and translate it to protein using complementary pairing and the genetic code at the Genetic Science Learning Center.

Some hereditary and age-related diseases are caused by translation errors. Explain why an error in translation may cause disease.

  1. If there is an error in translation, the correct lipids will not be made for signaling, storage of energy, or to perform vital functions. This can cause hereditary and age-related diseases.
  2. Translation is the process in which a particular segment of DNA is copied into RNA (mRNA) by the enzyme RNA polymerase. An error in such copying can lead to various hereditary and age-related diseases.
  3. Translation is the process used by ribosomes to synthesize proteins from amino acids. If there is an error in this process, the correct proteins will not be made to build important body tissue or perform vital functions, thus leading to hereditary and age-related diseases.
  4. Translation is the process Golgi bodies use to synthesize proteins from amino acids. If there is an error in this process, the correct proteins will not be made to build important body tissue or perform vital functions.

Science Practice Connection for AP® Courses

Think About It

  • A strand of DNA has the nucleotide sequence 3′……GCT GTC AAA TTC GAT……5′. What is the sequence of mRNA that is complementary to this DNA sequence? Using the chart of codons in the text, determine the sequence of amino acids which can be generated from this strand of DNA.
  • How does degeneracy of the genetic code make cells less vulnerable to mutations? What is an advantage of degeneracy with respect to the negative impact of random mutations on natural selection and evolution?

Degeneracy is believed to be a cellular mechanism to reduce the negative impact of random mutations. Codons that specify the same amino acid typically only differ by one nucleotide. In addition, amino acids with chemically similar side chains are encoded by similar codons. This nuance of the genetic code ensures that a single-nucleotide substitution mutation might either specify the same amino acid but have no effect or specify a similar amino acid, preventing the protein from being rendered completely nonfunctional.

Scientific Method Connection


 
Which Has More DNA: A Kiwi or a Strawberry?

 

Question—Would a kiwifruit and strawberry that are approximately the same size (Figure 15.6) also have approximately the same amount of DNA?

Photographs show a thin slice of a green kiwi fruit and a bowl of strawberries.
Figure 15.6 Do you think that a kiwi or a strawberry has more DNA per fruit? [credit (kiwi): "Kelbv"/Flickr; credit (strawberry): Alisdair McDiarmid]

Background—Genes are carried on chromosomes and are made of DNA. All mammals are diploid, meaning they have two copies of each chromosome. However, not all plants are diploid. The common strawberry is octoploid (8n) and the cultivated kiwi is hexaploid (6n). Research the total number of chromosomes in the cells of each of these fruits and think about how this might correspond to the amount of DNA in these fruit cells’ nuclei. Read about the technique of DNA isolation to understand how each step in the isolation protocol helps liberate and precipitate DNA.

Hypothesis—Hypothesize whether you would be able to detect a difference in DNA quantity from similarly sized strawberries and kiwis. Which fruit do you think would yield more DNA?

Test your hypothesis—Isolate the DNA from a strawberry and a kiwi that are similarly sized. Perform the experiment in at least triplicate for each fruit.

  1. Prepare a bottle of DNA extraction buffer from 900 mL water, 50 mL dish detergent, and two teaspoons of table salt. Mix by inversion (cap it and turn it upside down a few times).
  2. Grind a strawberry and a kiwifruit by hand in a plastic bag, or by using a mortar and pestle, or in a metal bowl and using the end of a blunt instrument. Grind for at least two minutes per fruit.
  3. Add 10 mL of the DNA extraction buffer to each fruit, and mix well for at least one minute.
  4. Remove cellular debris by filtering each fruit mixture through cheesecloth or porous cloth and into a funnel placed in a test tube or other appropriate container.
  5. Pour ice-cold ethanol or isopropanol (rubbing alcohol) into the test tube. You should observe white, precipitated DNA.
  6. Gather the DNA from each fruit by winding it around separate glass rods.

Record your observations—Because you are not quantitatively measuring DNA volume, for each trial you can record whether the two fruits produced the same or different amounts of DNA as observed by your eye. If one or the other fruit produced noticeably more DNA, record this as well. Determine whether your observations are consistent with several pieces of each fruit.

Analyze your data—Did you notice an obvious difference in the amount of DNA produced by each fruit? Were your results reproducible?

Draw a conclusion—Given what you know about the number of chromosomes in each fruit, can you conclude that chromosome number necessarily correlates to DNA amount? Can you identify any drawbacks to this procedure? If you had access to a laboratory, how could you standardize your comparison and make it more quantitative?

 

Disclaimer

This section may include links to websites that contain links to articles on unrelated topics.  See the preface for more information.