The Shape of a Protein Molecule Directly Determines Its

From a chemical point of view, proteins are past far the most structurally circuitous and functionally sophisticated molecules known. This is perhaps non surprising, in one case one realizes that the structure and chemistry of each protein has been adult and fine-tuned over billions of years of evolutionary history. We start this chapter past considering how the location of each amino acid in the long cord of amino acids that forms a protein determines its 3-dimensional shape. Nosotros will then utilise this understanding of protein structure at the atomic level to describe how the precise shape of each poly peptide molecule determines its function in a cell.

The Shape of a Protein Is Specified by Its Amino Acid Sequence

Recall from Affiliate 2 that in that location are 20 types of amino acids in proteins, each with dissimilar chemic properties. A protein molecule is made from a long concatenation of these amino acids, each linked to its neighbour through a covalent peptide bond (Figure three-1). Proteins are therefore also known as
polypeptides. Each type of poly peptide has a unique sequence of amino acids, exactly the same from ane molecule to the next. Many thousands of different proteins are known, each with its own particular amino acid sequence.

Figure three-one

A peptide bond. This covalent bond forms when the carbon atom from the carboxyl grouping of 1 amino acid shares electrons with the nitrogen atom
from the amino group of a second amino acrid. As indicated, a molecule of water is lost in this condensation (more…)

The repeating sequence of atoms along the core of the polypeptide chain is referred to as the polypeptide backbone. Fastened to this repetitive chain are those portions of the amino acids that are non involved in making a peptide bond and which give each amino acid its unique properties: the 20 different amino acrid side chains (Figure three-2). Some of these side bondage are nonpolar and hydrophobic (“water-fearing”), others are negatively or positively charged, some are reactive, and then on. Their diminutive structures are presented in Panel 3-1, and a brief list with abbreviations is provided in Figure iii-3.

Figure 3-2. The structural components of a protein.

Figure 3-two

The structural components of a poly peptide. A protein consists of a polypeptide backbone with fastened side chains. Each type of poly peptide differs in its sequence and number of amino acids; therefore, it is the sequence of the chemically dissimilar side bondage (more…)

Box Icon

Panel 3-1

The 20 Amino Acids Establish in Proteins.

Figure 3-3. The 20 amino acids found in proteins.

Figure 3-3

The xx amino acids establish in proteins. Both three-letter and 1-letter abbreviations are listed. As shown, at that place are equal numbers of polar and nonpolar side chains. For their atomic structures, run across Console 3-1 (pp. 132–133).

As discussed in Chapter ii, atoms conduct most as if they were difficult spheres with a definite radius (their
van der Waals radius). The requirement that no 2 atoms overlap limits greatly the possible bail angles in a polypeptide chain (Figure iii-4). This constraint and other steric interactions severely restrict the diverseness of three-dimensional arrangements of atoms (or
that are possible. Nevertheless, a long flexible chain, such as a protein, can notwithstanding fold in an enormous number of means.

Figure 3-4. Steric limitations on the bond angles in a polypeptide chain.

Figure iii-four

Steric limitations on the bond angles in a polypeptide chain. (A) Each amino acid contributes iii bonds
to the backbone of the concatenation. The peptide bail is planar
(gray shading)
and does not permit rotation. By dissimilarity, rotation can occur well-nigh (more…)

The folding of a protein concatenation is, however, further constrained past many different sets of weak
noncovalent bonds
that form between one part of the chain and some other. These involve atoms in the polypeptide courage, also as atoms in the amino acid side chains. The weak bonds are of three types:
hydrogen bonds,
ionic bonds, and
van der Waals attractions, equally explained in Chapter 2 (encounter p. 57). Private noncovalent bonds are 30–300 times weaker than the typical covalent bonds that create biological molecules. But many weak bonds can human action in parallel to hold ii regions of a polypeptide chain tightly together. The stability of each folded shape is therefore determined by the combined strength of large numbers of such noncovalent bonds (Effigy three-5).

Figure 3-5. Three types of noncovalent bonds that help proteins fold.

Figure three-5

Iii types of noncovalent bonds that assist proteins fold. Although a unmarried one of these bonds is quite weak, many of them frequently form together to create a strong bonding arrangement, equally in the example shown. Every bit in the previous figure, R is used as a full general (more than…)

A fourth weak force also has a fundamental role in determining the shape of a protein. Equally described in Chapter 2, hydrophobic molecules, including the nonpolar side bondage of item amino acids, tend to be forced together in an aqueous environment in order to minimize their disruptive consequence on the hydrogen-bonded network of water molecules (see p. 58 and Panel 2-2, pp. 112–113). Therefore, an important cistron governing the folding of any poly peptide is the distribution of its polar and nonpolar amino acids. The nonpolar (hydrophobic) side bondage in a protein—belonging to such amino acids as phenylalanine, leucine, valine, and tryptophan—tend to cluster in the interior of the molecule (just as hydrophobic oil droplets coalesce in water to form one big droplet). This enables them to avoid contact with the water that surrounds them inside a cell. In contrast, polar side chains—such equally those belonging to arginine, glutamine, and histidine—tend to accommodate themselves near the outside of the molecule, where they can class hydrogen bonds with water and with other polar molecules (Figure 3-6). When polar amino acids are buried within the poly peptide, they are usually hydrogen-bonded to other polar amino acids or to the polypeptide backbone (Figure iii-seven).

Figure 3-6. How a protein folds into a compact conformation.

Figure three-6

How a protein folds into a compact conformation. The polar amino acid side chains tend to get together on the outside of the protein, where they tin interact with water; the nonpolar amino acid side chains are buried on the inside to form a tightly packed hydrophobic (more…)

Figure 3-7. Hydrogen bonds in a protein molecule.

Figure three-vii

Hydrogen bonds in a protein molecule. Large numbers of hydrogen bonds form between adjacent regions of the folded polypeptide chain and assistance stabilize its three-dimensional shape. The protein depicted is a portion of the enzyme lysozyme, and the hydrogen (more…)

Proteins Fold into a Conformation of Lowest Energy

Every bit a result of all of these interactions, each type of protein has a particular iii-dimensional structure, which is adamant past the order of the amino acids in its chain. The concluding folded construction, or conformation, adopted by whatsoever polypeptide chain is by and large the one in which the free energy is minimized. Protein folding has been studied in a test tube by using highly purified proteins. A poly peptide tin exist unfolded, or
denatured, by treatment with certain solvents, which disrupt the noncovalent interactions property the folded chain together. This treatment converts the protein into a flexible polypeptide concatenation that has lost its natural shape. When the denaturing solvent is removed, the protein often refolds spontaneously, or
renatures, into its original conformation (Figure 3-8), indicating that all the data needed for specifying the iii-dimensional shape of a poly peptide is contained in its amino acid sequence.

Figure 3-8. The refolding of a denatured protein.

Effigy 3-8

The refolding of a denatured protein. (A) This experiment demonstrates that the conformation of a poly peptide is determined solely by its amino acid sequence. (B) The structure of urea. Urea is very soluble in water and unfolds proteins at high concentrations, (more…)

Each protein commonly folds up into a single stable conformation. However, the conformation oftentimes changes slightly when the protein interacts with other molecules in the cell. This change in shape is ofttimes crucial to the office of the protein, equally we see later.

Although a protein concatenation can fold into its correct conformation without outside help, poly peptide folding in a living cell is oft assisted by special proteins called
molecular chaperones.
These proteins bind to partly folded polypeptide chains and assistance them progress forth the well-nigh energetically favorable folding pathway. Chaperones are vital in the crowded conditions of the cytoplasm, since they prevent the temporarily exposed hydrophobic regions in newly synthesized protein chains from associating with each other to class protein aggregates (see p. 357). Notwithstanding, the final three-dimensional shape of the protein is still specified by its amino acid sequence: chaperones but make the folding process more than reliable.

Proteins come in a broad variety of shapes, and they are generally between l and 2000 amino acids long. Large proteins more often than not consist of several distinct
protein domains—structural units that fold more or less independently of each other, as we hash out beneath. The detailed structure of whatever poly peptide is complicated; for simplicity a protein’southward structure tin can exist depicted in several different ways, each emphasizing different features of the protein.

Panel 3-2 (pp. 138–139) presents iv different depictions of a protein domain called SH2, which has important functions in eucaryotic cells. Synthetic from a string of 100 amino acids, the structure is displayed every bit (A) a polypeptide backbone model, (B) a ribbon model, (C) a wire model that includes the amino acid side chains, and (D) a space-filling model. Each of the three horizontal rows shows the poly peptide in a different orientation, and the image is colored in a way that allows the polypeptide concatenation to be followed from its N-terminus
to its C-terminus

Box Icon

Panel 3-2

4 dissimilar Ways of Depicting a Small Protein Domain: the SH2 Domain (Courtesy of David Lawson.).

Panel 3-2 shows that a protein’s conformation is amazingly complex, fifty-fifty for a structure as small as the SH2 domain. But the description of protein structures can be simplified by the recognition that they are built up from several common structural motifs, every bit nosotros discuss next.

The α Helix and the β Sheet Are Common Folding Patterns

When the three-dimensional structures of many different protein molecules are compared, information technology becomes clear that, although the overall conformation of each protein is unique, 2 regular folding patterns are often constitute in parts of them. Both patterns were discovered about 50 years agone from studies of hair and silk. The first folding pattern to exist discovered, called the
α helix, was establish in the protein
α-keratin, which is abundant in pare and its derivatives—such equally hair, nails, and horns. Within a year of the discovery of the α helix, a second folded structure, called a
β sheet, was establish in the protein
fibroin, the major elective of silk. These two patterns are especially common considering they consequence from hydrogen-bonding between the N–H and C=O groups in the polypeptide courage, without involving the side chains of the amino acids. Thus, they can be formed past many dissimilar amino acrid sequences. In each case, the poly peptide chain adopts a regular, repeating conformation. These two conformations, as well as the abbreviations that are used to denote them in ribbon models of proteins, are shown in Effigy 3-9.

Figure 3-9. The regular conformation of the polypeptide backbone observed in the α helix and the β sheet.

Effigy 3-9

The regular conformation of the polypeptide courage observed in the α helix and the β sheet. (A, B, and C) The α helix. The Northward–H of every peptide bail is hydrogen-bonded to the C=O of a neighboring peptide bond located (more…)

The cadre of many proteins contains all-encompassing regions of β sheet. Every bit shown in Figure three-10, these β sheets can grade either from neighboring polypeptide bondage that run in the same orientation (parallel bondage) or from a polypeptide chain that folds dorsum and forth upon itself, with each section of the chain running in the direction reverse to that of its immediate neighbors (antiparallel chains). Both types of β sheet produce a very rigid structure, held together by hydrogen bonds that connect the peptide bonds in neighboring bondage (see Figure iii-9D).

Figure 3-10. Two types of β sheet structures.

Figure 3-x

Two types of β sheet structures. (A) An antiparallel β canvas (run into Figure iii-9D). (B) A parallel β sheet. Both of these structures are common in proteins.

An α helix is generated when a single polypeptide chain twists around on itself to course a rigid cylinder. A hydrogen bond is made betwixt every fourth peptide bond, linking the C=O of 1 peptide bond to the N–H of another (encounter Figure 3-9A). This gives rising to a regular helix with a complete turn every 3.6 amino acids. Notation that the protein domain illustrated in Console 3-2 contains two α helices, as well every bit β canvass structures.

Brusk regions of α helix are particularly abundant in proteins located in cell membranes, such equally transport proteins and receptors. Every bit we hash out in Chapter 10, those portions of a transmembrane poly peptide that cross the lipid bilayer ordinarily cross as an α helix composed largely of amino acids with nonpolar side bondage. The polypeptide backbone, which is hydrophilic, is hydrogen-bonded to itself in the α helix and shielded from the hydrophobic lipid surroundings of the membrane by its protruding nonpolar side chains (see also Figure three-77).

In other proteins, α helices wrap around each other to form a particularly stable structure, known as a coiled-roll. This structure can form when the 2 (or in some cases three) α helices have most of their nonpolar (hydrophobic) side chains on ane side, so that they can twist around each other with these side chains facing inward (Figure 3-11). Long rodlike coiled-coils provide the structural framework for many elongated proteins. Examples are α-keratin, which forms the intracellular fibers that reinforce the outer layer of the skin and its appendages, and the myosin molecules responsible for muscle contraction.

Read:   The Conclusion of an Informative Essay Should

Figure 3-11. The structure of a coiled-coil.

Figure 3-xi

The construction of a coiled-coil. (A) A unmarried α helix, with successive amino acid side chains labeled in a sevenfold sequence, “abcdefg” (from lesser to height). Amino acids “a” and “d” in such a sequence (more…)

The Protein Domain Is a Cardinal Unit of Arrangement

Even a small-scale protein molecule is built from thousands of atoms linked together by precisely oriented covalent and noncovalent bonds, and it is extremely hard to visualize such a complicated structure without a iii-dimensional display. For this reason, various graphic and computer-based aids are used. A CD-ROM produced to accompany this book contains computer-generated images of selected proteins, designed to be displayed and rotated on the screen in a variety of formats.

Biologists distinguish iv levels of system in the structure of a protein. The amino acrid sequence is known every bit the main structure of the protein. Stretches of polypeptide chain that form α helices and β sheets constitute the poly peptide’s secondary construction. The total three-dimensional system of a polypeptide chain is sometimes referred to as the poly peptide’s tertiary structure, and if a particular protein molecule is formed every bit a complex of more than one polypeptide chain, the complete structure is designated as the quaternary structure.

Studies of the conformation, function, and evolution of proteins take also revealed the central importance of a unit of organization distinct from the four just described. This is the protein domain, a substructure produced by any part of a polypeptide concatenation that tin can fold independently into a compact, stable structure. A domain usually contains between twoscore and 350 amino acids, and information technology is the modular unit from which many larger proteins are constructed. The different domains of a protein are frequently associated with different functions. Figure 3-12 shows an case—the Src poly peptide kinase, which functions in signaling pathways within vertebrate cells (Src is pronounced “sarc”). This protein has four domains: the SH2 and SH3 domains have regulatory roles, while the two remaining domains are responsible for the kinase catalytic activity. Subsequently in the chapter, nosotros shall return to this protein, in lodge to explain how proteins can form molecular switches that transmit information throughout cells.

Figure 3-12. A protein formed from four domains.

Effigy 3-12

A protein formed from four domains. In the Src protein shown, two of the domains course a protein kinase enzyme, while the SH2 and SH3 domains perform regulatory functions. (A) A ribbon model, with ATP substrate in
red. (B) A spacing-filling model, with (more…)

The smallest protein molecules contain only a single domain, whereas larger proteins can contain as many as several dozen domains, usually continued to each other by short, relatively unstructured lengths of polypeptide chain. Figure iii-13 presents ribbon models of 3 differently organized protein domains. As these examples illustrate, the central core of a domain can be constructed from α helices, from β sheets, or from diverse combinations of these ii fundamental folding elements. Each different combination is known equally a
protein fold. Then far, almost chiliad different poly peptide folds have been identified amongst the ten thousand proteins whose detailed conformations are known.

Figure 3-13. Ribbon models of three different protein domains.

Figure 3-13

Ribbon models of three different protein domains. (A) Cytochrome
562, a single-domain protein involved in electron transport in mitochondria. This protein is composed almost entirely of α helices. (B) The NAD-binding domain of the enzyme lactic (more…)

Few of the Many Possible Polypeptide Bondage Will Be Useful

Since each of the 20 amino acids is chemically distinct and each can, in principle, occur at any position in a protein concatenation, there are 20 × xx × 20 × xx = 160,000 dissimilar possible polypeptide chains iv amino acids long, or 20

different possible polypeptide bondage
amino acids long. For a typical protein length of about 300 amino acids, more than than 10390
(20300) dissimilar polypeptide chains could theoretically be fabricated. This is such an enormous number that to produce just i molecule of each kind would require many more than atoms than exist in the universe.

Only a very small fraction of this vast set of believable polypeptide bondage would adopt a single, stable three-dimensional conformation—by some estimates, less than i in a billion. The vast majority of possible protein molecules could adopt many conformations of roughly equal stability, each conformation having different chemical properties. And yet virtually all proteins present in cells adopt unique and stable conformations. How is this possible? The reply lies in natural selection. A poly peptide with an unpredictably variable structure and biochemical activity is unlikely to help the survival of a cell that contains it. Such proteins would therefore have been eliminated by natural pick through the enormously long trial-and-error process that underlies biological evolution.

Because of natural selection, not only is the amino acid sequence of a present-24-hour interval poly peptide such that a single conformation is extremely stable, but this conformation has its chemical properties finely tuned to enable the protein to perform a particular catalytic or structural function in the cell. Proteins are so precisely built that the change of fifty-fifty a few atoms in ane amino acrid tin can sometimes disrupt the construction of the whole molecule so severely that all office is lost.

Proteins Can Be Classified into Many Families

In one case a poly peptide had evolved that folded upwardly into a stable conformation with useful properties, its construction could be modified during development to enable it to perform new functions. This procedure has been greatly accelerated by genetic mechanisms that occasionally produce duplicate copies of genes, assuasive one gene copy to evolve independently to perform a new function (discussed in Chapter 7). This blazon of outcome has occurred quite often in the past; every bit a result, many present-day proteins can exist grouped into protein families, each family member having an amino acid sequence and a iii-dimensional conformation that resemble those of the other family members.

Consider, for example, the
serine proteases, a large family of protein-cleaving (proteolytic) enzymes that includes the digestive enzymes chymotrypsin, trypsin, and elastase, and several proteases involved in blood clotting. When the protease portions of any two of these enzymes are compared, parts of their amino acrid sequences are institute to match. The similarity of their three-dimensional conformations is even more striking: most of the detailed twists and turns in their polypeptide chains, which are several hundred amino acids long, are virtually identical (Effigy 3-14). The many different serine proteases nevertheless have distinct enzymatic activities, each cleaving different proteins or the peptide bonds between dissimilar types of amino acids. Each therefore performs a distinct part in an organism.

Figure 3-14. The conformations of two serine proteases compared.

Figure iii-14

The conformations of two serine proteases compared. The backbone conformations of elastase and chymotrypsin. Although just those amino acids in the polypeptide chain shaded in
are the aforementioned in the ii proteins, the two conformations are very like (more…)

The story we take told for the serine proteases could exist repeated for hundreds of other protein families. In many cases the amino acrid sequences accept diverged much further than for the serine proteases, so that one cannot be sure of a family unit relationship betwixt two proteins without determining their three-dimensional structures. The yeast α2 protein and the
engrailed poly peptide, for case, are both cistron regulatory proteins in the homeodomain family unit. Because they are identical in simply 17 of their threescore amino acid residues, their relationship became certain merely when their three-dimensional structures were compared (Figure three-15).

Figure 3-15. A comparison of a class of DNA-binding domains, called homeodomains, in a pair of proteins from two organisms separated by more than a billion years of evolution.

Figure iii-fifteen

A comparing of a grade of Deoxyribonucleic acid-binding domains, called homeodomains, in a pair of proteins from ii organisms separated past more a billion years of evolution. (A) A ribbon model of the structure mutual to both proteins. (B) A trace of the α-carbon (more…)

The diverse members of a large protein family often have distinct functions. Some of the amino acid changes that make family members different were no uncertainty selected in the course of evolution because they resulted in useful changes in biological activity, giving the private family members the different functional backdrop they take today. But many other amino acid changes are effectively “neutral,” having neither a beneficial nor a damaging upshot on the basic structure and role of the poly peptide. In addition, since mutation is a random process, there must also have been many deleterious changes that altered the 3-dimensional construction of these proteins sufficiently to harm them. Such faulty proteins would accept been lost whenever the individual organisms making them were at plenty of a disadvantage to exist eliminated by natural option.

Protein families are readily recognized when the genome of any organism is sequenced; for instance, the determination of the DNA sequence for the entire genome of the nematode
Caenorhabditis elegans
has revealed that this tiny worm contains more 18,000 genes. Through sequence comparisons, the products of a big fraction of these genes tin can be seen to contain domains from i or another protein family; for example, in that location appear to be 388 genes containing protein kinase domains, 66 genes containing Dna and RNA helicase domains, 43 genes containing SH2 domains, seventy genes containing immunoglobulin domains, and 88 genes containing Dna-bounden homeodomains in this genome of 97 million base pairs (Figure 3-sixteen).

Figure 3-16. Percentage of total genes containing one or more copies of the indicated protein domain, as derived from complete genome sequences.

Figure iii-16

Percent of total genes containing i or more copies of the indicated protein domain, every bit derived from complete genome sequences. Note that one of the three domains selected, the immunoglobulin domain, has been a relatively late addition, and its relative (more…)

Proteins Can Adopt a Limited Number of Different Poly peptide Folds

It is astounding to consider the rapidity of the increase in our knowledge well-nigh cells. In 1950, we did not know the club of the amino acids in a single protein, and many even doubted that the amino acids in proteins are arranged in an exact sequence. In 1960, the first three-dimensional structure of a protein was adamant past x-ray crystallography. At present that we take access to hundreds of thousands of protein sequences from sequencing the genes that encode them, what technical developments can nosotros await forward to side by side?

Information technology is no longer a big step to progress from a factor sequence to the production of large amounts of the pure protein encoded by that factor. Thanks to Deoxyribonucleic acid cloning and genetic engineering science techniques (discussed in Chapter 8), this step is ofttimes routine. Only there is still nothing routine virtually determining the complete 3-dimensional structure of a poly peptide. The standard technique based on x-ray diffraction requires that the protein be subjected to conditions that cause the molecules to amass into a large, perfectly ordered crystalline array—that is, a protein crystal. Each protein behaves quite differently in this respect, and protein crystals can exist generated merely through exhaustive trial-and-fault methods that ofttimes have many years to succeed—if they succeed at all.

Membrane proteins and large poly peptide complexes with many moving parts have generally been the most difficult to crystallize, which is why only a few such protein structures are displayed in this book. Increasingly, therefore, large proteins have been analyzed through determination of the structures of their individual domains: either by crystallizing isolated domains and then bombarding the crystals with 10-rays, or by studying the conformations of isolated domains in concentrated aqueous solutions with powerful nuclear magnetic resonance (NMR) techniques (discussed in Chapter 8). From a combination of ten-ray and NMR studies, we now know the iii-dimensional shapes, or conformations, of thousands of different proteins.

By advisedly comparing the conformations of known proteins, structural biologists (that is, experts on the structure of biological molecules) accept concluded that there are a express number of ways in which protein domains fold upward—maybe as few as 2000. As we saw, the structures for about 1000 of these protein folds accept thus far been determined; we may, therefore, already know half of the total number of possible structures for a protein domain. A consummate catalog of all of the protein folds that exist in living organisms would therefore seem to be inside our reach.

Sequence Homology Searches Can Identify Shut Relatives

The present database of known protein sequences contains more than 500,000 entries, and information technology is growing very rapidly as more and more than genomes are sequenced—revealing huge numbers of new genes that encode proteins. Powerful reckoner search programs are available that allow i to compare each newly discovered protein with this entire database, looking for possible relatives. Homologous proteins are defined equally those whose genes have evolved from a common ancestral cistron, and these are identified by the discovery of statistically significant similarities in amino acid sequences.

With such a large number of proteins in the database, the search programs observe many nonsignificant matches, resulting in a groundwork noise level that makes it very difficult to pick out all but the closest relatives. By and large speaking, a thirty% identity in the sequence of 2 proteins is needed to be certain that a match has been plant. Still, many brusque signature sequences (“fingerprints”) indicative of particular protein functions are known, and these are widely used to find more distant homologies (Figure 3-17).

Figure 3-17. The use of short signature sequences to find homologous protein domains.

Figure 3-17

The use of short signature sequences to detect homologous poly peptide domains. The two short sequences of 15 and 9 amino acids shown
tin be used to search big databases for a protein domain that is found in many proteins, the SH2 domain. Here, the (more…)

These poly peptide comparisons are of import considering related structures oft imply related functions. Many years of experimentation tin be saved by discovering that a new protein has an amino acid sequence homology with a poly peptide of known function. Such sequence homologies, for instance, first indicated that certain genes that cause mammalian cells to become malignant are protein kinases. In the same manner, many of the proteins that control design formation during the embryonic development of the fruit fly
were quickly recognized to be cistron regulatory proteins.

Read:   If the Speed of an Object Doubles Its Kinetic Energy

Computational Methods Let Amino Acid Sequences to Exist Threaded into Known Poly peptide Folds

Nosotros know that there are an enormous number of ways to make proteins with the same three-dimensional construction, and that—over evolutionary time—random mutations can crusade amino acid sequences to alter without a major alter in the conformation of a poly peptide. For this reason, one current goal of structural biologists is to determine all the different protein folds that proteins accept in nature, and to devise calculator-based methods to test the amino acrid sequence of a domain to identify which one of these previously determined conformations the domain is likely to adopt.

A computational technique chosen threading can exist used to fit an amino acid sequence to a particular protein fold. For each possible fold known, the computer searches for the best fit of the particular amino acrid sequence to that construction. Are the hydrophobic residues on the inside? Are the sequences with a stiff propensity to form an α helix in an α helix? And and then on. The best fit gets a numerical score reflecting the estimated stability of the structure.

In many cases, one particular 3-dimensional structure will stand out as a expert fit for the amino acid sequence, suggesting an approximate conformation for the protein domain. In other cases, none of the known folds will seem possible. By applying 10-ray and NMR studies to the latter grade of proteins, structural biologists hope to able to expand the number of known folds rapidly, aiming for a database that contains the consummate library of protein folds that exist in nature. With such a library, plus expected improvements in the computational methods used for threading, it may eventually get possible to obtain an gauge 3-dimensional structure for a protein equally soon as its amino acid sequence is known.

Some Protein Domains, Called Modules, Course Parts of Many Different Proteins

As previously stated, most proteins are composed of a series of protein domains, in which different regions of the polypeptide chain take folded independently to form compact structures. Such multidomain proteins are believed to have originated when the DNA sequences that encode each domain accidentally became joined, creating a new gene. Novel bounden surfaces accept oft been created at the juxtaposition of domains, and many of the functional sites where proteins demark to modest molecules are found to be located there (for an example run into Effigy 3-12). Many large proteins show clear signs of having evolved by the joining of preexisting domains in new combinations, an evolutionary process called
domain shuffling
(Figure three-18).

Figure 3-18. Domain shuffling.

Effigy 3-18

Domain shuffling. An extensive shuffling of blocks of protein sequence (protein domains) has occurred during protein evolution. Those portions of a protein denoted by the same shape and color in this diagram are evolutionarily related. Serine proteases (more than…)

A subset of poly peptide domains have been especially mobile during evolution; these and then-called
protein modules
are generally somewhat smaller (twoscore–200 amino acids) than an average domain, and they seem to take particularly versatile structures. The structure of one such module, the SH2 domain, was illustrated in Panel 3-ii (pp. 138–139). The structures of some additional protein modules are illustrated in Figure iii-xix.

Figure 3-19. The three-dimensional structures of some protein modules.

Figure 3-nineteen

The three-dimensional structures of some protein modules. In these ribbon diagrams, β-sheet strands are shown equally
and the North- and C-termini are indicated by
reddish spheres.
(Adapted from M. Baron, D.K. Norman, and I.D. Campbell,
Trends Biochem.
(more than…)

Each of the modules shown has a stable core construction formed from strands of β sheet, from which less-ordered loops of polypeptide chain protrude
The loops are ideally situated to form binding sites for other molecules, every bit most flagrantly demonstrated for the immunoglobulin fold, which forms the basis for antibiotic molecules (see Figure 3-42). The evolutionary success of such β-canvas-based modules is likely to take been due to their providing a user-friendly framework for the generation of new binding sites for ligands through small changes to these protruding loops.

A 2d characteristic of poly peptide modules that explains their utility is the ease with which they can be integrated into other proteins. Five of the half-dozen modules illustrated in Figure 3-nineteen have their Due north- and C-final ends at opposite poles of the module. This “in-line” arrangement means that when the Dna encoding such a module undergoes tandem duplication, which is not unusual in the evolution of genomes (discussed in Chapter 7), the duplicated modules can be readily linked in series to course extended structures—either with themselves or with other in-line modules (Figure 3-twenty). Potent extended structures equanimous of a serial of modules are especially mutual in extracellular matrix molecules and in the extracellular portions of cell-surface receptor proteins. Other modules, including the SH2 domain and the kringle module illustrated in Figure 3-19, are of a “plug-in” type. Subsequently genomic rearrangements, such modules are usually accommodated every bit an insertion into a loop region of a second protein.

Figure 3-20. An extended structure formed from a series of in-line protein modules.

Figure 3-20

An extended construction formed from a serial of in-line poly peptide modules. Four fibronectin type 3 modules (see Figure 3-xix) from the extracellular matrix molecule fibronectin are illustrated in (A) ribbon and (B) infinite-filling models. (Adapted from D.J. (more than…)

The Homo Genome Encodes a Complex Set of Proteins, Revealing Much That Remains Unknown

The result of sequencing the human being genome has been surprising, because information technology reveals that our chromosomes contain only 30,000 to 35,000 genes. With regard to gene number, we would announced to be no more than 1.4-fold more than complex than the tiny mustard weed,
and less than two-fold more circuitous than a nematode worm. The genome sequences also reveal that vertebrates take inherited nearly all of their protein domains from invertebrates—with only 7 percent of identified human being domains being vertebrate-specific.

Each of our proteins is on average more complicated, however. A process of domain shuffling during vertebrate development has given rise to many novel combinations of protein domains, with the consequence that there are most twice every bit many combinations of domains establish in human being proteins as in a worm or a fly. Thus, for example, the trypsinlike serine protease domain is linked to at least 18 other types of protein domains in human proteins, whereas information technology is found covalently joined to just five unlike domains in the worm. This extra variety in our proteins greatly increases the range of protein–protein interactions possible (see Figure 3-78), only how it contributes to making united states of america human is not known.

The complication of living organisms is staggering, and it is quite sobering to notation that we currently lack even the tiniest hint of what the part might be for more than 10,000 of the proteins that accept thus far been identified in the human genome. There are certainly enormous challenges alee for the next generation of cell biologists, with no shortage of fascinating mysteries to solve.

Larger Protein Molecules Often Contain More than Than 1 Polypeptide Chain

The aforementioned weak noncovalent bonds that enable a protein chain to fold into a specific conformation also permit proteins to bind to each other to produce larger structures in the cell. Whatever region of a poly peptide’s surface that tin can interact with another molecule through sets of noncovalent bonds is called a binding site. A protein tin can comprise binding sites for a diverseness of molecules, both large and small. If a binding site recognizes the surface of a 2d protein, the tight binding of 2 folded polypeptide bondage at this site creates a larger protein molecule with a precisely divers geometry. Each polypeptide chain in such a protein is called a
poly peptide subunit.

In the simplest case, two identical folded polypeptide chains bind to each other in a “head-to-caput” arrangement, forming a symmetric complex of two protein subunits (a
dimer) held together past interactions between two identical bounden sites. The
Cro repressor protein—a gene regulatory protein that binds to Deoxyribonucleic acid to turn genes off in a bacterial cell—provides an example (Figure iii-21). Many other types of symmetric poly peptide complexes, formed from multiple copies of a single polypeptide chain, are commonly constitute in cells. The enzyme
neuraminidase, for instance, consists of four identical protein subunits, each jump to the next in a “head-to-tail” system that forms a airtight ring (Figure 3-22).

Figure 3-21. Two identical protein subunits binding together to form a symmetric protein dimer.

Figure 3-21

Two identical protein subunits binding together to class a symmetric protein dimer. The Cro repressor protein from bacteriophage lambda binds to DNA to plough off viral genes. Its ii identical subunits bind head-to-head, held together by a combination of (more than…)

Figure 3-22. A protein molecule containing multiple copies of a single protein subunit.

Figure 3-22

A poly peptide molecule containing multiple copies of a single protein subunit. The enzyme neuraminidase exists as a band of iv identical polypeptide chains. The small diagram shows how the repeated use of the aforementioned bounden interaction forms the structure. (more…)

Many of the proteins in cells incorporate 2 or more than types of polypeptide bondage.
Hemoglobin, the protein that carries oxygen in red blood cells, is a specially well-studied example (Effigy 3-23). It contains two identical α-globin subunits and two identical β-globin subunits, symmetrically bundled. Such multisubunit proteins are very common in cells, and they can exist very large. Figure 3-24 provides a sampling of proteins whose verbal structures are known, allowing the sizes and shapes of a few larger proteins to be compared with the relatively pocket-sized proteins that we have thus far presented equally models.

Figure 3-23. A protein formed as a symmetric assembly of two different subunits.

Figure 3-23

A protein formed as a symmetric assembly of two different subunits. Hemoglobin is an abundant protein in red claret cells that contains two copies of α globin and two copies of β globin. Each of these four polypeptide chains contains a (more than…)

Figure 3-24. A collection of protein molecules, shown at the same scale.

Effigy 3-24

A drove of protein molecules, shown at the same calibration. For comparison, a DNA molecule bound to a poly peptide is also illustrated. These space-filling models stand for a range of sizes and shapes. Hemoglobin, catalase, porin, alcohol dehydrogenase, and (more…)

Some Proteins Class Long Helical Filaments

Some protein molecules can assemble to grade filaments that may span the entire length of a cell. Most simply, a long chain of identical protein molecules can be constructed if each molecule has a binding site complementary to another region of the surface of the same molecule (Effigy 3-25). An actin filament, for case, is a long helical construction produced from many molecules of the poly peptide
(Effigy 3-26). Actin is very abundant in eucaryotic cells, where it constitutes one of the major filament systems of the cytoskeleton (discussed in Affiliate 16).

Figure 3-25. Protein assemblies.

Figure 3-25

Poly peptide assemblies. (A) A protein with just one binding site tin can grade a dimer with another identical poly peptide. (B) Identical proteins with 2 unlike binding sites often form a long helical filament. (C) If the two binding sites are tending appropriately (more…)

Figure 3-26. Actin filaments.

Figure 3-26

Actin filaments. (A) Transmission electron micrographs of negatively stained actin filaments. (B) The helical organization of actin molecules in an actin filament. (A, courtesy of Roger Craig.)

Why is a helix such a mutual structure in biology? Equally we take seen, biological structures are often formed by linking subunits that are very similar to each other—such equally amino acids or protein molecules—into long, repetitive chains. If all the subunits are identical, the neighboring subunits in the chain can often fit together in but one way, adjusting their relative positions to minimize the free energy of the contact between them. As a result, each subunit is positioned in exactly the aforementioned style in relation to the next, so that subunit 3 fits onto subunit ii in the same fashion that subunit 2 fits onto subunit one, and and then on. Because it is very rare for subunits to join up in a straight line, this system generally results in a helix—a regular structure that resembles a spiral staircase, as illustrated in Figure 3-27. Depending on the twist of the staircase, a helix is said to exist either correct-handed or left-handed (Figure 3-27E). Handedness is not affected past turning the helix upside down, but information technology is reversed if the helix is reflected in the mirror.

Figure 3-27. Some properties of a helix.

Figure 3-27

Some backdrop of a helix. (A–D) A helix forms when a series of subunits bind to each other in a regular style. At the bottom, the interaction between two subunits is shown; behind them are the helices that upshot. These helices take two (A), iii (more…)

Helices occur commonly in biological structures, whether the subunits are modest molecules linked together by covalent bonds (for case, the amino acids in an α helix) or large protein molecules that are linked past noncovalent forces (for example, the actin molecules in actin filaments). This is not surprising. A helix is an unexceptional structure, and it is generated just past placing many like subunits side by side to each other, each in the same strictly repeated relationship to the one before.

A Protein Molecule Tin Accept an Elongated, Fibrous Shape

Nigh of the proteins we take discussed so far are
globular proteins,
in which the polypeptide chain folds up into a compact shape like a ball with an irregular surface. Enzymes tend to exist globular proteins: fifty-fifty though many are large and complicated, with multiple subunits, near have an overall rounded shape (see Figure 3-24). In contrast, other proteins take roles in the cell requiring each private protein molecule to span a big distance. These proteins generally have a relatively simple, elongated three-dimensional structure and are normally referred to as
fibrous proteins.

Read:   Jake Discovers They Are in Mexico City by:

One large family of intracellular fibrous proteins consists of α-keratin, introduced earlier, and its relatives. Keratin filaments are extremely stable and are the main component in long-lived structures such as hair, horn, and nails. An α-keratin molecule is a dimer of ii identical subunits, with the long α helices of each subunit forming a coiled-coil (see Effigy iii-11). The coiled-gyre regions are capped at each end by globular domains containing binding sites. This enables this course of protein to get together into ropelike
intermediate filaments—an important component of the cytoskeleton that creates the cell’due south internal structural scaffold (see Figure xvi-16).

Fibrous proteins are especially abundant outside the cell, where they are a main component of the gel-like
extracellular matrix
that helps to bind collections of cells together to form tissues. Extracellular matrix proteins are secreted by the cells into their environment, where they often assemble into sheets or long fibrils.
is the about abundant of these proteins in creature tissues. A collagen molecule consists of three long polypeptide chains, each containing the nonpolar amino acrid glycine at every third position. This regular structure allows the chains to wind effectually one another to generate a long regular triple helix (Figure 3-28A). Many collagen molecules and so bind to one another side-past-side and stop-to-end to create long overlapping arrays—thereby generating the extremely tough collagen fibrils that give connective tissues their tensile forcefulness, as described in Chapter nineteen.

Figure 3-28. Collagen and elastin.

Figure 3-28

Collagen and elastin. (A) Collagen is a triple helix formed by three extended poly peptide chains that wrap around 1 another
Many rodlike collagen molecules are cross-linked together in the extracellular space to form unextendable collagen fibrils (more…)

In complete contrast to collagen is another protein in the extracellular matrix,
elastin. Elastin molecules are formed from relatively loose and unstructured polypeptide chains that are covalently cross-linked into a rubberlike rubberband meshwork: different most proteins, they do not have a uniquely defined stable structure, but can be reversibly pulled from i conformation to another, as illustrated in Figure 3-28B. The resulting elastic fibers enable skin and other tissues, such as arteries and lungs, to stretch and recoil without fierce.

Extracellular Proteins Are Frequently Stabilized by Covalent Cross-Linkages

Many protein molecules are either attached to the outside of a prison cell’south plasma membrane or secreted as function of the extracellular matrix. All such proteins are straight exposed to extracellular weather condition. To help maintain their structures, the polypeptide chains in such proteins are often stabilized by covalent cantankerous-linkages. These linkages can either necktie two amino acids in the same protein together, or connect dissimilar polypeptide bondage in a multisubunit poly peptide. The almost mutual cantankerous-linkages in proteins are covalent sulfur–sulfur bonds. These
disulfide bonds
(besides chosen
S–S bonds) form as proteins are being prepared for export from cells. As described in Chapter 12, their formation is catalyzed in the endoplasmic reticulum past an enzyme that links together two pairs of –SH groups of cysteine side bondage that are adjacent in the folded protein (Figure 3-29). Disulfide bonds do non change the conformation of a poly peptide only instead act as atomic staples to reinforce its most favored conformation. For instance, lysozyme—an enzyme in tears that dissolves bacterial cell walls—retains its antibacterial activeness for a long time considering information technology is stabilized by such cross-linkages.

Figure 3-29. Disulfide bonds.

Figure iii-29

Disulfide bonds. This diagram illustrates how covalent disulfide bonds form between next cysteine side chains. As indicated, these cross-linkages tin bring together either two parts of the aforementioned polypeptide chain or ii different polypeptide bondage. Since the (more…)

Disulfide bonds generally fail to course in the cell cytosol, where a loftier concentration of reducing agents converts Southward–S bonds back to cysteine –SH groups. Apparently, proteins do not require this type of reinforcement in the relatively balmy environment inside the cell.

Poly peptide Molecules Often Serve equally Subunits for the Assembly of Large Structures

The aforementioned principles that enable a protein molecule to associate with itself to class rings or filaments operate to generate much larger structures in the cell—supramolecular structures such equally enzyme complexes, ribosomes, protein filaments, viruses, and membranes. These large objects are not made as single, giant, covalently linked molecules. Instead they are formed by the noncovalent assembly of many separately manufactured molecules, which serve every bit the subunits of the final structure.

The utilize of smaller subunits to build larger structures has several advantages:


A large structure congenital from one or a few repeating smaller subunits requires only a small amount of genetic information.


Both assembly and disassembly can be readily controlled, reversible processes, since the subunits associate through multiple bonds of relatively low energy.


Errors in the synthesis of the structure can exist more easily avoided, since correction mechanisms tin can operate during the grade of associates to exclude malformed subunits.

Some protein subunits get together into apartment sheets in which the subunits are arranged in hexagonal patterns. Specialized membrane proteins are sometimes bundled this way in lipid bilayers. With a slight modify in the geometry of the individual subunits, a hexagonal canvass tin can exist converted into a tube (Figure 3-30) or, with more changes, into a hollow sphere. Protein tubes and spheres that demark specific RNA and DNA molecules form the coats of viruses.

Figure 3-30. An example of single protein subunit assembly requiring multiple protein–protein contacts.

Effigy 3-30

An instance of unmarried protein subunit associates requiring multiple protein–protein contacts. Hexagonally packed globular poly peptide subunits can grade either a flat sheet or a tube.

The germination of airtight structures, such as rings, tubes, or spheres, provides additional stability because it increases the number of bonds between the poly peptide subunits. Moreover, because such a construction is created past mutually dependent, cooperative interactions between subunits, it can exist driven to assemble or disassemble past a relatively small alter that affects each subunit individually. These principles are dramatically illustrated in the protein glaze or
of many simple viruses, which takes the form of a hollow sphere (Effigy three-31). Capsids are often fabricated of hundreds of identical protein subunits that enclose and protect the viral nucleic acrid (Figure three-32). The protein in such a capsid must accept a especially adaptable structure: it must not only make several different kinds of contacts to create the sphere, it must as well change this arrangement to let the nucleic acid out to initiate viral replication in one case the virus has entered a cell.

Figure 3-31. The capsids of some viruses, all shown at the same scale.

Figure 3-31

The capsids of some viruses, all shown at the same calibration. (A) Tomato bushy stunt virus; (B) poliovirus; (C) simian virus forty (SV40); (D) satellite tobacco necrosis virus. The structures of all of these capsids have been determined by x-ray crystallography (more…)

Figure 3-32. The structure of a spherical virus.

Figure 3-32

The construction of a spherical virus. In many viruses, identical protein subunits pack together to create a spherical vanquish (a capsid) that encloses the viral genome, composed of either RNA or DNA (see also Figure iii-31). For geometric reasons, no more than than (more…)

Many Structures in Cells Are Capable of Self-Assembly

The data for forming many of the complex assemblies of macromolecules in cells must be contained in the subunits themselves, because purified subunits tin spontaneously assemble into the final construction nether the appropriate conditions. The kickoff big macromolecular aggregate shown to be capable of cocky-assembly from its component parts was
tobacco mosaic virus (TMV). This virus is a long rod in which a cylinder of protein is bundled around a helical RNA core (Figure 3-33). If the dissociated RNA and protein subunits are mixed together in solution, they recombine to course fully agile viral particles. The assembly process is unexpectedly circuitous and includes the formation of double rings of protein, which serve as intermediates that add together to the growing viral glaze.

Figure 3-33. The structure of tobacco mosaic virus (TMV).

Figure 3-33

The structure of tobacco mosaic virus (TMV). (A) An electron micrograph of the viral particle, which consists of a single long RNA molecule enclosed in a cylindrical protein coat composed of identical protein subunits. (B) A model showing part of the (more than…)

Some other circuitous macromolecular aggregate that tin reassemble from its component parts is the bacterial ribosome. This structure is composed of well-nigh 55 dissimilar poly peptide molecules and 3 different rRNA molecules. If the individual components are incubated nether appropriate weather condition in a test tube, they spontaneously re-form the original structure. Virtually importantly, such reconstituted ribosomes are able to perform protein synthesis. As might be expected, the reassembly of ribosomes follows a specific pathway: after sure proteins accept jump to the RNA, this complex is so recognized past other proteins, and and then on, until the structure is complete.

It is nevertheless not articulate how some of the more than elaborate self-associates processes are regulated. Many structures in the cell, for example, seem to have a precisely defined length that is many times greater than that of their component macromolecules. How such length determination is accomplished is in many cases a mystery. Three possible mechanisms are illustrated in Figure iii-34. In the simplest case, a long core poly peptide or other macromolecule provides a scaffold that determines the extent of the final associates. This is the mechanism that determines the length of the TMV particle, where the RNA chain provides the core. Similarly, a core poly peptide is thought to determine the length of the thin filaments in musculus, too as the length of the long tails of some bacterial viruses (Figure 3-35).

Figure 3-34. Three mechanisms of length determination for large protein assemblies.

Effigy three-34

Three mechanisms of length determination for large protein assemblies. (A) Coassembly forth an elongated core protein or other macromolecule that acts every bit a measuring device. (B) Termination of assembly because of strain that accumulates in the polymeric (more than…)

Figure 3-35. An electron micrograph of bacteriophage lambda.

Figure iii-35

An electron micrograph of bacteriophage lambda. The tip of the virus tail attaches to a specific protein on the surface of a bacterial cell, afterward which the tightly packaged Deoxyribonucleic acid in the head is injected through the tail into the jail cell. The tail has a precise (more than…)

The Formation of Complex Biological Structures Is Often Aided by Associates Factors

Non all cellular structures held together by noncovalent bonds are capable of self-assembly. A mitochondrion, a cilium, or a myofibril of a muscle cell, for example, cannot form spontaneously from a solution of its component macromolecules. In these cases, part of the associates information is provided by special enzymes and other cellular proteins that perform the function of templates, guiding construction just taking no office in the final assembled structure.

Fifty-fifty relatively simple structures may lack some of the ingredients necessary for their ain assembly. In the germination of sure bacterial viruses, for instance, the head, which is composed of many copies of a single poly peptide subunit, is assembled on a temporary scaffold equanimous of a 2nd poly peptide. Because the second protein is absent from the terminal viral particle, the head construction cannot spontaneously reassemble in one case it has been taken apart. Other examples are known in which proteolytic cleavage is an essential and irreversible stride in the normal assembly process. This is even the case for some pocket-size protein assemblies, including the structural protein collagen and the hormone insulin (Effigy iii-36). From these relatively simple examples, information technology seems very likely that the associates of a structure as circuitous as a mitochondrion or a cilium will involve temporal and spatial ordering imparted past numerous other cell components.

Figure 3-36. Proteolytic cleavage in insulin assembly.

Figure 3-36

Proteolytic cleavage in insulin assembly. The polypeptide hormone insulin cannot spontaneously re-class efficiently if its disulfide bonds are disrupted. It is synthesized equally a larger protein
that is cleaved by a proteolytic enzyme subsequently the (more…)


The iii-dimensional conformation of a protein molecule is determined by its amino acid sequence. The folded construction is stabilized by noncovalent interactions between unlike parts of the polypeptide chain. The amino acids with hydrophobic side chains tend to cluster in the interior of the molecule, and local hydrogen-bail interactions between neighboring peptide bonds requite rise to α helices and β sheets.

Globular regions, known as domains, are the modular units from which many proteins are constructed; such domains generally incorporate twoscore–350 amino acids. Small proteins typically consist of only a single domain, while large proteins are formed from several domains linked together by short lengths of polypeptide chain. As proteins have evolved, domains have been modified and combined with other domains to construct new proteins. Domains that participate in the formation of large numbers of proteins are known equally protein modules. Thus far, about 1000 different ways of folding upwardly a domain have been observed, among more than near 10,000 known protein structures.

Proteins are brought together into larger structures by the same noncovalent forces that determine poly peptide folding. Proteins with binding sites for their ain surface can assemble into dimers, closed rings, spherical shells, or helical polymers. Although mixtures of proteins and nucleic acids tin can get together spontaneously into complex structures in a test tube, many biological assembly processes involve irreversible steps. Consequently, not all structures in the prison cell are capable of spontaneous reassembly after they take been dissociated into their component parts.

The Shape of a Protein Molecule Directly Determines Its


Originally posted 2022-08-07 14:17:30.

Check Also

In the 1400s the Inca Lived in What is Now

Incan Civilization Painting from the 17th century with the Inca lineages mentioned past the colonial …