Tutorial Study Questions
for Protein Explorer

Protein Explorer: www.umass.edu/microbio/chime/explorer
Protein Explorer's Tutorial: www.umass.edu/microbio/chime/explorer/pe_tut.htm
"B&T" refers to Introduction to Protein Structure by Branden & Tooze, 2nd ed. (Garland).
(Answers available to teachers who identify themselves fully in an email request. Include your name, position, institution, city and country, and an institutional URL confirming your identity, or else you will get no reply. Include a promise not to post the answers on the web.)

You can enter your answers digitally. Pull down Netscape's File menu and click Edit Page.
Save this early and often as a file on your disk or diskette! If you don't save it, you'll lose your answers!

    Background

  1. Roughly what fraction of the proteins in the human proteome can be explored in PE?
    Answer:

  2. What categories of proteins cannot readily be explored in 3D? Why?
    Answer:

  3. Can PE show you a protein's structure if you provide it with the amino acid sequence?
    Answer:

  4. Why are we using Protein Explorer instead of RasMol?
    Answer:
    Chapter I: FirstView of 1d66

  5. What experimental method was used to determine the atomic coordinates in 1d66.pdb?
    Answer:

  6. What different experimental method was used to determine atomic coordinates for the second largest group of molecules in the Protein Data Bank?
    Answer:

  7. Why should we believe that the structures determined by the more common method accurately represent the structures of proteins in aqueous solution?
    Answer:

  8. What is the most common reason that structural data are unavailable for soluble proteins of wide interest?
    Answer:

  9. What does "PDB" mean?
    Answer:

  10. What does "mmCIF" mean?
    Answer:

  11. Do the waters in 1d66 have hydrogens? Why?
    Answer:

  12. For every water molecule you can see in Protein Explorer for 1d66, how many water molecules present in the crystal are invisible? Why?
    Answer:

  13. What is the point of the CPK color scheme?
    Answer:

    What are the CPK colors for

  14. How many disulfide bonds are present in 1d66?
    Answer:

    Chapter II: Molecule Information Window, PDB Header, & Sequences

  15. What are the full chemical names for the two organic hetero groups in 3pcb? (Hint: even with a PE session running for 1d66, the entry page is still open in another window. Pull the window titled Protein Explore Entry Options to front, and enter 3pcb in the slot in the middle gray square. This will start another PE session. You can have as many sessions going concurrently as your computer will support comfortably.)
    Answer:

  16. What are the full names of the hetero groups DMV and APX? What PDB ID codes contain these?
    Answer:

  17. Load the PDB file containing DMV. How many disulfide bonds? How many cysteine residues?
    Answer:

  18. In the same PDB file as the previous question:
    1. What is the length of Chain A? Explain, saying where you got each piece of information. 
    2. Why are the lengths of chains B-D different? Or are they? 
    3. Why are 524 residues listed under SEQRES? What happened to the missing 10 residues? 

  19. What is the N-terminal residue in 2zta? (full name)
    Answer:

  20. For chain A in 1d66:
    1. How many residues have coordinates assigned in chain A? (Hint: you can't see the residue in the graphic model if coordinates are not assigned. Click to identify!) 
    2. How many residues do the SEQRES records in the PDB file header list? 
    3. Do the first and last residues with coordinates agree with the residues of the same numbers in the SEQRES records? 
    4. How do you explain any discrepancies between the SEQRES records and the residues assigned coordinates? 

  21. The quaternary structure of a protein molecule can be described using one letter for each sequence-distinct chain, and number subscripts for the number of each type of chain. Thus, immunoglobulin G (IgG) is H2L2 (for Heavy and Light chains, e.g. 1igt), and hemoglobin, is A2B2. Don't confuse these A and B designations with the names of individual chains in PDB files, which are always different for each chain even when the chains have identical sequences. Thus, the chain names in a hemoglobin PDB file (e.g. 2hhd) are A, B, C, and D. Give the quaternary structures of the contents of PDB files 6AT1, 3PCB, and 1BL8. Extra credit: 1HTM (look carefully!).
    Answer:

  22. Is it OK to use the direct link to Protein Explorer at RCSB? Why?
    Answer:

    Chapter III: QuickViews Menus.

  23. For chain D in 1d66:
    1. Does the number of residues in PE's Sequences display agree with the SEQRES records? 
    2. When you click on the end residues, do their numbers agree with the Sequences display? (Hint: smoothed trace displays, such as produced by Cartoon for DNA, don't respond to clicking. Read the help for Cartoon.) 
    3. Explain any discrepancies. (Hint: use Seq3D to display the terminal residues.) 

  24. How many of the helices in chain A of 1d66 are amphipathic?
    (Hint:
    Answer:

  25. Concerning chain A of 1d66:
    1. How many of the amphipathic helices have their hydrophobic sides satisfied by contact with other hydrophobic moieties?
    2. What possibilities can you offer to account for the ones that aren't satisfied?

    Answer:

  26. Do you think the overall charge of 1d66 is positive or negative? Give structural evidence from 1d66 to support your answer. How does this fit with its function?
    Extra credit: calculate the isoelectric point at http://www.embl-heidelberg.de/cgi/pi-wrapper.pl (Get the sequence most easily from RCSB's Structure Explorer, Sequence Details, "Download all chains in FASTA format".)
    Answer:

  27. How many atoms of what element coordinate each pair of Cd ions in 1d66?
    Answer:

  28. Which if any of the 4 catalytic site residues in 1AI4 are not bonded noncovalently to the HAA substrate analog, according to QuickViews DISPLAY Contacts?
    Answer:

  29. Describe quantitatively the distance criteria an atom must meet in order to be shown as "likely noncovalently bonded" in PE's Contact Surfaces display.
    Answer:

  30. Regarding the PDB file header for 1OSA, each SITE record (a record is a line) includes the number 12. What does this number mean? (Hint: click on the hyperlinked word SITE.)
    Answer:

  31. Regarding SITE EF1 in 1OSA:
    1. How many oxygens contact the metal ion? (Hint: use Contacts.) 
    2. Are any of these oxygens not in amino acid sidechains? 
    3. Optional: See B&T pp. 109-110 (calmodulin) and pp. 24-26 (EF hands). 

  32. Are the gaps listed below physical or virtual? (Hint: Use Seq3D's "Scrutinize range", and click on the residues immediately preceding and following the gap.)
    1. Gap in 4CSM? 
    2. 4-residue gap in chain B of 1IGT, starting at residue 158?  

  33. Regarding the contact surface for chain A of 1d66: (Hint: in QuickViews, SELECT chain A, DISPLAY Contacts.)
    1. What elements are noncovalently bound in the protein-protein interaction region? (Remember that only atoms shown as balls are likely to be noncovalently bound.) 
    2. What elements are noncovalently bound in the DNA backbone interaction with chain A? 
    3. What elements are noncovalently bound in the DNA base interactions with chain A? 
    4. What DNA sequence is recognized by chain A? Bear in mind that sequence-specific recognition depends largely on hydrogen bonding to base-specific donors/acceptors (Optional: see B&T pp. 124-125 and after you try answering this question pp. 187-189 on Gal4). 

  34. Movie #1, dipeptide: (Hint: use Netscape's Edit, Find in Page to hunt for the word "movie" in the Tutorial.)
    1. Name these two amino acids, amino terminal first. Give full names, 3-letter abbreviations, and one-letter codes. 
    2. What are the names of their sidechain nitrogen and oxygen atoms? 
    3. What do the greek letters in these names represent? 

  35. DNA vs. RNA:
    1. How many ribonucleotides are in 124D? 
    2. How many deoxyribonucleotides are in 124D? 
    3. How many ribonucleotides are in 1OKA? 
    4. How many deoxyribonucleotides are in 1OKA? 

  36. For Chime's built-in hydrogen bond display, in a region of alpha helix:
    1. How many alpha carbons are between the donor and acceptor of one hydrogen bond (following the backbone)? 
    2. So in alpha helices, backbone-to-backbone hydrogen bonds connect every (3rd, 4th, 5th?) residue. 
    3. What categories of hydrogen bonds are not shown? 

  37. Real bonds vs. backbones and backbone-to-backbone hydrogen bonds. After running Tutorial movie #2:
    1. Do the backbone traces (green) correspond exactly to the positions of real bonds? 
    2. When protein hydrogen bonds are rendered as "backbone-to-backbone", do they correspond exactly to the positions of real bonds? 
    3. DISPLAY HBonds, checking "donor-to-acceptor". Now does the hbond correspond exactly to the position of a real bond? 

  38. For the DNA in 1d66:
    1. Three hydrogen bonds connect which two bases? 
    2. Two hydrogen bonds connect which two bases? 
    3. Explain the longest hydrogen bond. 

  39. Regarding the 3 hbonds between G11 and C28 in 1d66, as Chime depicts them (movie #3):
    1. What are their lengths? 
    2. Based on these lengths, which of these 3 could realistically be hydrogen bonds? 
    3. Is it possible for a hydrogen bond to cross a carbon atom as depicted for the middle-length hbond here? 
    4. Is it possible for a hydrogen bond to cross a carbon-carbon bond as depicted for the longest hbond here? 
    5. Which of the two bases here is not in Watson-Crick orientation regarding the other base? 
    6. Why do you think the authors of 1d66 oriented this base out of Watson-Crick position? (Just speculate.) 

  40. Regarding the three cation-pi interactions shown for 1b07:
    1. Are all of these energetically significant according to CaPTURE? 
    2. Did PE miss any energetically significant cation-pi interactions?