*** For details on how to be removed from this list visit the *** *** CCP4 home page http://www.ccp4.ac.uk ***
Ian Tickle wrote:
This is a potentially error-prone way of extracting lines from a PDB file, because the PDB record format is inherently not free (i.e. space-separated with no data allowed to be blank; rather the data are in fixed columns). awk is designed to process free-format data so is totally unsuited to the task.
even for small pdb files containing heme, the original awk expression is useful because some atom names contain spaces: HETATM 8614 FE HEM 101 0.407 31.517 78.648 1.00 35.55 HETATM 8615 CHA HEM 101 2.786 31.976 81.117 1.00 37.07 HETATM 8619 N A HEM 101 2.012 32.838 78.920 1.00 36.69 HETATM 8630 N B HEM 101 0.307 32.155 76.784 1.00 37.34 HETATM 8638 N C HEM 101 -1.283 30.423 78.355 1.00 37.97 HETATM 8646 N D HEM 101 0.615 30.975 80.458 1.00 36.88 But if you are on Linux awk is probably gawk, and gawk has a "FIELDWIDTHS" option that lets you keep the old syntax but separate fields by fixed width rather than field-separator charactor: set ligID=HEM gawk '$1 == "ATOM" && $5 == "'$ligID'"'\ FIELDWIDTHS="6 5 5 4 2 4 4 8 8 8" pdbin.pdb > pdbout.pdb Note <FIELDWIDTHS ""> comes outside the awk expression.
