***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***


> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On 
> Behalf Of Ian Tickle
> Sent: 20 January 2006 11:22
> To: Joel Bard
> Cc: [email protected]
> Subject: RE: [ccp4bb]: electron density map & Pymol

> > # generate a pdb with just our ligand
> > awk '$1 == "ATOM" && $5 == "'$ligID'"' $pdbin > 
> ligmap_lig${ligID}.pdb
> 
> This is a potentially error-prone way of extracting lines 
> from a PDB file, because the PDB record format is inherently 
> not free (i.e.
> space-separated with no data allowed to be blank; rather the 
> data are in fixed columns).  awk is designed to process 
> free-format data so is totally unsuited to the task.

<snip>
  
> Unix provides a versatile utility for handling fixed-format 
> records that many people don't seem to be aware of, namely 
> egrep (or grep -E), i.e.
> grep with extended regular expressions.  The following 
> command will handle all of the above cases very neatly, and 
> is easily generalised to perform similar tasks:
> 
> egrep  "^(ATOM  |HETATM).{15}$ligID"  $pdbin

It seems that awk provides a better solution than egrep after all,
though not in the way originally suggested:

awk  --posix  "/^(ATOM  |HETATM).{15}$ligID/"  $pdbin

This uses the same regexp as before (but note the terminating /.../),
but runs about 7 times faster (on Intel PC/SuSE 9.1) than egrep.  Also
note the '--posix' flag which is needed for Gnu awk to recognise
regexps.  Using the same regexp in a Perl program runs even faster
(marginally) than the awk program.

-- Ian



Disclaimer

This communication is confidential and may contain privileged information 
intended solely for the named addressee(s). It may not be used or disclosed 
except for the purpose for which it has been sent. If you are not the intended 
recipient you must not review, use, disclose, copy, distribute or take any 
action in reliance upon it. If you have received this communication in error, 
please notify Astex Therapeutics Ltd by emailing [EMAIL PROTECTED] and destroy 
all copies of the message and any attached documents. 



Astex Therapeutics Ltd monitors, controls and protects all its messaging 
traffic in compliance with its corporate email policy. The Company accepts no 
liability or responsibility for any onward transmission or use of emails and 
attachments having left the Astex Therapeutics domain.  Unless expressly 
stated, opinions in this message are those of the individual sender and not of 
Astex Therapeutics Ltd. The recipient should check this email and any 
attachments for the presence of computer viruses. Astex Therapeutics Ltd 
accepts no liability for damage caused by any virus transmitted by this email. 
E-mail is susceptible to data corruption, interception, unauthorized amendment, 
and tampering, Astex Therapeutics Ltd only send and receive e-mails on the 
basis that the Company is not liable for any such alteration or any 
consequences thereof.



Reply via email to