Thanks Tim! Reading your link, I had the same difficulties using chainsaw with multisubunit pdb files that you did, but I hadn't come across mrtailor - I'll have a look at it, looks very handy. Oh well, the exercise was worthwhile nevertheless :)
Best, Oliver. On Sat, Jan 4, 2014 at 12:51 PM, Tim Gruene <t...@shelx.uni-ac.gwdg.de> wrote: > Dear Oliver, > > merely as a coment: the program mrtailor > (http://shelx.uni-ac.gwdg.de/~tg/research/programs/mrtailor) achieves > something very similar: it requires an alignment between the sequence > inside the PDB file and the target sequence and operates on all matching > chains. The result is very similar to chainsaw but intended for the use > of mulit-chain PDB files, i.e. the non-matching chains are left > untouched in the output PDB rather than removed. > > Surely there is no python exercise involved ;-) > > Best, > Tim > > On 01/04/2014 03:54 PM, Oliver Clarke wrote: > > Hi all, > > > > I’m just going to post this in case it’s useful to anyone else - and if > not, at least I’ll be able to google it for later reference. > > > > It was more of a python learning exercise for me than anything else - > I’m sure there is probably a better and easier way to do this - but perhaps > it will be useful to someone. > > > > This is a script that is useful in a fairly specific situation - when > you have a molecular replacement solution of a large, multi protein complex > and you want to mutate each chain in the search model to match the target > sequence. > > > > It reads a file (seq_name) containing the target sequences in fasta > format (with an additional “>” line after the last sequence, order of the > sequences is not important), matches them each to the best aligning chain > in the molecule (mol_id), and aligns and mutates the chain before finally > associating each target sequence with the appropriate chain. At the moment > it only looks for one chain matching each target sequence, but it would be > easy enough to modify it to take account of multiple identical copies. > Unfortunately at the moment it spawns one info-dialog per align_and_mutate > job, because I don’t know how to automatically dismiss them from within the > script. > > > > It uses a couple of modules from Biopython, so Biopython will need to be > installed and accessible to coot for this script to work (place Bio and > BioSQL directories inside the site-packages subdir of Coot’s python > installation and modify the PYTHONPATH variable in /usr/local/bin/coot > accordingly). It will also only work with coot nightlies r4872 or later. > > > > Incidentally, biopython seems to play pretty happily with coot, no > negative effects that I’ve noted so far, and it has a few useful tools, > particularly for dealing with sequences (e.g. being able to directly blast > the sequence of a PDB and return the sequence of the top hit). > > > > Any comments/criticisms much appreciated - am trying to teach myself > python but am starting from scratch pretty much, so if there is any > obviously stupid code in this script I’d love to know how to make it better! > > > > Cheers, > > Oliver. > > > > > > > > > > > > > > -- > Dr Tim Gruene > Institut fuer anorganische Chemie > Tammannstr. 4 > D-37077 Goettingen > > GPG Key ID = A46BEE1A > >