Please, see below. --- Gabriel Genellina <[EMAIL PROTECTED]> wrote:
> En Mon, 19 Nov 2007 21:15:16 -0300, Henry <[EMAIL PROTECTED]> > escribió: > > > On 19/11/2007, Francesco Pietra <[EMAIL PROTECTED]> wrote: > >> > >> How to insert "TER" records recursively, i.e. some thousand fold, in a > >> file > >> like in the following example? "H2 WAT" is the only constant > >> characteristic of > >> the line after which to insert "TER"; that distinguishes also for lines > > > > If every molecule is water, and therefore 3 atoms, > you can use this fact > > to > > insert TER in the right place. You don't need recursion: > > > > f = open( "atoms.txt", "rt" ) > > lineCount = 0 > > for line in f.xreadlines( ): > > lineCount = lineCount + 1 > > print line > > if lineCount == 3: > > lineCount = 0 > > print "TER" > > f.close( ) > > A small variation can handle the original, more generic condition "insert > TER after the line containing H2 > WAT" > > f = open("atoms.txt", "r") > for line in f: > print line > if "H2 WAT" in line: > print "TER" > f.close() > > (also, note that unless you're using Python 2.2 or earlier, the xreadlines > call does no good) I tried the latter script (which works also if there are other molecules in the file, as it is my case) encountering two problems: (1) "TER" records were inserted, as seen on the shell window. Though, the file on disk was not modified. Your script named "ter_insert.py", in order to get the modified file I used the classic $ python ter_insert.py 2>&1 | tee file.out Now, "file .out" had "TER" inserted where I wanted. It might well be that it was my incorrect use of your script. (2) An extra line is inserted (which was not a problem of outputting the file as I did), except between "TER" and the next line, as shown below: TER ATOM 27400 O WAT 4178 20.289 4.598 26.491 1.00 0.00 W20 O ATOM 27401 H1 WAT 4178 19.714 3.835 26.423 1.00 0.00 W20 H ATOM 27402 H2 WAT 4178 21.173 4.237 26.554 1.00 0.00 W20 H TER ATOM 27403 O WAT 4585 23.340 3.428 25.621 1.00 0.00 W20 O ATOM 27404 H1 WAT 4585 22.491 2.985 25.602 1.00 0.00 W20 H ATOM 27405 H2 WAT 4585 23.826 2.999 26.325 1.00 0.00 W20 H TER ATOM 27406 O WAT 4966 22.359 0.555 27.001 1.00 0.00 W20 O ATOM 27407 H1 WAT 4966 21.820 1.202 27.456 1.00 0.00 W20 H ATOM 27408 H2 WAT 4966 22.554 -0.112 27.659 1.00 0.00 W20 H TER END Where "END" is how Protein Data Bank (pdb) files end. As these files are extremely sensitive, can the script be modified to avoid these extra lines? Not tried (it takes time, because I have to go to the big cluster) if the extra lines really create problems, though, they take a lot of space on the shell window. A nearly perfect script. Thank you francesco > > -- > Gabriel Genellina > > -- > http://mail.python.org/mailman/listinfo/python-list > ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs -- http://mail.python.org/mailman/listinfo/python-list