In relation to this, I just noticed something very important:
If you "set auto_zoom, off" before loading a series of structures, you'll
boost PyMOL's loading performance dramatically: by 10X at least...perhaps
much more. Just zoom once manually after everything is loaded in.
cmd.set("zoom","off")
# now load your structures
...
# now zoom
cmd.zoom()
This change makes it possible to load hundreds of structures containing
hundreds of thousands of atoms in a reasonable amount of time. For example,
on my dual 1 GB G5 with Shark-optimized G5 beta code, I loaded 800 PDB
structures containing a total of 1.4 million atoms in just over 426 seconds
-- apparently the situation isn't nearly as bad as I'd feared.
Cheers,
Warren
--
mailto:[email protected]
Warren L. DeLano, Ph.D.
Principal Scientist
DeLano Scientific LLC
Voice (650)-346-1154
Fax (650)-593-4020
> -----Original Message-----
> From: Ben Allen [mailto:[email protected]]
> Sent: Tuesday, September 07, 2004 2:48 PM
> To: Warren DeLano
> Cc: [email protected]
> Subject: Re: [PyMOL] long loading times as the number of
> existing objects increases
>
> Warren-
> Thanks for your prompt response!
> Given the fundamental issues you mentioned, I think I will
> change my script so that it loads files only when they are
> needed and deletes the associated objects when they are no
> longer being displayed. Initially, I rejected this solution
> as less efficient, but apparently the specific situation with
> pymol actually makes it more efficient!
>
> Although the current version of my script uses a lot of
> outside information to determine which files are loaded, how
> they are colored, and how they are aligned (and so I haven't
> included it), the following illustrates what I'm talking about:
>
> #!/usr/bin/env python
>
> from glob import glob
> from time import time
>
> if __name__ == 'pymol':
> from pymol import cmd
> t1 = time()
> for pdb in glob('*.pdb'):
> print pdb
> cmd.load(pdb)
> t2 = time()
> print t2-t1
>
> If (from pymol) I cd to a directory that has 50 pdb files and
> run this script, it takes about 105 sec to complete. If I
> include simple alignment and color commands, as follows:
>
> #!/usr/bin/env python
>
> from glob import glob
> from time import time
>
> if __name__ == 'pymol':
> from pymol import cmd
> t1 = time()
> objects = []
> for pdb in glob('*.pdb'):
> print pdb
> cmd.load(pdb)
> objects.append(pdb[:-4])
> cmd.fit(objects[-1]+' and name ca',objects[0]+' and name ca')
> cmd.color('wheat',objects[-1]+' and elem c')
> t2 = time()
> print t2-t1
>
> , it still takes same amount of time as before. This is only
> one data point (50 structures), because I didn't want to
> repeat the benchmarks for larger sets of structures, but it
> seems to indicate that the limiting step is the actual
> loading of the pdb files, and not the subsequent
> aligning/coloring steps.
>
> Thanks again for letting me know which direction I should go.
> I'll let you know if I get any insight into the origin of the
> original issue.
>
> -Ben
>
> On Sep 7, 2004, at 11:51 AM, Warren DeLano wrote:
>
>
>
> Ben,
>
> Thanks for the great benchmarks! PyMOL is definitely
> showing non-linear
> behavior when it comes to loading a lot of objects...I
> don't know why this
> is exactly, but I can tell you that I didn't originally
> envision (and thus
> optimize PYMOL for) loading of so many objects.
>
> As it currently stands, there are a number of places
> where PyMOL does things
> using lists when it should be using hashes, and there
> are many tasks (such
> as selecting of atoms) that are linearly dependent (or
> worse) on the total
> number of atoms and coordinate sets present in the
> system. All of these
> issues will be addressed in time, but it may take a
> considerable work to
> correct them. Unfortunately, these are more than just
> bugs -- they are
> limitations in the original design. Such limitations
> are now the bane of my
> existence, my dreams are filled with questions of "How
> do we fix or improve
> the software, without breaking existing PyMOL usage?"
> Remodeling an
> airplane full of passengers while you're flying it is
> much more challenging
> than when it is empty and on the ground. : )
>
> My current advice is to find creative ways of limiting
> the total number of
> atoms and objects loaded into PyMOL at one time. One
> way to do this is to
> create subsets which just contain those atoms you'd
> like to see. Another
> approach is to run multiple PyMOL instances simultaneously.
>
> Cheers,
> Warren
>
> PS. It would be great if you could send us one of your
> more challenging
> example scripts to use as a test-case for improvement
> -- and if you do spot
> simple bottlenecks in the code, such information could
> be very helpful.
>
> --
> mailto:[email protected]
> Warren L. DeLano, Ph.D.
> Principal Scientist
> DeLano Scientific LLC
> Voice (650)-346-1154
> Fax (650)-593-4020
>
>
>
>
> -----Original Message-----
> From: [email protected]
>
> [mailto:[email protected]] On Behalf Of
> Ben Allen
> Sent: Tuesday, September 07, 2004 10:32 AM
> To: [email protected]
> Subject: [PyMOL] long loading times as the
> number of existing
> objects increases
>
> I have a situation in which I need to load a
> large number of
> separate pdb files into a single pymol session. In this
> case, the number is ~150, but it could
> potentially be more.
> However, the amount of time required to load a
> file appears
> to be strongly dependent on the number of files already
> loaded. For example:
>
> # of structures loaded time to load all
> structures (seconds)
> 5 0.82
> 10 2.49
> 20 11.05
> 30 29.85
> 40 62.48
> 50 115.25
> 60 189.79
> 70 302.67
> 80 432.82
> 90 589.23
>
> unfortunately, this means that to load 150
> structures takes
> over an hour. I observe this behavior whether I
> am loading
> the structures all at once using a python
> script, or one at a
> time. In both cases, I am using the cmd.load()
> api function,
> but the built-in load command gives similar
> results. The
> structures I am loading are (nearly) identical:
> each has 263 residues (in a single chain); each
> individual
> pdb file is about 215KB.
>
> I am running this on a dual 2.0 GHz G5 system
> with 1.5 GB
> memory. The long loading times are consistent
> between the
> two versions of pymol I have installed: OSX/X11 hybrid
> version 0.97 and MacPyMol version 0.95.
> During the long loading times, there is plenty
> of memory
> available, but the processor load stays at 50%
> (i.e. one
> processor on my machine is fully loaded throughout).
>
> My gut feeling is that this situation should
> not be, but I
> don't yet understand the structure of the code
> well enough to
> debug it. Can anyone shed light on this issue?
>
> Thanks in advance,
> Ben Allen
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by BEA Weblogic
> Workshop FREE
> Java Enterprise J2EE developer tools!
> Get your free copy of BEA WebLogic Workshop 8.1 today.
> http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
> _______________________________________________
> PyMOL-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/pymol-users
>
>
>
>
>
>
>