Bill: For the NONMEM parallelization process, a maximum number of CPU's must be specified, and the available computers must be specified. There is no pre-run facility to assess the maximum number of CPU's needed. You can set it at the maximum physical available, and allow NONMEM's load-balancing algorithm to determine if it needs only a subset of them. The default launching programs for NONMEM have limited capability to search on their own for available computers on the network. You may substitute your own launching program. There is not a way to scale the number of worker threads during the middle of a run, unless this capability comes from the launching program.
Robert J. Bauer, Ph.D. Vice President, Pharmacometrics ICON Development Solutions Tel: (215) 616-6428 Mob: (925) 286-0769 Email: robert.ba...@iconplc.com Web: www.icondevsolutions.com ________________________________ From: Denney, William S. [mailto:william_den...@merck.com] Sent: Wednesday, September 22, 2010 9:58 PM To: Bauer, Robert; nmusers Subject: RE: [NMusers] FW: NM7.2 Parallelization and Virtualization Hi Bob, Thanks for the detailed response. I'm thinking of how cluster software will handle a variable number of threads being spawned and how it may be helpful to know (or estimate relatively closely) the number of threads a priori. Out of curiosity, will it be possible to specify a maximum number of threads for the auto-tuning search? Will NONMEM have the ability to detect (with MPI) the available number of worker CPUs? Might there be a pre-run (and therefore relatively quick) method to calculate the number of CPUs? For some of our workloads, I'm imagining that we'll want to be able to have higher priority jobs execute taking resources from lower priority jobs. Is there any way to scale the number of worker threads during the middle of a run? Thanks, Bill ________________________________ From: owner-nmus...@globomaxnm.com [mailto:owner-nmus...@globomaxnm.com] On Behalf Of Bauer, Robert Sent: Wednesday, September 22, 2010 1:21 PM To: nmusers Subject: [NMusers] FW: NM7.2 Parallelization and Virtualization Bill: Yes, we are planning to release NONMEM 8.0 in January. Among other exciting features, it will have parallel computing. From our tests, we have found that the optimal number of cores depends on the problem. On one extreme, if the problem contains many subjects, and each subject takes a long time to evaluate because of a large number of differential equations, and/or a large number of dose events, so that one subject takes a minute to evaluate on each function evaluation, then as many cores as there are subjects would still be efficient. Our parallelization algorithm does not split up the problem beyond one subject per core. On the other hand, if the problem takes just 0.01 second to evaluate all subjects for a function evaluation, then it is not worth using parallel processing. For each function call, the manager core packages a subset of subjects and sends the data to a worker core, then the worker core returns its results to the manager, and the manager summarizes the information from all of the workers. For the next function call, the process begins again. In NONMEM there is an optional algorithm that will determine how many nodes it actually needs for the job by timing the first few iterations. NONMEM can parallize across computers as well as to individual cores on those computers. However, depending on your intranet connection between computers, the process will be a little slower across computers than among cores on the manager computer alone. Two passing methods will be available, file passing interface (FPI), and message passing interface (MPI). FPI is built in to NONMEM and is available upon installation of NONMEM, but is somewhat less efficient than MPI, especially for small problems. MPI is more efficient, but requires third part installation of free but ubiquitous use software, and we recommend you set this up for your cluster. See the web site http://phase.hpcc.jp/mirrors/mpi/mpich2/ <http://phase.hpcc.jp/mirrors/mpi/mpich2/> I think 8 to 16 nodes per computer with about 2 GB memory per node should be ideal for almost any problem in NONMEM. Alternatively, 0.5 GB per node is enough for many NONMEM problems. Your operating system can be Windows or Linux. We have not tried MAC OSX. I do not know enough about virtualization. We have no such facility at ICON, and we will not be supporting that environment, so you would be on your own to trouble-shoot such an environment. One aspect of parallelization is that NONMEM sends a copy of its program (in nonmem.exe on Windows, nonmem on Linux) to the worker computer,and then loads it there. Therefore, the worker computers must be of the same operating system (although not necessarily same version) as the manager computer. For Intel fortran, the worker computer does not have to have Intel Fortran installed. IF you use the MPI system though, the MPI's dll files or share library files must be installed on every worker computer. For gfortran, the worker computer does not have to have gfortran installed, but may require its share libraries available. Robert J. Bauer, Ph.D. Vice President, Pharmacometrics ICON Development Solutions Tel: (215) 616-6428 Mob: (925) 286-0769 Email: robert.ba...@iconplc.com Web: www.icondevsolutions.com ________________________________ From: Denney, William S. [mailto:william_den...@merck.com] Sent: Tuesday, September 21, 2010 10:02 AM To: Bauer, Robert Subject: NM7.2 Parallelization and Virtualization Hi Bob, I hope that you're doing well these days. I have a couple of questions about parallelization and virtualization that I was wondering if you could help answer to assist with preparing our cluster for NM7.2. Parallelization As I recall from our April training, you mentioned that a new feature of NM7.2 is going to be parallelization. I was wondering if that's still one of the planned features, and if so, do you have any insight about the best ways to setup a cluster for parallel NM running? From historical experience, I know that execution of many types of parallel jobs tends to max out around 4-8 cores though I can imagine NONMEM-like jobs requiring relatively little message passing. Do you have an idea of how well NM7.2 will parallelize? Is there a recommended number of cores/node? Will NM7 parallelize across nodes? Virtualization Do you have any experience with virtualization of NONMEM-running clusters? Are there any pros to it over bare-metal operation from a NONMEM standpoint? Thanks, Bill Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates Direct contact information for affiliates is available at http://www.merck.com/contact/contacts.html) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. ICON plc made the following annotations. ------------------------------------------------------------------------ ------ This e-mail transmission may contain confidential or legally privileged information that is intended only for the individual or entity named in the e-mail address. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or reliance upon the contents of this e-mail is strictly prohibited. If you have received this e-mail transmission in error, please reply to the sender, so that ICON plc can arrange for proper delivery, and then please delete the message. Thank You, ICON plc South County Business Park Leopardstown Dublin 18 Ireland Registered number: 145835 Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates Direct contact information for affiliates is available at http://www.merck.com/contact/contacts.html) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. ICON plc made the following annotations. ------------------------------------------------------------------------------ This e-mail transmission may contain confidential or legally privileged information that is intended only for the individual or entity named in the e-mail address. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or reliance upon the contents of this e-mail is strictly prohibited. If you have received this e-mail transmission in error, please reply to the sender, so that ICON plc can arrange for proper delivery, and then please delete the message. Thank You, ICON plc South County Business Park Leopardstown Dublin 18 Ireland Registered number: 145835