[OMPI users] Fwd: Announcing the release of BLCR 0.6.3

2008-01-23 Thread Josh Hursey
Attention anyone using BLCR for checkpoint/restart functionality, you  
are encouraged to upgrade to the latest release of BLCR, 0.6.3. This  
fixes a data corruption problem seen my a number of users.


The release announcement is enclosed, and below is a link to the  
resolved bug if you are interested in more details.

http://upc-bugs.lbl.gov/bugzilla/show_bug.cgi?id=2001

-- Josh


Begin forwarded message:


From: "Paul H. Hargrove" 
Date: January 22, 2008 3:14:31 PM GMT-05:00
To: checkpo...@lbl.gov
Subject: Announcing the release of BLCR 0.6.3

First, let me apologize for the download problems many of you  
encountered with the 0.6.2 release.


Second, only a week after releasing 0.6.2, there is a 0.6.3 release  
to fix a floating-point corruption problem on the x86-64  
architecture (present in 0.6.2 and all previous releases for the  
x86-64 architecture).


The 0.6.3 release is now available from the BLCR Downloads page:
http://ftg.lbl.gov/CheckpointRestart/CheckpointDownloads.shtml

From the NEWS file:

0.6.3

January 22, 2008
Bug-fix release.
- This release fixes bug 2001 which was causing intermittent floating-
  point register corruption on x86-64, even after BLCR was unloaded.


-Paul

PS
You are receiving this either because you are on the  
checkpo...@lbl.gov
list, or because you've recently sent email to the list (or me  
directly)

asking about BLCR status.


--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900





[OMPI users] Need explanation for the following ORTE error message

2008-01-23 Thread David Gunter
A user of one of our OMPI 1.2.3 builds encountered the following error  
message during an MPI job run:


ORTE_ERROR_LOG: File read failure in file
util/universe_setup_file_io.c at line 123

He reported that the job ran normally other than that but we are  
wondering what this message means.


Thanks,
david
--
David Gunter
HPC-3: Parallel Tools Team
Los Alamos National Laboratory





Re: [OMPI users] Need explanation for the following ORTE error message

2008-01-23 Thread Ralph H Castain



On 1/23/08 8:26 AM, "David Gunter"  wrote:

> A user of one of our OMPI 1.2.3 builds encountered the following error
> message during an MPI job run:
> 
> ORTE_ERROR_LOG: File read failure in file
> util/universe_setup_file_io.c at line 123

It means that at some point in the past, an mpirun attempted to startup,
started to write a file that includes info on its name and contact info, and
then was aborted. The user subsequently restarted the job, it saw the file
and attempted to read it, but the info in the file was incomplete.

This can be ignored - we eliminated that handshake from future versions, so
you'll never see it after 1.2.

Ralph

> 
> He reported that the job ran normally other than that but we are
> wondering what this message means.
> 
> Thanks,
> david
> --
> David Gunter
> HPC-3: Parallel Tools Team
> Los Alamos National Laboratory
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Excessive Use of CPU System Resources with OpenMPI 1.2.4 using TCP only ..

2008-01-23 Thread Terry Frankcombe
On Tue, 2008-01-22 at 20:19 +0100, Pignot Geoffroy wrote:
> You could try the following MCA setting in your mpirun command
> --mca mpi_yield_when_idle 1

Yes, but to repeat what was said above, it is first essential that you
read:

and the related






[OMPI users] Jeffrey M Ceason is out of the office.

2008-01-23 Thread Jeffrey M Ceason

I will be out of the office starting  01/23/2008 and will not return until
02/04/2008.

I will respond to your message when I return.



[OMPI users] Information about multi-path on IB-based systems

2008-01-23 Thread David Gunter
Do I need to do anything special to enable multi-path routing on  
InfiniBand networks?  For example, are there command-line arguments to  
mpiexec or the like?


Thanks,
david
--
David Gunter
HPC-3: Parallel Tools Team
Los Alamos National Laboratory




[OMPI users] CfP 3rd Workshop on Virtualization in HPC Cluster and Grid Computing Environments (VHPC'08)

2008-01-23 Thread Michael Alexander
Apologies if you received multiple copies of this message.


===
CALL FOR PAPERS

3rd Workshop on Virtualization in High-Performance Cluster
and Grid Computing (VHPC'08)

as part of Euro-Par 2008, Las Palmas de Gran Canaria, Canary Island,
Spain

===


Date: August 26-29, 2008

Euro-Par 2008: http://europar2008.caos.uab.es/
Workshop URL: http://xhpc.wu-wien.ac.at/


SUBMISSION DEADLINE:
Abstracts: February 4, 2008
Full Paper: April 14, 2008


Scope:
Virtual machine monitors (VMMs) are becoming tightly integrated with
standard OS distributions, leading to increased adoption in many
application areas including scientific educational and high-performance
computing (HPC). VMMs allow for the concurrent execution of potentially
large numbers of virtual machines, providing encapsulation, isolation,
and the possibility for migrating VMs between physical hosts. These
features enable physical clusters to be treated as "computation pools",
where a variety of execution environments can be dynamically
instantiated on the underlying hardware. VM technology is therefore
opening up new architectures and services for HPC in cluster and grid
environments, but consensus has not yet emerged on the best models
and tools. This workshop aims to bring together researchers and
practitioners working on virtualization in HPC environments, with the
goal of sharing experience and promoting the development of a
research community in this emerging area.

The workshop will be one day in length, composed of 20 min paper
presentations, each followed by 10 min discussion sections.
Presentations may be accompanied by interactive demonstrations.
The workshop will also include a 30 min panel discussion by presenters.




TOPICS

Topics include, but are not limited to, the following subject matters:

- Virtualization in cluster and grid environments
- Workload characterizations for VM-based clusters
- VM cluster and grid architectures
- Cluster reliability, fault-tolerance, and security
- Compute job entry and scheduling
- Compute workload load leveling
- Cluster and grid filesystems for VMs
- VMMs, VMs and QoS guarantees
- Research and education use cases
- VM cluster distribution algorithms
- MPI, PVM on virtual machines
- System sizing
- Hardware support for virtualization
- High-speed interconnects in hypervisors
- Hypervisor extensions and utilities for cluster and grid computing
- Network architectures for VM-based clusters
- VMMs/Hypervisors on large SMP machines
- Performance models
- Performance management and tuning hosts and guest VMs
- Power considerations
- VMM performance tuning on various load types
- Xen/other VMM cluster/grid tools
- High-speed Device access from VMs
- Management, deployment of clusters and grid environments with VMs
- Information systems for virtualized clusters
- Management of system images for virtual machines
- Integration with relevant standards e.g. CIM, GLUE, OGF, etc.


PAPER SUBMISSION

Papers submitted to each workshop will be reviewed by at least two
members of the program committee and external reviewers. Submissions
should include abstract, key words, the e-mail address of the
corresponding author, and must not exceed 10 pages, including tables
and figures at a main font size no smaller than 11 point. Submission
of a paper should be regarded as a commitment that, should the paper be
accepted, at least one of the authors will register and attend the
conference to present the work.

Accepted papers will be published in the Springer LNCS series - the
format must be according to the Springer LNCS Style. Initial
submissions are in PDF, accepted papers will be requested to
provided source files.

http://www.springer.de/comp/lncs/authors.html


Submission Link:
http://www.edas.info/newPaper.php?c=6123&;



IMPORTANT DATES


February 4, 2008 - Abstract submissions due
Full paper submission due: April 14, 2008
Acceptance notification: May 3, 2008
Camera-ready due: May 26, 2008
Conference: August 26-29, 2008


CHAIR


Michael Alexander (chair), WU Vienna, Austria
Stephen Childs (co-chair), Trinity College, Dublin, Ireland


PROGRAM COMMITTEE


Jussara Almeida, Federal University of Minas Gerais, Brasil
Padmashree Apparao, Intel Corp., US
Hassan Barada, Etisalat University College, UAE
Volker Buege, University of Karlsruhe, Germany
Simon Crosby, Xensource, UK
Marcus Hardt, Forschungszentrum Karlsruhe, Germany
Sverre Jarp, CERN, Switzerland
Krishna Kant, Intel Corporation, US
Yves Kemp, University of Karlsruhe, Germany
Naoya Maruyama, Tokyo Institute of Technology, Japan
Jean-Marc Menaud, Ecole des Mines de Nantes, France
José E. Moreira, IBM Watson Research Center, US
Yoshio Turner, HP Labs
Andreas Unterkircher, CERN, Switzerland
Dongyan Xu, Purdue University, US


GENERAL INFORMATION


The workshop will be held as part of Euro-Par 2008, Las Palmas de
Gran Canaria, Canary Island, Spain.

Eur