On Tuesday 18 August 2009 08:56:06 Johan Hake wrote: > On Monday 17 August 2009 23:51:39 Anders Logg wrote: > > On Mon, Aug 17, 2009 at 11:20:08PM +0200, Johan Hake wrote: > > > On Monday 17 August 2009 19:19:40 Anders Logg wrote: > > > > On Mon, Aug 17, 2009 at 07:09:11PM +0200, DOLFIN wrote: > > > > > changeset: 6762:ca407204632a1b0430099c243c915a151b2bd941 > > > > > parent: 6759:efc24a341e41e9e0c83616be4613d819fe95ccb6 > > > > > user: Anders Logg <l...@simula.no> > > > > > date: Mon Aug 17 19:08:56 2009 +0200 > > > > > files: site-packages/dolfin/compile_function.py > > > > > site-packages/dolfin/jit.py description: > > > > > Make JIT compiler work in parallel. The process number is added to > > > > > the signature to create a unique signature for each process. This > > > > > means that each process will compile its own form. This may not be > > > > > optimal and could possibly be handled by Instant. On the other > > > > > hand, it seems to work nicely and might also be advantageous when > > > > > processes don't share a common cache. > > > > > > > > The Poisson Python demo now runs as is without the need for first > > > > running it in serial (to handle JIT compilation): > > > > > > Did it not work before this change? I know Martin added some file locks > > > to prevent simultaneous compilations of the same module. > > > > No, it didn't work before. I get things like > > > > In instant.build_module: Path > > '/home/logg/.instant/cache/form_f38430af401fbeddb9be4091a6fcde37cef9fa35' > > already exists, but module wasn't found in cache previously. Not > > overwriting, assuming this module is valid. > > Traceback (most recent call last): > > File "demo.py", line 23, in <module> > > V = FunctionSpace(mesh, "CG", 1) > > File > > > > "/home/logg/scratch/src/fenics-dev/dolfin-dev/local/lib/python2.6/site-pa > >ck ages/dolfin/functionspace.py", line 181, in __init__ > > FunctionSpaceBase.__init__(self, mesh, element) > > File > > > > "/home/logg/scratch/src/fenics-dev/dolfin-dev/local/lib/python2.6/site-pa > >ck ages/dolfin/functionspace.py", line 43, in __init__ > > ufc_element, ufc_dofmap = jit(self._element) > > File > > > > "/home/logg/scratch/src/fenics-dev/dolfin-dev/local/lib/python2.6/site-pa > >ck ages/dolfin/jit.py", line 67, in jit > > return jit_compile(form, options) > > File > > > > "/home/logg/scratch/lib/fenics-dev/lib/python2.6/site-packages/ffc/jit/ji > >t. py", line 56, in jit > > return jit_element(object, options) > > File > > > > "/home/logg/scratch/lib/fenics-dev/lib/python2.6/site-packages/ffc/jit/ji > >t. py", line 125, in jit_element > > (compiled_form, module, form_data) = jit_form(form, options) > > File > > > > "/home/logg/scratch/lib/fenics-dev/lib/python2.6/site-packages/ffc/jit/ji > >t. py", line 102, in jit_form > > os.unlink(signature + ".h") > > OSError: [Errno 2] No such file or directory: > > 'form_f38430af401fbeddb9be4091a6fcde37cef9fa35.h' > > It looks like the error comes from unlinking a file more than one time > (done in ffc/jit.py), and not in instant. I will look at it. > > > I guess the second process tries to read the generated file but > > it's not ready yet (still being generated by the first process). > > > > It would be good to handle the parallel JIT compilation as part of > > Instant, but I don't know what the best solution is. > > > > > > mpirun -n 4 python demo.py > > > > > > Do I have to set some environmental variables to make this work. I > > > can't get it to work (probably some stupid error) :P > > > > No, nothing. It should work out of the box. > > > > > Johan > > > > > > When running the above command I get: > > > > > > ssh: connect to host hake-laptop port 22: Connection refused > > > > Can you run other processes in parallel? > > > > mpirun -n 4 ls > > > > Maybe you need to install sshd? I didn't know it was required. > > Yes, that did the trick! openssh-server in ubuntu, btw, and I also had to > put my public ssh keys in my own authorized keys.
Also I had to add an alias for mpirun: alias mpirun='mpirun -x PYTHONPATH -x LD_LIBRARY_PATH -x PKG_CONFIG_PATH -x PATH -x DISPLAY' which forwards the mentioned environments variables, and I had to set the DISPLAY=:0.0 in my .basrh file. I know that others do not have to forward these variables but, I couldn't find a way to not do it. Johan > Johan > > > -- > > Anders > > > > > ----------------------------------------------------------------------- > > >-- - A daemon (pid 32065) died unexpectedly with status 255 while > > > attempting to launch so we are aborting. > > > > > > There may be more information reported by the environment (see above). > > > > > > This may be because the daemon was unable to find all the needed shared > > > libraries on the remote node. You may set your LD_LIBRARY_PATH to have > > > the location of the shared libraries on the remote nodes and this will > > > automatically be forwarded to the remote nodes. > > > ----------------------------------------------------------------------- > > >-- - > > > ----------------------------------------------------------------------- > > >-- - mpirun noticed that the job aborted, but has no info as to the > > > process that caused that situation. > > > ----------------------------------------------------------------------- > > >-- - mpirun: clean termination accomplished > > _______________________________________________ > DOLFIN-dev mailing list > DOLFIN-dev@fenics.org > http://www.fenics.org/mailman/listinfo/dolfin-dev _______________________________________________ DOLFIN-dev mailing list DOLFIN-dev@fenics.org http://www.fenics.org/mailman/listinfo/dolfin-dev