On Wed, Oct 30, 2013 at 2:14 PM, Anders Logg <[email protected]> wrote: > On Wed, Oct 30, 2013 at 11:25:06AM +0100, Johannes Ring wrote: >> On Tue, Oct 29, 2013 at 9:49 PM, Anders Logg <[email protected]> wrote: >> > On Tue, Oct 29, 2013 at 10:12:35AM +0000, Garth N. Wells wrote: >> >> On 2013-10-29 10:03, Anders Logg wrote: >> >> >On Tue, Oct 29, 2013 at 10:30:03AM +0100, Johannes Ring wrote: >> >> >>On Tue, Oct 29, 2013 at 10:06 AM, Martin Sandve Alnęs >> >> >><[email protected]> wrote: >> >> >>> Some of the buildbots have been offline for a while. >> >> >> >> >> >>I have restarted one (wheezy-amd64) and disabled the others now >> >> >>(osx-10.6 and sid-amd64). The osx-10.6 buildbot machine has been >> >> >>upgraded to OS X 10.8, but is currently not ready to be used as a >> >> >>buildbot slave. >> >> > >> >> >Great. I think we should only enable buildbots that are known to work. >> >> > >> >> >>> Some of them have strange timeouts. >> >> >> >> >> >>sid-amd64 hang when compiling the fem convergence benchmark. I thought >> >> >>this had been fixed when gcc 4.8.2 was uploaded to Debian unstable, >> >> >>but it was not. Can someone else reproduce this timeout when compiling >> >> >>this benchmark with gcc 4.8? We might want to disable building of this >> >> >>benchmark. >> > >> > Where is the sid-amd64 buildbot? Is it offline? I don't see it in the list. >> >> I removed it since it was just hanging, but I will add it back when we >> disable the fem convergence benchmark. > > ok. > >> >> >It hangs for me with gcc 4.8.1. Does anyone know when a new gcc will >> >> >enter Ubuntu? >> >> > >> >> >> >> I wouldn't hold your breath - Ubuntu has a poor track record in >> >> releasing bug fixes for development tools. >> >> >> >> >The benchmark builds fine if I disable P5 for R3. I suggest we disable >> >> >that benchmark for now and reenable it later. We should open an issue >> >> >so we don't forget it. (I can do this later today.) >> >> > >> >> >>> And at least two display this error: >> >> >>> >> >> >>> dolfin-master-full-precise-amd64 >> >> >>> >> >> >>> [ 0%] Building CXX object >> >> >>> test/unit/la/cpp/CMakeFiles/test_LinearOperator.dir/LinearOperator.cpp.o >> >> >>> Linking CXX executable test_LinearOperator >> >> >>> /home/buildbot/fenicsbbot/master/dolfin-full/lib/libdolfin.so: >> >> >>> undefined >> >> >>> reference to `METIS_Free' >> >> >> >> >> >>That error was first encountered in this build: >> >> >> >> >> >>http://fenicsproject.org:8010/builders/dolfin-master-full-precise-amd64/builds/417 >> >> > >> >> >Looks like Garth? >> >> > >> >> >> >> The correlation between code changes and buildbot errors has been >> >> very weak of late. The error is a linking problem which is probably >> >> due to a library configuration problem on the buildbot. >> > >> > Can we get it fixed? >> >> This was fixed after I removed the libparmetis-dev Debian package from >> the buildbot. It is strange though that it suddenly (in build 417) >> started to pick up libparmetis.so from /usr/lib instead of the locally >> installed library: >> >> http://fenicsproject.org:8010/builders/dolfin-master-full-precise-amd64/builds/416/steps/configure%20%28enable%20all%29/logs/CMakeCache.txt >> http://fenicsproject.org:8010/builders/dolfin-master-full-precise-amd64/builds/417/steps/configure%20%28enable%20all%29/logs/CMakeCache.txt >> >> The Debian package is version 3.1.1, so it shouldn't use that one >> anyway since we require 4.0.2. >> >> > I'm seeing the following errors at the moment: >> > >> > master: >> > >> > * ImportError:libteuchos.so: cannot open shared object file: No such file >> > or directory; >> >> This happened after I upgraded to Trilinos 11.4.1 but it is fixed now >> after everything was rebuilt last night. However, there is a new error >> on this buildbot slave (wheezy-amd64): >> >> $ mpirun -np 3 ./demo_bcs >> [debian-bbot:22320] *** An error occurred in MPI_Barrier >> [debian-bbot:22320] *** on communicator MPI_COMM_WORLD >> [debian-bbot:22320] *** MPI_ERR_COMM: invalid communicator >> [debian-bbot:22320] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) >> [debian-bbot:22319] 2 more processes have sent help message >> help-mpi-errors.txt / mpi_errors_are_fatal >> [debian-bbot:22319] Set MCA parameter "orte_base_help_aggregate" to 0 >> to see all help / error messages >> $ OMPI_MCA_orte_base_help_aggregate=0 mpirun -np 3 ./demo_bcs >> [debian-bbot:22324] *** An error occurred in MPI_Barrier >> [debian-bbot:22324] *** on communicator MPI_COMM_WORLD >> [debian-bbot:22324] *** MPI_ERR_COMM: invalid communicator >> [debian-bbot:22324] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) >> [debian-bbot:22325] *** An error occurred in MPI_Barrier >> [debian-bbot:22325] *** on communicator MPI_COMM_WORLD >> [debian-bbot:22325] *** MPI_ERR_COMM: invalid communicator >> [debian-bbot:22325] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) >> [debian-bbot:22326] *** An error occurred in MPI_Barrier >> [debian-bbot:22326] *** on communicator MPI_COMM_WORLD >> [debian-bbot:22326] *** MPI_ERR_COMM: invalid communicator >> [debian-bbot:22326] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) > > I have no idea what this is, but since it appears in two places, both > Wheezy in next and master, I would assume a problem with MPI in Wheezy?
I am trying to figure out this one. >> > * ImportError:libteuchos.so: cannot open shared object file: No such file >> > or directory; >> > * Strange segfaults on osx-10.7 >> >> I am looking into this. > > Great. Simply restarting the buildbot process fixed this problem. Johannes _______________________________________________ fenics mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics
