Re: [petsc-dev] DMDAGlobalToNatural errors with Ubuntu:latest; gcc 7 & Open MPI 2.1.1

2019-07-31 Thread Fabian.Jakub via petsc-dev
Awesome, many thanks for your efforts!

On 7/31/19 9:17 PM, Zhang, Junchao wrote:
> Hi, Fabian,
> I found it is an OpenMPI bug w.r.t self-to-self MPI_Send/Recv using 
> MPI_ANY_SOURCE for message matching. OpenMPI does not put correct value in 
> recv buffer.
> I have a workaround 
> jczhang/fix-ubuntu-openmpi-anysource<>.
>  I tested with your petsc_ex.F90 and $PETSC_DIR/src/dm/examples/tests/ex14.  
> The majority of valgrind errors disappeared. A few left are in ompi_mpi_init 
> and we can ignore them.
> I filed a bug report to OpenMPI 
> and hope 
> they can fix it in Ubuntu.
> Thanks.
> --Junchao Zhang
> On Tue, Jul 30, 2019 at 9:47 AM Fabian.Jakub via petsc-dev 
>>> wrote:
> Dear Petsc Team,
> Our cluster recently switched to Ubuntu 18.04 which has gcc 7.4 and
> (Open MPI) 2.1.1 - with this I ended up with segfault and valgrind
> errors in DMDAGlobalToNatural.
> This is evident in a minimal fortran example such as the attached
> example petsc_ex.F90
> with the following error:
> ==22616== Conditional jump or move depends on uninitialised value(s)
> ==22616==at 0x4FA5CDB: PetscTrMallocDefault (mtr.c:185)
> ==22616==by 0x4FA4DAC: PetscMallocA (mal.c:413)
> ==22616==by 0x5090E94: VecScatterSetUp_SF (vscatsf.c:652)
> ==22616==by 0x50A1104: VecScatterSetUp (vscatfce.c:209)
> ==22616==by 0x509EE3B: VecScatterCreate (vscreate.c:280)
> ==22616==by 0x577B48B: DMDAGlobalToNatural_Create (dagtol.c:108)
> ==22616==by 0x577BB6D: DMDAGlobalToNaturalBegin (dagtol.c:155)
> ==22616==by 0x5798446: VecView_MPI_DA (gr2.c:720)
> ==22616==by 0x51BC7D8: VecView (vector.c:574)
> ==22616==by 0x4F4ECA1: PetscObjectView (destroy.c:90)
> ==22616==by 0x4F4F05E: PetscObjectViewFromOptions (destroy.c:126)
> and consequently wrong results in the natural vec
> I was looking at the fortran example if I did forget something but I can
> also see the same error, i.e. not being valgrind clean, in pure C - PETSc:
> cd $PETSC_DIR/src/dm/examples/tests && make ex14 && mpirun
> --allow-run-as-root -np 2 valgrind ./ex14
> I then tried various docker/podman linux distributions to make sure that
> my setup is clean and to me it seems that this error is confined to the
> particular gcc version 7.4 and (Open MPI) 2.1.1 from the ubuntu:latest repo.
> I tried other images from dockerhub including
> gcc:7.4.0 :: where I could neither install openmpi nor mpich through
> apt, however works with --download-openmpi and --download-mpich
> ubuntu:rolling(19.04) <-- work
> debian:latest & :stable <-- works
> ubuntu:latest(18.04) <-- fails in case of openmpi, but works with mpich
> or with petsc-configure --download-openmpi or --download-mpich
> Is this error with (Open MPI) 2.1.1 a known issue? In the meantime, I
> guess I'll go with a custom mpi install but given that ubuntu:latest is
> widely spread, do you think there is an easy solution to the error?
> I guess you are not eager to delve into this issue with old mpi versions
> but in case you find some spare time, maybe you find the root cause
> and/or a workaround.
> Many thanks,
> Fabian

Re: [petsc-dev] Issues with Fortran Interfaces for PetscSort routines

2019-07-29 Thread Fabian.Jakub via petsc-dev
Fixes it for me. Many thanks for the prompt reply!

On 7/30/19 12:34 AM, Zhang, Junchao wrote:
> Fixed in jczhang/fix-sort-fortran-binding and will be in master later. Thanks.
> --Junchao Zhang
> On Mon, Jul 29, 2019 at 10:14 AM Fabian.Jakub via petsc-dev 
>>> wrote:
> Dear Petsc,
> Commit 051fd8986cf23c0556f4229193defe128fafa1f7 changed the C signature
> of the sorting routines and as a result I cannot compile against them
> anymore from Fortran.
> I tried to rebuild Petsc from scratch and did a make allfortranstubs but
> still to no avail.
> I attach a simple fortran program that calls PetscSortInt and gives the
> following error at compile time.
> petsc_fortran_sort.F90:15:27:
>call PetscSortInt(N, x, ierr)
> Error: Rank mismatch in argument ‘b’ at (1) (scalar and rank-1)
> Same applies for other routines such as PetscSortIntWithArrayPair...
> I am not sure where to find the FortranInterfaces and currently had no
> time to dig deeper.
> Please let me know if I have missed something stupid.
> Many thanks,
> Fabian
> P.S. Petsc was compiled with
> --with-fortran
> --with-fortran-interfaces
> --with-shared-libraries=1

[petsc-dev] Error in HDF5 dumps of DMPlex labels

2019-02-14 Thread Fabian.Jakub via petsc-dev
Dear Petsc Team!

I had an issue when writing out DMPlex objects through hdf5.

This comes from a DMLabel that has only entries on non-local mesh points.
The DMLabel write only includes local parts of the label and so leads to
a zero sized write for the index set.
This seems to be fine except that the hdf5 chunksize is set to zero
which is not allowed.

I added a minimal example to illustrate the error.
It creates a 2D DMPlex in serial, distributes it, labels the nonlocal
points in the mesh and dumps it via PetscObjectViewer to HDF5.
Run with:

   make plex.h5

I also attached a quick fix to override the chunksize.

Please let me know if you anything extra and also if this is expected
behavior... I could certainly with the fact that DMLabel is not supposed
to work this way.

Many thanks,


From a8fb918b6f1ef49b8b14c5c492581ff84d484eb6 Mon Sep 17 00:00:00 2001
From: "Fabian.Jakub" 
Date: Thu, 14 Feb 2019 13:21:28 +0100
Subject: [PATCH] fix hdf5 chunksizes of 0

  chunksize must not be 0.
  H5Pset_chunk,(chunkspace, dim, chunkDims) will otherwise give an
  This happened for example when dumping a dmlabel inside a dmplex which
  has only entries on non-local points.
 src/vec/is/is/impls/general/general.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/vec/is/is/impls/general/general.c b/src/vec/is/is/impls/general/general.c
index c6930cf..476a149 100644
--- a/src/vec/is/is/impls/general/general.c
+++ b/src/vec/is/is/impls/general/general.c
@@ -264,7 +264,7 @@ static PetscErrorCode ISView_General_HDF5(IS is, PetscViewer viewer)
   ierr = PetscHDF5IntCast(N/bs,dims + dim);CHKERRQ(ierr);
   maxDims[dim]   = dims[dim];
-  chunkDims[dim] = dims[dim];
+  chunkDims[dim] = PetscMax(1,dims[dim]);
   if (bs >= 1) {
 dims[dim]  = bs;

program main
#include "petsc/finclude/petsc.h"

  use petsc
  implicit none

  PetscErrorCode :: ierr
  PetscInt, parameter :: petscint_dummy=0
  integer, parameter :: pi=kind(petscint_dummy)

  type(tDM) :: dm, dmdist

  call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr)

  call create_plex(PETSC_COMM_WORLD, dm)
  call PetscObjectViewFromOptions(dm, PETSC_NULL_VEC, "-show_serial_plex", ierr); CHKERRQ(ierr)

  call distribute_dmplex(dm, dmdist)
  call PetscObjectViewFromOptions(dmdist, PETSC_NULL_VEC, "-show_dist_plex", ierr); CHKERRQ(ierr)

  call label_non_local_points(dmdist)
  call PetscObjectViewFromOptions(dmdist, PETSC_NULL_VEC, "-show_labeled_plex", ierr); CHKERRQ(ierr)

  call DMDestroy(dmdist, ierr);CHKERRQ(ierr)
  call DMDestroy(dm, ierr);CHKERRQ(ierr)
  call PetscFinalize(ierr)

subroutine create_plex(comm, dm)
  integer, intent(in) :: comm
  type(tDM), intent(out) :: dm
  integer :: myid

  PetscInt :: i, k, Nfaces, Nedges, Nverts, chartsize

  call mpi_comm_rank(comm, myid, ierr);CHKERRQ(ierr)
  call DMPlexCreate(comm, dm, ierr);CHKERRQ(ierr)
  call PetscObjectSetName(dm, 'testplex', ierr);CHKERRQ(ierr)
  call DMSetDimension(dm, 2_pi, ierr);CHKERRQ(ierr)

  if(myid.eq.0) then

!   1611_17_12___18
!|  /|\  |
!| / | \ |
!|/  |  \3   |
!|   0   /   |   \   |
!|  /|\  |
!8 7 | 9 10
!|/  |  \|
!|   /   8   \   |
!|  /1   |2   \  |
!| / | \ |
!|/  |  \|
!/   |   \
!   13-414-5-15

Nfaces = 4
Nedges = 9
Nverts = 6
chartsize = 19
Nfaces = 0
Nedges = 0
Nverts = 0
chartsize = 0

  call DMPlexSetChart(dm, 0_pi, chartsize, ierr); CHKERRQ(ierr)

  ! Preallocation
  ! cell has 3 edges
  do i = 1, Nfaces
call DMPlexSetConeSize(dm, k, 3_pi, ierr); CHKERRQ(ierr)
k = k+1

  ! Edges have 2 vertices
  do i = 1, Nedges
call DMPlexSetConeSize(dm, k, 2_pi, ierr); CHKERRQ(ierr)
k = k+1

  call DMSetUp(dm, ierr); CHKERRQ(ierr) ! Allocate space for cones

  if(myid.eq.0) then
! Setup Connections
call DMPlexSetCone(dm,  0_pi, [6_pi, 7_pi,11_pi], ierr); CHKERRQ(ierr)
call DMPlexSetCone(dm,  1_pi, [4_pi, 8_pi, 7_pi], ierr); CHKERRQ(ierr)
call DMPlexSetCone(dm,  2_pi, [5_pi, 9_pi, 8_pi], ierr); CHKERRQ(ierr)
call DMPlexSetCone(dm,  3_pi, [9_pi,10_pi,12_pi], ierr); CHKERRQ(ierr)

call DMPlexSetCone(dm,  4_pi, [13_pi,14_pi], ierr); CHKERRQ(ierr)

[petsc-dev] patch for wrong integer type

2019-02-04 Thread Fabian.Jakub via petsc-dev
Dear Petsc Team,

I recently had segfaults when dumping DMPlexs through the
PetscObjectViewer into hdf5 files

This happens to me with 64 bit integers and I think there is a PetscInt
where an int should be placed.

Please have a look at the attached patch.


From 78a8c48ed0956277273d50a22c456bf0d43db235 Mon Sep 17 00:00:00 2001
From: "Fabian.Jakub" 
Date: Mon, 4 Feb 2019 19:48:37 +0100
Subject: [PATCH] fix integer type given to PetscStrToArray

Had segfaults when dumping DMPlexs through HDF5 with 64-bit integers
Given integer was PetscInt* but expects int*
 src/sys/classes/viewer/impls/hdf5/hdf5v.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/sys/classes/viewer/impls/hdf5/hdf5v.c b/src/sys/classes/viewer/impls/hdf5/hdf5v.c
index a87d77c..ae58aa8 100644
--- a/src/sys/classes/viewer/impls/hdf5/hdf5v.c
+++ b/src/sys/classes/viewer/impls/hdf5/hdf5v.c
@@ -1009,7 +1009,7 @@ static PetscErrorCode PetscViewerHDF5Traverse_Internal(PetscViewer viewer, const
   const char rootGroupName[] = "/";
   hid_t  h5;
   PetscBool  exists=PETSC_FALSE;
-  PetscInt   i,n;
+  inti, n;
   char   **hierarchy;
   char   buf[PETSC_MAX_PATH_LEN]="";
   PetscErrorCode ierr;

Description: OpenPGP digital signature