Jose, Really appreciate your fix on this. I will try out slepc 3.8, with the mentioned option and also with the flbaslapack. I will get back to you if that fixes the problem.
-- Regards, Ramki On 10/23/17, 12:03 PM, "Jose E. Roman" <[email protected]> wrote: To close this old thread, I would like to mention that in SLEPc 3.8 we have added a command-line option that should fix the problem: -ds_parallel synchronized This option forces a synchronization of the results of local computations in DS (that involve LAPACK calls), so that all MPI processes have exactly the same result. This was causing the failure you reported. If this option is not provided, the behaviour is the same as in SLEPc 3.7 and before, i.e., all processes do the computation redundantly (-ds_parallel redundant). Jose > El 16 jun 2017, a las 17:36, Jose E. Roman <[email protected]> escribió: > > I still need to work on this, but in principle my previous comments are confirmed. In particular, in my tests it seems that the problem does not appear if PETSc has been configured with --download-fblaslapack > If you have a deadline, I would suggest you to go this way, until I can find a more definitive solution. > > Jose > > > >> El 16 jun 2017, a las 14:50, Kannan, Ramakrishnan <[email protected]> escribió: >> >> Jose/Barry, >> >> Excellent. This is a good news. I have a deadline on this code next Wednesday and hope it is not a big one to address. Please keep me posted. >> -- >> Regards, >> Ramki >> >> >> On 6/16/17, 8:44 AM, "Jose E. Roman" <[email protected]> wrote: >> >> I was able to reproduce the problem. I will try to track it down. >> Jose >> >>> El 16 jun 2017, a las 2:03, Barry Smith <[email protected]> escribió: >>> >>> >>> Ok, got it. >>> >>>> On Jun 15, 2017, at 6:56 PM, Kannan, Ramakrishnan <[email protected]> wrote: >>>> >>>> You don't need to install. Just download and extract the tar file. There will be a folder of include files. Point this in build.sh. >>>> >>>> Regards, Ramki >>>> Android keyboard at work. Excuse typos and brevity >>>> From: Barry Smith >>>> Sent: Thursday, June 15, 2017 7:54 PM >>>> To: "Kannan, Ramakrishnan" >>>> CC: "Jose E. Roman" ,[email protected] >>>> Subject: Re: [petsc-users] slepc NHEP error >>>> >>>> >>>> >>>> brew install Armadillo fails for me on brew install hdf5 I have reported this to home-brew and hopefully they'll have a fix within a couple of days so I can try to run the test case. >>>> >>>> Barry >>>> >>>>> On Jun 15, 2017, at 6:34 PM, Kannan, Ramakrishnan <[email protected]> wrote: >>>>> >>>>> Barry, >>>>> >>>>> Attached is the quick test program I extracted out of my existing code. This is not clean but you can still understand. I use slepc 3.7.3 and 32 bit real petsc 3.7.4. >>>>> >>>>> This requires armadillo from http://arma.sourceforge.net/download.html. Just extract and show the correct path of armadillo in the build.sh. >>>>> >>>>> I compiled, ran the code. The error and the output file are also in the tar.gz file. >>>>> >>>>> Appreciate your kind support and looking forward for early resolution. >>>>> -- >>>>> Regards, >>>>> Ramki >>>>> >>>>> >>>>> On 6/15/17, 4:35 PM, "Barry Smith" <[email protected]> wrote: >>>>> >>>>> >>>>>> On Jun 15, 2017, at 1:45 PM, Kannan, Ramakrishnan <[email protected]> wrote: >>>>>> >>>>>> Attached is the latest error w/ 32 bit petsc and the uniform random input matrix. Let me know if you are looking for more information. >>>>> >>>>> Could you please send the full program that reads in the data files and runs SLEPc generating the problem? We don't have any way of using the data files you sent us. >>>>> >>>>> Barry >>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Regards, >>>>>> Ramki >>>>>> >>>>>> >>>>>> On 6/15/17, 2:27 PM, "Jose E. Roman" <[email protected]> wrote: >>>>>> >>>>>> >>>>>>> El 15 jun 2017, a las 19:35, Barry Smith <[email protected]> escribió: >>>>>>> >>>>>>> So where in the code is the decision on how many columns to use made? If we look at that it might help see why it could ever produce different results on different processes. >>>>>> >>>>>> After seeing the call stack again, I think my previous comment is wrong. I really don't know what is happening. If the number of columns was different in different processes, it would have failed before reaching that line of code. >>>>>> >>>>>> Ramki: could you send me the matrix somehow? I could try it in a machine here. Which options are you using for the solver? >>>>>> >>>>>> Jose >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> <slepc.e614138><Arows.tar.gz> >>>>> >>>>> >>>>> >>>>> >>>>> <testslepc.tar.gz> >>> >> >> >> >> >
