ok, so what I get from this conversation is the following todo list:

1. check out the tests src/mpi/romio/test
2. revisit the atomicity issue. You are right that there scenarios where it 
might be required, the fact that we were not able to hit the issues in our 
tests is no evidence.
3. will work on an update of the FAQ section.



-----Original Message-----
From: users <users-boun...@lists.open-mpi.org> On Behalf Of Dave Love via users
Sent: Monday, January 18, 2021 11:14 AM
To: Gabriel, Edgar via users <users@lists.open-mpi.org>
Cc: Dave Love <dave.l...@manchester.ac.uk>
Subject: Re: [OMPI users] 4.1 mpi-io test failures on lustre

"Gabriel, Edgar via users" <users@lists.open-mpi.org> writes:

>> How should we know that's expected to fail?  It at least shouldn't fail like 
>> that; set_atomicity doesn't return an error (which the test is prepared for 
>> on a filesystem like pvfs2).  
>> I assume doing nothing, but appearing to, can lead to corrupt data, and I'm 
>> surprised that isn't being seen already.
>> HDF5 requires atomicity -- at least to pass its tests -- so presumably 
>> anyone like us who needs it should use something mpich-based with recent or 
>> old romio, and that sounds like most general HPC systems.  
>> Am I missing something?
>> With the current romio everything I tried worked, but we don't get that 
>> option with openmpi.
>
> First of all, it is mentioned on the FAQ sites of Open MPI, although 
> admittedly it is not entirely update (it lists external32 support also 
> as missing, which is however now available since 4.1).

Yes, the FAQ was full of confusing obsolete material when I last looked.
Anyway, users can't be expected to check whether any particular operation is 
expected to fail silently.  I should have said that
MPI_File_set_atomicity(3) explicitly says the default is true for multiple 
nodes, and doesn't say the call is a no-op with the default implementation.  I 
don't know whether the MPI spec allows not implementing it, but I at least 
expect an error return if it doesn't.
As far as I remember, that's what romio does on a filesystem like pvfs2 (or 
lustre when people know better than implementers and insist on noflock); I 
mis-remembered from before, thinking that ompio would be changed to do the 
same.  From that thread, I did think atomicity was on its way.

Presumably an application requests atomicity for good reason, and can take 
appropriate action if the status indicates it's not available on that 
filesystem.

> You don't need atomicity for the HDF5 tests, we are passing all of them to 
> the best my knowledge, and this is one of the testsuites that we do run 
> regularly as part of our standard testing process.

I guess we're just better at breaking things.

> I am aware that they have an atomicity test -  which we pass for whatever 
> reason. This highlight also btw the issue(s) that I am having with the 
> atomicity option in MPI I/O. 

I don't know what the application is of atomicity in HDF5.  Maybe it isn't 
required for typical operations, but I assume it's not used blithely.  However, 
I'd have thought HDF5 should be prepared for something like pvfs2, and at least 
not abort the test at that stage.

I've learned to be wary of declaring concurrent systems working after a few 
tests.  In fact, the phdf5 test failed for me like this when I tried across 
four lustre client nodes with 4.1's defaults.  (I'm confused about the striping 
involved, because I thought I set it to four, and now it shows as one on that 
directory.)

  ...
  Testing  -- dataset atomic updates (atomicity)
  Proc 9: *** Parallel ERRProc 54: *** Parallel ERROR ***
      VRFY (H5Sset_hyperslab succeeded) failed at line 4293 in t_dset.c
  aborting MPI proceProc 53: *** Parallel ERROR ***

Unfortunately I hadn't turned on backtracing, and I wouldn't get another job 
trough for a while.

> The entire infrastructure to enforce atomicity is actually in place in ompio, 
> and I can give you the option on how to enforce strict atomic behavior for 
> all files in ompio (just not on a per file basis), just be aware that the 
> performance will nose-dive. This is not just the case with ompio, but also in 
> romio, you can read up on that various discussion boards on that topic, look 
> at NFS related posts (where you need the atomicity for correctness in 
> basically all scenarios).

I'm fairly sure I accidentally ran tests successfully on NFS4, at least 
single-node.  I never found a good discussion of the topic, and what I have 
seen about "NFS" was probably specific to NFS3 and non-POSIX compliance, though 
I don't actually care about parallel i/o on NFS.  The information we got about 
lustre was direct from Rob Latham, as nothing showed up online.

I don't like fast-but-wrong, so I think there should be the option of 
correctness, especially as it's the documented default.

> Just as another data point, in the 8+ years that ompio has been available, 
> there was not one issue reported related to correctness due to missing the 
> atomicity option.

Yes, I forget some history over the years, like that one on a local
filesystem:
<https://urldefense.com/v3/__https://www.mail-archive.com/users@lists.open-mpi.org/msg32752.html__;!!LkSTlj0I!RRzFK9z4sNdKbI903cVqxsw9gbFuUl5eDchn3ohU1mB4OPzPuroARq6bZ6nGZxTXiGP4$
 >.

> That being said, if you feel more comfortable using romio, it is completely 
> up to you. Open MPI offers this option, and it is incredibly easy to set the 
> default parameters on a  platform for all users such that romio is being used.

Unfortunately that option fails the tests.

> We are doing with our limited resources the best we can, and while ompio is 
> by no means perfect, we try to be responsive to issues reported by users and 
> value constructive feedback and discussion.

I'm sorry to sound off, but experience -- not just mine -- is that issues 
typically don't get resolved; Mark's issue has been open for a year.  It 
probably doesn't help that people are even told off for using the issue 
tracker.  Generally it's not surprising if there's a shortage of effort when 
outside contributions seem unwelcome.  I've tried to contribute several times.  
The final attempt wasted two or three days, after being encouraged to get the 
port of current romio into a decent state when it was being done separately 
"behind the scenes", but that hasn't been released.

Reply via email to